UPDATED 18:04 EDT / APRIL 10 2014

The Data Economy: Understanding the Hadoop-Data warehouse balance of power

Will Hadoop replace your enterprise data warehouse (EDW)?

This question, or some variation there of, has been making the rounds lately. Just this week I’ve read two good posts on the topic (this one from Matt Asay and this one from Timo Elliot), and my Twitter feed is full of related commentary.

The answer to this question has significant ramifications for data warehouse vendors and the $10 billion plus EDW market, so its not surprising its getting so much attention. So what’s the answer?

Well, it depends on what you mean by “replace.” Sorry for the nuance, but nuance is required in this case.

Wikibon agrees with Asay, Elliot and others that Hadoop is not going to outright replace your EDW. The EDW is a mature technology that supports many mission-critical workloads related to business intelligence reporting. Many executives and managers rely on these reports to run their businesses. Hadoop is not capable of supporting many of these mission-critical workloads with the levels of performance, reliability, security or usability required.

However, Hadoop is capable of supporting some non-mission-critical (but often storage- and/or compute-intensive) EDW workloads and does so at a fraction of the cost. The most obvious of these workloads is data transformation, but there are others. Enterprise practitioners are already beginning to shift these workloads from the EDW to Hadoop, resulting in lower costs and better performing data warehouses.

Matt Brandwein, Director of Product Marketing at Cloudera, gave a great example during a recent webinar (which is definitely worth watching in full.) He cited the case of one company that discovered 5% of workloads in its EDW were consuming 60% of EDW compute resources. The company shifted these workloads (in this case ETL jobs) to Hadoop, saving money and freeing up CPU in the EDW for higher-value workloads.

(Of course, Hadoop is also a great platform for a number of other workloads that aren’t possible with conventional EDW technology, including large-scale exploratory analytics and crunching unstructured and multi-structured data.)

The answer to the original question, then, is that Hadoop will replace the EDW for specific workloads, but not the EDW itself. This means data warehouse vendors now face competition from commercial Hadoop vendors for some of the same dollars related to these overlapping workloads, and growth rates for data warehouse vendors are likely to slow if not stagnate. But it’s not a zero-sum game, as Asay points out, nor is Hadoop an existential threat to EDW vendors.

In fact, all that new data created in Hadoop could make its way to the EDW eventually, actually resulting in more data under management for EDWs (and more revenue for EDW vendors.) And EDW vendors are introducing new capabilities that allow better integration with Hadoop but solidifies the EDW as the dominant platform in the relationship between the two (see Teradata’s recent QueryGrid release.)

But that’s not the only possibility. As the open source community and Hadoop vendors add more analytic capabilities to Hadoop and improve enterprise-grade features and performance, this paradigm could be flipped on its head. Eventually, Hadoop could support more valuable workloads, such as advanced analytics and business intelligence, than the EDW. In this scenario, Hadoop serves as the dominant data management platform in the enterprise, with the EDW serving as a tactical adjunct tool for important but less valuable tasks.

Consider the improvements made to Hadoop over just the last year. In April 2013, Cloudera introduced Impala, enabling SQL-like capabilities on Hadoop. In the fall, Hortonworks (with significant contributions from the open source community) debuted YARN, or Yet Another Resource Negotiator, transforming Hadoop from a one trick pony (MapReduce) to a multi-application framework. And just today, MapR announced it has added Apache Spark to its enterprise distribution, brining in-memory processing to Hadoop.

As more analytic capabilities such as these are developed and security, data governance and reliability improve, it’s not inconceivable that the balance of power between Hadoop and the EDW could shift.

Today, in all but the most sophisticated enterprises, the EDW performs the vast majority of high-value workloads, while Hadoop covers many important but less glamorous tasks. Eventually, though, the EDW could compliment Hadoop, rather than the other way around .

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

The Data Economy: Understanding the Hadoop-Data warehouse balance of power

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

The Data Economy: Understanding the Hadoop-Data warehouse balance of power

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

Cookies