The State of Hadoop 2014: Who’s using it and why

State of HadoopNow more than eight years old, the Apache Hadoop platform for processing and storing Big Data is on the verge of hitting the big time. Or at least that’s what the industry keeps on telling us anyway. But how big is Hadoop really? Are many organizations actually using it? If so, what are they using it for?

Hard data on the actual size of the Hadoop market is extremely hard to come by, although the companies leading the Hadoop charge are adamant that they’re beating customers away with a stick. Cloudera alone claims to be adding between 50 and 60 new customers every quarter.

The clues certainly point to rapid adoption. You only need to look at the money being thrown around (for example, Intel’s $900 million investment in Cloudera and HP’s deal with Hortonworks) and the size of the largest players to realize everything points to an extremely fast-growing market.

Who’s using Hadoop?


It’s almost impossible to pin down just how many Hadoop users there are, but it’s clear that adoption isn’t quite as widespread as some have claimed. A few years ago, Deloitte somewhat optimistically forecast that by the end of 2012 more than 90 percent of the Fortune 500 will likely have at least some big data initiatives under way, which would imply that Hadoop would be part of the mix.

But IDC’s most recent “Trends in Enterprise Hadoop Deployments” report found that only 32 percent of enterprises had actually deployed Hadoop, with another 36 percent planning to do so in the next 12 months. IDC’s report seems to tally with a similar report from Gartner last year, which found that 30 percent of large organizations had already invested in Big Data technology, with an additional 34 percent planning to do so in the next 24 months. However, Gartner estimated that the number of organizations that have actually deployed Hadoop was way lower than expected.

“Adoption is still at the early stages with less than eight percent of all respondents indicating their organization has deployed big data solutions,” said Frank Buytendijk, research vice president at Gartner. “Twenty percent are piloting and experimenting, 18 percent are developing a strategy, 19 percent are knowledge gathering, while the remainder has no plans or don’t know.”

Further evidence of Hadoop’s modest adoption comes from InformationWeek’s 2014 State of Database Technology Survey, which states “Hadoop is in production or pilot by only 13 percent of the 956 respondents”. Compare that with traditional databases like Microsoft SQL Server (75 percent) or Oracle (47 percent) and it’s clear Hadoop still has a ways to go.

Hadoop in the real world


So what about the organizations that do use Hadoop? What are they actually using it for? IDC says the vast majority of users combine Hadoop with other databases to perform Big Data analysis. Nearly 39 percent of respondents say they use NoSQL databases like HBase, Cassandra and MongoDB, and nearly 36 percent say they are using Greenplum and Vertica in conjunction with Hadoop.

Moving beyond “traditional Hadoop”, Gartner recently conducted a survey among existing Hadoop users to find out what the second most-popular type of processing on Hadoop was, after MapReduce. Here’s what it found:

  • 53 percent are doing interactive SQL
  • 18 percent are running database management systems
  • 14 percent are doing stream processing
  • 9 percent are running search
  • 6 percent are running graph applications

That interactive SQL has become so popular with Hadoop users is a sign of how far things have come. Hadoop vendors are recognizing the platform’s limitations and seeking to address them. Marilyn Matz, CEO and co-founder of Paradigm4, recently described how most major vendors are adding SQL functionality to address the limitations of MapReduce, and to accommodate a preference for a higher-level query language over low-level programming languages like Java.

Use Cases of Hadoop


It’s important to look at the industries that are running Hadoop too. A recent CB Insights survey of 350 venture capital-backed companies sheds some light on this. Not surprisingly, Business Intelligence, Analytics & Performance Management was the leader, closely followed by two ad tech related areas — Advertising, Sales and Marketing Tech and Advertising Networks & Exchanges.

Hadoop IndustriesImage credit: CBInsights


More interesting, perhaps, is the kinds of projects these industries are running with Hadoop. It turns out Hadoop is an extremely versatile tool with potentially hundreds of different applications. A 2012 article in GigaOM illustrates ten of the most common use cases of Hadoop besides advertising. They include eCommerce, infrastructure management, energy discovery, energy savings, image processing, fraud detection and health care.

Cloudera has a number of case studies on its site highlighting the different things its customers are doing with Hadoop. These include the eCommerce site Shopzilla, which deployed Cloudera’s solution to accommodate its requirement to process and deliver insights on millions of pageviews or ten billion ad bid requests daily; and Treato, a health information portal that uses Hadoop to streamline access to thousands of community sites and forums.

Hadoop’s prospects


With so many industries seeing value in Hadoop despite its releatively low rate of current enterprise adoption, it’s easy to see why there’s so much optimism about the future.

In its Big Data Vendor Revenue and Market Forecast 2013-2017, Wikibon said it expects rapid growth for Hadoop, with revenues set to rise from $18.6 billion in 2013 to $50.1 billion by 2018. Furthermore, the evidence suggests that Hadoop will account for a large slice of this market, with Wikibon noting that 62 percent of respondents to its survey expect to optimize enterprise data warehouses by offloading data and batch workloads (ETL) to Hadoop; and 69 percent of respondents expect to make enterprise-wide data available for analytics in Hadoop.

These findings were echoed by more recent research from Allied Market Research, which forecasts that the global Hadoop market will grow at a CAGR of 58.2% between 2013 and 2020. It put Hadoop’s market value at $2.0 billion in 2013, rising to a staggering $50.2 billion by 2020.

With that kind of money on the table, it seems the race for Hadoop dominance has only just begun.

photo credit: RayMorris1 via photopin cc

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.