UPDATED 12:26 EDT / NOVEMBER 15 2011

NEWS

The Stakes are High in the Hadoop Distribution Race

A few weeks back I wrote that the focus in the Hadoop community was shifting from the infrastructure layer – the plumbing, if you will – to the analytics and application layers. That premise was backed up last week at Hadoop World by a number of important industry players, including Cloudera’s Jeff Hammerbacher and Tresata’s Abhi Mehta.

While this is good news for the enterprise – after all, Big Data analytics and applications are where enterprises will achieve real business value from Hadoop – the implications are just as significant for Hadoop distribution vendors. That’s because if one or another of the vendors can establish its Hadoop distribution as the end-users’ favorite, it could spur a very lucrative chain of events for that vendor.

Specifically, if a particular distribution gains momentum either due to its ease-of-deployment or associated support services, analytics and application vendors will increasingly begin tailoring their products for that distribution. This in turn will lead to increased adoption of that distribution, as enterprises will naturally gravitate to the Hadoop distribution for which there are the most robust analytic platforms and applications. The results will be a huge customer base and a river of revenue for the winning vendor.

Currently, most Big Data analytics platforms and applications vendors write multiple versions of their products to work with each of the top distributions  — including Cloudera’s CDH3, Hortonworks’ HDP, and MapR’s M5. Take Datameer, for example. At Hadoop World, I spoke with Joe Nicholson, the company’s Vice President of Marketing. He said Datameer customizes its Hadoop-based business intelligence platform to work seamlessly with each of the major Hadoop distributions, but that doing so requires additional engineering resources and effort on Datameer’s part.

In a similar vein, Karmasphere today announced a partnership to make its Big Data analytics platform compatible with Hortonworks’ HDP, just a few weeks after inking a similar deal with Cloudera.

If companies like Datameer and Karmasphere only had to write their applications to one distribution, they could spend less time making sure their products play nicely with multiple distributions and more time developing the products themselves and innovating truly game-changing features. This would obviously benefit the dominant Hadoop distribution vendor, but it would also be good news for end-users in the form of potentially faster development of Big Data applications.

The risk of such a scenario, of course, is that the Hadoop distribution vendor that comes out on top will secure what is essentially a monopoly and limit customer choice. Being an open source project, however, means customers will always have the option of using the Apache Hadoop distribution for free, though this will require significant internal engineering resources.

As for who’s winning the Hadoop distribution race at the moment, check out Wikibon’s recent analysis. For a complete picture of the Hadoop Landscape, also check out Wikibon’s Big Data Manifesto.

Services providers too would benefit should a particular Hadoop distribution become the industry standard. Like on-premise Big Data analytic platforms and applications vendors, Big-Data-as-a-Service providers like Tresata could focus less time on distribution customization and more on innovation. Big Data consulting services providers, meanwhile, could spend less time advising enterprises on which Hadoop distribution to go with and more time identifying/deploying Big Data analytic use cases that deliver significant business value and competitive differentiation.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU