UPDATED 12:48 EDT / MAY 09 2011

NEWS

Will EMC’s Greenplum-Hadoop Gambit Pay Off?

Wikibon believes EMC’s success or failure in the commercial Hadoop market depends on four critical factors.

1. EMC’s reception by the open source Apache Hadoop community. Credibility is critical to gaining adoption in a young, open source technology community like Hadoop. Traditionally, such credibility is contingent on making significant contributions to the open source project in question. In this case, EMC’s contributions to the Apache Hadoop project are negligible. The company has few if any engineers regularly contributing to the project in comparison to Cloudera, which has dozens of engineers contributing to Apache Hadoop. Indeed, Cloudera’s entire raison d’être revolves around strengthening the Apache Hadoop project to in turn bolster its own commercial Hadoop distribution. Instead of slowly trying to earn credibility by contributing to open source Hadoop organically, Wikibon believes EMC will attempt to hire its way into the community, acquiring respected engineers from Yahoo and others with significant Hadoop experience. This strategy may work in the long-term, but it will not happen overnight for EMC.

(Read Wikibon’s full analysis of EMC’s move into the commercial Hadoop market here.)

2. EMC’s ability to position its Greenplum HD appliance as the most enterprise-ready commercial Hadoop distribution. In order to dislodge Cloudera from the top of the Hadoop food chain, EMC must position its Greenplum HD appliance as the most stable, highest performing enterprise-class Hadoop product on the market. This also means sowing doubt among the Hadoop community as to the robustness of Cloudera’s Hadoop distribution. And there is an opening for EMC to do so successfully. There is significant whitespace in Cloudera’s Hadoop distribution, a fact that Cloudera itself is well aware of and is actively trying to fill. EMC’s challenge is to exploit Cloudera’s shortcomings while building its own credibility. EMC is a marketing machine, and will undoubtedly use its vast resources to fight an image war with Cloudera. Cloudera, meanwhile, has to date paltry marketing capabilities of its own and is vulnerable to a full-scale marketing attack by EMC.

3. EMC’s ability to successfully integrate its Greenplum data warehouse appliance and commodity hardware with the Hadoop framework. EMC can’t just talk a good game, however. It must also deliver a well-integrated appliance that seamlessly combines its analytic database, commodity servers and proprietary storage technology with the open source Hadoop framework. EMC is in a good position to do so, as Greenplum’s MPP architecture, which runs analytics jobs in parallel, would seem a natural fit with Hadoop’s distributed nature. Its Isilon storage line is also purpose built for large, unstructured storage. Cloudera definitely has the edge, however, when it comes to the Hadoop distribution itself. EMC must innovate and bring value to its own distribution to edge out Cloudera.

4. EMC’s ability to develop new sales channels and distribution methods. EMC is at heart an infrastructure vendor. It forte is selling hardware to storage administrators. Selling analytics software, even wrapped in an appliance with preconfigured hardware, is a very different business, a fact EMC is no doubt well aware of. At present, most Hadoop “buyers” are data scientists and line of business end-users looking to exploit Hadoop to solve a particular business problem in an end-run around IT. EMC’s recent marketing efforts have begun targeting this new constituency (i.e. EMC’s Data Scientist Summit running in conjunction with EMC World this week), but it will take time for EMC’s sales team to master the subtleties of selling to this market. Cloudera, meanwhile, has ingratiated itself with data scientists and the larger Hadoop ecosystem. However, the quest for Hadoop market share is not a zero-sum game and Wikibon believes both strategies could work. Missing from the discussion is the possibility of delivering Hadoop and other Big Data technologies as a service. Delivering Hadoop in the cloud makes intuitive sense, and we believe it is a strategy EMC and other commercial Hadoop providers should explore with service providers.

Jeff Kelly is a Principal Research Contributor at Wikibon.org. He focuses on trends in business analytics and big data technologies. Reach Jeff by email at jeff.kelly@wikibon.org or Twitter at @jeffreyfkelly.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU