Today Hadoop COO Kirk Dunn and NetApp Senior Director of Data Center Solutions Jeff O’Neal appeared on our show theCube live at HadoopWorld, followed by Hadoop CTO Amr Awadallah. The execs talked about Hadoop washing, Cloudera unique differentiators, how Hadoop is different from Linux, the Cloudera’s new partnership with NetApp and more.
Dunn provided an excellent litmus test for whether technology was truly disruptive. If a technology is disruptive, he said, you won’t able to hire people who know how to work with it. You’ll have to create and train those skill sets in-house.
SiliconAngle founder and theCube co-host John Furrier asked whether he was starting to see companies doing “Hadoop washing,” similar to cloud washing and open washing. Dunn said that he’s seeing a little of it, and mostly what you should watch out for the types of workloads that companies are using Hadoop for. Hadoop really isn’t good for relational data applications, Dunn points out, so there’s no reason to use as a stand in for those solutions. He said that the sorts of applications that Hadoop is good for, like recommendation engines, are becoming more mainstream and as that happens you’ll actually see less Hadoop-washing. Wikibon founder and theCube co-host Dave Vellante said he thinks that Hadoop washing will be rather difficult, but to watch out for big data washing instead.
O’Neal talked about NetApp’s partnership with Cloudera to create a new solution called NetApp Open Solution for Hadoop, which provides Cloudera Enterprise running on NetApp’s OnTap. O’Neal clarified that OnTap isn’t being open sourced, but the Open Solution for Hadoop is running the open source Hadoop core. Dunn said the solution is great for workloads that are both storage intensive and compute intensive.
John asked Awadallah to talk about how he feels about, rather than thinks about, the state of the big data market. Awadallah said he’s very emotional about it right now because of the new $40 million round of funding that just closed.
John then asked Awadallah to compare and contrast Hadoop with Linux, noting that many call Cloudera the Red Hat of the Hadoop world. Awadallah explained that Hadoop is similar to Linux in that it provides a file system and a platform for running applications. But while Linux is designed to run on a single machine, Hadoop is designed to run on clusters. Awadallah said the similarities between Cloudera’s business model and Red Hat’s end at open source. Awadallah said the big difference is that Red Hat creates its own Linux distribution while Cloudera uses the core Apache Hadoop code and contributes back to the Apache Hadoop. To provide differentiation, it builds its own proprietary tools, but it doesn’t make changes to the core Hadoop project. If I understand his position, he means that Red Hat created a new Linux distribution instead of using and contributing to an existing Linux distro and then adding its own tools on top of it.
Awadallah does emphasize that like Red Hat, Cloudera’s core business is in services. “I could give you an airplane,” he said. “But you’d still need people to support it.” Taking the metaphor further, he explained that you could hire your own mechanics to work on the plane, but what Cloudera does is offer the services of its very experienced mechanics.