Hortonworks: the Red Hat of Hadoop | #RHSummit

John KreisaRed Hat has long been lauded for their approach of focusing on and building a sound infrastructure upon which all of their subsequent innovations were built. Their open source approach was and continues to be fairly unique, leading many tech pundits to claim their model was a one-off success. Most people take it as a statement of fact that there will never be another Red Hat of anything. However, John Furrier, founder of SiliconANGLE, posits that Hortonworks, with their similar DNA being applied in the data world, is, in fact, the Red Hat of Hadoop. “The discipline required,” he says, “really is a long game.”

Furrier, along with co-host Jeff Kelly, welcomed John Kreisa, Vice President of Strategic Marketing for Hortonworks, to a special session of theCUBE at this week’s Red Hat Summit.

Bringing up Intel’s recent strategic investment in Cloudera, Furrier claimed this is viewed as a significant validation for the Big Data space. “That’s big news,” he said. “That’s got everyone’s attention. It takes Hadoop to the top of the front page of the business press.”  Intel commented in earlier interview on @theCUBE that despite the Cloudera announcement, Intel intends to drive innovation to other Hadoop distro providers upstream.  This is important because it telegraphs that Intel doesn’t want to alienate competitors in the opensource ecosystem.

Furrier then asked Kreisa to address the Cloudera news and how it impacts Hortonworks and relates to other conversations happening at this week’s summit.  Agreeing with Furrier’s contention, Kreisa said, “It’s a good validation for the market, in general. That the large vendors continue to invest in the community, much like a lot of the partnerships that we form, it’s making sure there is investment at various levels, whether it’s engineering or elsewhere.”

He sees the Intel/Cloudera announcement as being important for the continued drive of the technology to the next generation and moving it steadily forward to the enterprise. “You’ve got to be the company that can really innovate on that technology,” said Kreisa.

For more on Intel’s perspective, be sure to watch an earlier interview on theCUBE with Doug Fisher, who speaks in detail on his company’s investment in Cloudera. {see editors note below}


Strategic Partnerships

Moving onto his next line of questions, Furrier noted, “You’ve been very successful with your partnerships. Talk about the ecosystem and specifically your relationship with Red Hat.”

With a year’s experience working with OpenStack, Kreisa said, “First of all, it’s a great partnership. Integrating OpenStack with Hortonworks’ open source platform in order to allow Hadoop to be deployed in that infrastructure [has been key].” He continued, “There is a particular simpatico nature to the way that Red Hat works with communities and the way that we work with communities.”

Within those communities, Hortonworks, like Red Hat, identifies communities upstream that are working on projects that perhaps aren’t being addressed elsewhere, and curates them downstream and into development. Furrier asked Kreisa to explain the concept of upstream to those perhaps not familiar with the term.

“There are open source projects,” Kreisa began, “that a very broad community of developers are working on. So, when I say upstream, I mean perhaps they are working in some Apache project that someone is developing and contributing code to.”

With a single project receiving input from as many as thousands of developers, that is what is considered to be upstream. Kreisa further explained that “what Hortonworks does is… we work in that upstream notion and then curate that down and take the most stable versions of each of those open source projects from the other upstreams. We test and integrate that together and then apply a very detailed and rigorous level of testing and put it out there as a platform.”

  • Cooperation or competition?

Kelly, who inquired if there was tension that existed between the open source community and vendors that want to push one particular project or feature over another. “I’m sure there are a lot of interesting conversations behind the scenes,” he said.

“Those conversations do happen,” Kreisa admitted. “But we offer a level of commitment to them and that drives them to want to work with Hortonworks. We guarantee that the changes that they want to get will be committed into the core Apache Hadoop.”

This is important for several reasons. Firstly, Hortonworks is not aiming to lock anyone into a particular version of the platform. “We want to make sure it’s all out there for everyone to benefit from, which is a key part of our strategy and it’s why people want to work with us,” Kreisa stated.

The Advent of Data First

A concept discussed often on theCUBE and one beginning to gain traction in tech circles is the mantra of Data First. Furrier stated, “It’s coming. Developers are acting on data as a resource. Where does Data First fit into the architecture?” he asked.

“It’s interesting you bring up that meme,” Kreisa began, “because we’ve been saying it wasn’t Hadoop that disrupted the data center. It was data that disrupted the data center.”

He continued by pointing out that it was the individual organizations that have realized they can and want to capture new kinds of data and exploit them for new applications that previously, due to financial and/or technological constraints, simply couldn’t be developed. “That’s what Hadoop is enabling and Hadoop is kind of a Data First architecture.”

The strength of Hadoop is to work with several different forms of unstructured data with no pre-processing necessary before landing it. “I can land video and log files and machine generated data and blogs and tweets into this giant pool of storage in the Hadoop distributed file system and then begin to process and iterate on it,” Kreisa explained. “As I heard someone say, torture the data until it reveals its value.”

Kreisa recommends that any organization interested in moving into the Big Data sphere identify on single use case, regardless of industry, and work that use case to a point of success for their own organization. “It usually starts, from our standpoint, regardless of industry, with a line of business-driven initiatives. [They may want] better service for customers or to capture better prospects or better predictive and proactive maintenance or better processing of healthcare data. Whatever it might be, it’s a line of business driven to create a single analytic application.”

Likening the maturity of Big Data, as it currently stands, to the game of baseball, Kreisa states we are definitely at the top of the 1st. “It’s early on. Hadoop, as a technology, has a long way to go. All of these technologies will continue to evolve and figure out new ways to exploit what they have.”

{Editors Note:  We previously wrote “In an earlier interview on theCUBE, Doug Fisher of Intel made clear that despite the Cloudera announcement, they view Hortonworks as a strong partner.”  Intel did not say that statement on theCUBE.  That statement was deleted from the story}