UPDATED 13:07 EST / APRIL 19 2016

NEWS

Cloudera’s Olson sees innovation flourishing amid consolidation

With Pivotal Software Inc.’s announcement that it is formally abandoning Hadoop development in favor of standardizing on Hortonworks Inc.’s platform, the field of active competitors in the Hadoop market has been culled to just a handful. With that as a backdrop, we thought it was a good time to check in with Mike Olson, who co-founded Cloudera, the first commercial Hadoop company, in 2008. Cloudera is considered the market leader in Hadoop-related platforms. With massive funding from Intel Capital and others, it’s well-positioned to see the market through to maturity.

With Pivotal effectively dropping out of the race, the number of major big data companies is shrinking. Is this consolidation good for customers?

Leadership in the industry is getting concentrated in the companies that actually work on the products. Hortonworks is our competitor, but they’re also a close collaborator, and that just wasn’t true of Pivotal.

There are still plenty of options in the market, from Hortonworks, MapR [Technologies Inc.] and others. Plus there are cloud services. I think early entrants who misjudged the market have folded their tents and gone home.

It’s been a year since the Open Data Platform (ODP) initiative was launched. How much of a factor has it been in the market?

It’s been entirely a non-issue for us. Our stand hasn’t changed. There’s a place to do cooperative development and that’s in the Apache Software Foundation. ODP launched with 15 members and says it’s “doubled” to 25 – which is a funny doubling – but they’re not the companies that are leading in Hadoop.

I don’t believe that [ODP’s stated intention to define] a standard for HDFS is what’s needed. That technology hasn’t changed in years. The conspicuous absence of us, MapR, Microsoft, Amazon and Google [in ODP says a lot]. The majority of the market isn’t there.

With Apache Spark, Flink, Storm and other analytic engines stealing much of the thunder in the industry recently, is there a risk that Hadoop will be marginalized?

Remember that Hadoop was originally just the HDFS storage system and MapReduce. Today Hadoop is 30 different products bundled together. The real strength of Hadoop is its ability to absorb complementary technologies within the same framework. Technologies like Spark, Impala and Aparche Solr deliver another analytic or query framework on top of that storage. So we’re seeing alternative engines come into the market and offer a variety of options.  I believe that what we ship 10 years from now will look very little like what we ship today.

What are the most important factors that you see holding back the growth of big data applications in enterprises?

We’ve crossed the chasm from evangelizing technology to talking about business value. We are now seeing customers roll out enterprise-wide, and the folks who started with us eight years ago are now very large consumers of our technology. But half the customers have only started in the last 12 months. There’s still plenty of opportunity for enterprise-wide expansion.

I would love to see more end-user focused applications. Customers don’t care about all these Apache project animals. They want stuff that’s going to benefit their business. This is beginning to happen with the partners we’re working with.

recent study by Xplenty Ltd. found that one-third of business intelligence professionals spend over half their time cleaning up raw data to load into analytics platforms. What are you doing to help with that problem? 

Just one third? We hear that it’s up to 90 percent. The guys who do big data are mostly data cleansing jockeys who do data science in their spare time. We’ve got great relationships with partners that are helping to organize and automate this process [but we aren’t solving it ourselves]. If we tried to be all things to all customers we’d be terrible at it. And we’d slow the Darwinian evolution that’s driving so much value.

Why aren’t there more packaged big data applications?

Customers have overwhelmingly built their own application portfolios because it’s a competitive advantage for them. Also, at the time many of them started, there were no packaged applications. So they almost had to write their own.

You don’t get a big market if you rely on customers having Java developers in the basement. I do think there’s a big market for ISVs, but we’ve had to wait until the market was stable and mature enough. We have more than 2,000 partners, and many are now building impressive applications.

Are you interested in offering any of your own branded services as cloud options? 

When actually started the company to offer our own hosted Hadoop. Customers loved it, but then they also asked us to run their web servers and Oracle servers, and we weren’t very good at that, so we focused on on-premise delivery.

Our charter is to make it easy for customers to run our software on whatever infrastructure they want. We’re working with major IaaS vendors to make sure we provide all the power and elasticity they need. But we won’t get into providing Hadoop as a service.

Wikibon’s George Gilbert recently wrote that Spark and streaming applications – rather than Hadoop – are poised to drive the big data market’s growth going forward. Do you agree?

We believe so deeply in the promise of Spark that we years ago figured out a way to incorporate it into the platform. There are some issues involved – scale, manageability, integration with security frameworks for compliance – that need to be addressed, but MapReduce had all those same problems and we solved them.

Spark’s a rocket ship right now but it’s not the only platform that’s seeing rapid adoption. The fastest growing part of our product line is actually Impala, because it’s a way to use an existing SQL app instead of writing a new Spark app. We believe the combination of these tools is a lot more powerful than any one alone.

Photo by SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU