UPDATED 15:45 EDT / MARCH 13 2018

BIG DATA

Cloudera founder Mike Olson: ‘We’re moving from automating processes to automating decisions’

The growing momentum of big data in the cloud has been described as a threat to Cloudera Inc., which is ironic given that the company’s original business plan was to sell big data as a cloud service. The market wasn’t ready in 2008, so Cloudera shifted to selling an integrated platform that combines various open-source projects with proprietary extensions.

Of the three prominent startups to emerge from the Hadoop ecosystem, the other two being MapR Technologies Inc. and Hortonworks Inc., Cloudera was the most prominent, in no small part because of the $740 million investment it received from Intel Corp. in 2014. Its stock-market performance since going public last April has been underwhelming, but few would deny that it’s a market leader. Chief Strategy Officer Mike Olson, one of Cloudera’s four founders, joined SiliconANGLE recently for a telephone interview on where big data’s going next.

Gartner has estimated that 85 percent of enterprise big data projects failed. Does that surprise you?

I don’t understand who they’re talking to. We’re a high-growth company in the $300 million-plus forecast range and most of our business is still on-prem. People aren’t shutting down large football fields of Teradata [Corp.], but the opportunity was never to displace data warehouses. It was to capture more data than we could before and see what we could do if we had better tools to derive value. I think Gartner is comparing the wrong things. The growth rate of machine learning is zero if you compare it against traditional markets because there is no traditional market. But it has created value in huge new ways.

It’s true that the biggest growth area has been in cloud databases like Cloudera Altus and Redshift on Amazon Web Services. We’ve believed since the early days that much of the action would be in cloud services, and that’s why we named the company Cloudera. We still believe that, but I think there’s enormous potential and success in on-prem deployments.

Three years ago, Cloudera defined its mission around a unified platform built on Hadoop, the pioneering framework for managing big data. Today, you don’t even mention Hadoop in your description. What changed?

We talked about Hadoop in the early days because people needed to know what the platform was and what we had. Today it’s a much richer environment. We’re seeing native machine learning using Spark integrated with AWS storage buckets. There’s no Hadoop in there. This platform is doing more than just what Hadoop did. Our original platform was Hadoop and MapReduce. Today, we ship 26 different open-source projects, 18 of which were created by Clouderans. Hadoop is always going to be part of the foundation of the company — and we’re proud that we spotted it so early — but it’s only a small part of what we do today.

What do enterprises need to do better to get more out of big data? Technology? Culture? Governance?

The answer is yes. Technically, the platforms have matured to the point that they solve a lot of customer problems. But this is still a relatively new platform. If you think of it in the lifetime of the relational database market, it’s still 1988. It’s early days for the broad range of applications that enterprises need to address. That means customers often don’t understand the use cases they need to attack.

Then there are cultural things. For example, we have the security features to be GDPR [General Data Protection Regulation]-compliant, but the hard part is integrating those capabilities with your business processes so you know if you’ve properly provisioned them. The technology is there, but you need your business processes to take advantage of the platform.

In what ways has your business diverged from its original plan?

When we went public last year, I dug up my Series A pitch deck, and I was surprised and pleased to see that everything we predicted in that deck has happened. We were wrong to some degree on order. For example, our first platform was a hosted solution offering managed Hadoop to big bank clients, but couldn’t get anybody to buy it. We were right about trajectory, though.

One area where innovation has moved faster than I would have bet is enterprise adoption of machine learning. That’s been driven by two things. One was that Spark made it a lot easier to build training models. More than that, though, there’s been enormous progress in machine learning models themselves, and God bless the developers for releasing them in open source. I would not have bet on the innovation we’ve seen in those platforms.

This is a big opportunity for clients to automate decisions. In the ’90s and ’00s, the industry wrapped software around business processes. Over the next two decades we’re going to wrap software around decisions, automating bets people are going to make.

Is the cloud a threat to Cloudera?

If we’re just an on-prem vendor five years from now, we’ll be a footnote. Our big opportunity is to help clients migrate to the cloud and offer portability between cloud and on-premises. Because of the bets we made early on, we enable you to make that migration without coding to proprietary APIs [application program interfaces]. We have good partnerships with all of the hyperscale cloud providers. Granted, they compete with us at some level, but my opportunity isn’t to beat Redshift. It’s to help customers who want to train machine learning models to deliver that capability across all the cloud providers. We aim to integrate and provide all the portability customers want with the regulatory and compliance capabilities they need.

What company do you believe will emerge as the leader in machine learning?

Among the public companies, IBM has articulated a vision with Watson that we think is right. The idea of enabling enterprises to do machine learning workloads in production is interesting. Palantir [Technologies Inc.] is very successful at machine learning infrastructure for defense and government built on proprietary intellectual property. What we’ve never seen before is the innovation in algorithms coming out of Google, Amazon, Microsoft and some of the social media platforms. Who’s going to be No. 1? It’s going to be a fight.

How do you believe Cloudera will describe itself five years from now?

I don’t know for sure, but I do think this idea that we’re automating capabilities that used to be gut-driven is going to be a fundamental transition for our enterprise clients. Making decisions based on data and spotting patterns that are invisible to human eye will be the way successful companies execute. I’d like Cloudera to be a leader in delivering those capabilities. But what will be exciting in five years I don’t know. Five years ago, I would have never said machine learning would be where it is now.

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU