Big data is a pretty hot topic right now, but when it comes to cloud implimentation, there’s still a lot to learn. Today’s first big data panel at VMworld welcomed Cloudera co-founder Amr Awadallah and Richard McDougall, CTO of Application Infrastructure at VMware, discussing big data, the cloud and virtualization with theCube co-hosts John Furrier and Dave Vellante. Looking at how VMware has changed over the years, Furrier looks to get a partner perspective, directing this initial question to Awadallah.
“VMware is changing beyond just the virtualization layer, growing into layers above that, as we saw with the acquisition of Zimbra and Springsource,” Awadallah begins. “Regardless, storage is very important at both layers. Traditionally I’ve seen VMware focused on central storage, and I predict over the next few years that will change, with more storage moving onto servers themselves.”
In fact, storage would become an important aspect of the panel discussion, as well as VMware’s goals around expanding the role of virtualized environments. And there’s a few ways VMware is going about this implementation, from product delivery to commercialization of things like the open source cloud for the enterprise. VMware’s still relatively early in its commercialized open source cloud offering, but it’s helping to drive the community, McDougall points out.
“Looking at the core values, VMware is growing well,” McDougall says. “People want big data resources, so some core features into management and other functionalities allow people to leverage that deeper at the platform level. It’s better able to be exploited.”
A key mechanism in this commercialization process is Hadoop, and it’s an open source software program that’s being utilized by both Cloudera and VMware for enterprise solutions. So what is it that the enterprise is looking for right now? Do they even know what it is they need from this still fresh offering?
“They’re about big data in many ways,” Awadallah points out. “Having a cloud that’s scalable and can do your bidding–that’s exactly what Hadoop offers. The first change an organization is looking to do is behavioral. They want to be more agile and adaptive, and not be locked in to a given language or schema. Hadoop gives them that agility. Then there’s the commodity hardware trend–we have boxes where you can have multiple cores and disks and that allows us to push big data to scales more than we have been able to do before.”
Easy integration and on-demand instances are two requirements from the enterprise when it comes to Hadoop integration, but this grand simplification of the cloud also leads to the commoditization of entire industries, across both PCs and servers. HP is already scaling back its PC efforts, and though they’re high profile VMware partners for their server offerings, they may have to modify in this area as well.
McDougall says that the configuration of servers and storage will change, with an “initial amount of Hadoop being deployed, and the commoditization of storage later. The change will be in the way storage is provisioned, and at a reduction in cost.” He goes on to note that smaller customers want to be able to deploy Hadoop quickly, as well as share the workload. It’s a form of time-sharing a platform, if you will.
Furrier then asks if either company is worried about a fracturization of Hadoop? As we’ve noted on SiliconAngle, there’s several companies coming out with their own “flavor” of Hadoop, many of which are commercialized services designed around enterprise needs. For VMware, its Cloud Foundry product is an encompassing framework that developers can write on top of, with a backend for system scheduling and other administrative needs. In the future, McDougall mentions that there might be more going on with Hadoop integration.
For Cloudera, a business focused entirely on Hadoop deployment for the enterprise, there’s two areas the company is looking at, starting with the scalability of people. “Hadoop is good at scaling processing, but how can we get IT admin to manage 10k nodes?” Awadallah asks. “Cloudera enables one IT manager to scale this. We launched the Cloudera CMS Express, which brings down the level of Hadoop installation…we want to make sure they can use Hadoop with the existing infrastructure they have already.”
With such a focus on making Hadoop work for the enterprise, Cloudera has a unique perspective on the industry. But competition is building–in the last six months alone there’s been several commercialization initiatives gaining attention, from Hortonworks, OpenStack, VMware and others. But Cloudera does have a three-year jump on its rivals. “We know how to use this technology and what the needs are to develop it,” Awadallah persists. But is Hadoop itself ready for the enterprise? Not quite, as Awadallah further notes. “But it’s getting there quickly.”
So when it comes to the democratization of IT and the commercialization of open source solutions, will we eventually lose the chain of command in IT adminstration? McDougall thinks so. “At the cloud layer it’s our job to make this run quickly, and you don’t want to have a business unit having to make a lot of decisions,” he says. “At a higher level you’d expect Hadoop to go towards multitenant Hadoop services. This job is going to need 100 nodes–give me that straight away.”
It’s an on-demand cloud service the industry is working towards, where managing environments and disparate software is as easy as setting up a website on GoDaddy, or managing a server. This will present a great deal of opportunity for the startup world, as Awadallah and McDougall agree. This will also entail a major shift in storage, with less centralized solutions and more localized offerings to make the entire deployment and management process more efficient. One area of great promise for entrepreneurs is building on this wave of technology. Data’s value is being uncovered every day, and the ability to run data sets against each other in a more automated and autonomous fashion is the fertile ground for startups looking to the future of business.