

Amr Awadallah, CTO & Co-Founder, Cloudera, discussed the Hadoop current ecosystem, the open source / proprietary business model, and Big Data challenges with theCUBE co-hosts Dave Vellante and John Furrier, live at the 2013 edition of Hadoop Summit in San Jose.
“The ecosystem is definitely growing,” Awadallah said. Five years ago it was just Cloudera, no other commercial vendor was trying to enable Hadoop in the enterprise. Now there are about ten of them, including both big companies and new comers like Hortonworks. “That’s a healthy thing, that’s a sign of a growing market,” says Awadallah. If Hadoop was just hype, new companies, new ventures wouldn’t enter the market, he explained. “It’s a very healthy sign of maturity.”
Asked what still needs to be done, Awadallah said: “our vision for Cloudera from day one was we need to build this data system, and on top of that data, work any workloads.” Now, the company has Impala for interactive analytics. a strong partnership with SAS, and has just announced Search as a workload. YARN is also a critical part of the Cloudera platform and has been for over a year. “We were the first vendor to bring it to a Hadoop distribution. It’s a fundamental part of our platform to help us coordinate all the workloads in the platform.”
“Our focus should be to continue to work as a community to push the platform forward,” Awadallah stated. Cloudera wants to see the platform continuing to evolve. The key point to achieve that, “please don’t be just takers,” as some companies want to take from the open source community and never give back. “That is a selfish behavior and it won’t help the platform in the long run.”
Asked how Cloudera is allotting resources between open source and proprietary components, Awadallah explained: “Our core platform, SDH, is open source. All that we put there is open source.” We recently released Cloudera Search is also open source.
“To have a successful open source company you need to have a very good engine between the business model and the product roadmap,” he explaiend. If the product is 100 percent open source, that creates two problems, a lack of differentiation and becoming purely about endurance and maintenance. “That is why we have a combination of open source architects and proprietary architects.”
Watch the full clip below.
Commenting on the Red Had partnership with Hortonworks, Awadallah said that what it implies is running MapReduce on top of Red Hat storage instead of HDFS. It came with a shift in messaging from Hortonworks. While a year ago the company forecast half of the world data would be hosted by Hadoop, they had changed the wording to “processed by Hadoop” by 2015. “HDFS is very core to the Hadoop platform,” how scalable, how reliable and how economical it is are key factors. “We need the storage of Hadoop to stay inside Hadoop,” not to be fragmented, Awadallah stated.
Asked about his own forecast for the future, Awadallah said “the majority of world data that has to do with analytics will run on Hadoop.” There is data that is not suitable for Hadoop, data storage for streaming video files, for example, which is a huge proportion of world data.
Where future plans are concerned, Cloudera will continue to grow. “We want to be one of the very few comp to take an open source model and turn it into a large, publicly traded corporations,” Awadallah said.
At an international level, “Europe is definitely our next big focus. It’s growing very quickly,” and Awadallah estimated it was two years behind the US. Cloudera is also looking at China, but does not have a big presence there. They do have a strong presence in in Japan, another very fast growing market.
The company still focuses on “having a single platform storing all of your data, and allow you to extract value from your workloads,” Awadallah said. The issue to solve is how to “bring all your data apps to your data, as opposed to having your data go to them.” The key landmine is making sure that the YARN vision grows, and managing to have many workloads running without stepping on each others toes.
THANK YOU