What comes after non-stop Hadoop? | #HadoopSummit

SundarProviding continuous availability to its customers is WANdisco’s principle concern, according to the company’s CTO, Jagane Sundar. In a live interview with theCUBE co-hosts John Furrier and Dave Vellante from this year’s Hadoop Summit, he explained that these concerns have led to what the company now calls non-stop HBase, the next step after non-stop Hadoop.

Talking about the innovation brought on top of HBase by WANdisco, Sundar explained that “the biggest problem is regent server failure.” If the server fails, all its regents have to be moved to other servers. That process takes time. WANDisco’s solution is to store data on three different servers. “If one regent server fails, it’s a non event in our system.”

What will it take to make Hadoop enterprise-ready?


Asked to comment on the need for an enterprise-grade Hadoop, the paramount role security place and how they explain to customers that they are enterprise ready, Sundar said “security is a first order concern of ours. If you have more data centers and are copying data to other data centers, you are compromising data.” The solution is a single HDFS, a single point of authentication. “That is step one in an enterprise secure Hadoop.”

Kelly, citing a recent Wikibon survey, said 70 percent of Hadoop practitioners had data spread out through multiple data centers. Sundar explained that WANdisco allows those deployments to act like a single deployment, saying “the point is that the services that most of these enterprises offer cannot be limited to a single data center anymore. You need your data to be available continuously in all the data center.”

“Enterprise Haddop has to be continuously available, secure,” and allow enterprise customers to use Big Data in the same way they are used to deal with other databases, Sundar explained. “The first and biggest impact that the enterprises are going to have, the volume of data going into the system will grow by orders of magnitude.”

Asked to detail which were the applications most commonly supported by multiple Hadoop deployments, Sundar said his company’s expertise is mostly in the financial services segment. Organizations in this sector have applications running mostly on Tiers 2 and 3. WANdisco technology gives them the ability to run Tier 1 applications and consider fail over as something that is an essential part of the system. It allows the to have the data available for Tier 1 apps with virtually no failover.

On the Hadoop marketplace


Commenting on the company’s interaction with other players in the Hadoop market, Sundar said “we’re based on open source Apache Hadoop, so it was easy to get it to run on Cloudera and Hortonworks,” as well as on other distributions.

Asked his opinion of the market in terms of level competition and how they approach customers, Sundar said “we don’t sell religion, we sell bibles. The decision has been made before we go in. I see both Cloudera and Hortonworks as strong players in this market.”

As for the future, he said he saw the marked as having more than one popular distribution. Sundar explained that there was innovation from a hundred companies poured into the open source world.

Commenting on how that trend will shape the future of Apache Hadoop, Sundar explained that he sees “a clear parallel between the Linux world and the Hadoop environment at this point.”

While there is Linux who puts out the kernel, it takes further work and innovation to pack it as an operating system. “There will be enhancements, improvements, added by vendors or partners such as ourselves,” said Sundar.

Asked the difference between Cloudera and Hortonworks, Sundar said “I believe that Cloudera has a stronger enterprise focus and Hortonworks has a stronger open source roots focus. There will be two winners in this equation.”