Hadoop Summit Recap: Then and Now

The sixth annual Hadoop Summit is only three weeks away. The event will be hosted by Yahoo! and Hortonworks, two Big Data leaders that have come a long way in the past year. Now is as good a time as any to reflect on the state of the industry in 2012.

In the beginning, Hortonworks was a fledgling startup with plenty of money in the bank and a team of experienced Yahoo! engineers, but no product to speak of. That changed at the Hadoop Summit 2012, where Hortonworks Data Platform (HDP) made its official debut.

Hortonworks CEO Rob Bearden opened the conference with a powerful keynote speech that highlighted his company’s commitment to making Hadoop viable for the enterprise, a point that he reiterated in the follow-up interview with SiliconAngle founder John Furrier. Bearden voiced his concerns over fragmentation in the Hadoop ecosystem, and told Furrier that a single, open source framework is required to achieve a “critical mass of adoption.”

Mohr Davidow Ventures’ Geoffrey Moore, the author of the famed title ”Crossing the Chasm,” shared his own take on Big Data adoption in a separate interview with Furrier. He drew comparisons between Hadoop and earlier trends that disrupted IT in a big way, and provided a business perspective on the value of analytics.

Big Data is creating new opportunities for organizations across all verticals, but traditional database systems still have a place in modern IT environments. PayPal director of engineering Anil Madan told Wikibon analyst Jeff Kelly that relational databases are still the best option for basic money management functions that require accuracy, integrity, and security rather than raw speed. We’ll find out if this is still the case later this month.

The upcoming Hadoop Summit will feature many of the pundits and industry insiders who attended last year’s event, as well as new faces from the business world and academia. Tresata founder Abhishek Mehta is looking forward to hearing from Tathagata Das and Reynold S. Xin in particular, two UC Berkeley students who will host a session entitled “What’s New in the Berkeley Data Analytics Stack.”  Mehta told us why he’s looking forward to the lecture in a recent interview with SiliconAngle’s Kristin Feledy:

“[They] are going to talk about some of Tresata’s favorite projects: they’re gonna talk about Spark, which has some really cool functionality that allows companies and applications like ours to run real-time on Hadoop. They’re gonna briefly preview Tachyon (their alpha stage in-memory file system), [and] they’re gonna have Spark Streaming, which is rapidly becoming a must-watch project.”

For Mehta’s full take, check out the video below.