Hadoop + YARN : Key to Big Data Platforms in Enterprise | #hadoopsummit

Arun Murthy

Arun Murthy, Founder and Architect, Hortonworks, discussed YARN and Hadoop as a viable solution for enterprises with theCUBE co-hosts Dave Vellante and John Furrier, live at the 2013 Hadoop Summit.

John Furrier pointed out there is a lot of pressure to deliver a viable platform for enterprises, asking Murthy where the industry was in that respect. He stated that Hadoop and YARN are going to be “a big, big piece of the puzzle,” as “YARN allows you to interact with data in ways that were never possible before.”

Betting on YARN


As the company is making a bet that all data is going to end up in Hadoop, “we need to be a platform, not just a solution.” A solution can solve a distinct set of problems, a platform allows people to build solutions on top of it, add enhancements, and innovate.

“Yarn is a re-imagination, re-acrchitecture of Hadoop itself,” Murthy said. It is the second generation of the architecture and allows you to run multiple apps on same platform. It gives users the ability to run not just MapReduce, but 7 other algorithms along with it, all in one platform. The key part, he added, was that “you get significantly more value off of your existing investment.”

  • Knitting YARN into the community

On of the biggest strengths of the Hadoop community, Murthy said, was that “it’s very pragmatic,” as it is made up of not just developers, but also professionals who are part of support teams, the people who make sure that implemented platforms and solutions actually work.

Asked about compliance with YARN and Hortonworks Data Platform, he said that the company runs a YARN certification program, which allows certifying applications and tools on the platform, thus when addressing potential customers, they can be sure they are getting “good support and feedback from vendors.”

  • Knitting YARN into the business 

Talking about their customer pool, Murthy said it was spread across different industries and levels of savviness. In many cases, enterprises have 7-8 implementations of Hadoop, and Hortonworks gets called to then help them understand the security, compliance, and auditing aspects and how to run their applications on it. “That is one of the reasons Knox was born,” he added, to “take enterprise requirements and solve them inside the platform.”

The company talks to both customers who understand YARN and its use cases and know specifically what to do with it and those who are just starting out and need to be shown how Hadoop works, how to run jobs, secure it, and only then take it to the next level. A very common use cases is for real time event processing. Companies want an insight, analyzing real time media exchange, and price the item on clicks.

They key message from Hadoop Summit is, according to Murthy, “let’s try and make the core of the system better. If you can help us make the platform and core better, everyone benefits.”