UPDATED 15:20 EDT / AUGUST 06 2012

NEWS

With YARN, Hadoop No Longer a One Trick Pony

Last week my colleague John Furrier reported that YARN, also known as Next-Generation MapReduce, was upgraded to a full-fledged Apache Hadoop sub-project. While YARN is still considered alpha-quality, the move is a good sign for Hadoop. Here’s why.

Critics often cite Hadoop’s inability to process data with any method other than MapReduce as evidence that Hadoop isn’t enterprise-ready. And indeed this is a shortcoming. MapReduce is great for batch processing large volumes of distributed data, but it’s less than ideal for real-time data processing, graph processing and other non-batch methods.

YARN is the open source community’s effort to overcome this limitation and transform Hadoop from a One Trick Pony to a truly comprehensive Big Data management and analytics platform. Specifically, YARN gives each application running on Hadoop its own ApplicationMaster. As described at Hadoop.Apache.org, the ApplicationMaster serves as a framework-specific library that, in conjunction with a global ResourceManager, enables applications to process data in one of a number of frameworks, traditional MapReduce among them.

You can get all the technical details here, but the important takeaway for CIOs and others responsible for Big Data investments is that YARN enables enterprises to wring significantly more value from Hadoop by allowing both MapReduce-focused applications and applications for other data processing frameworks to run on the same cluster.

Hortonworks’ Arun Murthy, who plays a major role in developing YARN, explains its significance:

“People are not going to be comfortable buying a $5 million Hadoop cluster just to do MapReduce and a $2 million cluster to do something else. If you can allow them to run both apps in the same cluster, its not only easier for you in terms of a CapEx perspective … it’s also easier from an operational perspective because you don’t have to have two separate sets of people managing your clusters or two sets of tools for managing your clusters.”

So instead of maintaining one large-scale Hadoop cluster to support historical analysis of Big Data sets (along with dedicated staff and software to manage the cluster) and a separate cluster, staff and software to support real-time end-user-facing Big Data applications, you can deploy just one cluster for both. That’s results in a lot of time, money and manpower savings, making Hadoop a much more attractive option for the enterprise.

Of course, YARN is not quite ready for prime time, and there are other areas that need improvement for Hadoop to be considered a comprehensive data management platform. But the upgrade to its own sub-project means that YARN will receive even more attention from committers and develop that much faster. When it reaches the point that YARN is stable enough for production-level deployments, Murthy said Hortonworks will then integrate it into its own Hadoop distribution, the Hortonworks Data Platform.

Check out the below video for a succinct explanation of YARN and its benefits from Murthy at Hadoop Summit 2012.


Watch live video from SiliconANGLE.com on Justin.tv


A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.