UPDATED 06:31 EDT / JUNE 27 2014

Hortonworks lights up Hadoop: Apache Spark declared YARN-ready

small__4635261557(1)Hortonworks said Apache Spark, a new technology that’s quickly gaining interest for in-memory-accelerated machine learning and other forms of high-volume data analysis,  is now enabled to plug into Apache YARN, the resource-management layer introduced last year with Apache Hadoop 2.0.

Apache Spark is high-speed engine for large-scale data processing that was released as version 1.0.0 last May. It’s designed to run much faster than Hadoop’s MapReduce, and is capable of tacking more specialized applications. Spark is now ready to run as a technology preview on Hortonwork Data Platform (HDP), with a production-certified release set for later this year.

Hortonworks is a little bit late to the Apache Spark game. Back in February, Cloudera added support for Spark using its Cloudera Manager software for deployment, managing and monitoring. MapR followed up with its own Spark deployment last April. Now Hortonworks is getting in on the game, stressing that its version is 100 percent open-source, using YARN to monitor and manage the components.

In an interview with V3.co.uk, Hortonworks vice president of Corporate Strategy Shaun Connolly said developers using the Scala language were particularly interested in Spark. It allows them to perform analysis on Hadoop data for customer segmentation and other advanced techniques like classifying and clustering datasets. Now that Spark is YARN-ready, users can run Spark applications in a Hadoop cluster alongside other workloads, rather than doing so in a different cluster.

“Since Spark has requirements that are much heavier on memory and CPU, YARN-enabling it will ensure that the resources of a Spark user don’t dominate the cluster when SQL or MapReduce users are running their application,” said Connolly.

To ensure everything runs smoothly, Hortonworks is teaming up with Databricks – a company founded by Apache Spark’s creators – to make sure new apps and tools built on Spark are compatible.

“With the designation of Apache Spark as YARN-ready, enterprises can rest assured that Spark can run simultaneously and effectively with other mission-critical applications,” said Databricks business development executive Arsalan Tavakoli-Shiraji in a statement.

HDP 2.1 Tech Preview Component of Apache Spark can now be downloaded and installed on the current HDP 2.0 distro for free. Hortonworks says its HDP 2.1 release, which will include Spark, is expected to be ready in the next few months.

photo credit: Striking Photography by Bo Insogna via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.