UPDATED 12:01 EDT / SEPTEMBER 09 2015

NEWS

Cloudera backs Spark as successor to MapReduce in Hadoop

Hadoop is entering a new chapter in its evolution with the launch of an ambitious community effort from Cloudera Inc. that aims to replace MapReduce as its default data processing engine. The proposed successor is predictably Apache Spark, the speedy in-memory alternative that has been gaining steam among adopters in the last few years.

The Hadoop distributor claims that industry interest is at a point where the engine is now the single most widely-used component in the entire upstream ecosystem, with 200 of its own customers having joined the bandwagon over the past 18 months alone. The new One Platform Initiative represents its response to that shift.

The push will concentrate on bringing the level of integration between Spark and the other projects in the Hadoop universe more up to par with the interoperability of MapReduce, which has a considerable head start thanks to the fact that the framework was built around it from the outset. Cloudera is already well into its effort, having made over 370 patches to the in-memory engine so far.

That adds up to about 43,000 lines of codes, a sizable portion of which is designed to help Spark work better with essential components such as the YARN resource manager that makes it possible to run multiple different analytics workloads on the same Hadoop clusters. The One Platform Initiative will expand upon that integration with support for several other complementary technologies, particularly on the security front.

One of the first items on the agenda is support for Intel Corp.’s Advanced Encryption libraries, which Cloudera plans to follow up with more granular access controls. The ultimate goal is to help Spark live up to the security standards of even the most heavily regulated sectors, especially the banking and medical industries, which are the forefront of Spark adoption.

At the same time, the company will also work to enhance the core data crunching capabilities of the engine through the development of new management features to help organizations scale their deployments more effectively and improvements to its emerging stream processing component. Both are essential to the continued growth of Spark.

But what gives special urgency to the One Platform Initiative is the fact that the engine can work without Hadoop. That means that if Cloudera doesn’t make it not only easy but also appealing to deploy Spark on the framework, it potentially risks losing the growing number of its customers moving away from MapReduce on the long-term. That risk is all the greater in view of BM Corp.’s recent commitment to invest a billion dollars into accelerating the development of the engine.

Photo via AdjencaJA

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.