UPDATED 14:02 EDT / NOVEMBER 03 2011

spark by Jonas Maaløe Jespersen NEWS

Another Hadoop Alternative: Spark

spark by Jonas Maaløe Jespersen I just published a list of Apache Hadoop alternatives, but here’s another one for the list: Spark. Spark is an distributed in-memory data analytics platform that uses the Scala programming language. IBM claims that Spark should be must faster than Hadoop because it uses in-memory analytics instead of Hadoop’s cluster file system approach. Spark was developed at the UC Berkeley AMP Lab along with Mesos, which is now an Apache Incubator project.

According to a recent paper on Spark from IBM:

Spark is an open source cluster computing environment similar to Hadoop, but it has some useful differences that make it superior in certain workloads—namely, Spark enables in-memory distributed datasets that optimize iterative workloads in addition to interactive queries.

Spark is implemented in the Scala language and uses Scala as its application framework. Unlike Hadoop, Spark and Scala create a tight integration, where Scala can easily manipulate distributed datasets as locally collective objects.

Although Spark was created to support iterative jobs on distributed datasets, it’s actually complementary to Hadoop and can run side by side over the Hadoop file system. This behavior is supported through a third-party clustering framework called Mesos. Spark was developed at the University of California, Berkeley, Algorithms, Machines, and People Lab to build large-scale and low-latency data analytics applications.

Spark is currently in use at Conviva.

Services Angle

Spark is a fresh approach that demonstrates that Hadoop isn’t necessarily the end-all-be-all of big data analytics. There’s quite a bit of room for improvement on Hadoop’s model, whether that’s through Hadoop distributions that add tools to the Hadoop stack or through alternatives like Spark and the others I’ve written about. Most of these tools don’t have the traction that Hadoop has yet, but the market is still open.

Photo by Jonas Maaløe Jespersen


A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.