UPDATED 12:29 EDT / SEPTEMBER 11 2015

NEWS

Spark 1.5 puts the pedal to the metal on in-memory analytics

Apache Spark expanded its lead as the prefered candidate for the open-source community’s new flagship analytics engine this week with the release of a landmark update that drastically improves processing speeds for every supported workload type. Much of that increase is due to an overhaul of the underlying operating scheme that has been in the works for several quarters.

Like most of the other leading analytics technologies developed under the umbrella of the Apache Software Foundation, Spark is written mainly in Java, which comes with an abstraction layer that removes the need for the programmer to worry about the nuances of how their code is executed. The project’s backers have given up some of that convenience to squeeze out more performance out of the underlying hardware.

Spark now circumvents the native Java mechanism for managing data in memory to use its own specialized format that saves space and reduces the overhead that the abstraction layer expends on figuring out which bits can be deleted and when after they’re no longer needed. But that still doesn’t fully accommodate every workload, which is why the engine takes over code execution entirely for some of its more advanced components.

Standing out in particular are the data management functions that Spark borrows from the world of relational databases, which are implemented in a dedicated component that allows business analysts to carry out analytics using familiar structure queries. As an added bonus, the new release makes it possible to visualize the execution paths of those queries in order to identify ways to improve response times.

Spark 1.5 also targets a more mathematically-oriented audience with the addition of expanded support for the R statistical modelling language, which is likewise aimed at enabling users to employ syntax they already know. Except instead structured queries, the integration aims to enable the creation of machine learning algorithms like the kind used in recommendation systems and several other popular use cases for the engine.

Another fast-rising application for Spark that often goes hand in hand with machine learning is stream processing, which is also receiving a boost in the form of reliability improvements and a new throttling feature meant to prevent clusters from ingesting more data than they can handle. That’s useful for dealing with sudden input spikes that can potentially compromise the service levels of a deployment if left unchecked.

But as big of an improvement as the update represents, it’s still only the tip of the iceberg of what’s to come now that IBM Corp. has allocated a billion dollars and several thousand engineers to accelerating the development of Spark. One of the first additions in the pipe is a library called SystemML that is derived from Watson and automatically optimizes machine learning algorithms for fast execution.

Photo via AdjencaJA

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+

CUBE Alumni Network

C-level and Technical

Domain Experts

15M+

theCUBE

Viewers

Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Spark 1.5 puts the pedal to the metal on in-memory analytics

Photo via AdjencaJA

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

Understanding Today's Digital Business With Dynatrace

Black Hat USA 2025

World of Workato 2025

VMware Explore 2025

CrowdStrike Fal.Con 2025

RECENT CUBE EVENTS

Blue Yonder AI and the Autonomous Supply Chain 2025

Data Protection & AI Summit 2025

Open Source Summit NA 2025

theCUBE + NYSE Wired: Robotics & AI Infrastructure Leaders 2025

AppDev Done Right Summit 2025

Spark 1.5 puts the pedal to the metal on in-memory analytics

Photo via AdjencaJA

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST STORIES

LATEST STORIES

Understanding Today's Digital Business With Dynatrace

Black Hat USA 2025

World of Workato 2025

VMware Explore 2025

CrowdStrike Fal.Con 2025

Blue Yonder AI and the Autonomous Supply Chain 2025

Data Protection & AI Summit 2025

Open Source Summit NA 2025

theCUBE + NYSE Wired: Robotics & AI Infrastructure Leaders 2025

AppDev Done Right Summit 2025

Cookies