UPDATED 09:02 EDT / FEBRUARY 23 2016

NEWS

Alluxio, the in-memory store for Apache Spark, hits version 1.0

As the Hadoop File System continues to lose traction among Spark adopters, new and more sophisticated storage frameworks are starting to take its place. One of the most popular options is the open-source Alluxio (previously known as Tachyon), which is moving under the wing of a dedicated foundation this morning on occasion of its first major release hitting general availability.

The launch is the culmination of a three-year development effort supported by some of the biggest names in the technology world that began with the work of a single doctoral candidate at UC Berkeley. Haoyuan Li witnessed the rise of Spark firsthand during his studies at the university’s AMPlab, where the analytics engine had gotten its start in 2010, and identified a bottleneck that was holding back early implementation attempts: The handful of data stores that were able to effectively support in-memory processing at the time all relied on replication for fault-tolerance.

The records in a Spark cluster would be copied across multiple servers to ensure that they could still be accessed if a node malfunctions. The approach remains the prefered method of maintaining the reliability of the analytics engine to this very day, even as the amount of information that organizations are processing grows at an accelerating rate. As a result, more and more bandwidth is used replicating data, which leaves less for other tasks and thus ultimately impedes processing. Haoyuan foresaw the challenge and devised an alternative fault-tolerance technique that would go on to form the basis of Alluxio.

The platform registers every change made to a record from the moment it’s ingested by Spark in a special log that is kept readily-accessible at all times. Should the server that hosts the file fail during analysis, Alluxio can have another machine pick up the slack, redo all the calculations that were performed in the run-up to the malfunction and continue from there as if nothing happened. The mechanism takes advantage of the fact that processing power is much more abundant than bandwidth in the enterprise to drastically improve cluster performance.

Banking giant Barclays PLC claims that its data scientists were able to reduce the duration of certain analyses from hours to minutes using Alluxio. The framework enables developers to work faster as well by hiding the complexity of its internals behind a programming interface that makes it relatively straightforward to control the flow of information. Records may be imported into memory from a variety of third party systems and automatically moved to disk for permanent storage after processing is complete.

Alluxio can handle the latter task by itself or relegate the analyzed data to conventional file systems such as GlusterFS and OpenStack Swift. The framework also provides integration with a number of open-sourced execution engines to accommodate organizations whose needs may not be fully met by Spark.

Image via Geralt

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Alluxio, the in-memory store for Apache Spark, hits version 1.0

Image via Geralt

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Alluxio, the in-memory store for Apache Spark, hits version 1.0

Image via Geralt

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Cookies