Alluxio stokes data orchestration ambitions with $50M funding round
Distributed filesystem developer Alluxio Inc. today announced $50 million in new funding, bringing its total funds raised to date to $70 million.
The company also released version 2.7 of its Data Orchestration Platform, claiming a fivefold improvement in input/output efficiency for machine learning training applications and other performance improvements.
The company will use the proceeds from the oversubscribed Series C funding round to expand into the Europe/Middle East/Africa and Asia-Pacific regions as well as to boost research and development. Founder Haoyuan Li said the funds give Alluxio a nice cushion as the company was already cash-flow positive in the most recent quarter.
“This easily gives us five years-plus of runway,” he said. “We keep closing big deals and the big bottleneck today is how fast we can hire people. We have more requests from the community than we can handle.”
Alluxio makes an in-memory virtual storage layer that harmonizes data from multiple back-end stores for use by open-source computing frameworks like Apache Spark, Apache HBase and Presto. It uses intelligent caching to predict requests from the frameworks and predict what data will be needed. Last fall it expanded its capacity to support billions of files.
Li said the company is gaining traction in particular with firms with voracious data appetites and multiple open-source projects. He said eight of the 10 largest internet companies and five of the six largest public cloud providers use either the platform, which comes in both community and commercial editions.
New features in release 2.7 are aimed at artificial intelligence workloads, particularly those that use the Tensorflow and PyTorch frameworks. The firm has also “doubled down on making it easy to run Kubernetes with us,” Li said. Alluxio now supports a native Container Storage Interface driver for Kubernetes, as well as a Kubernetes operator for machine learning.
Among the new features is optimized support for Nvidia Corp.’s Data Loading Library, a Python library that supports central processing unit and graphics processing unit execution for data loading and pre-processing to accelerate deep learning. Parallelization has been applied to speed data loading along with the ability to batch data management jobs using an embedded execution engine to reduce management controller resource requirements.
A new feature called Shadow Cache for the Presto open-source distributed SQL query engine delivers better reporting on the impact of cache size on response times, thereby significantly reducing management overhead, the company said.
Built originally for on-premises use, Alluxio is now increasingly being taken to the cloud by customers, Li said. Although nothing has been announced, he said it’s likely the company will introduce a fully managed cloud service.
The financing round was led by an unnamed global investment firm with participation from existing investors that include Andreessen Horowitz LLC, Seven Seas Partners and Volcanics Venture.
Photo: Flickr CC
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU