

Alluxio Inc., maker of a virtual distributed file system for data science and analytics workloads, Wednesday released a new version that expands its metadata service and enables unified management across hybrid and multiple clouds.
Users can now manage namespaces with billion of files without the need for third-party tools, and a new management console makes it easier to connect an analytics cluster to multiple data sources both in the cloud and on premises.
Alluxio specifically targets data science and analytics users and has landed seven of the top 10 internet companies as customers, the company said. Its technology abstracts and virtualizes data for delivery to popular open-source analytics engines such as Apache Spark, Presto, Flink and Hive. It uses a global namespace, caching and in-memory metadata to track the location of and changes to data at its source, thereby avoiding the need to replicate.
Using Alluxio can improve the productivity of data modelers fourfold, said Chief Executive Haoyuan Li, who co-created the technology while a graduate student at the University of California at Berkeley. “The cost of training the model goes from $1 million to $200,000 and the time required from one year to three months,” he said.
The expanded metadata service moves the product further away from its Hadoop roots and improves support for cloud-native and container-based deployment. “We started in the Hadoop world and so required users to have that dependency,” Li said. “Now it’s completely removed.”
The management hub provides a wizard-based approach to connecting data sources across multiple locations as well as configuration and monitoring of Alluxio clusters. That permits data from sources such as Hadoop HDFS, Amazon Web Services Inc.’s S3 and Google LLC’s Cloud Storage to be combined.
In an effort to reduce barriers to adoption, the console also simplifies the process of configuring and launching a cluster and improves monitoring to reduce operational costs. Alluxio previously shipped with an open-source console that had only basic monitoring features and no configuration options, Li said.
New support for Terraform, an open-source toolset for managing infrastructure as code, now makes it easier to launch pre-configured clusters programmatically with a single command. This version also integrates with Vault to provide for secure, centralized management of sensitive information across clouds and data centers. Other enhancements include simpler cluster management and support for Java 11.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.