UPDATED 09:42 EST / NOVEMBER 28 2017

BIG DATA

Cloudera rolls first application of its shared-data platform into beta test

Cloudera Inc. is rolling out the first application built on top of the Altus platform as a service it introduced in May.

Cloudera Altus Analytic DB, which will soon enter beta test, is a cloud-based data warehouse designed for self-service use that works on a shared foundation of data, catalog and directory services to eliminate the need for duplicate data and extract/transform/load procedures.

The company said it’s attacking the problem of data silos that proliferate as multiple users demand access to big data stores. Supporting multiple copies of the same data complicates consistency, security and governance, and transient clusters often disappear, the company said. Business users are also often prevented from performing their own ad hoc analysis because data is tied up by engineers.

Altus Analytics DB leverages the Shared Data Experience, a set of tools that enables multiple users to work from the same data and catalog using the tools that they prefer, including SQL, Python and R. Altus Data Engineering enables data scientists and engineers to quickly provision Apache Spark, Apache Hive, Hive on Spark and MapReduce capacity on cloud-native infrastructure, with initial support for Amazon Web Services Inc.’s S3 object storage and planned support for Microsoft Corp.’s Azure at an unspecified later date.

Data engineers can read from and write to cloud object storage without the need for data replication or changes to file formats, the company said. There’s no need to move data into a database and no preprocessing.

“The query engine can work against the data directly,” said Greg Rahn, director of product management at Cloudera. “You can load raw data into S3, then use something like Impala to convert to an optimized format. Or you can do transformation along the way and input directly into a Spark job, which outputs Apache Parquet files to S3. The query engine needs only to know the definition of the table.”

The product is functionally identical to Apache Impala, a massively parallel analytical database for Hadoop that Cloudera developed and released under an open-source license in 2012. “For folks who are familiar with Impala workloads, it’s a seamless transition to the cloud,” Rahn said.

With Altus, “business analysts get immediate self-service analytic access to the full breadth of data and can support a wide range of workloads,” said Alex Gutow, a Cloudera senior product marketing manager. “There’s no end to the number of users or use cases, and workloads are predictable, no matter how many other workloads are running.” Usage doesn’t affect performance or conformance with service-level agreements, she added.

Because the need to move data into a database is eliminated, shared data and associated data schemas and structures are always available for access by nearly any tool the user chooses, the company said. Cloudera has a network of more than 3,000 partners and works with all of the leading business intelligence tools, Gutow said. The company provided testimonials from Arcadia Data Inc., Informatica Corp., Tableau Software Inc., Qlik Inc. and Zoomdata Inc.

Pricing wasn’t specified. Customers can join a waiting list for the waiting list to get more details on availability.

Image: Flickr CC

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU