UPDATED 15:05 EDT / JULY 20 2017

BIG DATA

Cloudera aims to change the way data is engineered

Developing an accurate data science model is a challenging process on its own. Scaling the model from a development environment to a production cluster presents another set of operational challenges that Cloudera Inc. aims to address with two new product offerings: Data Science Workbench and Altus.

Mark Grover (pictured, left), software engineer at Cloudera Inc., explained some of the operational challenges in data science. “There is this dichotomy, as a data scientist. I want to have the latest and greatest tools, the latest version of Python, the latest notebook kernel. … However, on the other side of this the dichotomy, the [information technology] world wants to make sure all tools are compliant and data is secure,” he said.

Grover and colleague Jennifer Wu (pictured, right), director of cloud management at Cloudera, spoke with David Goad (@davidgoad) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during Spark Summit in San Francisco, California. They discussed Cloudera’s new product offerings. (* Disclosure below.)

A seamless production experience

The disconnect between a typical data scientist’s working environment and a production cluster is exactly what Data Science Workbench aims to alleviate.

“Data Science Workbench runs on the same cluster that is being managed by Cloudera Manager … it allows you to move your development machine learning algorithms from your Data Science Workbench to production much easier because it’s all running on the same hardware and system,” Grover explained.

Altus, a new platform for consuming data science services, publicly launched just two weeks ago. Wu explained how Altus is changing the data science developer experience.

“It is a platform as a service offering designed to leverage the agility and scale of cloud, and make a very easy-to-use experience to expose Cloudera capacity for data engineering type of workloads,” she said. “They’ll be able to do things like [extract, transform and load], large-scale data processing, productionized machine learning workflows in the cloud.”

This focus of the product has been improving the end user experience for data scientists.

“We wanted to abstract away the cloud and cluster operations and make the end user experience very easy. So jobs and work loads are first-class objects; you can do things like submit jobs, clone jobs, troubleshoot jobs. We wanted to make this very easy for the data engineering end user,” Wu concluded.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Spark Summit 2017(* Disclosure: DataBricks Inc. sponsored this Spark Summit 2017 segment on SiliconANGLE Media’s theCUBE. Neither DataBricks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.