UPDATED 12:00 EST / JUNE 18 2018

BIG DATA

Hortonworks extends cloud support with new data platform

Big-data management firm Hortonworks Inc. today announced the third version of its core Hortonworks Data Platform along with expanded partnerships with cloud computing leaders, including Google LLC, Microsoft Corp. and IBM Corp.

The announcement of HDP 3.0 at the opening of the company’s DataWorks Summit in San Jose, California, essentially enables enterprises to run data-intensive applications more easily across computing environments, whether in the cloud or in on-premises data centers.

Planned for general availability in the third quarter, it builds upon the latest version of the open-source Apache Hadoop platform, which the company said sets it apart from Hadoop distributions offered by other companies.

“We’re seeing this huge migration to modern data architectures that includes more cloud content than ever before,” Hortonworks Chief Technology Officer Scott Gnau (pictured) told SiliconANGLE. “That means customers will have data in the data center, the cloud and everywhere in between. The idea is to create a seamless experience.”

In particular, HDP 3.0 comes with several new features. One is the ability to deploy applications quickly across computing environments using containers, a method of packaging up applications so they can run unchanged in the cloud or in private data centers of various kinds. “If you have HDP 3.0 running on Google, AWS or Azure, it can run same applications the same way,” Gnau said.

Another is support for deep-learning applications, which use artificial neural networks to do image and speech recognition and other data-intensive artificial intelligence jobs. HDP 3.0 enables data scientists to share access to servers using graphics processing units, the highly parallel chips used widely for training and running machine learning models.

The new platform also offers improved query optimization through use of a real-time database, the company said, so more data can be processed faster whether it’s in the cloud or on-premises. It’s enabled by the open-source Hadoop data warehouse Apache Hive.

In addition, the 3.0 version provides support for all the major cloud data stores, including Amazon Web Services Inc.’s S3, Microsoft’s Windows Azure Storage Blob and Google Cloud Storage. Gnau said that enables companies to move data around to where it can be used most efficiently, for examples moving data back from S3 into the Hadoop Distributed File System to get higher performance for some applications.

Hortonworks announced expanded partnerships with several cloud providers as well. That includes optimizing HDP and the Hortonworks DataFlow or HDF data analytics platform for Google Cloud Platform. “Our partnership with Hortonworks will give customers the ability to quickly run data analytics, machine learning and streaming analytics workloads in GCP while enabling a bridge to hybrid or cloud-native data architectures,” Sudhir Hasbe, a director of product management at Google Cloud, said in a statement.

An extended partnership with Microsoft will allow, among other things, the ability to deploy HDP, HDF and Hortonworks DataPlane Service or DPS, which allows management of data of different types and sources, natively on the Azure cloud platform. They’re already available on AWS.

And IBM announced that it’s now offering a new service called IBM Hosted Analytics with Hortonworks as an integrated service on the IBM Cloud. More specifically, it combines HDP, IBM’s Db2 Big SQL database and the IBM Data Science Experience. Rob Thomas, general manager of IBM Analytics, in a blog post likened the evolution of companies’ use of data to the development of the interstate highway system.

Not least, HDP 3.0 includes improved security and governance to comply with the European Union’s recently imposed General Data Protection Regulation and other data governance rules. That means the lineage of data being used can be tracked from its origin to the data lake in which it resides for use in various applications.

Firms that enable use of massive amounts of data are under a microscope lately thanks to high-profile breaches and misuse of people’s data, such as Cambridge Analytica’s unauthorized use of Facebook Inc. data in the U.S. presidential election in 2016. Gnau made the case that centralized control Hortonworks can offer in its platform enables companies to avoid problems.

“We can offer common data governance,” he said. “We know where the data is, who copied it, and what happened when it got there. Data in this century is the wealth creator,” he added, and it’s “insanity” not to control it.

Gnau delved deeper into GDPR and other data governance issues in a recent interview on theCUBE, SiliconANGLE’s livestreaming studio, which will be covering the DataWorks Summit this week:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU