UPDATED 09:27 EDT / SEPTEMBER 10 2018

BIG DATA

Hortonworks launches initiative to define unified hybrid cloud platform for big data

Saying the boundaries between cloud and on-premises computing have blurred to the point that distinctions are no longer meaningful, Hortonworks Inc. today is launching an initiative to define a single hybrid architecture for multicloud environments.

The company will initially be joined in what it calls the Open Hybrid Architecture Initiative by IBM Corp. and Red Hat Inc., with more partner announcements promised in the future. Hortonworks called the campaign “a broad effort across the open-source communities, the partner ecosystem and Hortonworks platforms to enable a consistent experience by bringing the cloud architecture on-premises for the enterprise.”

The so-called “multicloud” is one of the hottest trends in enterprise information technology right now. As organizations increasingly virtualize new applications to run on nearly any kind of infrastructure, they are also seeking to move workloads easily between public cloud vendors and their own private clouds. An International Data Corp. survey of more than 6,000 cloud-using organizations found that three-quarters already work with more than one cloud vendor. IDC expects multicloud services to be a $68 billion market by 2021.

Among the key milestones in the Open Hybrid Architecture Initiative will be recasting storage interfaces to support both file system interfaces and object stores, moving compute resources into containers, creating shared services for areas like metadata and security and standardizing on orchestration tools for consistent management of services, and workloads.

Avoiding cloud lock-in

“Customers don’t want to get locked into a cloud vendor and they want a consistent management and security architecture,” said Arun Murthy (pictured), chief product officer at Hortonworks. “This formalizes the process and broadens knowledge across the community, partners and ecosystems.”

It also works to Hortonworks’ advantage, since various technologies the company developed and contributed to the open source community are at the core of the platform it proposes. Red Hat wins because its OpenShift container management platform is set to be the standard for Kubernetes orchestration. IBM sells the Hortonworks Data Platform as its preferred Hadoop implementation and also partners with Red Hat in a number of areas.

Hortonworks has a track record of pulling together partners around open-source-focused initiatives. Three years ago it stirred up controversy with the creation of the Open Data Platform initiative, an effort to define a consistent set of standards for the Hadoop ecosystem while also marginalizing “open core” competitors that sell proprietary extensions to open-source standards. Two years earlier, it launched the Stinger initiative in an effort to rally developers around the Apache Hive query language.

Hortonworks thinks lightning can strike again with this latest project because the market is ready to build cross-platform big data workloads. Hadoop was developed with the assumption that storage and computing would be tightly coupled, but with 100-megabit connections to the cloud now commonplace, “we need to rethink some of the fundamental assumptions,” Murthy said. “We can now optimize in different ways and do it consistently whether on-prem or in the cloud.”

Three-phase approach

In the first phase of the project, Hortonworks will containerize the core Hadoop platform and DataFlow streaming analytics engine using DataPlane, a cloud service that corrals data from multiple sources and locations. Containers are miniature virtual machines that enable portability across a wide variety of infrastructure.

The second phase involves separating storage and computing by adopting a scalable file system and object-store interface based on Ozone, Hortonworks’ extensions to the the Hadoop File System that handle information stored as objects like multimedia files.

Finally, the company plans to lead an effort to containerize other essential big data services using OpenShift for Kubernetes orchestration. Among the services being targeted are Apache Knox for cluster management, Apache Atlas for data governance and Apache Ranger for security.

Hortonworks isn’t committing to a date for completing the integration work, but “we see a lot of these phases playing out over the next year,” Murthy said. “We’ve done a lot of work to put meat on the bones technically, but the ecosystem has to adapt.”

Murthy spoke with theCUBE, SiliconANGLE Media’s livestreaming studio, at Hortonworks’ DataWorks Summit in June:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU