UPDATED 11:56 EDT / FEBRUARY 01 2018

BIG DATA

Hortonworks streaming analytics platform gets better flow management and governance

Hortonworks Inc. today brushed up its Hortonworks DataFlow streaming analytics platform with improved support for complex processes and the ability to share and publish data flows directly to production.

HDF is an open-source toolset that combines governance, security and management for applications involving real-time processing. It combines multiple open-source projects, including the Apache NiFi integrated data logistics platform, Apache Kafka message broker, Apache Storm event processor and Druid columnar distributed database.

The new release will be particularly useful for companies in regulated environments that need to rigorously document and govern their data, said Scott Gnau, Hortonworks’ chief technology officer. With the go-live date for the European Union’s General Data Protection Regulation less than four months away, he said, “having a trusted provenance of data is going to be more important than in the past.”

To that end, HDF can now be integrated with the Apache Atlas data governance and metadata framework, Hortonworks’s SmartSense problem resolution and optimization software and Apache Knox authentication gateway to provide better manageability of and access to data when colocated with Hortonworks Data Platform.

“We’re extending all the governance we’ve developed into the entire stack, so there’s one set of governance, metadata tags and processes,” Gnau said. Security is provided by a combination of Knox and Apache Ranger.

Portable data flows

The NiFi Registry, a new Apache sub-project, facilitates the development, management and portability of data flows. It can abstract data flow schemas and programs to enable users to track and monitor data flow changes at a granular level. Schemas can also be stored in a repository for sharing and versioning of schemas. Exporting and importing data flows enables those flows to be easily moved from one environment to another, Hortonworks said.

HDF 3.1 also adds new capabilities to improve streaming data operations in Hortonworks Streaming Analytics Manager. A new “test mode” gives developers the ability to build applications using mock data and create unit tests for integration into continuous integration and delivery environments.

A new operations module provides for testing, debugging, troubleshooting and monitoring of the deployed applications, saving developer time. Improvements to Apache Ambari and Apache Ranger now automate the processes for managing Apache NiFi resources, such as adding a new NiFi node to an existing cluster without manually updating node information. Updates to Ranger allow for group-based policies to be defined for NiFi resources.

“With a lot of these tools, developers have to create a test cluster to prove out their algorithms,” Gnau said. “We’re making it easy to publish to the production system. They can make changes in their laptop version and then republish those changes out to production.”

HDF now also supports Apache Kafka 1.0 for easier management, visualization and navigation around Kafka clusters. That integration also enables advanced security options and more stringent message processing semantics.

The new release is available today. All Hortonworks enhancements have been incorporated into their respective open-source projects.

Image: Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.