Hortonworks streaming analytics platform gets better flow management and governance
Hortonworks Inc. today brushed up its Hortonworks DataFlow streaming analytics platform with improved support for complex processes and the ability to share and publish data flows directly to production.
HDF is an open-source toolset that combines governance, security and management for applications involving real-time processing. It combines multiple open-source projects, including the Apache NiFi integrated data logistics platform, Apache Kafka message broker, Apache Storm event processor and Druid columnar distributed database.
The new release will be particularly useful for companies in regulated environments that need to rigorously document and govern their data, said Scott Gnau, Hortonworks’ chief technology officer. With the go-live date for the European Union’s General Data Protection Regulation less than four months away, he said, “having a trusted provenance of data is going to be more important than in the past.”
To that end, HDF can now be integrated with the Apache Atlas data governance and metadata framework, Hortonworks’s SmartSense problem resolution and optimization software and Apache Knox authentication gateway to provide better manageability of and access to data when colocated with Hortonworks Data Platform.
“We’re extending all the governance we’ve developed into the entire stack, so there’s one set of governance, metadata tags and processes,” Gnau said. Security is provided by a combination of Knox and Apache Ranger.
Portable data flows
The NiFi Registry, a new Apache sub-project, facilitates the development, management and portability of data flows. It can abstract data flow schemas and programs to enable users to track and monitor data flow changes at a granular level. Schemas can also be stored in a repository for sharing and versioning of schemas. Exporting and importing data flows enables those flows to be easily moved from one environment to another, Hortonworks said.
HDF 3.1 also adds new capabilities to improve streaming data operations in Hortonworks Streaming Analytics Manager. A new “test mode” gives developers the ability to build applications using mock data and create unit tests for integration into continuous integration and delivery environments.
A new operations module provides for testing, debugging, troubleshooting and monitoring of the deployed applications, saving developer time. Improvements to Apache Ambari and Apache Ranger now automate the processes for managing Apache NiFi resources, such as adding a new NiFi node to an existing cluster without manually updating node information. Updates to Ranger allow for group-based policies to be defined for NiFi resources.
“With a lot of these tools, developers have to create a test cluster to prove out their algorithms,” Gnau said. “We’re making it easy to publish to the production system. They can make changes in their laptop version and then republish those changes out to production.”
HDF now also supports Apache Kafka 1.0 for easier management, visualization and navigation around Kafka clusters. That integration also enables advanced security options and more stringent message processing semantics.
The new release is available today. All Hortonworks enhancements have been incorporated into their respective open-source projects.
Image: Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU