UPDATED 09:00 EDT / SEPTEMBER 11 2017

BIG DATA

Data Artisans offers smoother on-ramp to stream processing with Apache Flink

Organizations that want to implement stream processing are about to get another boost with a new release of the commercial version of Apache Flink from data Artisans GmbH, a company founded by Flink’s creators.

Apache Flink was created to replace the MapReduce component in the big-data software framework Hadoop with a faster engine that can handle high flows of real-time data, such as credit card activity monitoring, machine learning and business intelligence. Flink has been overshadowed by Apache Spark and its Spark Streaming extension, but Spark isn’t a true streaming engine. Rather, it processes streaming data in small chunks, a process called “micro-batching.”

The distinction is important because micro-batching is stateless, meaning that only simple processes can be performed on data, such as transforming or storing it. Stateful platforms like Flink enable computation to be performed on the stream, a key capability for real-time analytics.  “Stateless stream processing is simple. Stateful processing is a lot harder,” said Stephan Ewen, data Artisans’ chief technology officer.

Berlin-based data Artisans debuted its first fully supported commercial distribution of Flink, called dA Platform, a year ago a few months after raising $6 million in initial funding. With the release of the new dA Platform 2, the company has added features to reduce the cost and trouble involved in implementing Flink widely across an organization.

“Last year we worked with a good number of streaming users and observed the same pattern over and over again,” Ewen said. “It’s easy to get started with Flink, but difficult to perform upgrades without carrying forward all the previous apps. They don’t have the manpower to implement all the infrastructure pieces.”

The highlight of the new platform is Application Manager, a package of tools that streamlines the process of deploying and maintaining real-time streaming applications in production. Developers can specify processes such as updating, testing and committing code through integration with popular automation platforms such as Jenkins and Puppet.

Flink provides out-of-the-box tools for automating processes like starting and stopping jobs and triggering snapshots, but they require considerable skill to use, Ewen said. “We’re raising the abstraction level so that all you worry about is moving the application to a newer version or a different cluster,” he said.

The Application Manager also enables Flink to be integrated with popular open-source logging and metric systems such as Grafana and Logstash, as well as container orchestration platforms such as Kubernetes. The company provides application programming interfaces that can be used to integrate with other platforms as well.

DA Platform 2 is available now through an early access program. Production availability scheduled for October. Pricing is either on a per-node or annual license basis, but specifics weren’t announced.

Image: Flickr CC

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU