Understanding Apache Flink, an open-source stream processing framework, requires a fresh take on data processing. Flink goes beyond the capabilities of the last generation of stream processors by enabling new types of applications and microservices, said Jamie Grier (pictured), director of applications engineering at data Artisans GmbH.
“Stream processors in the past have been oriented for analytics alone. That’s the real sweet spot. Whereas Flink is a technology that enables you to build much more complex events in a time-driven application in a much more flexible way,” Grier said.
Meeting up with George Gilbert (@ggilbert41), host of theCUBE, SiliconANGLE Media’s mobile live streaming studio, on the ground at the Flink Forward event in San Francisco, California, Grier provided an in-depth view of the technology that drives Apache Flink. (*Disclosure below.)
Breaking down the benefits of Flink
As a stateful stream processor, Flink runs a continuous programming model that consumes events one at a time. Additionally, Flink updates and manages the data structures fault-tolerantly at scale. Grier explained how the open-source streaming processor offers users the ability to schedule things to happen at different times when certain amounts of data are complete.
Grier said that Flink provides so much more than processing data. A micro, simple database is part of the state processor; it is possible to update the state or the CRUD (Create, Read, Update and Delete) operations on that state, but the potential to reach further exists.
“You can build any kind of logic you can think of that’s driven by consuming events. I tell people all the time, ‘Flink allows you to do this consuming of events, updating data structures of your own choosing, and does fault-tolerantly and at scale. Build whatever you want out of that.’ And what people are building are things that are truly not really expressible as an analytics job. It’s more than just building applications,” Grier explained.
He gave the example of setting Flink up to manage market trades based on events happening in the market that modifies your position in the market or your state. Grier pointed out that this illustrates how Flink is not just in the category of analytics. He also explained how Flink manages state consistently with the input streams.
“That’s one of the core reasons why stream processors need to have state — so they can provide strong guarantees about correctness,” he said, going into detail about the importance of managing the state in the stream processor and consistently maintaining the input and the state.
He also drew the distinction between state and storage, mentioning that Flink state is not long-term storage. The purpose is to hold the in-flight data until it’s no longer in the process of modification, he explained.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Flink Forward 2017. (*Disclosure: TheCUBE is a paid media partner at Flink Forward. The conference sponsor, data Artisans, does not have editorial oversight of content on theCUBE or SiliconANGLE.)