Coverage from SiliconANGLE's livestreaming video studio

UPDATED 19:15 EDT / JUNE 07 2017

APPS

What new apps are chugging on Spark 2.2’s real-time engine?

Apache Spark 2.2 has achieved event-by-event data streaming by trimming some fat from its execution process. So what new applications will the leaner, meaner engine drive online?

“Since we began structured streaming, we tried to make sure the API [Application Programming Interface] is not tied in with micro-batching in any way, and so this is the next step to actually eliminate that from the engine,” said Matei Zaharia (pictured), chief technologist and co-founder of Databricks Inc., a cloud big data service founded by Spark.

Untying that knot for good frees Spark 2.2 to stream a single event at a time with 1 millisecond of latency — effectively, true real-time, Zaharia told George Gilbert (@ggilbert41) and David Goad (@davidgoad), co-hosts of theCUBE, SiliconANGLE Media’s mobile live streaming studio, during Spark Summit 2017 in San Francisco, California. (* Disclosure below.)

To take full advantage of Spark 2.2’s streaming engine, however, Spark APIs are available to integrate with users’ own databases. Conversely, “If you want to do these transactions on a file system, there will be basically some performance constraints to doing that,” Zaharia said.

The engine enables various next-gen continuous streaming data applications. Automated decision-making apps on websites for loan approval, for instance, are one type. “But it could be in an even lower latency, like say stock-market style of place or Internet of Things or industrial monitoring and making decisions there,” he said.

Continuous stream-to-stream Extract-Transform-Load can produce new data streams from existing ones without losing anything to latency, Zaharia continued. This may not sound exciting, but it could boost the performance of microservices-based applications, he added.

DataBricks Serverless announcement

In the microservices-next-gen app vein, DataBricks announced the DataBricks Serverless platform for running Spark and data applications at the Summit.

“Serverless computing is this idea of: Users can just submit a query or a computation. They don’t have to configure the hardware at all,” Zaharia said. “So far, [serverless computing] has been very successful with stateless workloads, such as SQL [Structured Query Language] or Amazon Lambda [serverless compute], which is just functions serving a webpage,” he said.

DataBricks now extends this to Spark and big data, Zaharia concluded.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s independent editorial coverage of Spark Summit 2017. (* Disclosure: DataBricks Inc. sponsored this Spark Summit 2017 segment on SiliconANGLE Media’s theCUBE. Neither DataBricks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.