UPDATED 15:24 EDT / FEBRUARY 17 2016

NEWS

Spark solutions: Making life easier for data scientists | #SparkSummit

by Betsy Amy-Vogt

“We are trying to make it easy to plug these models into the same interface …”

“We have focused on making it easy for people to contribute to Spark …”

“Just compute the thing you need and get out of there …”

There’s a common theme to theCUBE’s conversation with Matei Zaharia, CTO of Databricks, Inc. and creator of Apache Spark: making life easier for data scientists. But rather than adding to the pool of existing solutions, Zaharia is focused on more complex processing to solve problems that do not have existing solutions.

In an interview at Spark Summit East 2016 at the New York Hilton Midtown in NYC, Zaharia talked with Jeff Frick and George Gilbert, cohosts of theCUBE , from the SiliconANGLE Media team.

Realtime vs. streaming

“Everyone wants real-time,” stated Gilbert, as he invited Zaharia to clarify the confusion between real-time and streaming. Zaharia described the differences in detail, before the topic moved specifically to Spark streaming and the internal operations of the engine, which is designed for high-volume batch operations.

“We design our roadmap and engine based on what people want,” Zaharia said, and most use cases do not require both huge volume and true real-time. The latency of Spark – a few hundreds of milliseconds – is considered real time for most operations.

Geeking out with the rockstar of Spark Summit

In this in depth discussion, Zaharia – the rockstar of Spark Summit East – shared his knowledge and opinions on Spark SQL [Spark’s module for working with structured data], Catalyst [a query optimization framework for Spark], DataFrames [a distributed collection of data organized into named columns], and machine learning systems, as well as the exponential growth of the Spark community, new solutions from Databricks, and the future of Big Data.

Watch the full video interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of Spark Summit East 2016. Also join in on the conversation by CrowdChatting with theCUBE hosts.

Photo by SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.