UPDATED 16:20 EST / JULY 25 2017

BIG DATA

Could Apache Spark become a universal computation engine?

Spark Summit keynotes are known for their surprises, and this year the stand-out changes were in data streaming, with sub-millisecond times predicted for some workloads. With multiple avenues open for potential success, the community is watching as Spark matures to fulfill the promise of what it could be: But does that promise include becoming a database?

Exploring the gap between theoretical possibilities and reality, Matthew Hunt (pictured) technologist at Bloomberg LP, discussed the maturation of Spark with George Gilbert (@ggilbert41) and David Goad (@davidgoad), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during this year’s Spark Summit event in San Francisco, California.

As a pioneer of streaming media, Bloomberg has a long history developing apps for news and finance and has developed its own relational database, ComDB2. “Everyone needs a database,” Hunt said, adding that most companies do not have the resources to develop their own. This leads to the question: Can Spark become a database?

Hunt believes that Spark has the promise to become a Universal Computation Engine. Describing a universal system as having distributed file store, database with transactional semantics, extensible analytics and the ability to stream data in, he asked, “how close can you come to that?”

Maturity means real-world use

Although the dream might be a universal system, the more practical question is how to make Spark and other databases work well together.

“If you have to master 5,000 skills and 200 different products, that’s a huge impediment for real-world usage,” said Hunt, who sees practical usage coalescing around a smaller set of options.

Hunt predicted that Apache Arrow, which powers columnar in-memory analytics, is about to explode because “it lets you connect these systems radically more efficiently in a standardized way.”

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Spark Summit 2017(* Disclosure: DataBricks Inc. sponsored this Spark Summit 2017 segment on SiliconANGLE Media’s theCUBE. Neither DataBricks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Video by SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.