UPDATED 12:00 EDT / JUNE 06 2017

BIG DATA

Exploring Apache Spark’s challenges and opportunities at Spark Summit 2017

What are the current challenges and opportunities for Apache Spark in the enterprise? Who’s making the best effort at Spark products and ecosystem development?

To answer these questions and others, SiliconANGLE is at Spark Summit 2017, taking place in San Francisco with exclusive commentary and interviews from the mobile studio theCUBE. (* Disclosure below.)

As Apache Spark continues to tackle some of the biggest data-wrangling obstacles in enterprise information technology, the platform is expected to heighten its influence, closing complexity gaps in the Hadoop open-source ecosystem.

Spark, however, does face its own challenges as it scales up to enterprise information technology environments, and community efforts are looking to address potential bottlenecks and other performance pitfalls. Ongoing enhancements for intelligent automation and code consolidation could help Spark circumvent Hadoop’s fate, as many top industry players boost support for the open-source platform.

Cloudera Inc. recently launched Altus with integrated support for Spark, a platform as a service for data engineers to access on-demand, elastic infrastructure to streamline data pipelines’ creation and management.

Databricks Inc., a startup founded by the creators of Apache Spark, has also been busy simplifying Spark for big data analytics. The company has also announced Databricks Serverless at Spark Summit, the first fully managed computing platform for Apache Spark. It facilitates the auto-management capability of big data workloads and subsequently removes the complexity and cost of users managing their own Spark clusters.

“One of the current challenges for Apache Spark as a community is defining a clear, bounded set of use cases where it supports the artificial intelligence and deep learning pipeline alongside emerging deep learning de facto standards, such as TensorFlow, Caffe2, MXNet and other broadly adopted deep learning tools,” said James Kobielus, an analyst at Wikibon, owned by the same parent company as SiliconANGLE.

At this year’s Spark Summit, Databricks has launched Deep Learning Pipelines, an open-source package that makes it possible for enterprises of all sizes to scale deep learning and makes it easier to use deep learning with Apache Spark, TensorFlow and others.

MapR Technologies Inc. recently announced new features for its Native Spark Connector for MapR-DB, which allows for deep Apache Spark integration. The company will showcase the MapR Converged Data Platform along with its updated Spark Connector at this week’s Spark Summit. Sameer Nori, director of partner solutions at MapR, will also present on how leading practitioners have been able to scale their machine learning deployments in production.

“The Spark community has not yet placed a huge emphasis on tooling to enable a robust industrial-grade DevOps pipeline for machine learning apps,” said Kobielus. “Given how increasingly integral Spark-based machine-learning apps are to enterprise development initiatives, we expect to see a growing number of solution providers at this year’s Spark Summit with high-quality DevOps offerings for converged teams of data scientists and traditional programmers.”  

Keynote speakers for Spark Summit 2017 include Matei Zaharia, co-founder and chief technologist at Databricks; Eric Siegel, founder and author for Predictive Analytics World; Ali Ghodsi, chief executive officer and co-founder at Databricks; Christopher Ré, associate professor at Stanford; Michael Greene, vice president for the Software and Service Group at Intel Corp.; Ion Stoica, professor at UC Berkeley AMP/RISELab and executive chairman at Databricks; Matt Fryer, vice president and chief data science officer at Hotels.com; Michael Armbrust, software engineer at Databricks; Tim Hunter, software engineer at Databricks; and Wes Kerr, senior data scientist at Riot Games Inc.

How to watch theCUBE interviews

There are various ways to watch all of theCUBE interviews that will be taking place at Spark Summit 2017, including SiliconANGLE TV and YouTube. You can also get all the coverage from the event on SiliconANGLE.

TheCUBE’s coverage starts today at 9:45 a.m. through 5 p.m. PDT.

SiliconANGLE TV

You can watch all of theCUBE’s exclusive interviews and commentary from Spark Summit 2017 on the dedicated SiliconANGLE TV page.

Watch on the SiliconANGLE YouTube channel

All of theCUBE interviews from Spark Summit 2017, which runs until June 7, will also be loaded onto SiliconANGLE’s dedicated YouTube channel.

Cubecasts

SiliconANGLE also has podcasts available of archived interview sessions, available on both SoundCloud and iTunes, which you can enjoy while on the go.

Guests who will be interviewed on theCUBE

Tuesday, June 6

Guests who will be interviewed on theCUBE include Spark Summit keynote speakers Matei Zaharia and Michael Greene.

Other guests include Nathan Murith, senior software development manager at Autodesk Inc.; and Octavian Tanase, vice president, data ONTAP software and systems group at NetApp Inc.

Wednesday, June 7

Guests who will be interviewed on theCUBE include Spark Summit keynote speakers Ali Ghodsi and Matt Fryer.

TheCUBE’s guest list is still being finalized and will be updated on the dedicated SiliconANGLE TV page.

Live stream of Spark Summit

If you are unable to join the Apache Spark community at Spark Summit in San Francisco, you can watch the event’s official live web stream.

The Spark Summit live stream will start today at 9 a.m. PDT, and you can register on the official event page.

(* Disclosure: Some segments on SiliconANGLE Media’s theCUBE are sponsored by Databricks or other companies. Sponsors have no editorial control over content on theCUBE or SiliconANGLE.)

Image: Spark Summit

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.