UPDATED 02:31 EST / OCTOBER 28 2016

NEWS

Databricks brings deep learning capabilities to Apache Spark

Databricks Inc. is turning its attention to machine learning by adding support for deep learning on its cloud-based Apache Spark platform.

The gist of the announcement is that Databricks is adding graphics chip support and integrating popular deep learning libraries to accelerate these kinds of workloads on the Spark platform.

By adding support for deep learning and graphics processing unit chips and integrating pre-installed libraries and examples, Databricks users will be able to leverage the power of GPUs to perform an array of machine learning tasks on Spark, including image processing and text analysis. The company says users can expect to see a 10 times speed boost in deep learning and automated configuration of GPU machines, as well as smoother integration with Spark clusters.

Today’s announcement follows Databricks’ earlier release of TensorFrames, a software library that enables the popular deep learning framework TensorFlow to run on Spark.

“The enhancements announced today simplify deep learning on Spark by adding out-of-the-box support for using TensorFrames with GPUs — specialized hardware that can perform an impressive amount of deep learning-specific computations in parallel,” the company said in a statement. “With Databricks, data teams can easily conduct deep learning on highly optimized hardware with a few clicks or API calls.”

The addition of GPU support is important, because while deep learning is an extremely powerful way to model data, it comes at the cost of extremely expensive computations when performed on normal architectures. However, GPUs can dramatically reduce this cost thanks to their ability to support efficient parallel computation.

In a blog post, Databricks highlights the cost benefits of using GPUs in a benchmark test of a simple numerical task:

“We compared optimized code written in Scala and run on top-of-the-line compute intensive machines in AWS (c3.8xlarge) against standard GPU hardware (g2.2xlarge),” wrote Databricks engineers Tim Hunter, Joseph Bradley and Yandong Mao in a blog post. “Using TensorFlow as the underlying compute library, the code is 3X shorter, and about 4X less expensive (in $ cost) to run on a GPU cluster.”

cpu-and-gpu-deep-learning-comparison

Databricks says support for deep learning will allow organizations to perform data wrangling, interactive exploration, stream data processing and other advanced analytics techniques alongside deep learning in a single platform. With these capabilities, organizations can avoid unwanted system complexities and simplify the development of deep learning applications such as accurate cancer detection for healthcare providers, faster drug discover for pharmaceutical companies and more capable AI for use in things like language translation.

Image credit: Unsplash via pixabay.com

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.