UPDATED 19:57 EDT / JULY 20 2017

BIG DATA

Can big data DevOps see what abstraction is hiding?

DevOps that have worked well in other areas of information technology can’t hack it in big data, according to Ash Munshi (pictured), chief executive officer of Pepperdata Inc.

“While agile and all that still works, the tools don’t work,” Munshi said in an interview at this year’s Spark Summit in San Francisco, California.

Tightening the loop leading from development to operations is trickier due to the mass of data in the middle and the number of machines computing it, Munshi told David Goad (@davidgoad) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile live streaming studio. (* Disclosure below.)

There could be thousands of machines working to solve one problem. This obviously points to infrastructure abstraction and virtualization as possible fixes, but, by themselves, they’re half-baked solutions, Munshi stated.

Apache Spark’s data engine abstracts the paradigm developers write against, which is wonderful, Munshi explained, since it simplifies the code-writing process.

“The problem when you abstract is, what does that abstraction do down in the hardware, and where am I losing performance?” he said.

Spark’s user interface provides some information about processor and memory resource consumption and the state of the garbage collector, Munshi stated. “What it doesn’t do is give you a time-series view of what’s going on,” he said.

Visibility ties up loose ends?

With blow-by-blow visibility, Pepperdata’s recently announced Code Analyzer for Apache Spark allows users to pinpoint performance issues in their code. A second, complementary Pepperdata release is the Application Profiler. This tool analyzes all data from completed applications in the Spark History Server. If it discovers faulty executors, it highlights them so developers can click on them for explanations and suggested cures.

Pepperdata customers have discovered that this is a useful prognosticator as well, Munshi stated.

“If the Application Profiler comes back and says, ‘Everything is green; there’s no critical issues there,’ then they’re saying, ‘OK, fine. Put it on the production cluster,'” he said.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Spark Summit 2017(* Disclosure: DataBricks Inc. sponsored this Spark Summit 2017 segment on SiliconANGLE Media’s theCUBE. Neither DataBricks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.