UPDATED 12:00 EDT / SEPTEMBER 06 2017

BIG DATA

Ready-made apps, operations mindset enable faster data science applications

Simply developing accurate data science models presents such a big effort that many companies overlook the challenges associated with bringing these models to production. To help facilitate this process, DataTorrent Inc. is using its open-source engine Apache Apex to help businesses better utilize real-time big data analytics. And the company’s co-founder and co-founder and chief strategy officer, Phu Hoang (pictured), is leveraging his years of engineering experience from Yahoo’s early days of bringing complex infrastructure stacks to a production-worthy state.

“Very quickly we learned that at the pace of scale of data that we were generating that we couldn’t use [current enterprise] software, and we were kind of on our own,” Hoang said. “So we had to invent approaches to do that. The thing we knew a lot was commodity servers on racks. So, we ended up saying, ‘How do I solve this big data processing problem using that hardware?’ … We started to iterate around how to do distributed processing across many hundreds of servers.”

Hoang spoke with John Furrier (@furrier), host of theCUBE, SiliconANGLE media’s mobile livestreaming studio, at theCUBE’s studios in Palo Alto, California. They discussed the mindset and strategy required for quickly bringing data science applications to production.

From dev to prod

DataTorrent applies the same operations-driven mentality from Hoang’s Yahoo days in helping companies bring big data applications to production. All of their engineers are trained to live and breathe optimization for stability and robust operation at scale. 

“Our DNA is all about ops. We think that, especially with big data, there are lots of ways to do prototypes and get some proof of concept going. But getting that to production to run it 24×7 and never lose data, that really has been hard,” Hoang said.

A key to enabling a smooth productization experience with data science applications has been leveraging large building blocks that can address the majority of customer-driven use cases. These building blocks can come in the form of ready made apps that only require minor tweaking to fit the needs of a customer. 

“As we continue to learn in working with our customers and starting to see the patterns … putting kind of a bigger functional block together so that it’s easier to build a big data application at this next layer — machine learning, rule engines, whatever. But how do you piece that together in a way that is 80 percent done so that the customer only has the last mile?” Hoang asked. 

Watch the complete video interview below. (* Disclosure: DataTorrent Inc. sponsored this segment on SiliconANGLE Media’s theCUBE. Neither DataTorrent Inc. nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.