BIG DATA
BIG DATA
BIG DATA
Simply developing accurate data science models presents such a big effort that many companies overlook the challenges associated with bringing these models to production. To help facilitate this process, DataTorrent Inc. is using its open-source engine Apache Apex to help businesses better utilize real-time big data analytics. And the company’s co-founder and co-founder and chief strategy officer, Phu Hoang (pictured), is leveraging his years of engineering experience from Yahoo’s early days of bringing complex infrastructure stacks to a production-worthy state.
“Very quickly we learned that at the pace of scale of data that we were generating that we couldn’t use [current enterprise] software, and we were kind of on our own,” Hoang said. “So we had to invent approaches to do that. The thing we knew a lot was commodity servers on racks. So, we ended up saying, ‘How do I solve this big data processing problem using that hardware?’ … We started to iterate around how to do distributed processing across many hundreds of servers.”
Hoang spoke with John Furrier (@furrier), host of theCUBE, SiliconANGLE media’s mobile livestreaming studio, at theCUBE’s studios in Palo Alto, California. They discussed the mindset and strategy required for quickly bringing data science applications to production.
DataTorrent applies the same operations-driven mentality from Hoang’s Yahoo days in helping companies bring big data applications to production. All of their engineers are trained to live and breathe optimization for stability and robust operation at scale.
“Our DNA is all about ops. We think that, especially with big data, there are lots of ways to do prototypes and get some proof of concept going. But getting that to production to run it 24×7 and never lose data, that really has been hard,” Hoang said.
A key to enabling a smooth productization experience with data science applications has been leveraging large building blocks that can address the majority of customer-driven use cases. These building blocks can come in the form of ready made apps that only require minor tweaking to fit the needs of a customer.
“As we continue to learn in working with our customers and starting to see the patterns … putting kind of a bigger functional block together so that it’s easier to build a big data application at this next layer — machine learning, rule engines, whatever. But how do you piece that together in a way that is 80 percent done so that the customer only has the last mile?” Hoang asked.
Watch the complete video interview below. (* Disclosure: DataTorrent Inc. sponsored this segment on SiliconANGLE Media’s theCUBE. Neither DataTorrent Inc. nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.