

After years of presentations that focused on how to analyze, enhance and even expand views of data as it landed in the cluster, Hortonworks Inc. finally admitted that it conveniently ignored how to actually build a process that streamed in the data itself.
With the company’s announcement this week of the Streaming Analytics Manager as part of HortonWorks Data Flow 3.0, it took a major step toward giving business analysts the ability to create streaming applications without having to write a single line of code.
The new streaming data tool was demonstrated during today’s keynote at DataWorks Summit in San Jose, California, in a presentation by Joseph Witt, senior director of engineering for Hortonworks, and George Job Vetticaden, vice president of Hortonworks product management and emerging products.
“Before today, we just hand-waved at how to do stream processing,” Witt said.
The company’s SAM has changed that dynamic. In response to concerns that the process for building streaming analytics needed to become easier, Hortonworks has introduced a tool that uses a simple drag-and-drop interface to build an application in real time.
“We’ve shielded a lot of hairy details away from the developer. It’s not just easier, but quite fun,” Vetticaden said.
SAM includes a schema registry that lets applications interact with each other across streaming engines like Apache NiFi, which automates the flow of data between systems, and Apache Storm, an open-source distributed real-time computation system. In the DataWorks Summit keynote this morning, the two Hortonworks executives built a sample application that visualized data streams for a fleet of trucks, while predicting which vehicles and drivers would exceed the speed limit on a particular route.
“These are predictive analytics that work without writing any code,” Vetticaden said.
The keynote session also offered a look at how the various Apache Hadoop-based tools are being used to address critical needs in the enterprise. (Apache Hadoop is an open-source-based software used for storing, processing and analyzing big data.)
Sumeet Singh, senior director for cloud and big data platforms at Yahoo Inc., described how the company is relying on Apache Hive — a data warehouse software project built on top of Hadoop — to process half a billion records for each database query.
“Apache Hive is one of the predominant technologies that we’ve been shaping,” Singh said.
Singh said that Yahoo has introduced GPU and high-memory servers to facilitate the integration of machine learning into its operation. The company has also been running Caffe, a deep learning framework, and TensorFlowonSpark, which brings TensorFlow programs onto Apache Spark clusters, over the past two years.
“Open source is big for us,” Singh added.
The presentations from the Yahoo and Hortonworks executives underscored the growing influence of data science in the enterprise, as companies look for simplicity and a return on their information technology investment. This is leading to more focus on how to frame the big data conversation and what tools, like Hortonworks’ SAM, make the most sense.
“You don’t monetize the data,” said Bill Schmarzo, chief technology officer for the big data practice at Dell EMC, Dell Technology Inc.’s infrastructure group. “You’re going to monetize the insights that come from the data.”
Schmarzo, who spoke at the DataWorks keynote session this morning, teaches a class in Silicon Valley on how to get business people to think like data scientists. “It’s not about technology; it’s about business models,” he said.
Schmarzo challenged the gathering to better understand the economic value of data and create business models with analytics to deliver real results to the bottom line. Business executives live by the “four M’s,” which are “make me more money,” he said.
Watch the complete keynote video below, and be sure to check out more of SiliconANGLE’s and theCUBE’s independent editorial coverage of DataWorks Summit US 2017.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.