UPDATED 09:00 EDT / SEPTEMBER 23 2015

NEWS

Wikibon says Hortonworks Dataflow is stream processor with a twist

Hortonworks Inc.’s DataFlow, which the company brought to market thanks to its purchase of Onyara Inc., is much more than just another stream processor. It has a unique set of capabilities that makes it hard to classify and that are the answer to the needs in the Internet-of-Things (IoT) and Internet-of-Anything (IoAT) domains, writes Wikibon Big Data Analyst George Gilbert. But Hortonworks’ obvious intent to combine DataFlow with its Hadoop distribution signals the beginning of fragmentation of the Hadoop environment. Hadoop is entering an era similar to that of the fragmented Unix environment of the 1990s.

DataFlow does the job of a stream processor but, unlike most stream processors, is bi-directional, having a separate channel to send and receive commands that control devices and applications. It’s designed to extend beyond the data center to the edge of complex networks, and it has the resilience, lineage and security capabilities of traditional databases.

These extra qualities make it ideal for IoT, which are decentralized environments. IoT will use intelligent end-point devices to gather large quantities of data. It will often use remote computing devices to capture, analyze and store data close to the point of generation rather than trying to send huge volumes through the network to a central data center. And those remote devices also need to be controlled from a central location. A smart electrical grid, for instance, not only needs to monitor the power usage of all appliances in every home, but also to adjust temperature settings when it knows the house is empty.Having two channels makes this task much simpler to accomplish.

However, the Onyara purchase is also the latest symptom of a gradual splintering of the Hadoop environment, Gilbert writes. Until recently, Hadoop vendors all provided the same open-source core capabilities and differentiated on manageability. Cloudera Inc.’s Manager and Navigator, for example, did not change the core compute engines such as MapReduce, Hive and Pig. Cloudera ships its own analytic MPP SQL database, Impala, but this uses the standard Parquet data format and Hive HCatalog, so data is not locked in.

The fast growth of the Hadoop market, however, is beginning to splinter the community, Gilbert writes. Hortonworks has always been strongly committed to using Apache projects for core compute engines and management tools. With DataFlow, stream processing, which, Gilbert says, is becoming a core compute engine, may be different across different vendors.


A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.