UPDATED 08:30 EDT / JUNE 05 2015

NEWS

Hadoop gets another open-source stream processing engine courtesy of DataTorrent

A year to the week after launching its homegrown in-memory stream processing platform into general availability, DataTorrent Inc. is releasing the code for the core execution engine under a free license. The move levels the playing field against the open-source alternatives that have hit the scene since then.

There are no fewer than three other options for analyzing real-time data on Hadoop in the upstream ecosystem, all of which enjoy significant community support. Apache Spark’s popularity has helped drive much attention to its stream processing extension, while the more specialized Storm and Samza benefit from the continued backing of their respective big-name creators, Twitter and LinkedIn.

But while DataTorrent can’t boast the same widespread acceptance as Spark Streaming, nor the clout of the social networking titans, its development team, which is led by two of the executives who helped spearhead the creation of Hadoop at Yahoo, has nonetheless managed to put forth a strong contender. One of the main features setting its software apart is usability.

While its more entrenched rivals are notoriously difficult to implement, DataTorrent’s RTS platform provides features for simplifying many of the operational tasks that make it so hard to perform stream processing on a large scale. The software can quickly redistribute work from malfunctioning nodes to the rest of the cluster and automatically recognize new ones.

That allows organizations to keep up with the growth of their data more easily and potentially handle up to billions of events per second, which makes RTS highly competitive in performance with Storm and Samza. Spark Streaming suffers latency issues due to the batch-oriented nature of the underlying architecture that make it less suitable for truly real-time use cases.

DataTorrent hopes that releasing the core software, which will be available under the codename Project Apex, under the same open license as Hadoop will help provide the catalyst needed to translate that technical advantage into a competitive advantage. As Google has demonstrated with its Kubernetes project, offering code for free can be the best way to drive adoption.

And once organizations are using Project Apex, it should become a much simpler matter for DataTorrent to upsell them to its commercial version, which has been updated alongside the release with new value-added features to help take advantage of the platform’s performance. The arguably biggest improvement is to the process of routing information into Hadoop.

New connectors can automatically convert data from sources as MapReduce-based clusters and traditional business intelligence solutions into a format compatible with RTS 3, which also packs a visual interface for customizing those streams. After everything is inside Hadoop, another new console enables analysts to visualize useful patterns into dashboards that they can then share with business users.

Photo by Nimish Gogri via Flickr

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU