UPDATED 09:02 EDT / JULY 30 2015

NEWS

DataTorrent debuts free data integration service to complement its ultra-fast Spark alternative

DataTorrent Inc. is wasting no time adjusting to its new status as an open-source company. Merely two months after releasing its homegrown data crunching engine for Hadoop under an Apache 2.0 license, the analytics provider is launching a free companion tool designed to help users move their information into the analytics framework more easily.

There are already plenty of options in the open-source ecosystem and beyond for transferring large amounts of information among applications, which constitutes the crux of the problem. The large enterprises and other tech-savvy organizations where Hadoop is finding use nowadays draw data from a wide range of sources that handle distribution in different ways.

DataTorrent dtIngest provides a high-level interface for managing the flow of information across all the protocols, messaging services and storage systems involved in the process. The list of supported technologies includes Kafka, which is often used in combination of Hadoop to handle real-time information, Amazon Inc.’s cloud-based S3 object store and the JMS transfer standard, among others.

The unified nature of dtIngest has the added benefit of facilitating centralized security in the form of encryption and compression algorithms that are automatically applied to data ingested through the system. The software also performs a number of optimizations on top of that, including fusing small files into large batches so to avoid a situation where Hadoop runs out of memory addresses in which to keep the information.

That’s a very serious problem in stream processing use cases where upwards of billions of individual data points are crisscrossing the network at any given second, which happens to be one of the main applications for DataTorrent’s analytics engine. RTS, known as Project Apex in its recently introduced open-source incarnation, is described as being able to easily handle that kind of traffic with high reliability and low latency.

Both the data crunching engine and dtIngest can run on any Hadoop cluster running version 2.0 or above. The solutions are available for download immediately from DataTorrent’s site.

Photo via Geralt

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU