

DataTorrent Inc. is wasting no time adjusting to its new status as an open-source company. Merely two months after releasing its homegrown data crunching engine for Hadoop under an Apache 2.0 license, the analytics provider is launching a free companion tool designed to help users move their information into the analytics framework more easily.
There are already plenty of options in the open-source ecosystem and beyond for transferring large amounts of information among applications, which constitutes the crux of the problem. The large enterprises and other tech-savvy organizations where Hadoop is finding use nowadays draw data from a wide range of sources that handle distribution in different ways.
DataTorrent dtIngest provides a high-level interface for managing the flow of information across all the protocols, messaging services and storage systems involved in the process. The list of supported technologies includes Kafka, which is often used in combination of Hadoop to handle real-time information, Amazon Inc.’s cloud-based S3 object store and the JMS transfer standard, among others.
The unified nature of dtIngest has the added benefit of facilitating centralized security in the form of encryption and compression algorithms that are automatically applied to data ingested through the system. The software also performs a number of optimizations on top of that, including fusing small files into large batches so to avoid a situation where Hadoop runs out of memory addresses in which to keep the information.
That’s a very serious problem in stream processing use cases where upwards of billions of individual data points are crisscrossing the network at any given second, which happens to be one of the main applications for DataTorrent’s analytics engine. RTS, known as Project Apex in its recently introduced open-source incarnation, is described as being able to easily handle that kind of traffic with high reliability and low latency.
Both the data crunching engine and dtIngest can run on any Hadoop cluster running version 2.0 or above. The solutions are available for download immediately from DataTorrent’s site.
THANK YOU