UPDATED 12:32 EDT / JULY 27 2011

NEWS

Syncsort Tries To Pump-up MapReduce

A number of data integration vendors have developed connectors that allow companies to move data between Hadoop clusters and analytic databases, enterprise applications and other sources.

But one data integration vendor, Syncsort, is also focusing its attention on speeding up moving data within Hadoop clusters. The Woodcliff, N.J.-based company is developing a pluggable sort to replace the native sort function currently part of the Apache framework that the company says significantly boost performance of MapReduce jobs.


Specifically, Syncsort says its DMExpress Hadoop Edition, which includes the pluggable sort, is capable of “delivering up to 2x faster performance than native Hadoop deployments.” Rather than requiring developers to write complicated scripts in Java or Pig, DMExpress Hadoop Edition also includes a graphical user interface for simplifying and speeding up the task of creating MapReduce jobs.

Both the new sort capabilities and GUI are another step towards making Hadoop, and MapReduce in particular, enterprise ready. Indeed, two reasons there aren’t more Hadoop deployments supporting business-critical applications are: 1. A lack of experienced Hadoop developers with the skills required to write MapReduce jobs and 2. A lack of reliable, fast performance to meet tight service-level agreements. Syncsort’s latest modules address both issues.

Marketing intelligence firm comScore is currently using DMExpress Hadoop Edition to process and analyze multiple terabytes of online records per day.

The new sort feature is still a work in progress, however, and Syncsort is aiming for a GA in Q4. But the company is still waiting to hear when/if it will be adopted by the Apache Hadoop project, a process that can drag on for weeks and months depending on the complexity of the new feature in question.

Another critical question for Syncsort is how it brings the sort feature to market. The company says it plans to keep its sort feature open to the community and is currently in discussion with one or more open source Hadoop distribution vendors. Syncsort may also enlist the help of services vendors like Cognizant and Accenture to spur adoption, according to the company.

Syncsort has a long and successful track record in the mainframe data integration/protections business, and its data integration accelerator is enjoying increased adoption to boost performance of traditional ETL jobs by Informatica and others. Enterprises that are reluctant to adopt tHadoop to support mission-critical applications because of performance issues should keep and eye on Syncsort’s Hadoop platform. In particular, its pluggable sort feature has the potential to simplify and speed up MapReduce jobs — and improve Hadoop’s true business value (see video below).


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU