DDN’s hScaler is the “Best Appliance for Hadoop So Far” – Breaking Analysis

There’s plenty of Hadoop fun to go ’round at this week’s Strata conference, as Big Data comes into its own and companies find they’re ready to execute on years of research and exploration.  DataDirect Networks is joining the party with its latest storage solution, which it says runs Hadoop faster and more efficiently than commodity clusters. The company credits this feat of engineering to a number of homegrown technologies that are included in the offering.

The newly announced hScaler appliance leverages DDN’s SFA shared storage architecture to deliver an easy-to-manage, “robust and high performance server framework” that can improve analytics performance by up seven times.  More importantly, the box comes pre-integrated with an ETL engine that eliminates the fatal bottlenecks that plague traditional Hadoop deployments. Wikibon’s David Floyer explained the significance of this technology on this morning’s NewsDesk segment (full video below).

“One of the biggest problems in keeping the processes going are errors, errors [that stem] from a read that doesn’t go right or from a write that doesn’t complete correctly. If that happens in the traditional systems, you get that break and you then lose all of the pipeline – you have to throw away stuff you brought in, you have to go back to fix that particular I/O problem and then restart the pipeline. The result of that is much slower throughput,” Floyer says. “DDN has… a capability of correcting any I/O, either read or right, in flight.”

DataDirect says that hScaler can deliver 1.4 million sustained I/Os per second. The company also claims that the platform is four time denser than conventional systems, and says that integration with DirectMon makes it much easier (and thus cheaper) to manage.

Big Data is only one of the trends that caught DDN’s attention in recent months. Late last year the company set aside $100 million for the development of next generation architectures that can support Exascale environments – that is, systems with more than 1,000 petabytes of capacity. Alternative solutions are needed because disk-based storage ceases to be cost-effective at this scale.