EMC today announced a new Big Data Analytics appliance that brings together the company’s Greenplum analytic database, its MapR-based Hadoop distribution and its Chorus development/collaboration platform under one roof.
The platform, called the Unified Analytics Platform, is more or less a repackaging and integration of existing technologies with the goal of providing Data Scientists, business analysts and, to some extent, business users a single environment to collaborate and conduct analysis on both structured an unstructured Big Data.
UAP will be available in Q1 2012 as either an on-premise or cloud appliance, or as a stand-alone software offering, according to Greenplum executives.
Speaking at a live webcast accompanying the announcement, EMC COO Pat Gelsinger said UAP makes it simple for skilled Data Scientists and less-savvy business users to collaborate on data exploration and allows enterprises to embed Big Data-enabled predictive analytics into business processes to better meet the needs of customers a la Amazon.com. “Big Data is transforming business,” he said.
Scott Yara, a Greenplum Co-Founder and the company’s Senior Vice President of Products, told analysts in a pre-brief earlier this week that Greenplum had the vision for a UAP-type product for years, but was not in a position to deliver as the Greenplum database is not optimized to process unstructured data. Once acquired by EMC, the company had the resources and cache to partner with MapR to produce its own Hadoop distribution, filling in this important gap.
The EMC/Greenplum Hadoop distribution, which uses MapR’s proprietary NFS as its storage layer, debuted in May, followed by the release of the latest version of the Greenplum Data Computing Appliance in September. UAP appears to be the next evolution of the platform, adding Chorus, which sports a Facebook-like UI, as well as unified management capabilities to the mix.
EMC Gets Aggressive on Big Data Analytics
EMC is clearly making an aggressive play at the Big Data market, having invested heavily in Big Data technology, training and services in the last year.
The new UAP is a comprehensive, compelling and easy-to-understand offering for EMC customers looking to leverage Big Data for competitive advantage. It offers the flexibility of either on-premise or in-the-cloud deployment, a single platform for processing both structured and unstructured data, and a unified environment for Data Scientists and others to collaborate and explore data. Its appliance model also means customers should be able to easily drop UAP into existing data centers and be off and running in short order. This in contrast to IBM, who’s clearly got all the Big Data pieces, but has yet to put forth a coherent strategy or streamlined offering, and Oracle, whose answer to just about every Big Data questions is Exadata.
EMC should also be applauded for its efforts around Big Data Analytics education and training. The company held a well-received Data Scientist Summit in May, has built up an internal team of Data Scientists to consult with customers, and recently announced a new vendor-neutral training class aimed at helping statisticians, quants and others make the leap to full-fledged Data Scientist. Greenplum also plans to make available a 1,000-node Greenplum workbench cluster for Data Scientists to experiment with for free in Q1.
Having said that, EMC is clearly taking a largely proprietary approach to Big Data, which brings with it the risk of vendor lock-in for customers. Enterprises that base their Big Data Analytics practices around EMC’s UAP will soon find it difficult if not impossible to migrate to a competing platform or approach due the technical challenges and costs involved.
Specifically, EMC’s Hadoop distribution, while it includes some important open source components, relies on MapR’s proprietary storage layer – the very heart of any Hadoop deployment. Enterprises that architect Big Data processes around NFS must perform significant re-architecting to migrate to HDFS, not to mention squander financial investments already made in EMC’s platform.
UAP will likely appeal to existing large enterprise EMC customers already heavily invested in the company’s storage, backup and recovery, and virtualization technology. But its hard to see why non-EMC customers, or even cash-strapped SMB EMC customers, wouldn’t at least experiment with other open source, more mature, less expensive Big Data options such as Cloudera or Hortonworks first.