Oracle Releases Big Data Appliance with Cloudera Distribution for Hadoop

Oracle Big Data Appliance

Today Oracle is releasing Oracle Big Data Appliance, which was first announced Oracle Open World last fall. Also, today Oracle announced that the appliance will include the Cloudera Distribution Including Hadoop and Cloudera Manager. Oracle also announced a set of connectors for integrating data stored in Hadoop and/or Oracle NoSQL Database with with Oracle Database 11g.

The Cloudera Distribution Including Hadoop is Cloudera’s open source Hadoop distribution that includes many of the other Hadoop related tools from Apache. The project is identical to Apache BigTop. Cloudera Manager is the company’s proprietary Hadoop management tool. Oracle’s announcement in the fall made no mention of a Cloudera partnership and a slide during the announcement mentioned “Oracle Hadoop Tools” being included.

Oracle Big Data Appliance

The appliance will run on Oracle Linux (which is based on Red Hat Enterprise Linux) and will also include an open source distribution of the R statistical programming language, Oracle NoSQL Database Community Edition (based on BerkleyDB) and Oracle Java HotSpot Virtual Machine.

Here are the specs for the appliance:

18 Oracle Sun servers with a total of:

  • 864 GB main memory;
  • 216 CPU cores;
  • 648 TB of raw disk storage;
  • 40 Gb/s InfiniBand connectivity between nodes and other Oracle engineered systems; and,
  • 10 Gb/s Ethernet data center connectivity.

According to Oracle’s announcement “The integrated Oracle and Cloudera architecture has been fully tested and validated by Oracle, who will also collaborate with Cloudera to provide support for Oracle Big Data Appliance.”

The Oracle Big Data Appliance joins EMC Greenplum’s MapR based appliance in the Hadoop appliance market. Cloudera’s partnership with Oracle diverges from its partnerships with companies like Dell and SGI, who have been critical of the appliance approach. But this is a big win for Cloudera. I also count it as a win for Oracle customers, who will be provided with Cloudera’s mature stack of tools instead of new tools from Oracle. Support for R is also an attractive proposition. The real question is whether it’s worth buying an appliance like this instead of building a cluster of commodity servers.