UPDATED 07:38 EDT / JANUARY 14 2014

Cloudera: Impala is faster than Hive, and here are the numbers to prove it

Cloudera is on a journey to make Hadoop enterprise-ready, and central to that mission is the integration of enterprise capabilities into the Big Data stack. That includes high-availability, recoverability and business intelligent features like structured query, a gap that the company is working to bridge with Impala. Launched in October 2012 and released for general availability in May 2013, the open source SQL-on-Hadoop solution is considerably faster than Hive, and according to a new internal benchmark, a leading DBMS as well.

In the first part of the the benchmark test, Cloudera pit Impala 1.1.1 against the latest release of the data warehouse, which runs on YARN, in a 3TB environment consisting of five Hadoop nodes with a 8-core processor and 96GB memory each. Impala outperformed Hive by between 6 to 69 times across three categories, namely interactive query, reporting, and deep analysis.

Afterwards, Impala 1.2.2 and a parallel relational database, only referred to as DBMS-Y due to “restrictive proprietary licensing agreement” terms, were run against 30 terabytes of TPC-DS data on a 20-node cluster. The SQL engine emerged victorious once again, outperforming the commercial database by an average of two times but coming in behind on 3 of the 20 queries tested.

“Interactive exploratory business intelligence is a mainstay workload of the Enterprise Data Hub,” noted Mike Olson, the founder and chairman of Cloudera. “One year ago, when we released Impala to open source, we knew that it had the potential to eventually play on the same field as some very mature analytic DBMSs, but the results of these performance benchmark tests exceed our very high expectations.”

Noticeably absent from the study are rivaling SQL-on-Hadoop solutions like Hadapt and the Hortonworks-sponsored Stinger project, but Cloudera maintains that Impala is the “fastest, most functional and proven way to run SQL on Hadoop data,” with more than 5,000 corporate users in various industries.

image source Cloudera

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU