UPDATED 09:50 EDT / JULY 25 2016

NEWS

Wikibon says It’s too early for Big Data performance benchmarks

Several Wikibon clients have asked about performance benchmarks for Big Data systems. The problem, writes Wikibon Big Data and Analytics Analyst George Gilbert, is that the technology is too immature. Meaningful benchmarks are based on standard workloads, but “common Big Data workload remains an alien concept” in Big Data, the analyst writes. Yahoo! Inc., for instance, created the Yahoo! Cloud Serving Benchmark (YCSB) to benchmark key-value scale-out NoSQL databases in 2010. That standard seems to be losing favor, however, because these databases are deployed in widely varying scenarios that are not covered.

Apache Spark is particularly difficult to benchmark because each new point release enables new classes of complex workloads, “which are neither cheap nor easy to translate into benchmarks,” Gilbert contends. The most commonly used benchmark for Big Data systems today is the Transaction Processing Council’s TPC-DS 2.x, which is designed to benchmark SQL decision support and can be targeted at Hadoop. However, the products being tested are so immature that “none that we know of actually can run all 90+ queries in the TPC-DS test suite unmodified.”

In general, attempts at using existing standard benchmarks on Big Data workloads have been ineffective, and the product benchmarks that have been published typically aren’t useful. And since the technology has not yet matured to the point of having standard workloads, customers find that the benchmarks that are published often do not apply to their situations.

The full report, which is available to Wikibon Premium subscribers, looks at published benchmarks for several prominent products and points out the serious weaknesses in each. Gilbert recommends that users who need benchmarks should run their own stylized workloads based on their intended usage scenarios and not expect those to compare closely to the workloads or experience of other users.

photo credit: Amulet (Nazar) via photopin (license)

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU