

Artificial intelligence can be extremely resource-intensive. Generally, AI practitioners seek out the fastest, most scalable, most power-efficient and lowest-cost hardware, software and cloud platforms to run their workloads.
As the AI arena shifts toward workload-optimized architectures, there’s a growing need for standard benchmarking tools to help machine learning developers and enterprise information technology professionals assess which target environments are best suited for any specific training or inferencing job. Historically, the AI industry has lacked reliable, transparent, standard and vendor-neutral benchmarks for flagging performance differences between different hardware, software, algorithms and cloud configurations that might be used to handle a given workload.
In a key AI industry milestone, the newly formed MLPerf open-source benchmark group last week announced the launch of a standard suite for benchmarking the performance of ML software frameworks, hardware accelerators and cloud platforms. The group — which includes Google, Baidu, Intel, AMD and other commercial vendors, as well as research universities such as Harvard and Stanford — is attempting to create an ML performance-comparison tool that is open, fair, reliable, comprehensive, flexible and affordable.
Available on GitHub and currently in preliminary release 0.5, MLPerf provides reference implementations for some bounded use cases that predominate in today’s AI deployments:
The first MLPerf release focuses on ML-training benchmarks applicable to jobs. Currently, each MLPerf reference implementation addressing a particular AI use cases provides the following:
The MLPerf group has published a repository of reference implementations for the benchmark. Reference implementations are valid as starting points for benchmark implementations but are not fully optimized and are not intended to be used for performance measurements on target production AI systems. Currently, MLPerf published benchmarks have been tested on the following reference implementation:
The MLPerf group plans to release each benchmark — or a specific problem using specific AI models — in two modes:
Each benchmark runs until the target metric is reached and then the tool records the result. The MLPerf group currently publishes benchmark metrics in terms of average “wall clock” time needed to train a model to a minimum quality. The tool takes into consideration the costs of jobs as long as price does not vary over the time of day that they are run. For each benchmark, the target metric is based on the original publication result, minus a small delta to allow for run-to-run variance.
The MLPerf group plans to update published benchmark results every three months. It will publish a score that summarizes performance across its entire set of closed and open benchmarks, calculated as the geometric mean of results for the full suite. It will also report power consumption for mobile devices and on-premises system to execute benchmark tasks and will report cost for cloud-based systems performing those tasks.
The next version of the benchmarking suite, slated for August release, will run on a range of AI frameworks. Subsequent updates will include support for inferencing workloads, eventually to be extended to include those executing run on embedded client systems. It plans to incorporate any benchmarking advances developed in “open” benchmarks into future versions of the “closed” benchmarks. And it plans to evolve reference implementations to incorporate more hardware capacity and optimized configurations for a range of workloads.
MLPerf is not the first industry framework for benchmarking AI platforms’ performance on specific workloads, though it certainly has the broadest participation and the most ambitious agenda. Going forward, Wikibon expects these established benchmarking initiatives to converge or align with MLPerf:
Here’s a video where you can learn more about MLPerf:
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.