MLCommons releases results of its latest MLPerf AI inference benchmark test
MLCommons today released the latest results of its MLPerf Inference benchmark test, which compares the speed of artificial intelligence systems from different hardware makers.
MLCommons is an industry organization that develops open-source AI tools. As part of its work, the organization runs benchmark tests to compare the speed of different AI-optimized hardware systems. MLCommons’ benchmark tests help data center operators compare the performance of different suppliers’ products when purchasing new hardware.
Today, MLCommons released the results from the latest installment of its MLPerf Inference test. MLPerf Inference is designed to compare how well a data center system performs inference, or the task of running an AI model that has already been trained.
More than 20 companies participated in the latest installment of the test. The participants included Nvidia Corp., the top supplier of graphics processing units for data centers, as well as Intel Corp. and several other major chipmakers.
The companies compared the speed of their AI systems by having them perform inference using six neural networks. The six neural networks are each focused on a different use case, namely image classification, object detection, medical image segmentation, speech-to-text, language processing and e-commerce recommendations.
The participants in the MLPerf Inference test generated 5,300 individual performance results, 37% more than during the previous round. Participants also generated 2,400 measurements about the amount of electricity used by their systems while performing inference.
Nvidia’s flagship data center GPU, the H100, set multiple performance records during the test. The H100 (pictured) can perform certain inference tasks up to 30 times faster than Nvidia’s previous flagship data center GPU. It features more than 80 billion transistors, as well as a range of machine learning optimizations not included in the company’s earlier products.
“In their debut on the MLPerf industry-standard AI benchmarks, NVIDIA H100 Tensor Core GPUs set world records in inference on all workloads, delivering up to 4.5x more performance than previous-generation GPUs,” Dave Salvator, a senior product marketing manager at Nvidia, detailed in a blog post today. “The H100, aka Hopper, raised the bar in per-accelerator performance across all six neural networks in the round.”
Compared with Nvidia’s previous-generation flagship GPU, the H100 delivered the most significant performance improvement when running the BERT-large neural network. BERT-large is a neural network optimized for natural language processing. It’s based on the Transformer architecture, an approach to designing AI models that is widely used in the natural language processing field.
Nvidia’s H100 chip includes a module optimized specifically for running AI models based on the Transformer architecture. According to Nvidia, the module reduces the amount of data that neural networks have to process to produce results. The less data a neural network must process to complete a computation, the faster it can make decisions.
The H100 is not the only product that Nvidia evaluated as part of the MLPerf Inference test. The company also tested the speed of its Jetson Orin system-on-chip, a power-efficient processor designed to power robots. The processor provided five times the top performance of Nvidia’s previous-generation product and used half as much electricity.
Image: Nvidia
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU