UPDATED 11:00 EDT / JULY 17 2023

MLCommons announces MedPerf, a new benchmark for validating medical AI models

The Medical Working Group of the open machine learning consortium MLCommons today announced the availability of a new and open benchmarking platform called MedPerf.

MedPerf is a big deal, the group says, as it enables medical-focused artificial intelligence models to be validated on diverse, real-world healthcare data without revealing that information. It’s hoped that the availability of MedPerf will help to “catalyze wider adoption of medical AI,” resulting in more efficient and cost-effective clinical practices, the group said.

MLCommons is a collaborative engineering organization focused on developing the AI ecosystem through benchmarks, public datasets and research. It’s best known for its MLPerf AI benchmarks, which have become established as the AI industry standard for testing and validating AI models.

In an upcoming article in Nature Machine Intelligence, the Medical Working Group of MLCommons explains that medical AI has tremendous potential to advance healthcare by supporting and contributing to the evidence-based practice of medicine, personalizing patient treatment, reducing costs and improving both healthcare provider and patient experiences. However, one of the biggest challenges in unlocking this potential is the need for a systematic, quantitative method for evaluating the performance of AI models on large-scale, heterogeneous datasets that can capture a diverse range of patient populations.

MedPerf has been built to address this challenge, and the group claims it can provide numerous benefits to the medical community. First, it delivers a consistent and rigorous methodology to quantitatively evaluate the performance of medical AI models for real-world applications in a systematic and standardized way.

MedPerf also provides researchers with a technical approach to quantify model generalizability across institutions, with full data privacy and protection of each model’s intellectual property, by ensuring that any data used never leaves the healthcare provider’s systems. In addition, its collaborative design method supports a neutral and scientific approach to clinical validation of AI, while simultaneously illuminating use cases where superior AI models can improve clinical efficiency.

MLCommons said its existing benchmarks have had a positive impact on AI development in multiple industries, and it believes that having access to a similar benchmark for medical AI will help to accelerate development in the healthcare industry. It believes MedPerf will help to accelerate medical AI adoption by providing developers with a way to better serve underrepresented patient populations.

“MedPerf aims to advance research related to data utility, model utility, robustness to noisy annotations, and understanding of model failures,” MLCommons explained. “If a critical mass of AI researchers adopts these benchmarking standards, healthcare decision makers will see substantial benefits from aligning with this effort to increase benefits for their patient populations.”

MedPerf itself has already been validated in multiple settings, including a key use case for the Federated Tumor Segmentation Challenge, and four other academic pilot studies.

Image: Freepik

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

MLCommons announces MedPerf, a new benchmark for validating medical AI models

Image: Freepik

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

MLCommons announces MedPerf, a new benchmark for validating medical AI models

Image: Freepik

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Cookies