Congratulations to MLPerf Inference v6.0

53 views

Skip to first unread message

David Kanter

unread,

Apr 3, 2026, 12:05:27 PMApr 3

to inference, public

Hi Everyone,

Good morning! It is my delight to announce the release of MLPerf Inference v6.0 results just two days ago on behalf of our remarkable team and entire community of submitters.

This our first major benchmark release of the year, and I’m thrilled at the velocity and progress we’ve achieved. AI moves at a tremendous pace and this round introduces five new models and updates one model for a lower latency scenario.

We also set a new record for participation, with 24 organizations, five newly available processors, and exciting new entrants from both industry and academia. Huge thanks to all submitters for pushing boundaries and ensuring the benchmark remains essential for everyone deploying, tuning, and developing AI.

Check out the full results and details here:
MLPerf Inference v6.0 Results Blog

Help us spread the word! Please like, share, and repost on your favorite platform:

Here are some highlights from the latest v6.0 Results:

● Four new datacenter benchmarks: the GPT-OSS 120B open-weight LLM, DLRMv3 sequential recommendation, a vision-language model, and text-to-video generation

● A new edge benchmark: YOLOv11 single-shot object detection

● A DeepSeek R1 interactive scenario with robust participation and tighter latency constraints, reflecting real-world deployment in agentic and LLM-powered applications

● Llama 2 70B remains the most popular test, now driving as many as 24 submissions - with some systems showing 50% performance gains over the last round

I want to offer particular congratulations and a warm welcome to new submitters: Inventec, Netweb Technology, and Stevens Institute of Technology. Special shoutout to our academic participants—this benchmark is powered by all corners of the community.

Participants and Organizations: AMD, ASUSTeK, Cisco, CoreWeave, Dell, GATEOverflow, GigaComputing, Google, Hewlett Packard Enterprise, Intel, Inventec Corporation, KRAI, Lambda, Lenovo, MangoBoost, MiTAC, Nebius, Netweb Technologies India Limited, NVIDIA, Oracle, Quanta Cloud Technology, Red Hat, Stevens Institute of Technology, and Supermicro.

Partners & Working Group Members: As always, MLPerf is a team activity, and this round in particular has a huge cast of heroes and heroines:

Miro Hodak & Frank Han - co-chairs of the Inference WG
Viraat Chandra (Task force Chair, NVIDIA), Miro Hodak (AMD), Zhihan Jiang (NVIDIA), Shobbit Verma (NVIDIA) for driving the DeepSeek update and GPT-OSS
Shang Wang (task force Chair, NVIDIA), Kshetrajna Raghavan (Shopify), John Calderon (NVIDIA), who led the creation of VLM with the Qwen model and the Shopify dataset
Linjian Ma (Task force Chair, Meta), for creating DLRMv3.
Tin-Yin Lai (Task force Chair, NVIDIA) and Akshat Tripathi (Task force Chair, Krai), for creating the text-to-video benchmark.
Manpreet Sokhi (Task force Chair, Dell) and Dwith Chenna (AMD), for the creation of the YOLO benchmark.
Scott Wasson and Reilly Fairbanks, who shepherded such a tremendous release to the finish line
Oana Balmau and Scott Wasson, who led the review process
Dave Graham and Lori Blonn, who drove the blog and handled our marketing and communications in one of the busiest pushes of the year
Kevin Schofeld, who wrote and revised the press release
Pablo Gonzalez Mesa, Tanvi Gour, Arav Agarwal, Joe Karmarcik, and Karl Pietri for handling all our IT and infrastructure

Explore the full results and technical details at MLCommons: https://mlcommons.org/2026/04/mlperf-inference-v6-0-results/

Again, congrats to everyone—let’s keep working to do our part to make AI faster, more capable, and more energy efficient. Let me know if you have any questions or feedback!

Thanks,

David Kanter

Founder, Head of MLPerf

MLCommons

Reply all

Reply to author

Forward

0 new messages