UPDATED 21:04 EDT / MARCH 11 2026

Nvidia’s Nemotron Super 3 model for agentic systems launches with five times higher throughput

With so much talk about its upcoming Vera Rubin graphics processing units, it’s easy to forget that Nvidia Corp. doesn’t just supply the hardware for artificial intelligence.

It also develops its own series of AI models, and today it announced the availability of its most capable model so far. The company said Nemotron Super 3 is aimed at running complex agentic AI systems at large scale, combining advanced reasoning skills with rapid processing speeds to efficiently perform tasks that require extreme accuracy.

Nemotron Super 3 is a 120 billion-parameter open model based on a hybrid mixture-of-experts architecture. It combines three innovations to achieve up to five times higher throughput and twice the accuracy of the previous-generation Nemotron Super model, Nvidia said.

According to Nvidia, Nemotron Super 3 is designed to tackle two major constraints facing agentic AI systems that aim to automate complex tasks on behalf of their users. The first is an explosion of content. Nvidia said that multi-agent workflows typically generate up to 15 times more tokens than standard chat interactions, because each time a user interacts with one, the model needs to resend context including tool outputs and intermediate reasoning.

The second constraint is known as the “thinking tax.” Complex agents must reason at each step of a task they complete, which means it’s impractical to use much larger models, since the more parameters there are, the more expensive it becomes to process things. Large models are also slower than smaller models.

To get around these problems, Nemotron 3 Super has a 1 million-token context window that allows it to retain full workflow state in memory and prevent “goal drift,” Nvidia said. Moreover, only 12 billion of its 120 billion parameters are active during inference, which is the process of running trained models to generate predictions or produce conclusions on new data.

Nvidia said Nemotron Super 3 runs in NVFP4 precision on its Blackwell GPUs, which allows it to reduce its memory requirements and speed up inference by up to four times over what can be achieved on its previous-generation Hopper platform.

Nemotron 3 Super can be downloaded from build.nvidia.com, OpenRouter and Hugging Face. In addition, the AI search engine Perplexity Inc. is making the model available in its search engine, and also with its “Computer” AI agent system. Generative AI coding applications such as CodeRabbit, Factory and Greptile are also adding the model to their lineups, while the life sciences organizations Edison Scientific and Lila Sciences will use it to power agents for data science, deep literature research and molecular understanding.

Companies including Amdocs group Co., Palantir Technologies Inc., Cadence Design Systems Inc. and Dassault Systèmes SA are also using Nemotron Super 3 to automate workflows in telecommunications, cybersecurity, semiconductor design and manufacturing. Finally, Dell Technologies Inc. and Hewlett Packard Enterprise Co. will also offer access to the model through their respective agent hubs.

The launch of Nemotron 3 Super comes ahead of Nvidia’s annual GTC conference, which is set to kick off next week on March 16, when the company is expected to reveal more about its next-generation GPU platforms among other announcements.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Nvidia’s Nemotron Super 3 model for agentic systems launches with five times higher throughput

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Nvidia’s Nemotron Super 3 model for agentic systems launches with five times higher throughput

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Cookies