UPDATED 12:00 EDT / APRIL 28 2026

Nemotron 3 Omni AI

Nvidia introduces Nemotron 3 Nano Omni with vision and speech for powerful agentic AI use 

Nvidia Corp. today launched a powerful reasoning artificial intelligence model that unifies text, vision and speech, capable of acting as the “brains” of faster, smarter agentic AI applications. 

Dubbed Nemotron 3 Nano Omni, and weighing in at about 30 billion parameters, the new state-of-the-art model uses mixture-of-experts architecture to deliver extremely low latency and provides high flexibility and control. 

Nvidia combined vision and audio encoders with its 30B-AD3B hybrid MoE architecture to eliminate the need for separate perception modules, allowing its AI model to unify everything into one. The company said this allowed the model to improve efficiency at scale and provide up to nine times faster throughput than other open omni models on the market. 

“To build useful agents, you can’t wait seconds for a model to interpret a screen,” said Gautier Cloix, chief executive of H Company. “By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.” 

The result is a lower cost and higher scalability. With its smaller size, it can also be compressed enough to run on higher-end consumer hardware and execute efficiently on enterprise cloud deployments. 

The company said it is designed to run alongside other proprietary cloud models or other Nvidia Nemotron open models, such as Nemotron 3 Super for high-frequency execution or Super for complex planning. 

The new model allows for rapid understanding of documents, computer displays, voice activity, video and more. This makes it the suitable interface for working with people and bridging to more complex machine states. It can take conversational replies from a user and quickly turn it around into reasoning. 

Nvidia said the Nemotron family – including Ultra, Super and Nano – has seen over 50 million downloads in the past year. The Omni variant extends the family’s capabilities into the multimodal and agentic domains. 

The new model is now available on Hugging Face, OpenRouter and build.nvidia.com as an Nvidia NIM microservice. As an open, lightweight model, it’s also designed for developers to build on and deploy on local hardware, including the Nvidia DGX Spark and other hardware. 

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.