

Nvidia Corp. is looking to capitalize on the agentic artificial intelligence trend not only by providing the underlying infrastructure, but also the models that power these next-generation autonomous agents.
At its GTC 2025 annual conference today, the company unveiled a new family of Llama Nemotron AI models with advanced reasoning capabilities. Based on Meta Platforms Inc.’s renowned open-source Llama models, they’re designed to provide developers with a strong foundation on which they can build advanced AI agents that perform tasks on behalf of their users with minimal supervision.
Nvidia explained that it basically just took Meta’s Llama models and improved them using post-training enhancement techniques to increase their multistep math, coding, complex decision-making and reasoning skills. Employing some careful refinements, Nvidia claims that the Llama Nemotron AI models are 20% more accurate than the Llama models they’re based on, while their inference speed has been increased by an impressive five times, enabling them to handle many more complex tasks with lower operational costs.
The Llama Nemotron models are being made available through Nvidia’s NIM microservices platform in three sizes – Nano, Super and Ultra – optimized for different kinds of applications.
According to Nvidia, Llama Nemotron Nano is designed for deployment on personal computers, edge devices and other low-powered hardware. The Super model is optimized to run on a single graphics processing unit, while the Ultra version is meant for those who need maximum performance running on multiple GPU servers.
The company said it carried out its post-training refinements using the Nvidia DGX Cloud platform with high-quality synthetic data from Nvidia Nemotron, plus its own proprietary, curated datasets. To keep things open, it’s making these datasets, the tools it used and details of its post-training optimization techniques publicly available, so everyone can see the improvements and develop their own foundational reasoning models.
Although just announced today, Nvidia has already amassed an impressive list of partners that are using the Llama Nemotron models to create powerful new AI agents. For instance, Microsoft Corp. is making them available on its cloud-based Azure AI Foundry service, and they’ll also be listed as an option for customers to create new agents using the Azure AI Agent Service for Microsoft 365.
Another partner, SAP SE, is utilizing the Llama Nemotron models to improve the capabilities of its AI assistant Joule and its SAP Business AI solutions portfolio. And others, including Accenture Plc, Atlassian Corp., Box Inc. and ServiceNow Inc., are also working with Nvidia to ensure their customers can access the Llama Nemotron models.
Of course, for anyone looking to create AI agents, the underlying large language models are just one part of the equation. There’s also the infrastructure to consider, the tools needed to piece them together, the all-important data pipelines to provide them with knowledge, and much more.
Nvidia is catering to most of these needs, announcing a host of additional agentic AI building blocks at GTC 2025 today.
They include the new Nvidia AI-Q Blueprint, which is a framework that enables developers to connect knowledge bases to AI agents that can act autonomously. The Blueprint was built with Nvidia NIM microservices and integrates with Nvidia NeMo Retriever, making it simple for AI agents to retrieve multimodal data in various formats.
Meanwhile, the new Nvidia AI Data Platform is a customizable reference design that’s being made available to the world’s most important storage providers. The idea is to help storage infrastructure providers such as Dell Technologies Inc., Hewlett Packard Enterprise Co., Hitachi Vantara, IBM Corp., NetApp Inc.. Nutanix Inc., Vast Data Inc. and Pure Storage Inc. develop more efficient data platforms for agentic AI inference workloads.
By combining highly optimized storage resources with Nvidia’s accelerated computing hardware, the company promises that developers will see some major performance gains when it comes to AI reasoning, as it will ensure the smooth flow of information from database to model.
There’s also some updated Nvidia NIM microservices, which are used to optimize agentic AI inference to support continuous learning and adaptiveness. Using these microservices, customers will be able to reliably deploy the latest and most powerful agentic AI models, including Nvidia’s Llama Nemotron and alternatives from the likes of Meta, Microsoft and Mistral AI.
Finally, Nvidia said it’s enhancing its NeMo microservices, which provide a framework for developers to build robust and efficient data flywheels. This is key to ensuring that AI agents can learn continuously based on both human- and AI-generated feedback.
Sticking with AI agents, Nvidia also revealed it’s expanding its association with Oracle Corp. to bring agentic AI to Oracle Cloud Infrastructure. Under the partnership, Nvidia is bringing its accelerated GPUs and inference software to Oracle’s cloud infrastructure and making it compatible with that company’s generative AI services.
It will help to accelerate AI agent development on OCI. All told, Nvidia now offers more than 160 AI tools and NIM microservices available natively via the OCI console. Further, the companies announced they’re also working to accelerate vector search on the Oracle Database 23ai platform.
Moving away from its focus on AI agents, Nvidia also provided an update on its expanded collaborations with Google LLC, revealing a series of initiatives that aim to enhance and improve access to AI and its underlying tooling.
Nvidia said it will become the first organization to leverage Google DeepMind’s SynthID, which directly embeds digital watermarks into AI-generated images, video and text. That helps preserve the integrity of AI outputs. SynthID is initially being integrated with Nvidia’s Cosmos World foundation models, where it will provide safeguards against misinformation and wrongful attribution.
Elsewhere, Nvidia helped Google’s DeepMind researchers to optimize a family of open-source, lightweight AI models called Gemma to run on its GPUs, and they’re also working on an initiative to build AI-powered robots with grasping skills, and various other projects.
“It’s a great joy to see Google and Nvidia researchers and engineers collaborate to solve incredible challenges, from drug discovery to robotics,” said Nvidia Chief Executive Jensen Huang.
THANK YOU