UPDATED 09:00 EDT / APRIL 23 2025

AI

Nvidia announces general availability of NeMo tools for building AI agents

Nvidia Corp. today announced the general availability of NeMo microservices, a set of tools designed to assist developers get artificial intelligence agents up faster by tapping into AI inference and information systems at scale.

Agents have become a focal point for creatingdigital teammatescapable of driving workforce productivity for knowledge and service workers by taking orders, discovering information and doing proactive work.

Unlike AI chatbots, agents can take autonomous actions with little or no human oversight but they need data to make accurate and efficient decisions as part of their reasoning. This can be particularly true for proprietary knowledge, which might be locked behind company firewalls or when using rapidly changing real-time information.

“Without a constant stream of high-quality inputs — from databases, user interactions or real-world signals — an agent’s understanding can weaken, making responses less reliable, which makes agents less productive,said Joey Conway, senior director of generative AI software for enterprise at Nvidia.

To help developers rapidly build and deploy agents, Nvidia is releasing NeMo microservices, including Customizer, Evaluator, Guardrails, Retriever and Curator. They are designed to ease enterprise AI engineers’ experience building agentic AI experiences when scaling and accessing data.

Customizer assists with large language model fine-tuning by providing up to 1.8 times higher training throughput. It provides an application programming interface that allows developers to curate models rapidly so they can fit a dataset before they deploy it. Evaluator simplifies the evaluation of AI models and workflows based on custom and industry benchmarks with just five API calls.

Guardrails runs atop an AI model or agent to keep it from behaving in a way that is either unsafe or out of bounds. It can provide additional compliance with 1.4x efficiency and only a half-second more latency. Retriever, announced at GTC 2025, allows developers to build agents that can extract data from systems and accurately process it, enabling them to build complex AI data pipelines such as retrieval-augmented generation.

“NeMo microservices are easy to operate and can run on any accelerated computing infrastructure, both on-premises and the cloud, while providing enterprise-grade security, stability and support,added Conway.

Nvidia designed the NeMo tools so that developers with general AI knowledge can access them via API calls to get AI agents up and running. Right now enterprises are beginning to build complex multi-agent systems where hundreds of expert agents collaborate to achieve unified goals while working alongside human teammates.

Broad support for numerous models and partners

NeMo microservices support a large number of popular open AI models, including Meta Platforms Inc.’s Llama, Microsoft Phi family of small language models, Google LLC’s Gemma and Mistral.

Nvidia’s Llama Nemotron Ultra, currently ranking as the top open model on scientific reasoning, coding and complex math benchmarks, is also accessible through the microservices.

Numerous leading AI service providers, including Cloudera Inc., Datadog Inc., Dataiku, DataRobot Inc., DataStax Inc., SuperAnnotate AI Inc. and Weights & Biases Inc., have included NeMo microservices in their platforms. Developers can start using these microservices in their processes today through popular AI frameworks such as CrewAI, Haystack by Deepset, LangChain, LlamaIndex and Llamastack.

Using the new NeMo microservices, Nvidia partners and tech companies have built AI agent platforms and onboarded digital teammates to get more work done.

For example, AT&T Inc. used NeMo Customizer and Evaluator to increase AI agent accuracy by fine-tuning a Mistral 7B model for personalized services, preventing fraud and optimizing network performance. And BlackRock Inc. is working with the microservices in its Aladdin tech platform to unify investment management through a common data language.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU