UPDATED 18:00 EST / MARCH 18 2024

Nvidia’s new microservices APIs promise to speed up AI development

Nvidia Corp. said today it’s adding a new, microservices-based layer to its popular Nvidia AI Enterprise platform, giving generative artificial intelligence model developers, platform providers and others the ability to run custom AI models in any location.

The new microservices were announced at the Nvidia GTC 2024 conference in San Jose alongside a major update to the Nvidia Edify platform, which is a multimodal architecture for visual generative AI workloads. Nvidia Edify gains 3D asset generation capabilities, alongside more controls over generative AI image generation, the company said.

Generative AI microservices to simplify model deployment

Nvidia said the Nvidia NIM microservices, built atop its Nvidia CUDA platform, enable the optimized inference of popular generative AI models from both itself and its partner ecosystem. Besides NIM, it also announced the launch of new, Nvidia CUDA-X microservices-based accelerated software development kits, libraries and tools for tasks such as retrieval-augmented generation, guardrails, data processing, high-performance computing and more.

The company explained that NIM microservices are prebuilt containers for reducing the deployment time of inference software such as Triton Inference Server and TensorRT-LLM from weeks to a matter of minutes. The microservices come with industry-standard application programming interfaces for AI domains such as language and drug discovery, and make it simpler for developers to build AI applications that can leverage their own data, on any platform, including cloud servers, on-premises systems and even workstations and laptops.

The NIM microservices cater to Nvidia’s own catalog of models, as well as those from partners such as AI21 Labs Inc., Adepts Inc., Cohere Inc., Getty Images Holdings Inc. and Shutterstock Inc., plus open-source models from the likes of Meta Platforms Inc., Hugging Face Inc., Stability AI Ltd. and Google LLC.

The company said customers can access NIM microservices via the Nvidia AI Enterprise platform, as well as Microsoft Azure AI, Google Cloud Vertex AI, Google Kubernetes Engine and Amazon SageMaker, and integrate with AI frameworks, including LangChain, LlamaIndex and Deepset.

Early access customers include ServiceNow Inc., which is using NIM to develop and deploy a series of more cost-effective, domain-specific generative AI copilots. Others include companies such as Adobe Inc., CrowdStrike Holdings Inc., Getty Images, SAP SE and Shutterstock.

Nvidia co-founder and Chief Executive Jensen Huang said the NIM microservices can be thought of as the building blocks enterprises need to become AI companies. “Established enterprise platforms are sitting on a goldmine of data that can be
transformed into generative AI copilots,” he explained.

CUDA-X Microservices for RAG, Data Processing, Guardrails, HPC

As for the CUDA-X microservices, they provide the building blocks for essential AI development tasks such as data preparation, customization and training development. They include Nvidia Riva, which is a microservice for customized speech and translation AI, Nvidia cuOptCUDA-X Microservices for RAG, data processing, guardrails, HPC for routing optimization and Nvidia Earth-2 for high resolution climate and weather forecasting.

Others include NeMo Retriever microservices, which make it simple to link AI applications with proprietary business data so they can generate more accurate and contextually relevant responses. These RAG capabilities enable organization to feed more data to their copilots, chatbots and generative AI productivity tools, the company said.

The Nvidia CUDA-X microservices are also available Nvidia AI Enterprise 5.0, and will be supported on public cloud infrastructure platforms, on-premises server systems, including Nvidia-certified systems from the likes of Dell Technologies Inc., Hewlett Packard Enterprise Co. and Lenovo Group Ltd. They’re also compatible with infrastructure software platforms such as VMware Inc.’s Private AI Foundation with Nvidia and Red Hat OpenShift. AI and machine learning ecosystem partners, including Anyscale Inc., Dataiku Inc. and Weights & Biases Inc., are also adding support for CUDA-X microservices.

Nvidia Edify enhances visual generative AI development

Nvidia said its Edify platform for visual generative AI models is being enhanced with various new APIs that enable superior control for image and scene generation. The Edify AI Models can be accessed as an API through Nvidia NIM or via Nvidia Picasso, which is an AI development foundry built on the Nvidia DGX Cloud platform.

One early customer is the livestreaming platform BeLive Studios Ltd., which has used Nvidia Picasso and Edify to create real-time generative AI that automates the creation of visual scenes.

Shutterstock is another early adopter, partnering with HP Inc. to demonstrate how Edify 3D can enhance customized 3D printing with various AI-generated designs. This, the company said, will enable designers to quickly iterate on new prototypes to aid in product design. In addition, Shutterstock has also created an Edify-powered tool to light 3D scenes using 360-degree HDRi environments generated from text and image prompts.

Meanwhile, Getty Images has announced new Edify-powered APIs to enhance its image generation AI tools. They include an API for inpainting, which enables users to add, remove or replace objects in an image, plus outpainting, which can be used to extend the creative canvas. Getty Images is also adding Edify-based APIs that provide more control over generative AI image output and the ability to fine-tune the Edify foundation models to a company’s brand and visual style.

Other APIs offered by Getty deliver sketch, depth and segmentation features, allowing users to provide a sketch as a prompt, follow the composition of reference images with a depth map and segment parts of an image to add, remove or retouch any character or object.

The new Edify APIs are designed to give users much greater control and flexibility over the output of image-based generative AI tools, making them much more viable for creative design processes.

Images: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Nvidia’s new microservices APIs promise to speed up AI development

Generative AI microservices to simplify model deployment

CUDA-X Microservices for RAG, Data Processing, Guardrails, HPC

Nvidia Edify enhances visual generative AI development

Images: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

Nvidia’s new microservices APIs promise to speed up AI development

Generative AI microservices to simplify model deployment

CUDA-X Microservices for RAG, Data Processing, Guardrails, HPC

Nvidia Edify enhances visual generative AI development

Images: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

Cookies