UPDATED 16:30 EST / JULY 29 2024

AI

Nvidia expands microservices library and support for 3D and robotic model creation

Nvidia Corp. announced today at the Siggraph conference in Denver that it’s significantly expanding its library of Nvidia Inference Microservices to encompass physical environments, advanced visual modeling and a wide variety of vertical applications.

Among the highlights are the availability of Hugging Face Inc.’s inference-as-a-service on the Nvidia cloud and expanded support for three-dimensional training and inferencing.

NIM is a set of containerized microservices delivered as part of the Nvidia AI Enterprise suite that simplifies and speeds artificial intelligence model deployment. Each is an optimized inference engine tailored for various hardware setups and accessible via application program interfaces to reduce latency and operational costs as well as improve performance and scalability. Developers can use NIMs to deploy AI applications quickly without extensive customization and to fine-tune models with proprietary data.

Nvidia said Hugging Face will offer inferencing-as-a-service on top of Nvidia’s DGX cloud, giving Hugging Face’s 4 million developers faster performance and easier access to serverless inferencing. Hugging Face provides a platform specialized for natural language processing and machine learning development and staging as well as a library of pre-trained models for NLP tasks such as text classification, translation and question answering. It also offers a large repository of datasets that are optimized for use with Transformers, an open-source Python library that provides resources for working with NLP models.

Nvidia announced generative physical AI advancements including its Metropolis reference workflow for building interactive visual AI agents. Metropolis is a collection of developer workflows and tools to build, deploy and scale and generative AI applications across all types of hardware. It also announced new NIM microservices that help developers train physical machines to handle complex tasks.

3D worlds

Today’s announcements include three new Fast Voxel Database NIM microservices that support new deep learning frameworks for three-dimensional worlds. FVDB is a new deep-learning framework for generating AI-ready virtual representations of the real world. It’s built on top of OpenVDB, an industry-standard library of structures and programs for simulating and rendering sparse volumetric data such as water, fire, smoke and clouds.

FVDB provides four times the spatial scale of prior frameworks, 3.5 times the performance and access to a large library of real-world datasets. It simplifies processes by combining functions that previously required multiple deep-learning libraries.

Also being announced are the three microservices — USD Code, USD Search and USD Validate — that use the Universal Scene Description open-source interchange format for creating arbitrary 3D scenes.

USD Code can answer OpenUSD knowledge questions and generate Python code, USD Search enables natural language access to massive libraries of OpenUSD 3D and image data. USD Validate checks the compatibility of uploaded files against OpenUSD release versions and generates a fully rendered path traced image using Omniverse cloud APIs.

“We built the world’s first generative AI models that can understand OpenUSD-based language, geometry, materials, physics and spaces,” said Rev Lebaredian, Nvidia’s vice president of Omniverse and simulation technology.

Physical AI support

Nvidia said its NIMs tailored for physical AI support speech and translation, vision and realistic animation and behavior. Visual AI agents use computer vision capabilities to perceive and interact with the physical world and perform reasoning tasks.

They’re powered by a new class of generative AI models called vision language models that enable enhanced decision-making, accuracy, interactivity and performance. Nvidia’s AI and DGX supercomputers can be used to train physical AI models and its Omniverse and OVX supercomputers can be applied to refine skills in a digital twin.

Applications include robotics, and in line with that, Nvidia said it will provide the world’s leading robot manufacturers, AI model developers and software makers a suite of services, models and computing platforms to develop, train and build the next generation of humanoid robotics (pictured).

Offerings include NIM microservices and frameworks for robot simulation and learning, the OSMO orchestration service for running multistage robotics workloads and an AI- and simulation-enabled teleoperation workflow that significantly reduces the amount of human demonstration data required to train robots.

Generative AI’s visual output is typically “random and inaccurate, and the artist can’t edit finite details exactly how they want,” Lebaredian said. “With Omniverse and NIM microservices, the designer or artist builds a ground-truth 3D scene that conditions the generative AI. They assemble their scene in Omniverse, which lets them aggregate brand approved assets like a Coke bottle and various models for props and the environment into one scene.”

Getty Images Holdings Inc.’s 4K image generation API and Shutterstock Inc.’s 3D asset generation will be available as Nvidia NIMs for image generation using text or image prompts. Both use Nvidia Edify, a multimodal architecture for visual generative AI.

“We’ve been investing in OpenUSD since 2016, making it, and therefore Omniverse, easier and faster for industrial enterprises and physical AI developers to develop performant models,” Lebaredian said. Nvidia has also been working with Apple Inc., which co-founded the Alliance for Open USD, to build a hybrid rendering pipeline stream from its Graphics Delivery Network to Apple Vision Pro. Software development kits and APIs that enable this on Omniverse are now available through an early access program.

Developers can use like NIM microservices and Omniverse Replicator to build generative AI-enabled synthetic data pipelines, addressing a shortage of real-world data that often limits model training.

Coming soon as NIMs or USD Layout, USD Smart Material and FDB Mesh Generation, which generates an OpenUSD-based mesh rendered by Omniverse APIs.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU