UPDATED 11:00 EDT / NOVEMBER 15 2023

AI

Nvidia announces generative AI foundry on Azure and new foundational models

Pushing the envelope for generative artificial intelligence model training and deployment workflows, Nvidia Corp. today said that it’s bringing an AI foundry to Microsoft Corp.’s Azure cloud that will allow any customer to roll their own chatbot, predictive or image-generating AI.

The company announced the offering during Microsoft’s Ignite 2023 conference today, calling it a generative AI factory provided by Nvidia and hosted on Azure that brings together the company’s best-in-breed AI enterprise infrastructure and tools for developer productivity.

“This entire end-to-end workflow with all of its pieces, all of the infrastructure and software, are now available on Microsoft Azure,” said Manuvir Das, vice president of enterprise computing at Nvidia. “This means that any customer can come into Azure and procure the required components from Azure marketplace.”

Check out our full coverage of Microsoft Ignite in these stories:

The offering includes the DGX Cloud platform, which is a powerful cloud hardware offering that allows companies to provision and run on-demand AI workloads on cloud infrastructure, now available through the Azure Marketplace. Companies can spin up DGX Cloud instances with access to eight A100 80-gigabyte graphical processing units capable of multinode scale, providing extreme power for AI processes such as training and fine-tuning. DGX Cloud was previously available on Oracle Cloud.

Nvidia also announced that it plans to add even more computing power to Azure by bringing its newly announced H200 Tensor Core GPU on board next year to support even larger workloads. This new GPU model is purpose-built for the biggest AI needs, including LLMs and generative AI models. In comparison to previous generations, it brings 141 gigabytes of HBM3e memory, 1.8 times more, and is capable of 4.8 terabits per second of peak memory bandwidth, an increase of 1.4 times.

Nvidia-optimized AI foundation models

To support the acceleration of custom-built generative AI models by the industry, Nvidia announced the release of its own family of generative AI foundation models, called Nemotron-3 8B, and endpoints for optimized open-source models.

The Nemotron-3 8B family is a set of 8 billion parameter LLM models that are optimized to run on Nvidia hardware designed for industry customers looking to build secure, accurate generative AI applications. The models are multilingual and the company said that they were “trained on responsibly sourced datasets” and capable of performance comparable to much larger models for enterprise deployments.

Out of the box, the Nemotron-3 8B models are proficient in over 50 different languages, including English, German, Russian, Spanish, French, Japanese, Chinese, Korean, Italian and Dutch.

“The new Nvidia family of Nemotron-3 8B models are also included to support the creation of today’s most advanced enterprise chat and Q&A applications, for a broad range of industries including healthcare, telecommunications and financial services,” said Erik Pounds, senior director of AI software at Nvidia.

The company also provides a curated set of optimized models for users to access including common community-sourced models such as Meta Platform Inc.’s Llama 2, Paris-based startup Mistral AI’s Mistral and the image-generating model Stable Diffusion XL.

Foundation models are designed to be customized with proprietary data from enterprise customers to fine-tune them with their own data to specialize them to their own specific use cases. Once tailored to their specific needs they can be deployed virtually anywhere to be used in AI-based applications.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.