Nvidia and Microsoft partner to build AI supercomputer
Nvidia Corp. today announced that it’s partnering with Microsoft Corp. to build a supercomputer optimized for running artificial intelligence software.
The system will be one of the most powerful AI supercomputers in the world, the companies stated. It will be implemented in Microsoft’s Azure public cloud platform.
“AI technology advances as well as industry adoption are accelerating,” said Manuvir Das, the vice president of enterprise computing at Nvidia. “The breakthrough of foundation models has triggered a tidal wave of research, fostered new startups and enabled new enterprise applications. Our collaboration with Microsoft will provide researchers and companies with state-of-the-art AI infrastructure and software to capitalize on the transformative power of AI.”
As part of the collaboration, Microsoft will equip Azure with tens of thousands of Nvidia graphics processing units. GPUs are widely used in supercomputers because of their ability to speed up AI and scientific applications. Microsoft will implement the GPUs alongside several other technologies from Nvidia, including the chipmaker’s Quantum-2 series of network switches.
Azure already includes a substantial number of GPUs from Nvidia. Azure customers can access multiple compute instances powered by Nvidia’s A100 chip, which was the chipmaker’s flagship data center graphics card when it launched in 2020. Through the partnership announced today, Microsoft will introduce Azure cloud instances powered by Nvidia’s current flagship data center GPU, the H100 chip.
The H100 made its debut in March. It features 80 billion transistors that can train AI models up to six times faster than Nvidia’s previous-generation A100 graphics card. The H100 also includes optimizations that allow it more efficiently to run Transformer models, a type of advanced neural network widely used for tasks such as natural language processing.
The upcoming H100-powered Azure instances that Microsoft plans to launch will use Nvidia’s Quantum-2 InfiniBand switch series to manage network traffic. The switches can process 400 gigabits of traffic per second per network port, twice as much as Nvidia’s previous-generation hardware.
Software is also a major focus of Nvidia’s partnership with Microsoft. As part of the collaboration, the companies will make their respective AI development tools more easily accessible in Azure.
Microsoft provides an open-source toolkit called DeepSpeed that developers use to reduce their neutral networks’ infrastructure requirements. According to Microsoft, the toolkit can reduce the amount of hardware needed to train and run neural networks. Microsoft and Nvidia will optimize DeepSpeed to run on the chipmaker’s H100 graphics card.
The optimization effort will focus on helping developers speed up AI models that use the popular Transformer neural network architecture. The speed improvement will be provided with the help of a feature known as the Transformer Engine that is built into Nvidia’s H100 graphics card. According to the chipmaker, the Transformer Engine accelerates neural networks by reducing the amount of data that they must process to complete calculations.
Nvidia provides a software platform called Nvidia AI Enterprise to help companies more easily run AI applications on its chips. The platform reduces the amount of manual work required to build, deploy and manage neural networks. It also includes a set of preconfigured neural networks optimized for tasks such as generating shopping recommendations.
As part of an earlier collaboration with Microsoft, Nvidia certified Nvidia Enterprise AI to run on Azure instances powered by its A100 chip. The companies will now team up to provide support for Nvidia Enterprise AI on Microsoft’s upcoming H100-powered Azure instances.
“Our collaboration with NVIDIA unlocks the world’s most scalable supercomputer platform, which delivers state-of-the-art AI capabilities for every enterprise on Microsoft Azure,” said Scott Guthrie, executive vice president of Microsoft’s Cloud + AI Group.
Nvidia, in turn, plans to expand its use of Azure as part of the partnership. The chipmaker will use Azure instances to support its research efforts in the field of generative AI. That’s a type of advanced neural network that can perform tasks such as generating text, video and software code.
Nvidia is already making significant investments in this area. Last October, the company debuted MT-NLG, a generative AI system described at the time as the most powerful in its category. MT-NLG features 530 billion parameters, the configuration settings that determine how a neural network processes data.
Image: Nvidia
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU