Microsoft debuts new AI-optimized Azure instances
Microsoft Corp. is extending its Azure cloud platform with the addition of a new instance family designed to run artificial intelligence models.
The instance family, known as the ND H100 v5 series, made its debut today.
“Delivering on the promise of advanced AI for our customers requires supercomputing infrastructure, services, and expertise to address the exponentially increasing size and complexity of the latest models,” Matt Vegas, a principal project manager at Azure’s high-performance computing and AI group, wrote in a blog post. “At Microsoft, we are meeting this challenge by applying a decade of experience in supercomputing and supporting the largest AI training workloads.”
Each ND H100 v5 instance features eight of Nvidia Corp.’s H100 graphics processing units. Introduced last March, the H100 is Nvidia’s most advanced data center GPU. It can train AI models nine times faster than the company’s previous flagship chip and performs inference, the running of the models, up to 30 times faster.
The H100 features 80 billion transistors produced using a four-nanometer process. It includes a specialized module, known as the Transformer Engine, that is designed to speed up AI models based on the Transformer neural network architecture. The architecture powers many advanced AI models including OpenAI LLC’s ChatGPT chatbot.
Nvidia has also equipped the H100 with other enhancements. The chip offers, among other capabilities, a built-in confidential computing feature. The feature can isolate an AI model in a way that blocks unauthorized access requests, including from the operating system and hypervisor on which it runs.
Advanced AI models are usually deployed on not one but multiple graphics cards. GPUs used in such a manner must regularly exchange data with another to coordinate their work. To speed up the flow of data between their GPUs, companies often link them together using high-speed network connections.
The eight H100 chips in Microsoft’s new ND H100 v5 instances are connected to one another using an Nvidia technology called NVLink. According to Nvidia, the technology is seven times faster than PCIe 5.0, a popular networking standard. Microsoft says NVLink provides 3.6 terabits per second of bandwidth between the eight GPUs in its new instances.
The instance series also supports another Nvidia networking technology called NVSwitch. Whereas NVLink is designed to link together the GPUs inside a single server, NVSwitch connects multiple GPU servers with one another. This makes it easier to run complex AI models that have to be distributed across multiple machines in a data center.
Microsoft’s ND H100 v5 instances combine the H100 graphics cards with Intel Corp. central processing units. The CPUs are sourced from Intel’s new 4th Gen Xeon Scalable Processor series. The chip series, which is also known as Sapphire Rapids, made its debut in January.
Sapphire Rapids is based on an enhanced version of Intel’s 10-nanometer process. Each CPU in the series includes multiple onboard accelerators, computing modules optimized for specific tasks. Thanks to the built-in accelerators, Intel says, Sapphire Rapids provides up to 10 times better performance for some AI applications than its previous-generation silicon.
The ND H100 v5 instance series is currently available in preview.
Photo: efes/Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU