UPDATED 02:00 EST / JUNE 01 2021

INFRA

Nvidia announces more certified AI server systems at Computex 2021

Nvidia Corp. is expanding the number of Nvidia-certified systems available for enterprises that want to run artificial intelligence workloads at scale in their own data centers or Nvidia’s cloud.

The announcement was made at the virtual Computex 2021 event today, where the company also launched Nvidia Base Command Platform, a pricey new cloud-hosted development service that gives customers a way to tap into Nvidia’s supercomputer resources.

Nvidia launched its first batch of certified systems earlier this year, working alongside server hardware makers to ensure their offerings meet the company’s design best practices and deliver optimal performance for AI workloads. The systems combine central processing units with Nvidia’s graphics processing units and Mellanox network adapters, giving companies lots of different hardware options for running AI in their corporate data centers or the cloud.

They’re certified to run Nvidia’s AI Enterprise suite of AI and data analytics tools and its new Omniverse Enterprise design platform. They also play nicely with VMware Inc.’s vSphere virtualization software and Red Hat Inc.’s OpenShift platform for AI development.

The new systems include x86-based rigs from high-profile server makers such as Dell Technologies Inc., Hewlett Packard Enterprise Co., Lenovo Group Ltd., AsusTek Computer Inc. and Super Micro Computer Inc., and also smaller firms such as Advantech Co. Ltd., Altos Computing Inc., ASRock Rack Inc., Gigabyte Technology Co. Ltd. and Quanta Cloud Technology Inc. There’s a wide range of options for different price and performance levels, with the most powerful systems running Nvidia’s A100 Tensor Core GPUs, and others that feature its A40, A30 and A10 Tensor Core GPUs.

For the most advanced AI training and cloud computing services, Nvidia said that Dell, HPE, Supermicro and Nettrix Corp. are offering some of the first servers based on its Nvidia HGXTM accelerated computing platform. These systems are available now, Nvidia said, powered by a choice of four or eight A100 GPUs with Nvidia NVLink GPU interconnects and InfiniBand networking.

More efficient systems

Some of the servers announced today will be the first to integrate Nvidia’s new BlueField-2 data processing units that are designed to optimize hardware performance. Announced last October, the DPUs are designed to handle a lot of the infrastructure administration tasks in data centers, such as scanning network traffic for malware and orchestrating storage that would otherwise be handled by the CPUs.

By offloading this work to the DPU, the CPU can focus solely on the compute tasks it has been given, thereby improving overall performance. Nvidia claims that a single BlueField-2 DPU (below) can handle data center infrastructure administration tasks that would otherwise have to be performed by as many as 125 CPU cores.

“Servers that primarily run software-defined networking (for example, a stateful load balancer or distributed firewall), software-defined storage or traditional enterprise applications will all benefit from the DPU’s ability to accelerate, offload and isolate infrastructure workloads for networking, security and storage,” said Nvidia’s director of storage marketing John Kim in a blog post. “Systems running VMware vSphere, Windows or hyperconverged infrastructure solutions also benefit from including a DPU, whether running AI and machine learning applications, graphics-intensive workloads or traditional business applications.”

The first systems featuring NVIDIA BlueField-2 DPUs will be available later this year, Nvidia said.

First Arm-based certified systems on the way

Also coming soon will be the first Arm-based Nvidia-Certified servers from Gigabyte and Wiwynn Corp. These systems will be powered by Arm Neoverse CPUs and Nvidia’s Ampere GPUs, and some models will also feature the BlueField-2 DPUs too, Nvidia said. The systems will be submitted for Nvidia certification as soon as they’re ready and are expected to become available next year.

“Enterprises across every industry need to support their innovative work in AI on traditional data center infrastructure,” said Manuvir Das, head of Enterprise Computing at Nvidia. “The open, growing ecosystem of Nvidia-Certified Systems provides unprecedented customer choice in servers.”

Nvidia Base Command Platform

The new Base Command Platform (below) that was also announced at Computex is designed for AI teams that need far greater AI resources than a few simple servers can provide. It’s meant for “large scale, multiuser and multiteam AI development workflows” and enables dozens of researchers and data scientists to use accelerated compute resources at the same time.

Base Command Platform, a joint service offered by Nvidia and NetApp Inc., provides AI teams with access to Nvidia DGX SuperPOD supercomputers with NetApp’s data management services, to deliver enormous power for intensive workloads. The service is meant for serious AI developers only, with monthly subscriptions said to start at $90,000 per month.

For that, customers will gain a single view of all of their AI development projects from where they can easily allocated compute resources and collaborate. Plus, Nvidia will throw in a range of AI and data science tools that can be played with, such as its NGCTM catalog, APIs for integration with MLOps, Jupyter notebooks and more.

The service will be made available on Google Cloud’s marketplace later this year, Nvidia said. Alternatively, customers can have a SuperPOD system installed in their own data centers. It’s available now for early-access customers, Nvidia said.

“World-class AI development requires powerful computing infrastructure, and making these resources accessible and attainable is essential to bringing AI to every company and their customers,” Das said.

Constellation Research Inc. analyst Holger Mueller said that Nvidia looks to be successfully building out its ecosystem with certified adoption of its new DPU across a wide number of hardware partners, with Dell and VMware being most prominent. “All of this is of value to enterprises as it gives them more variety on the platform side and faster implementation times,” he said.

Images: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU