UPDATED 11:00 EDT / AUGUST 23 2024

AI

Nvidia to present AI and data center performance innovations at Hot Chips conference

Nvidia Corp. today revealed details about what it will discuss during the Hot Chip 2024 semiconductor technology conference in Cupertino, California, on Monday, which includes advancements to its Blackwell platform, research on liquid cooling for data centers and AI agents for chip design.

“Nvidia Blackwell is a platform, the GPU is just the beginning,” said Dave Salvator, director of accelerated computing products at Nvidia.

It comprises multiple different Nvidia chips including the Blackwell graphics processing unit, the Grace central processing unit, the Bluefield data processing unit, the ConnextX network interface card, the NVLink Switch, the Spectrum Ethernet switch and the Quantum InfiniBand switch. All work together to power large language model inference and accelerated computing.

Nvidia unveiled the Blackwell GPU architecture in March, during GPT 2024 when the company said it will be capable of running real-time generative AI models powered by colossal, 1 trillion-parameter large LLMs. It will also be able to handle them at an impressive 25 times lower cost and power consumption than Nvidia’s existing H100 GPUs, based on the older Hopper architecture.

“As we’ve seen, models grow in size over time and the fact that most generative AI applications are expected to run in real-time,” said Salvator. “The requirement for inference has gone up dramatically over the last several years. One of the things that real-time LLM inferencing needs is multiple GPUs and in the not-so-distant future multiple servers.”

An example of the Blackwell as a platform is the multi-node GB200 NVL72 solution, which provides low-latency, high-throughput token generation for extremely large LLMs. It acts as a unified system capable of delivering inference for trillion-parameter LLMs, such as GPT-MoE-1.8T, at 30 times the speed of the HGX H100 system and four times the training speed compared to the H100.

In addition to the new hardware, Nvidia will showcase the Quasar Quantization System, a new piece of software that uses Blackwell’s Transformer Engine to support high accuracy on lower precision models. Through a technique called FP4, using four bits of floating point precision per operation — new to the Blackwell processor, as Hopper had eight — models can take up less memory, perform better and still retain high accuracy.

Liquid cooling in data centers

On Sunday, Ali Heydari, director of data center cooling and infrastructure at Nvidia will present several designs for hybrid-cooled data centers. Although air cooling is common for moving heat away from servers, water is becoming a much more sustainable solution in combination with air.

Liquid-cooling techniques can move heat away from hot components more efficiently than air, which can keep components from overheating and throttling themselves and extending their lifespans. This is especially important given the bigger workloads that AI represents. Liquid-cooling systems also take up less space than air-cooling systems, Nvidia said.

One system that Nvidia will present is a warm water direct chip-to-chip approach that can deliver up to a 28% reduction in data center facility power.

“As the name implies, this system does not use chillers, which makes water cold, which uses a compressor, like a refrigerator works for instance,” said Salvator. “By going with this solution of using warm water we don’t have to use chillers and that gets us some energy savings.”

AI agents for chip design

Semiconductor chips are also a place where design quality and productivity can benefit from AI helping engineers better understand the microscopic effects of the placement of tiny circuits and the field effects on silicon.

Mark Ren, director of design and automation at Nvidia, will lead a presentation on Sunday of AI models that can assist with answering questions, generating code and debugging design problems. Nvidia has even developed an LLM to accelerate the production of Verilog code, a hardware description language used to model electronic systems, to assist engineers in building better chips.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU