UPDATED 15:37 EDT / JUNE 12 2025

AI

AMD debuts new flagship MI350 data center graphics cards with 185B transistors

Advanced Micro Devices Inc. today introduced a new line of artificial intelligence chips that it says can outperform Nvidia Corp.’s Blackwell B200 at some tasks.

The Instinct MI350 series, as the product family is called, includes two graphics cards. There’s the top-end MI355X, which relies on liquid cooling to dissipate heat. It’s joined by a scaled-down chip called the Instinct MI350X that trades off some performance for lower operating temperatures. That allows it to use fans instead of liquid cooling, an often simpler arrangement for data center operators.

“With flexible air-cooled and direct liquid-cooled configurations, the Instinct MI350 Series is optimized for seamless deployment, supporting up to 64 GPUs in an air-cooled rack and up to 128 in a direct liquid-cooled and scaling up to 2.6 exaFLOPS of FP4 performance,” Vamsi Boppana, the senior vice president of AMD’s Artificial Intelligence Group, detailed in a blog post.

More memory, faster chiplets

The MI350 series is based on a three-dimensional, 10-chiplet design. Eight of the chiplets contain compute circuits made using Taiwan Semiconductor Manufacturing Co.’s latest three-nanometer process. They sit atop two six-nanometer I/O chiplets that function as the MI350’s base layer and also manage the flow of data inside the processor.

Both the MI355X and the MI350X ship with 288 gigabytes of HBM3E memory. That’s a variety of fast, high-capacity RAM widely used in AI chips. Like AMD’s new graphics cards, HBM3E devices feature a three-dimensional design in which layers of circuits are stacked atop one another.

HBM3E theoretically supports up to 16 vertically layered RAM layers. Some memory devices based on the technology also include additional features. Micron Technology Inc.’s latest HBM3E chips, for example, ship with a so-called Memory Built-In Self-Test module. It reduces the amount of specialized equipment needed to develop AI chips that include HBM3E memory.

According to AMD, the MI350 series features 60% more memory than Nvidia’s flagship Blackwell B200 graphics cards. The company is also promising faster performance for some workloads. AMD says that MI350 chips can process 8-bit floating point numbers 10% faster than the B200 and 4-bit floating point numbers more than twice as fast.

Floating point numbers are the basic units of data that AI models use to perform calculations. The largest such data units contain 64 bits, while the smallest have 4. The MI350’s support for four-bit floating point, or FP4, data is one of the improvements it introduces over earlier AMD graphics cards. 

The fewer bits there are in a floating point number, the quicker it can be processed. As a result, AI models often compress large floating points into smaller ones to speed up calculations. MI350’s support for the smallest, 4-bit floating points will make it easier to perform compression in order to speed up AI workloads.

In practice, the new speed optimizations allow a single chip from the MI350 series to run an AI mode with up to 520 billion parameters. AMD is also promising a 40% increase in tokens per dollar compared to competing products.

New AI servers

AMD will make the MI350 available in 8-chip server configurations. According to the company, the machines will provide up to 160 petaflops of performance for some FP4 workloads. One petaflop corresponds to 1,000 trillion computations per second.

Further down the line, AMD plans to launch a line of rack systems called Helios. The systems will combine chips from the upcoming Instinct MI400 chip series, the successor to the MI350, with the company’s central processing units. AMD will also add in its Pensando data processing units, which offload infrastructure management tasks from an AI cluster’s other chips.

On the software side, Helios will ship with the company’s ROCm platform. It’s a collection of developer tools, application programming interfaces and other components that can be used to program AMD graphics cards. The company debuted a new version of ROCm in conjunction with the debut of the MI350 and Helios.

ROCm 7.0, as the latest release is called, enables AI models to perform inference more than 3.5 times faster than before. It can also triple the performance of training workloads.

According to AMD, the speedup is partly the fruit of optimizations that allow ROCm 7.0 to manage data movement more efficiently. The software is also better at distributed inference. That’s the task of spreading an inference workload across multiple graphics cards to accelerate processing. 

“Over the past year, ROCm has rapidly matured, delivering leadership inference performance, expanding training capabilities, and deepening its integration with the open-source community,” Boppana wrote. 

Photo: AMD

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.