UPDATED 18:46 EST / JANUARY 05 2026

AI

Nvidia debuts Rubin chip with 336B transistors and 50 petaflops of AI performance

Nvidia Corp. today announced a new flagship graphics processing unit, Rubin, that provides five times the inference performance of Blackwell.

The GPU made its debut at CES alongside five other data center chips. Customers can deploy them together in a rack called the Vera Rubin NVL72 that Nvidia says ships with 220 trillion transistors, more bandwidth than the entire internet and real-time component health checks.

High-speed inference

Rubin includes 336 billion transistors that provide 50 petaflops of performance when processing NVFP4 data. Blackwell, Nvidia’s previous-generation GPU architecture, provided up to 10 petaflops. Rubin’s training speed, meanwhile, is 250% faster at 35 petaflops.

Some of the chip’s computing power is provided by a module called the Transformer Engine that also shipped with Blackwell. According to Nvidia, Rubin’s Transformer Engine is based on a newer design with a performance-boosting feature called hardware-accelerated adaptive compression. Compressing a file reduces the number of bits it contains. That decreases the amount of data AI models have to crunch and thereby speeds up processing.

“Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof,” said Nvidia Chief Executive Officer Jensen Huang. “With our annual cadence of delivering a new generation of AI supercomputers — and extreme codesign across six new chips — Rubin takes a giant leap toward the next frontier of AI.”

Rack-scale AI systems

Nvidia plans to ship its new silicon as part of an appliance called the Vera Rubin NVL72 NVL72. It will combine 72 Rubin chips with 36 of the company’s new Vera central processing units, which also made their debut at CES. Vera includes 88 cores based on a custom design called Olympus. They’re compatible with Armv9.2, a widely-used version of Arm Holdings plc’s CPU instruction set architecture.

The Vera Rubin NVL72 keeps its chips in modules called trays. According to Nvidia, the trays have a cable-free design that cuts assembly and servicing times by a factor of up to 18 compared with Blackwell-based appliances. The RAS Engine, a subsystem that the company’s GPU racks use to automate certain maintenance tasks, has been upgraded as well. It provides fault tolerance features and performs real-time health checks to verify that the hardware is working as expected.

Nvidia says the Vera Rubin NVL72 provides 260 terabits per second bandwidth, which is more than the entire internet. The appliance processes AI models’ traffic with the help of three different chips called the NVLink 6 Switch, Spectrum-6 and ConnectX-9. All three were announced at CES today.

NVLink 6 Switch enables multiple GPUs inside a Vera Rubin NVL72 rack to exchange data with one another at once. That data exchange is needed to coordinate the GPUs’ work while they’re running distributed AI models. The Spectrum-6, in turn, is an Ethernet switch that facilitates connections between GPUs installed in different racks.

Nvidia’s third new networking chip, the ConnectX-9, is what’s known as a SuperNIC. It’s a hardware interface that a server can use to access the network of the host data center. ConnectX-9 performs networking tasks that were historically carried out by a server’s CPU, which leaves more processing capacity for AI workloads.

Rounding out the list of chips that Nvidia debuted today is the BlueField-4. It’s a DPU, or data processing unit. A DPU offloads work from a server’s main processor much like a SuperNIC, but it does so across a broader range of tasks. The BlueField-4 can perform not only networking-related computations but also certain cybersecurity and storage management operations.

The BlueField-4 powers a new storage system that Nvidia calls the Inference Context Memory Storage Platform. According to the company, it will help optimize large language models’ key-value cache.

An LLM’s attention mechanism, the component it uses to determine which data points to use and how, often repeat the same calculations. A key-value cache allows an LLM to perform a frequently recurring calculation only once, save the results and then reuse those results. That’s more hardware-efficient than calculating the same output from scratch every time it’s needed.

The Vera Rubin NVL72 will ship alongside a smaller appliance called the DGX Rubin NVL8 that includes eight Rubin GPUs instead of 72. The two systems form the basis of the DGX SuperPOD, a new reference architecture for building AI clusters. It combines Nvidia’s latest chips with a software platform called Mission Control that companies can use to manage their AI infrastructure.

Rubin-powered systems will start shipping in the second half of 2026.

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.