UPDATED 01:30 EDT / JUNE 01 2026

INFRA

Nvidia ramps up production of Vera Rubin, the foundation of the next generation of AI factories

Nvidia Corp. said early Monday at the Computex conference in Taipei that it’s gearing up the production of its forthcoming Vera Rubin platform, which is set to become the foundation of a new generation of artificial intelligence factories that will dominate the enterprise infrastructure story for years to come.

The company unveiled Vera Rubin for the first time in March at its annual GTC developer conference, and today’s announcement that the systems are entering volume production means it’s coming into closer view.

Vera Rubin is named after the pioneering astronomer who first discovered evidence for dark matter, and it’s much more than just a simple refresh of Nvidia’s previous-generation graphics processing units. The company said it’s a complete architectural overhaul that’s aimed at powering the enterprise shift toward “agentic AI” – a world where autonomous AI agents that can reason, use third-party software tools and execute complex workloads on behalf of humans.

The Vera Rubin platform is anchored by Nvidia’s new Rubin graphics processing unit, which is the successor to the Grace Blackwell GPU, but that’s not all. The platform also consists of Nvidia’s new Vera central processing units, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 data processing unit and the Spectrum-6 Ethernet switch, plus the new Nvidia Groq 3 language processing unit that’s designed to support the deterministic, low-latency requirements of trillion-parameter model inference. It combines these components into a fully integrated system that delivers 10 times as much “agentic AI” throughput at scale than the previous-generation Grace Blackwell platform.

Nvidia founder and Chief Executive Jensen Huang explained during a keynote at Computex that agentic AI is an entirely new kind of workload, so it requires a new foundation. “One prompt can launch a thousand-step journey of reasoning, retrieval, tool use and response generation,” he said. “Vera Rubin was built for this moment – an AI factory engine that delivers intelligence at scale, with the performance, efficiency and security needed to power the next industrial revolution.”

Production ramps up

Vera Rubin is the third generation of Nvidia’s MGX rack-scale systems, and it’s set to be mass-produced at an unprecedented scale, with more than 350 supply chain partners spread across 30 countries all involved in the process. Some of its top partners include Dell Technologies Inc., Hewlett Packard Enterprise Co., SuperMicro Computer Inc. and Lenovo Group Ltd., which are all manufacturing Vera Rubin servers that will be shipped out to Nvidia’s cloud and enterprise customers later this year.

The new Vera Rubin NVL72 rack-scale system sits at the heart of the Vera Rubin platform. It’s a liquid-cooled rack-scale system made up of 72 Rubin GPUs and 36 Vera CPUs connected over its high-speed NVLink 6 interconnects to achieve “breakthrough efficiency.”

For instance, Nvidia said the Vera Rubin NVL72 platform can be used to train large mixture-of-experts models using just one-fourth of the number of GPUs compared to what would be required with its previous-generation Blackwell chips. In terms of inference, the company said Vera Rubin will deliver 10 times greater throughput at just a 10th of the cost per token.

To support the kind of massive AI factory deployments it envisions, Nvidia is introducing the world’s first co-packaged optics-based network switches in the shape of Nvidia Spectrum-X Ethernet Photonics. It’s a new generation of switching technology that’s said to deliver five times greater power efficiency, five times longer AI uptime and 1.3 times faster deployment speeds than traditional transceiver-based networks.

It also integrates Nvidia’s new BlueField 4 data processing units, which boast software-defined networking speeds of up to 800 gigabytes per second and built-in multitenant isolation to simplify network operations and enhance the efficiency of the underlying Vera Rubin GPUs. The BlueField-4 STX storage rack is meant to act like a dedicated “context memory” tier, which AI agents can use to maintain coherence during massive, multi-turn interactions, Nvidia said. By offloading cache data to the BlueField-4 chips, companies can increase their inference throughput by up to five times.

Securing AI factories at rack scale

BlueField-4 STX also plays a vital role in helping to secure Nvidia’s AI factories, which are increasingly being tasked with processing sensitive and highly-regulated data that cannot be exposed to third-party AI systems. Such workloads require enhanced security measures, which is why Vera Rubin has been designed for full-stack confidential computing at rack scale, with data encrypted as it travels between the GPUs and CPUs across high-speed interconnects.

The foundational security is provided by a new, programmable software layer that’s designed to enforce, orchestrate and adapt security policies across the entire system. This is powered by the new Nvidia DOCA security innovations in BlueField-4 STX, which enforces security policies at the silicon layer. DOCA is said to enable multitenant network isolation, zero-trust policy enforcement, runtime threat detection and encryption at speeds of up to 800 GBs per second.

“Agentic AI turns enterprise data into a living, real-time system — and that system must be protected where data moves, where context is stored and where agents act,” Huang said. “With Vera BlueField-4 STX, Nvidia and its ecosystem is building secure-by-design storage infrastructure that enforces trust in silicon at the speed of AI.”

The agentic workhorse

Another key element of the Vera Rubin platform is the Vera CPU, which is a new class of processor that’s designed specifically for running agentic workloads at scale with greater speed and energy efficiency compared with standard x86-based chips.

The Vera CPU is the successor to Nvidia’s Grace CPU, and early benchmark tests suggest that it can deliver stellar performance across key agentic workloads including code compilation and database processing. These kinds of workloads will be the bread and butter of most AI factories, paving the way for much higher throughput and more productive individual AI agents.

“AI agents will be the largest users of computing,” Huang explained. “Vera is the first CPU designed for that future — built to run agentic AI at hyperscale with extraordinary performance, efficiency and programmability.”

The Vera CPU will also help accelerate a shift in AI factory economics from cores per dollar to tokens per dollar, Nvidia believes. It’s based on a new custom CPU core called Olympus that’s engineered for tasks such as Python runtimes and sandboxed code execution to orchestration logic and analytics pipelines.

Olympus enables Vera to process more instructions, anticipate application behavior and shift data across large numbers of concurrent environments in real time, Nvidia said. Each CPU features 88 Olympus cores, spatial multithreading and an LPDDR5X memory subsystem that supports bandwidth 1.2 terabytes per second, ensuring agents spend much less time waiting on CPU-bound steps, enhancing the overall efficiency of AI factories.

In addition, the Vera CPUs are closely integrated with the BlueField-4 STX processor to benefit from its embedded silicon security capabilities.

One final component of the Vera Rubin platform is Nvidia DSX, which is an architectural blueprint that provides the complete design and operational foundation for modern AI factories. It unifies reference designs, simulations, infrastructure software and ecosystem technologies to help server makers develop energy-efficient AI systems optimized for both performance and lower token costs.

By adopting DSX, Nvidia’s partners, including Dell, HPE, Lenovo and Supermicro and others are all accelerating production of their first Vera Rubin systems, and Nvidia expects the first complete systems to ship to customers in the fall.

Images: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.