UPDATED 15:30 EDT / MAY 06 2026

INFRA

Nvidia’s MRC: When ‘just Ethernet’ isn’t enough for gigascale AI

Nvidia Corp.‘s latest networking innovations meet the needs of a new kind of network that supports the unique demands of artificial intelligence factories.

Ethernet is no longer a generic plumbing choice but an enabler of high-performance AI. With today’s unveiling of Multipath Reliable Connection, or MRC, on Spectrum-X Ethernet, Nvidia is pushing Ethernet even deeper into AI-native territory — and doing so in partnership with OpenAI Group PBC and Microsoft Corp.

On the surface, MRC is a new remote direct memory access or RDMA transport protocol, now open-sourced via the Open Compute Project. In reality, it’s a production-proven way to keep tens or hundreds of thousands of graphics processing units fed and synchronized by using a single RDMA connection to stripe traffic across multiple paths and dynamically steer around congestion and failures. OpenAI has already used MRC on Spectrum-X to train recent frontier large language models powering ChatGPT and Codex, and Microsoft is deploying it in some of its largest AI factories built on GB200 systems. The important point is that MRC isn’t a lab experiment but a set of algorithms that has already earned its place in some of the most demanding AI environments on the planet.

What Nvidia announced

There are three intertwined elements to the announcement:

MRC as a transport: MRC is an RDMA transport that lets a single connection fan out across multiple network paths, using multipath awareness, congestion hints and fast retransmission to keep bandwidth utilization high and failure recovery fast.
SpectrumX as the platform: MRC runs natively on Nvidia SuperNICs and SpectrumX switches, riding on the same scaleout Ethernet fabric that already underpins large GPU clusters.
An open spec via OCP: The protocol specification is being published through the Open Compute Project, with Nvidia, OpenAI, Microsoft, Advanced Micro Devics Inc., Broadcom Inc. and Intel Corp. all participating in its development.

That openness is important to scaling MRC. Nvidia has been adamant that everything in Spectrum X is built on standard protocols, with no proprietary wire formats and no lock-in at the packet level. The “secret sauce” is in how they partition control logic among NICs, switches and host software, not in a closed protocol. MRC follows that pattern: Anyone can implement the spec, but Nvidia believes its execution on SpectrumX hardware, with deep telemetry and fabric control, will be hard to match.

Why MRC matters for gigascale AI

When a frontier model is being trained across tens or hundreds of thousands of GPUs, the network is effectively part of the compute pipeline. If a link flaps for a few milliseconds or a path gets congested, it’s a stall in a multimillion-dollar training run and can cost big money.

MRC addresses that problem in several ways.

Multipath load balancing: Instead of pinning a flow to a single path, MRC can distribute traffic for an RDMA connection across many paths, smoothing out hot spots and using all available fabric capacity.
Congestion-aware routing: The protocol uses real-time signals from the fabric — congestion events and path health — to steer around overloaded links, sustaining high bandwidth even under stress.
Fast, precise retransmission: When data is lost, MRC retransmits quickly and precisely, minimizing the impact of short-lived faults on long-running jobs.
Microsecond failure bypass: SpectrumX can detect a path failure and reroute in hardware in microseconds, which is crucial when thousands of GPUs must stay in lockstep.

During a call, Nvidia Senior Vice President Gilad Shainer described MRC as extending the routing “brain” all the way to the host. The network interface card and the host-side management stack (in OpenAI’s case, its own software) can actively participate in routing decisions, thereby overriding or influencing what the switches do. That’s a major shift from classical Ethernet designs, where a hosted tenant has little or no control over the fabric.

In more traditional cloud models, a hosted customer has visibility and control at the virtual machine or server level, but the network fabric remains opaque. OpenAI wanted to change that, acting as a “smart tenant” with the ability to govern routing policy, congestion responses and failure behavior from the server edge. MRC is the mechanism that reconciles that desire with the realities of a shared, hyperscale fabric.

The role of multiplane architectures

Another key piece is SpectrumX multiplane support. Large AI factories are increasingly built as multiplane networks. That is a separate, independent network plane that provides a full path between GPUs. Think of it as having multiple disjointed fabrics in parallel, each serving as an alternative route for the same east-west traffic.

SpectrumX solves this. Hardware-accelerated load balancing across planes keeps latency predictable while scaling to hundreds of thousands of GPUs. Failures or maintenance events can be absorbed by shifting traffic between planes without disrupting training jobs.

MRC sits on top of this, using multiplane awareness to exploit those parallel fabrics more intelligently. The result is a kind of AI-native Ethernet fabric where redundancy, performance and control are baked into the transport, not bolted on via box-by-box tinkering.

MRC vs. adaptive RDMA vs. UEC

Nvidia is careful to present MRC as “another protocol” on SpectrumX, not a replacement for everything else. Today, SpectrumX supports at least two main Ethernet transports for AI. Spectrum-X plus adaptive RDMA is a general-purpose AI Ethernet with adaptive routing in the switches and NIC-level optimization. Spectrum-X with MRC is an RDMA transport emphasizing multipath, host-driven routing and governance.

There is also the Ultra Ethernet Consortium, which is a multivendor effort to define a new Ethernet RDMA-based fabric. I asked Shainer about the long-term implications of these Ethernet variants and he gave a very pragmatic answer. He does not see the world collapsing onto a single “winner” like UEC. Instead, he expects more variety: Different hyperscalers and AI providers will tune their transport protocols to their own workloads and operational models.

In that context, MRC is a great example of a “custom Ethernet for AI” that’s already running in production, while UEC is another evolving effort. Technically, MRC builds on RoCEv2 as defined by the InfiniBand Trade Association, then extends it with multipath, host-governed routing and the multiplane integration.

Some concepts that surfaced in UEC discussions — such as enhanced congestion control — also show up in MRC, but wired into Nvidia’s hardware and host stack. From a user point of view, the important bit is that SpectrumX gives you a choice: you can run Adaptive RDMA, you can run MRC, and there are other undisclosed variants SpectrumX can support that are specific to other large customers.

Hosted users vs. owners: A new control boundary

One of the more interesting subtexts in my conversation with Shainer is the distinction between “hosted users” and “infrastructure owners.” If you own the AI factory, you can program switches, NICs and hosts end-to-end; you can roll your own routing algorithms and congestion-control tweaks anywhere in the stack. If you’re a hosted customer — OpenAI on top of Microsoft, for example — you typically only control the host. The network underneath is someone else’s problem.

MRC exists largely to bridge that gap. By embedding new logic in the SuperNIC and exposing it to host-side management, a tenant can make meaningful routing decisions that the fabric will honor, without direct switch access. That allows OpenAI, or others with similar models, to optimize for their specific training jobs — changing routing strategies, reacting to congestion patterns, or tuning behavior per workload — without owning the whole data center.

That’s an important pattern to watch as AI ecosystems get more layered and multiparty. We’ll see more cases where a model provider wants near-owner-level control over routing and telemetry, even when they’re running on someone else’s iron. MRC is an early pattern for how that could be done safely over Ethernet.

Why this matters for the industry

From an industry perspective, MRC and SpectrumX underscore three trends.

First, AI is forcing Ethernet to specialize. Ten years ago, you could plausibly talk about “one Ethernet” dominating the data center. Today, we have a spectrum: shallow-buffer vs. deep-buffer switches, DCB vs. ECN-driven fabrics, a variety of RDMA variants, and now AI-specific transports such as MRC. Shainer’s line that “there is Ethernet, and there is Ethernet, and there is another Ethernet” isn’t just a joke — it’s the reality of the role the network plays in AI.

Second, open specifications with proprietary implementations are becoming the norm. By pushing MRC into OCP, alongside contributions from AMD, Broadcom and Intel, Nvidia gains ecosystem credibility while still betting that its Spectrum-X implementation will perform best. It’s the same playbook Nvidia has used in InfiniBand: standards on the wire, differentiation in silicon, and software.

Third, UEC is now one of several options, not the ordained future. With MRC in production on GB200-based clusters at Microsoft and in OpenAI environments, Nvidia can point to a working, large-scale, open Ethernet transport that doesn’t depend on the UEC kitchen to finish its meal. That doesn’t kill UEC, but it does make the future feel more pluralistic — one where hyperscalers, silicon vendors and model providers define and adopt the flavors that best match their economics and risk tolerance.

For enterprise buyers and service providers, the practical takeaway is this: When evaluating “AI networking,” don’t stop at port speeds and buffer sizes. Ask which transport protocols the fabric supports, how they’re implemented in NICs and switches, what telemetry and host-side control you get, and how quickly the system can respond to failure and congestion. In other words, treat the network as part of the AI architecture, not just a line item.

Nvidia’s MRC announcement, backed by OpenAI and Microsoft, is a strong reminder that in gigascale AI, Ethernet must mature and function as an AI-native fabric. With Spectrum-X, Nvidia is betting that the winning networks won’t just be fast — they’ll be intelligent, programmable and tailored to the unique demands of AI factories.

Zeus Kerravala is a principal analyst at ZK Research, a division of Kerravala Consulting. He wrote this article for SiliconANGLE.

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Nvidia’s MRC: When ‘just Ethernet’ isn’t enough for gigascale AI

What Nvidia announced

Why MRC matters for gigascale AI

The role of multiplane architectures

MRC vs. adaptive RDMA vs. UEC

Hosted users vs. owners: A new control boundary

Why this matters for the industry

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Atlassian Team 2026

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Nvidia’s MRC: When ‘just Ethernet’ isn’t enough for gigascale AI

What Nvidia announced

Why MRC matters for gigascale AI

The role of multiplane architectures

MRC vs. adaptive RDMA vs. UEC

Hosted users vs. owners: A new control boundary

Why this matters for the industry

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Atlassian Team 2026

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Cookies