UPDATED 15:33 EDT / APRIL 23 2026

From GPUs to AI factories: Inside the Nvidia-Google Cloud superstack

Nvidia Corp. and Google LLC used the search giant’s annual Cloud Next event to deepen their long-running partnership, creating a full-stack “artificial intelligence factory” that integrates Google’s AI Hypercomputer infrastructure with Nvidia’s latest solutions, including Blackwell, open models and agentic and physical AI tooling.

With this announcement, Google expands its distribution of Nvidia’s accelerated computing stack, while customers gain a faster, lower-risk path from AI experimentation to large-scale deployment.

What was announced at Next

Google Cloud is extending its AI Hypercomputer architecture with new Nvidia-powered instances (including Grace Blackwell systems and the upcoming A5X instance based on the Nvidia Vera Rubin platform) for large-scale “AI factories” for training and inference.
Virgo Networking is a data center network fabric designed for megascale AI. Introduced by Google, it serves as the backbone for Google’s AI Hypercomputer and will enable the Vera Rubin A5X instance to scale to 960,000 graphics processing units across multiple sites.
Agentic AI and “physical AI” use cases are showcased: Nvidia Omniverse libraries and the open-source Nvidia Isaac Sim robotics simulation framework are available on Google Cloud Marketplace, enabling developers to build physically accurate digital twins and develop custom robotics simulation pipelines to train, simulate, and validate robots before real-world deployment. In addition, Nvidia NIM microservices for models such as Nvidia Cosmos Reason 2 can be deployed on the Google Enterprise Agent Platform and Google Kubernetes Engine.
The partnership spans cloud (Google Enterprise Agent Platform, GKE, DGX Cloud), on-premises and edge via Google Distributed Cloud on Nvidia Blackwell, providing customers with a consistent platform from lab to production across environments.

A decade-long full‑stack collaboration

Nvidia and Google Cloud have been co-developing the accelerated cloud stack for about a decade, starting with early K80/P100 GPU instances and evolving into the AI Hypercomputer architecture.

That collaboration has been expanded to address the entire AI stack:

Infrastructure: Nvidia GPUs (H100, RTX PRO 6000, GB300, GB200, B200, H200, H100, L4 and A100 GPUs today with Vera Rubin coming) power GCE, GKE, Vertex AI, Batch, DGX Cloud and Distributed Cloud, all tied into Google’s custom networking, storage and schedulers.
Libraries and software: Nvidia CUDA, cuDNN, Dynamo, NeMo, Nemotron and optimized JAX/PyTorch are integrated with Google Cloud services and reference architectures.
Managed services integrations: Vertex AI, GKE, Cloud Run and Google’s AI Hypercomputer all have Nvidia GPUs and autoscaling, and provide native observability, so customers consume Nvidia as an on-demand cloud primitive rather than a bespoke hardware project.
Models and agents: Gemini models on Vertex and the Gemini Agent Platform are now cross-linked with Nvidia’s open Nemotron models and NeMo tools, giving customers a choice of model families optimized for Nvidia hardware.

For customers, co-engineering means there is no need to stitch together GPUs, schedulers and frameworks, as the combined stack is designed to be turnkey and is approaching “utility” status.

Google’s million‑plus-GPU footprint

Google has quietly built out one of the world’s largest accelerated infrastructure deployments, with well over a million Nvidia GPUs deployed across its global fleet for internal products and Google Cloud services.

There are two implications for this scale. The first is that it shortens deployment times. Because the backbone, supply chain, and data center footprint are already GPU‑centric, adding each new GPU generation (Hopper, Blackwell, Vera Rubin) can roll out faster, and those accelerators show up quickly in customer‑facing SKUs like A3/A5X and DGX Cloud.

The second point is that there should be plenty of capacity for AI factories. The technology footprint that underpins Google’s AI Hypercomputer concept — multitenant, massively scaled clusters where training, fine-tuning and inference share the same fabric — makes it realistic for enterprises to spin up large language model and agent workloads that run across tens of thousands of Nvidia GPUs without bespoke infrastructure engineering.

Information technology leaders no longer have to guess which region or instance type will still be available at scale in 18 months — Google is standardizing on Nvidia as the default accelerator fabric, alongside its tensor processing units.

Nvidia makes the move from general‑purpose to accelerated computing easier

Nvidia has rewritten the computing stack by shifting heavy compute workloads away from general-purpose central process units toward GPU-accelerated architectures optimized for parallel workloads.

Key aspects of that shift:

From instruction-driven to parallel data-driven: Traditional CPUs are optimized for serial workloads, whereas GPUs deliver massive parallelism that AI, HPC, graphics and data analytics exploit; CUDA and its ecosystem make that parallelism programmable at scale.
From components to platforms: Despite the media positioning Nvidia as a chip company, that’s only one part of its offerings. The company sells a full platform — GPUs, CPUs, interconnects (NVLink), networking, systems (DGX, GB300 NVL72),and extensive software stacks such as CUDA, cuDNN, Dynamo and NeMo.

That “accelerated computing” mindset is why Nvidia maps so cleanly onto Google’s AI Hypercomputer strategy: Both focus on building dense, software-defined supercomputers rather than generic cloud infastructure as a service.

Why Nvidia reach beats any single TPU/ASIC

Specialized accelerators like TPUs and other application-specific integrated circuits are powerful and often positioned as a threat to Nvidia, but they are narrow. Nvidia’s bet has always been horizontally broad programmability. This has the following benefits:

Ecosystem gravity: Virtually every major AI framework (PyTorch, JAX,), along with a long tail of domain-specific frameworks and libraries, has first-class, production-hardened support for Nvidia GPUs because CUDA is the de facto standard for accelerated computing.
Workload diversity: Nvidia accelerates not only LLMs but also recommendation systems, traditional ML, scientific HPC, data analytics, simulation and digital twins, media, gaming and graphics pipelines, all on a common platform.
Portability across environments: The same CUDA binaries and container images can run on-prem, at the edge, or on public clouds such as Google Cloud, AWS, Azure and others, giving independent software developers and enterprises a broad distribution surface that no proprietary ASIC can match.

So although TPUs will remain strong within Google for specific workloads, Nvidia’s cross-industry, multicloud footprint makes it attractive to enterprises that need to ship software to any customer, anywhere.

For Google Cloud, aligning with Nvidia broadens the appeal of its AI infrastructure to customers who want a neutral, portable accelerated platform rather than a proprietary stack that locks them into a single cloud or architecture.

Why this matters to Google Cloud

Differentiated yet open: It can lead with TPUs for internal products and select Vertex AI offerings, but partnering with Nvidia lets it claim the broadest possible ecosystem support for enterprise AI, spanning open-source to proprietary models.
Faster innovation cadence: Google inherits Nvidia’s rapid GPU roadmap (Hopper to Blackwell to whatever is next) and combines it with its own networking, storage and AI orchestration fabric — meaning customers see new capabilities sooner, with less integration pain.

Why this matters to Nvidia

Distribution and visibility: Google Cloud becomes one of the most visible, multitenant showcases for Nvidia’s latest platforms, spanning training, inference, agents and physical AI, strengthening Nvidia’s position as the default AI hardware choice.
Deeper stack integration: Tight integration with Vertex, GKE, Cloud Run and Distributed Cloud provides Nvidia privileged access to enterprise workloads and telemetry, which can feed back into its software and hardware optimization loops.

Why customers should care – and how it accelerates AI adoption

For customers, this partnership is about reducing risk and shortening time-to-value. Specifically:

Lower platform risk: Building on Nvidia via Google Cloud enables customers to follow two strong roadmaps — Nvidia’s for accelerated computing and Google’s for hyperscale AI infrastructure —r ather than betting on a single proprietary accelerator.
Faster path from PoC to production: There is so much hype today about customers getting stuck in proofs of concept. With this partnership, customers can prototype with Gemini or Nemotron models on Vertex or GKE, then scale to DGX Cloud or massive AI Hypercomputer clusters without changing hardware architectures or rewriting for a different accelerator.
Operational maturity: Google wraps Nvidia GPUs in managed services with autoscaling, observability and MLOps patterns, so teams can focus on models and applications instead of driver versions, firmware, and schedulers.

This combination lowers the organizational friction of adopting AI because infra teams, data scientists and app teams share a common, battle‑tested platform.

Nvidia-Google partnership can accelerate AI adoption

While customers want choice, too many variables in an equation can slow things down. The Google-Nvidia stack provides enterprises with a reference design for building AI factories, cloud-scale clusters for training, fine-tuning, inference and simulation — that they can consume as a service or emulate on-premises with similar building blocks.

Support for agentic and physical AI: By integrating Nvidia’s NeMo, Nemotron and robotics and digital twin platforms, customers can move beyond chatbots to agents that plan, act and interact with the physical world, all on the same accelerated platform.
Ecosystem leverage: Because “all kinds of frameworks and algorithms run on Nvidia,” enterprises can adopt best-of-breed open-source components, ISV solutions, and custom models without fighting the hardware; that flexibility encourages experimentation and shortens the iteration loop.

Google has spent a decade playing third fiddle to Amazon Web Services and Microsoft Azure, but its partnership with Nvidia gives it a first-fiddle story in AI: a co-designed AI Hypercomputer, tuned for agentic and physical AI, that turns Google’s Nvidia-powered supercomputers into a product enterprises and startups can actually buy. Google has a decade-long partnership with Nvidia and offers the widest range of Blackwell instances today. In the AI era, choice is important, and Google gives customers that.

Zeus Kerravala is a principal analyst at ZK Research, a division of Kerravala Consulting. He wrote this article for SiliconANGLE.

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

From GPUs to AI factories: Inside the Nvidia-Google Cloud superstack

What was announced at Next

A decade-long full‑stack collaboration

Google’s million‑plus-GPU footprint

Nvidia makes the move from general‑purpose to accelerated computing easier

Why Nvidia reach beats any single TPU/ASIC

Why this matters to Google Cloud

Why this matters to Nvidia

Why customers should care – and how it accelerates AI adoption

Nvidia-Google partnership can accelerate AI adoption

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Google Cloud Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

From GPUs to AI factories: Inside the Nvidia-Google Cloud superstack

What was announced at Next

A decade-long full‑stack collaboration

Google’s million‑plus-GPU footprint

Nvidia makes the move from general‑purpose to accelerated computing easier

Why Nvidia reach beats any single TPU/ASIC

Why this matters to Google Cloud

Why this matters to Nvidia

Why customers should care – and how it accelerates AI adoption

Nvidia-Google partnership can accelerate AI adoption

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Google Cloud Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

Cookies