UPDATED 17:12 EDT / MAY 12 2025

INFRA

Where containers meet AI: Google’s evolving Kubernetes ecosystem

There was a time when Kubernetes was mostly a curiosity — a developer tool built in the wilds of open source to solve problems not everyone knew they had yet. But a little more than a decade later, the open-source project has become the backbone of cloud-native computing and a cornerstone of how artificial intelligence workloads are deployed and scaled across enterprise environments.

As Kubernetes evolves into its second decade, Google LLC remains a central player in the container ecosystem it helped create. The company continues to expand the capabilities of Google Kubernetes Engine and Google Cloud Run to meet the demand for scalable, AI-powered infrastructure. Meanwhile, the broader Kubernetes community is refining the project’s flexibility and maturity, paving the way for enterprise innovation across industries.

TheCUBE Research's Savannah Peterson talks with Google experts about Kubernetes – Google Cloud: Passport to Containers” interview series.

TheCUBE Research’s Savannah Peterson shares her take on the “new stage of maturity” in IT.

“I think we’re reaching a new stage of maturity within the ecosystem as well,” theCUBE Research’s Savannah Peterson said. “It’s a lot less hype. Kubernetes is actually being deployed. I think the AI stack is actually driving a bit of that as well. I think we’re at a place where this isn’t just a project. People aren’t thinking about it; we’re actually implementing and seeing what that looks like.”

The experimental era of Kubernetes has given way to enterprise-scale deployment, and the implications are broader than infrastructure. As organizations operationalize AI, the Kubernetes ecosystem is being recast in real time, with Google Cloud helping lead a shift in what containerization can enable across industries.

This feature is part of the “Google Cloud: Passport to Containers” interview series, which explores how businesses use AI and containers to scale efficiently in the cloud. (* Disclosure below.)

Training, tuning and deploying AI: How GKE powers intelligence at scale

As AI development accelerates, GKE has become essential scaffolding for training, serving and scaling machine learning models. Developers need infrastructure that can handle massive data loads, model versioning and compute-intensive tasks, all while staying flexible across dev, test and production. GKE combines the portability of containers with the orchestration muscle of Kubernetes, allowing teams to iterate quickly and serve models at scale, according to Brandon Royal (pictured, right), product manager of AI infrastructure at Google Cloud, and Bobby Allen (left), cloud therapist at Google.

“It could be training a very small model with a very specific set of information, or it could be all the way up to very large language models that are doing incredible text encoding, text generation or even image models,” Royal told theCUBE during an exclusive interview.

Google’s Bobby Allen and Brandon Royal talk with theCUBE's Savannah Peterson about Kubernetes during the “Google Cloud: Passport to Containers” interview series.

Google Cloud’s Bobby Allen and Brandon Royal talk with theCUBE about cloud-native AI technologies as a paradigm shift for data-driven businesses.

Inference is now just as critical as training, Royal added. With open-source models and pre-trained intelligence readily available, developers can deploy capabilities via application programming interface endpoints instead of retraining from scratch. This shift makes integration smoother, enabling faster application development without reinventing the wheel.

“A model is only valuable until we can put it behind an API and make it available to do something interesting,” Royal said. “That’s really where the fun and interesting stuff happens. Inference is becoming more and more critical to businesses that are looking at deploying AI models in their platforms.”

GKE also reduces the complexity of building and integrating AI systems by providing containerized environments that slot into existing app stacks and scale on demand. That flexibility is widening access to advanced capabilities, even for teams without deep machine learning experience, according to Allen.

“It’s futuristic, but it’s also bleeding into everything,” Allen told theCUBE. “I think people can feel the pace speeding up every day.”

Rewriting the playbook: How containerization evolved into enterprise-scale infrastructure

The shift from monolithic servers to virtualized infrastructure marked the beginning of modern cloud architecture, but the real leap came with containers. As Docker pushed containerization into the mainstream, developers gained a way to package code and dependencies into portable units that sidestepped the platform conflicts of traditional environments. That shift redefined how teams build, test and ship software, according to Spencer Bischof, product manager of GKE at Google, and Gari Singh, product manager of Google Cloud at Google.

“If you start thinking about source containers from that development perspective, you can package up your entire app and all its dependencies independent of the host operating system,” Singh told theCUBE. “Containers have been around for a long time, but Docker popularized them by making them a lot easier to use.”

Spencer Bischof, product manager of GKE at Google, and Gari Singh, product manager of Google Cloud at Google, talk with theCUBE about Kubernetes as part of the “Google Cloud Passport to Containers” interview series – 2025.

Google Cloud’s Spencer Bischof and Gari Singh talk with theCUBE about GKE and the evolution of containers.

That simplification paved the way for Kubernetes to emerge as the orchestration standard for modern infrastructure. But it was Google’s work on GKE that helped scale the system for enterprise use. The idea of mini servers running in virtualized environments unlocked a new level of efficiency, according to Bischof.

“Traditionally in the past, we’d have things like large servers; then it matured, and we started virtualizing those machines because no one has all the space to have one single server,” he told theCUBE. “A couple of folks at Google, Red Hat and others said, ‘What happens if we made something smaller, compact and we could stuff thousands of these containers, mini servers, into a virtual environment?’ That’s what a container is.”

As Kubernetes adoption grew, so did the need for smarter defaults and easier onramps. Instead of forcing developers to configure every detail manually, Google introduced tools such as GKE Autopilot and compute classes to abstract away the infrastructure heavy lifting, according to Bischof.

“If you just want to get started with Kubernetes, something that’s based on Kubernetes, go start there,” he said. “Now, you’re not necessarily sure how you want to spin up a GKE cluster — we have complete walkthroughs and guides. Just follow the best practice built in using something like Autopilot. You don’t need to worry about understanding how the networking works, because of the complexity of the systems and the complexity of the storage.”

Beyond the edge: Scaling inference, commerce and community through Kubernetes

As AI adoption accelerates, the cost of inference — not training — has become the biggest challenge for organizations looking to scale. Google Cloud Run offers a flexible solution, combining container portability with serverless pricing and on-demand GPU access, according to Yunong Xiao, director of engineering at Google Cloud, and Steren Giannini, head of product for Google Cloud Run. That model is reshaping how businesses deploy AI in real time, without being locked into proprietary hardware or stuck waiting for scarce infrastructure.

“The container you deploy to Cloud Run has nothing proprietary about Cloud Run,” Giannini told theCUBE. “That’s a very unique value proposition. You can literally take it, you run it on your local machine, you run it on Kubernetes, you run it on another cloud, but hopefully you prefer to run it on Google Cloud Run because it’s more efficient and highly scalable.”

Jago Macleod, director of engineering, Kubernetes, at Google Cloud, and John Belamaric, senior staff software engineer at Google, talk with theCUBE about Kubernetes during the “Google Cloud Passport to Containers” series – 2025.

Google Cloud’s Jago Macleod and John Belamaric talk with theCUBE about the future of Kubernetes for AI.

Cloud Run is already powering a range of high-demand applications, from L’Oréal’s online assistant to Shopify’s flash-sale infrastructure, according to Xia0 and Giannini. L’Oréal uses AI to support high-volume customer interactions. Shopify relies on Cloud Run to handle unpredictable traffic spikes and latency surges during major retail events. In both cases, serverless inference on Kubernetes delivers the scale and agility required for enterprise-grade performance.

“The big problem that people are struggling with [is] inference … it’s the cost,” Xiao said. “There’s very expensive hardware that you have to buy, and there’s a capacity crunch. What we’re seeing with our customers … is all of them are struggling to even just get supply of the cards or the [tensor processing units] or [graphics processing units] to be able to run their inference applications. We are actually … providing on-demand access to GPUs today.”

As AI use cases stretch infrastructure in new ways, Kubernetes is evolving to meet them. Originally designed for microservices and stateless applications, it now supports inference-aware scheduling, high-performance hardware optimization and intelligent automation. These enhancements allow organizations to fine-tune deployments without micromanaging resources, according to Jago Macleod, director of engineering, Kubernetes, at Google Cloud, and John Belamaric, senior staff software engineer at Google.

“One of the things we’re doing within Kubernetes is what we call dynamic resource allocation, which is about helping Kubernetes understand the hardware better than it used to,” Belamaric told theCUBE. “It’s like trying to shift the work of making all of these decisions from the user, the human to the machine … so when you’re asleep at night, if some other job finishes, you can get the thing you wanted, and your workload will run by the time you wake up in the morning.”

The Kubernetes community started with open-source ideals, but its impact now extends far beyond code. As AI becomes a shared force across industries, the ecosystem continues to grow, not just through technical breakthroughs, but a shared sense of purpose, according to Allen.

“I want everyone to feel like they can play a part,” Allen told theCUBE. “This is going to be something that touches all of mankind. Let me just find my part and play that role.”

(* Disclosure: TheCUBE is a paid media partner for the “Google Cloud: Passport to Containers” series. Neither Google Cloud, the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+

CUBE Alumni Network

C-level and Technical

Domain Experts

15M+

theCUBE

Viewers

Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Where containers meet AI: Google’s evolving Kubernetes ecosystem

Training, tuning and deploying AI: How GKE powers intelligence at scale

Rewriting the playbook: How containerization evolved into enterprise-scale infrastructure

Beyond the edge: Scaling inference, commerce and community through Kubernetes

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

theCUBE + NYSE Wired: MedTech Unplugged Series

Google Cloud Partner AI Series

Black Hat USA 2025

Open Storage Summit 2025

World of Workato 2025

RECENT CUBE EVENTS

AWS Mid-Year Leadership Summit 2025

RAISE Summit 2025

Blue Yonder AI and the Autonomous Supply Chain 2025

Data Protection & AI Summit 2025

Open Source Summit NA 2025

Where containers meet AI: Google’s evolving Kubernetes ecosystem

Training, tuning and deploying AI: How GKE powers intelligence at scale

Rewriting the playbook: How containerization evolved into enterprise-scale infrastructure

Beyond the edge: Scaling inference, commerce and community through Kubernetes

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST STORIES

LATEST STORIES

theCUBE + NYSE Wired: MedTech Unplugged Series

Google Cloud Partner AI Series

Black Hat USA 2025

Open Storage Summit 2025

World of Workato 2025

AWS Mid-Year Leadership Summit 2025

RAISE Summit 2025

Blue Yonder AI and the Autonomous Supply Chain 2025

Data Protection & AI Summit 2025

Open Source Summit NA 2025

Cookies