UPDATED 12:15 EDT / MARCH 19 2025

Steren Giannini, head of product, Google Cloud Run, and Yunong Xiao, director of engineering at Google Cloud, talk about why Cloud Run is a unique offering at the Google Cloud: Passport 2025 event. CLOUD

Google unlocks scalable GPU access with Cloud Run’s serverless model

The pressure is on for companies to pay off their investment in artificial intelligence training with inference; Google Cloud Run hopes to alleviate that stress.

Cloud Run is a platform that marries serverless services with containers, allowing users to run code directly on top of Google LLC’s scalable infrastructure. Because serverless platforms only charge when you use them, Cloud Run is a cost-efficient option for companies that don’t want to skimp on compute power.

Steren Giannini, head of product, Google Cloud Run, and Yunong Xiao, director of engineering at Google Cloud, talk about the democratization of GPUs.

Google Cloud’s Steren Giannini and Yunong Xiao discuss how Cloud Run brings serverless to containers.

“The big problem that people are struggling with inference … it’s the cost,” said Yunong Xiao (pictured, right), director of engineering at Google Cloud. “There’s very expensive hardware that you have to buy and there’s a capacity crunch. What we’re seeing with our customers and with customers generally speaking is all of them are struggling to even just get supply of the cards or the TPUs or GPUs to be able to run their inference applications. We are actually today providing on-demand access to GPUs.”

Xiao and Steren Giannini (left), head of product for Google Cloud Run, spoke with theCUBE’s Savannah Peterson for the “Google Cloud: Passport to Containers” interview series, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed Cloud Run’s unique offering and the democratization of GPUs. (* Disclosure below.)

How Cloud Run is busting myths about serverless

With Cloud Run, Google is pushing back against the myth that serverless doesn’t scale, according to Giannini. Instead of running on Kubernetes, Cloud Run sits atop a highly scalable Borg infrastructure, although it still makes containers portable to other platforms.

“The container you deploy to Cloud Run has nothing proprietary about Cloud Run,” Giannini said. “That’s a very unique value proposition. You can literally take it, you run it on your local machine, you run it on Kubernetes, you run it on another cloud, but hopefully you prefer to run it on Google Cloud Run because it’s more efficient and highly scalable.”

Cloud Run also offers a portable application programming interface, similar to a Kubernetes API. Giannini further compares the needs of an application to the needs of agentic AI. Customers have been drawn to Cloud Run for creating AI agents because of its autoscaling on demand. By allowing users to harness serverless GPUs, Cloud Run plays a part in democratizing GPU access.

“Our premise for Cloud Run is we don’t want to win your business because we’re trying to lock you in,”  Xiao said. “We want to win your business with an open product because it’s the best product for you. The APIs, native integrations with Terraform, the container standard, means that you can very easily take any containerized application, put it on Cloud Run and then just scale out of it as you go.”

Finding opportunity in agentic AI

Cloud Run has seen success supporting its customers’ large language models and hopes to do so with the coming wave of agentic AI. In one use case, L’Oreal S.A. built its website chatbot on top of the platform. The on-demand usage allowed the chatbot to handle peak times while saving money at night when very few employees or customers would be using it.

“It’s … a rose by any other name — agentic is just a term meant to describe the use case of traditional inference workloads,” Xiao said. “I have these agents, they perform some functions for me, I can chain them together and it’s a term that’s helpful. It’s helpful for us to group them into a specific category in terms of use case. But, ultimately, I think the thing that we should realize at the end of the day is what is the value that they’re providing to the customer?”

Customers can look forward to hearing more about the future of Cloud Run at Google Cloud Next in April, but for now, Giannini wants the platform to be faster, better and more sustainable. He also founded Google Cloud Carbon Footprint, which tracks the carbon footprint of Google Cloud usage. Cloud Run is already more environmentally friendly because of its on-demand model. The next step is to reduce latency.

“Five seconds for us, it’s slow,” Giannini said. “We come from serverless, we are used to milliseconds, not seconds. That’s one thing that we are still working on improving that startup time. The GPUs we offer today, we want to offer bigger ones. Of course, there are a large set of GPU types. We are looking forward to adding bigger GPU types, which hopefully will have the same kind of performance as the ones we have today.”

Learn more about using GPUs on Cloud Run in this YouTube video.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of the “Google Cloud: Passport to Containers” interview series:

(* Disclosure: TheCUBE is a paid media partner for the “Google Cloud: Passport to Containers” series. Neither Google Cloud, the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU