INFRA
INFRA
INFRA
As enterprises pivot from AI experimentation to production deployment, AI storage infrastructure is emerging as the critical bottleneck determining whether massive chip investments deliver real returns.
The central challenge for running large-scale AI workloads has shifted from building models to keeping the accelerators that run them fully fed. GPU and tensor processing unit utilization rates — not raw compute capacity — now define whether AI investments generate real returns, according to Alex Bouzari (pictured, right), co-founder and chief executive officer of DataDirect Networks Inc.
“Demand is skyrocketing. I think what’s happening is finally the world is evolving from experimenting with AI, trying to figure out what to do, how to do it, how to drive value, to production scale,” Bouzari said. “Agentic AI I think is making a huge difference. We have customers now who are utilizing internally trillions of tokens per month. The economics have to pencil out. And so you need a data engine that delivers a number of token per hours where the ROI works out.”
Bouzari and Asad Khan (left), senior director of Google Storage at Google LLC, spoke with Alison Kosik and John Furrier at Google Cloud Next, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how the DDN and Google Cloud partnership is raising TPU and GPU utilization, lowering AI storage infrastructure costs and powering production workloads across industries. (* Disclosure below.)
The case for AI storage infrastructure as a first-order concern is backed by hard numbers. Google Cloud’s data footprint has grown 4x in just three years as more enterprises bring more data into the cloud to power training and inference workloads, Khan noted. Enterprises that struggle to justify AI spending are often leaving their most expensive infrastructure underutilized, he added.
“How saturated your GPUs and TPUs are — those are super expensive, hard to find, and that drives a lot of the TCO,” Khan said. “We were working with [Harmonic Inc.] and they were saying, ‘We are not getting the ROI,’ then they started using the Managed Lustre and the saturation was 6X more, which is crazy.”
That co-designed Google Cloud Managed Lustre offering — built on DDN’s EXAScaler platform — is central to the performance gains. New capabilities announced at Google Cloud Next 2026 push throughput to 10 terabytes per second, representing a 10x increase from earlier tiers, and the two companies have together achieved 95% or more TPU utilization for joint customers, according to Bouzari. The partnership has also extended to KV cache optimization for inference workloads. By offloading key-value cache to Google Cloud Managed Lustre, the mean time to first token drops by more than 40% compared to host memory alone, Khan explained.
“It is co-designing. It is ensuring we are using the right network, we are using the right VMs,” Khan said. “If you look at the announcement, one of the announcements was that we are now delivering 10 terabytes per second — which is 80 terabits per second of throughput. That is four to 20X of any hyper cloud [provider’s] offering because we did not just pick some open-source software — we partnered with DDN who has been the primary company [behind] this project.”
The breadth of the customer base reflects how non-deterministic the demand has become. Salesforce Inc. is running large-scale enterprise workloads on Managed Lustre, while Sony Honda Mobility is using it as a multimodal platform to train the AFEELA Intelligent Drive autonomous driving system. Academic institutions are also adopting it to give researchers and doctoral students direct access to the infrastructure, creating the next generation of enterprise AI practitioners, Khan added. The throughline across all of it is agentic AI — and the urgency for enterprises to get on board.
“Agentic is not going away. If you don’t use agentic AI, you will not be differentiated, you will not be able to compete,” Bouzari said. “You need to embrace it, but you need to embrace it thoughtfully with the right technologies and the right partners. This partnership, DDN and Google together, are delivering that to enterprises, which is lowering the cost and increasing the value.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Google Cloud Next:
(* Disclosure: DataDirect Networks sponsored this segment of theCUBE. Neither DataDirect Networks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.