Coverage from SiliconANGLE's livestreaming video studio

UPDATED 14:47 EDT / JULY 10 2025

Prasad Kalyaranaman, vice president of AWS infrastructure service at AWS, talks with theCUBE about AI-first infrastructure at the AWS Mid-Year Leadership Summit 2025.

AI

AWS doubles down on AI-first infrastructure to meet soaring global demand

by Victor Dabrinze

Artificial intelligence is no longer just a use case layered onto cloud platforms — it’s the driving force behind a global shift to AI-first infrastructure, and Amazon Web Services Inc. is right in the middle of it. As businesses scramble to deploy generative models and agentic systems, AWS is retooling its stack to meet massive new demands.

Just a decade ago, cloud adoption was still a cautious conversation. Now, the conversation has flipped — and the pressure is on. AWS must deliver not just scale, but performance, resilience and data architecture fit for a world where AI is baked into every business process. The question heading into the second half of 2025 isn’t whether AWS can keep up with AI — it’s how it will lead.

Prasad Kalyaranaman, vice president of AWS infrastructure service at AWS, talks with theCUBE about AI-first infrastructure at the AWS Mid-Year Leadership Summit 2025.

AWS’ Prasad Kalyanaraman Explores AI-first infrastructure with theCUBE.

“I think gen AI has created yet another opportunity for us, where I think we have the opportunity to build the equivalent of another AWS or probably even bigger,” said Prasad Kalyanaraman (pictured), vice president of AWS infrastructure service at AWS. “Every single service and every single foundational service is getting reinvented in terms of how they use gen AI. The scale, power requirements, network requirements and computer requirements from this are exponentially larger as well.”

Kalyanaraman spoke with theCUBE’s John Furrier at the AWS Mid-Year Leadership Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the technical and strategic muscle required to keep AWS at the forefront of the AI revolution.

Global AI-first infrastructure fueled by custom-built compute factories

AWS’ expansion is no longer about racking servers; it has evolved into full-scale computer factories, according to Kalyanaraman. By owning every portion of the stack — from silicon to networking and higher-level services — that vertical integration enables AWS to roll out new regions and AI-first infrastructure faster than ever. What once took years can now be deployed in months.

“It is not about just the length of time to go and build up a data center, it’s also the length of time it takes to build all the other services around it, the network around it, the database services, the storage needed for it and so on,” he said. “If you had to start from scratch right now, it’ll take anyone years. If you invested the time on this and built that experience that we’ve done over many years, then we can turn those up fairly quickly.”

Recent investments such as the $10 billion North Carolina region and Saudi Arabia’s Humain AI zone underscore AWS’s strategy: bring sovereign, AI-ready infrastructure closer to customers, while meeting each country’s latency, control and regulatory needs.

Another problem angle in scaling AI deployment is energy. Every factor in the stack is an energy draw, and concerns are mounting globally on the rising energy demand for AI outpacing sustainable levels. AWS’s response? Continuous innovation across chips, memory and interconnects for efficient performance. Its UltraCluster fabric delivers 10+ petabits per second with sub-10 microsecond latency, with the speed and scale for distributed AI training and inference, according to Kalyanaraman.

“You’re reducing latency and increasing bandwidth for the collectives and so on,” he said. “At the same time, your chip capacity is improving as well as your network capacity increasing, while the models are getting more and more powerful. Now, that’s on the training side, but you also have the inference side of the equation, and the inference side of the equation, as the models start getting bigger, you need to start actually building UltraClusters to actually store the models because the amount of memory is not sufficient for a single chip. ”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the AWS Mid-Year Leadership Summit:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.