NEWS
NEWS
NEWS
The era of artificial intelligence experimentation is giving way to the realities of production infrastructure. As enterprises move past early large language model deployments, the conversation in the cloud-native ecosystem is shifting from what these models can do to how they can run securely, affordably and at scale.
This shift to production-grade AI is exposing pressure points in the stack, particularly around GPU availability and data sovereignty. It’s also forcing a rethink of how Kubernetes supports stateful workloads and autonomous agents as the community prepares for KubeCon + CloudNativeCon EU, according to Rob Strechay, principal analyst at theCUBE Research.
“KubeCon + CloudNativeCon has evolved from a Kubernetes event into the control plane for modern infrastructure strategy,” he said. “What we’re seeing now is the convergence of cloud-native, platform engineering and AI workloads, where Kubernetes isn’t just orchestrating containers. It’s orchestrating how enterprises build, scale and govern intelligent applications.”
On March 24–26, theCUBE, SiliconANGLE Media’s livestreaming studio, will feature exclusive coverage of KubeCon + CloudNativeCon EU. Ahead of the event, we examine how major tech players, such as Red Hat, IBM, Google and others, are advancing Kubernetes-based platforms, AI workloads and the broader cloud-native ecosystem.
This feature is part of SiliconANGLE Media’s exploration of how the Kubernetes ecosystem continues to shape enterprise infrastructure strategies. (* Disclosure below.)
The transition from AI experimentation to production reality is meeting its most significant hurdle in the regulatory landscape. As the cloud-native ecosystem prepares for the EU AI Act to enter full effect on August 2, 2026, enterprises are realizing that data residency is no longer a sufficient defense against jurisdictional risk. Nearly 62% of North American organizations are now actively adopting or planning for sovereign cloud solutions to maintain operational autonomy amid tightening global regulations, according to a recent IDC Market Perspective, a shift that is already influencing enterprise spending and reinforcing hybrid cloud strategies.
IBM’s generative AI “book of business” surpassed $12.5 billion in the fourth quarter of 2025, while software now accounts for 45% of the company’s revenue, compared with 25% in 2018. Underpinning that growth is Red Hat’s OpenShift, the enterprise Kubernetes platform that anchors IBM’s hybrid cloud software business and is on track to reach $1.9 billion in annual recurring revenue.
“AI is now embedded across our business, from how we deliver services to our software portfolio to the capabilities we are adding to our infrastructure platforms and how we drive our own productivity,” said James Kavanaugh, senior vice president and chief financial officer of IBM.
For IBM, OpenShift’s reach converges with Sovereign Core, its newly designed software platform that gives organizations and governments more control over their AI and cloud workloads amid tightening compliance requirements. The platform’s jurisdictional controls reflect an architectural bet that organizations most exposed to regulatory pressure will require greater control over where and how their AI workloads run, according to IBM.
“AI is not an afterthought,” said Sachin Prasad, director of product management at IBM. “AI has to be baked into the fabric.”
Amazon Web Services is approaching the same pressure from a different architectural angle. In an exclusive interview with theCUBE Research’s John Furrier, AWS Chief Executive Officer Matt Garman described AI factories as “highly opinionated AWS-managed AI systems” deployed directly inside customer data centers — full-campus infrastructure purpose-built for sovereign nations, defense agencies and the largest global enterprises requiring cloud-consistent services entirely on their own turf. Garman was equally direct about the addressable market.
“99.99% of customers will never purchase an AI factory,” Garman noted. “They’d really just want to use the cloud in their environment and build inference into their applications.”
If the sovereign shift is rewriting where AI runs, the inference race is rewriting what it costs to run it. As enterprises move from training models to serving them at production scale, computational economics has become the central battleground and major players in the cloud-native ecosystem are responding with accelerated, purpose-built hardware for the inference moment.
Google is betting on silicon depth. The company brought its custom Ironwood tensor processing units online for cloud customers while simultaneously upgrading vLLM to support inference switching between GPUs and TPUs, or a hybrid approach. The result is an inference stack designed to absorb volatility across model types and workload sizes without requiring enterprises to re-architect every time hardware generations evolve.
AWS is pursuing a different cost-performance path through specialized silicon partnerships. Red Hat recently announced that its AI Inference Server, running on AWS Trainium3 and Inferentia2, delivers up to 30% to 40% better price performance for scalable AI. The companies also built a Kubernetes-native Neuron operator for OpenShift, giving teams a supported path to target AWS accelerators within the Kubernetes operational model. AWS is also bringing Cerebras’ wafer-scale WSE-3 chip to its cloud platform, lowering the entry cost for high-end inference.
Red Hat’s approach to the inference stack is less about competing with silicon vendors than about improving the performance of existing hardware at scale. The company’s partnership with Nvidia, including day-zero support for new GPU architectures, reflects a strategy to unify an increasingly fragmented hardware landscape into a repeatable, production-ready foundation, according to Stu Miniman, senior director of market insights for hybrid platforms at Red Hat.
“If we’re building an AI factory, it’s a stack, and no one vendor has all the pieces,” he told theCUBE. “The hardware, the software, we’re working really closely, not just with the hardware vendors themselves but some of the key systems integrators that are helping the customers with that last mile of deployment and really getting productivity of production workloads … that’s our history at Red Hat.”
That multi-vendor momentum is reshaping the cloud-native ecosystem at the platform level, particularly as AI infrastructure becomes more distributed and integrated, according to Paul Nashawaty, principal analyst at theCUBE Research and host of the AppDevANGLE podcast.
“Across the ecosystem, companies such as Red Hat, IBM and Google, alongside the Cloud Native Computing Foundation, are advancing platforms that support not only microservices but also data pipelines, AI inference and large-scale distributed applications,” he said. “Key trends shaping the market include the rise of platform engineering, the integration of AI workloads with cloud-native infrastructure, and expanding developer and automation tooling, all of which will be central themes at KubeCon as the community continues to define the next phase of cloud-native innovation.”
Open-source forces are increasingly serving as the primary defense against vendor lock-in as the AI stack grows more complex. The Linux Foundation and Cloud Native Computing Foundation are currently standardizing the agentic layer of the stack to ensure interoperability between autonomous systems. At the heart of this effort is Google’s Agent2Agent Protocol, which it contributed to the open ecosystem. A2A aims to ensure that software agents from different vendors can communicate securely without requiring custom integrations, according to Jonathan Bryce, executive director of the CNCF and Linux Foundation.
“Agents are capturing people’s attention,” he told theCUBE. “Where we are right now is the very early stage of coming up with the right frameworks and protocols. When I look at the agent space, I see that these frameworks and projects are open-source.”
An increasingly urgent search for return on investment drives the push for standardization. While the infrastructure layer is being hardened across the cloud-native ecosystem by open-source communities, vendors such as Red Hat are focusing on that agentic layer. This autonomous software bridges the gap between raw models and existing enterprise workflows. For Red Hat, this means building in the open to identify which technologies gain organic traction before committing to long-term enterprise support, according to Jennifer Vargas, senior principal marketing manager at Red Hat, and James Harmison, senior principal technical marketing manager at Red Hat.
“Everyone is trying to look for the ROI,” Vargas told theCUBE. “It’s really expensive to deploy AI at the scale that everyone wants. On the enterprise, [they] need to demonstrate ROI very quickly. I think agentic AIs probably are key to that because it will help enterprises match their workflows. Enterprises already understand automation. What they need is to take it to the next level.”
(* Disclosure: TheCUBE is a paid media partner for the KubeCon + CloudNativeCon EU event. Sponsors of theCUBE’s event coverage do not have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.