AI
AI
AI
Red Hat Inc. wants to be the one layer everything else builds around — the durable AI ecosystem of the open source era.
The company has staked its next decade of growth on the conviction that open source will underpin enterprise AI the same way Linux and Kubernetes defined the cloud era. With the launch of Red Hat AI 3.4, the company is positioning itself as the platform of record for inference at scale, agentic deployment and token economics — the building blocks of any durable AI ecosystem, according to Brian Stevens (pictured, right), senior vice president and AI chief technology officer at Red Hat. Central to that pitch is the claim that private, on-premises inference can now match the unit economics of the big cloud providers.
“[Model providers] compete on two attributes: the value of the token — who has the best model — and the economics of the cost of that token,” Stevens said. “That’s what we’ve changed. We’re putting in the hands of our clients and an open source community the most efficient token economics possible.”
Stevens and Joe Fernandes (left), vice president and general manager of the AI business unit at Red Hat, spoke with theCUBE’s Rob Strechay and Rebecca Knight at Red Hat Summit 2026, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed inference economics, agent governance and the open source path to a durable AI ecosystem. (* Disclosure below.)
As enterprises move from cloud-based experimentation toward private, on-premises AI production, the pressure to manage both token costs and agent behavior is intensifying. Red Hat AI 3.4 adds a model-as-a-service layer on top of its vLLM inference engine and llm-d distributed inference framework, giving platform teams governed access to model endpoints without requiring them to manage the infrastructure underneath, Fernandes explained.
“As a service provider, your end users don’t care about the infrastructure underneath. They just want quick access to model endpoints and [to] secure the tokens to deploy those as well as being able to meter that consumption,” Fernandes said. “But now what’s going to be the biggest consumer of inference? It’s going to be these agents.”
Agent autonomy — the very property that makes agents powerful — is also what makes them risky. Red Hat AI 3.4 addresses this through a new AgentOps capability that bundles tracing, observability, identity management and lifecycle controls into a single operational layer, Fernandes noted. The goal is to give enterprises the guardrails that allow agents to run without requiring a leap of faith on governance.
“The key thing that distinguishes an agent is autonomy,” Fernandes said. “You want to give that agent an identity and make sure that it’s authorized to do the things you want it to do, but not authorized to do the things it [shouldn’t]. You have to figure out how to pass the credentials, what it should be able to access in your network, on your file system — and then ultimately be able to trace what it did to be able to debug.”
Looking ahead three to five years, the endgame is not a single dominant platform but a rich, open ecosystem — the same pattern that made Linux the default enterprise operating system, Stevens explained. With AI still fragmented across hardware, model and cloud choices, the opportunity lies in becoming the one durable AI ecosystem layer that everything else can build around.
“I think that’s where we are right now in AI. We’re in this siloed world,” Stevens said. “If we’re successful with this, we’ve just opened up this whole world by being that one durable thing that allows that ecosystem to build around it.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Red Hat Summit 2026:
(* Disclosure: Red Hat sponsored this segment of theCUBE. Neither Red Hat nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.