UPDATED 20:15 EST / MAY 19 2025

Stephen Watt, vice president and distinguished engineer, Office of the CTO, at Red Hat Inc., talks with theCUBE about how the company is evolving its AI infrastructure – Red Hat Summit 2025 AI

Red Hat builds real-world AI infrastructure from upstream to inference

Enterprises looking to push AI infrastructure from lab experiments to production-ready solutions face a familiar bottleneck: Data access, system compatibility and performance at scale.

Red Hat Inc. is tackling this challenge head-on through collaborative efforts with hardware and chip partners to optimize artificial intelligence and memory technologies in real-world enterprise environments.

Stephen Watt, vice president and distinguished engineer, Office of the CTO, at Red Hat Inc., talks with theCUBE about how Red Hat is evolving its AI infrastructure – Red Hat Summit 2025.

Red Hat’s Stephen Watt talks with theCUBE about the company’s open-source innovation.

“I think it all starts with large language models,” said Stephen Watt (pictured), vice president and distinguished engineer, Office of the CTO, at Red Hat. “I think we had this sort of era of predictive AI, and now with generative AI, I think there’s a whole lot of … new applications … and … interesting new use cases in three different areas: training, fine-tuning and inference. Last year, we announced the InstructLab, which was democratizing fine-tuning models. With our Neural Magic acquisition, we’ve got a lot more into inference, and that’s about serving models and creating value for applications in the enterprise.”

Watt spoke with theCUBE Research’s Rob Strechay and theCUBE’s host Rebecca Knight at Red Hat Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed Red Hat’s evolving AI infrastructure strategy and open-source innovation. (* Disclosure below.)

Scaling AI infrastructure with context and control

Red Hat is expanding its AI strategy by integrating open-source tools that enhance model context and task specificity. By combining retrieval-augmented generation with fine-tuning techniques and high-performance inference frameworks, such as virtualized large language models, the company aims to ground large language models in both data and real-world operations, according to Watt.

“I would say it’s all about context,” he said. “There’s retrieval augmentation, RAG, and then RAFT, which is applying RAG with fine-tuning. We’ve got an emerging story around that with the upstream Llama Stack project, where we’ve just done a lot of work upstream to enable all of that.”

As AI scales across edge, data center and cloud environments, Red Hat is leaning into its distributed systems pedigree to tame inference sprawl. The company is prioritizing engineering strategies that make institutional knowledge more accessible to AI models and exploring new architectural patterns for seamless model integration, according to Watt.

“I think there’s two specific areas going back into context again,” he said. “One is vector databases. You take those [extract, transform, load] pipelines and you chunk all your documents back into those vector databases. Once you do that, you’re able to basically take what you institutionally know within your organization [and] add it into something that’s accessible from the large language model. The second thing that’s really interesting is the evolution of service-oriented architectures … to hook those into the large language model — I think those two things together are really exciting.”

Open-source leadership remains foundational to Red Hat’s approach to AI infrastructure.  As innovation accelerates across model development and AI deployment, the company continues to invest in upstream communities that support transparency, trust and long-term viability for applied AI solutions, according to Watt.

“The rate of innovation, new projects, the creative destruction that’s happening … our role is to basically create a steady pipeline that businesses can use to consume software where it’s stabilized and safe,” Watt added.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Red Hat Summit:

(* Disclosure: Red Hat Inc. sponsored this segment of theCUBE. Neither Red Hat nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.