UPDATED 12:02 EDT / JUNE 22 2026

AI computing for inference speed: HPE and Kamiwaza tackle GPU architecture challenges to deliver production-ready enterprise AI at scale. AI

HPE and Kamiwaza rethink AI infrastructure for the inference era

As AI factories evolve into “data centers of the future,” the infrastructure stack must also transform into a mix of CPU and GPU platforms that can deliver a full set of AI computing solutions.

This runs the gamut from application hosting to intelligence generation and from static workflows to agentic orchestration systems. For key enterprise computing vendors, such as Hewlett Packard Enterprise Co., it means that organizations increasingly expect production-ready enterprise AI with the governance, security and scale required to move efficiently from pilot to production.

The challenge confronting many organizations today is to get beyond the noise surrounding the IT stack and use AI infrastructure to improve inference speed, according to Robin Braun (pictured, left), vice president of AI business development, hybrid cloud, at HPE.

“People are trying to find the signal in the noise; they’re trying to use their data to improve their efficiency … to improve their business,” Braun said. “That’s where inference comes in — just trying to use that to get at the underlying understanding of your data is so important. That’s where I see so many customers are now really locking in and focusing on how they are solving some of the more mundane, messy data type issues.”

Braun spoke with theCUBE’s Rob Strechay for HPE’s “Unleash AI Momentum” series, during an exclusive interview on theCUBE, SiliconANGLE Media’s livestreaming studio. She was joined by Luke Norris (right), co-founder and chief executive officer of Kamiwaza Corp., and they discussed AI computing for inference speed and why architecture really matters. (* Disclosure below.)

A new approach to AI computing

In response to growing inference demands, HPE has worked with partners such as Kamiwaza and Nvidia Corp. to improve GPU performance and efficiency in the handling of larger and more complex AI workloads. This required a whole new approach to how systems are architected, according to Norris.

“The whole concept of architecting for inference is probably only two years old, and it’s got some pretty significant issues to maximize the most expensive part of the infrastructure, which is the GPU,” he told theCUBE. “You have to architect the environment so that when a user makes a request, the data and that request and the answers get loaded up into that GPU. When the user makes another request, it needs to be redirected back towards the same GPU that already has the cache. That’s extremely complex, and that’s extremely limiting because you’ve now locked that user’s session into the GPU. New architectures, new paradigms are needed.”

Part of HPE’s solution for these challenging requirements is Unleash AI, a program to deliver production-ready enterprise AI on infrastructure that provides the necessary power, governance, security and scale. Unleash AI is focused on a curated set of vetted ISV partners, such as Kamiwaza, who integrate industry-specific solutions with HPE’s offerings to enable enterprise-wide AI deployment.

“We are trying to deliver that outcome, that end-user value to our mutual customers, but the hardware and the architecture and the limitations of the data center typically prohibit our customers from moving forward,” Norris explained. “The HPE Unleash AI partnership really takes all of that away from a complexity standpoint, from an acceleration standpoint, and from a packaging standpoint, [and] allows us to continue to focus on what we want with our customers.”

This focus has allowed HPE to work more closely with its customers in the development of a clearer role for AI inference. The benefits include cost savings and a more environmentally sustainable platform, according to Braun.

“We’ve really changed the black box of inferencing — it’s now being able to truly explore how you architect your business for inferencing and make that investment wisely,” Braun said. “The real magic this can deliver is that you can dramatically increase the performance without having to dramatically invest in more servers and without having to invest in a larger power bill.”

One element of this solution involves an AI-ready, cloud-native data storage foundation to support intensive inference workloads. In May, HPE expanded its hybrid cloud and data platform portfolio with new private cloud and storage offerings designed for artificial intelligence workloads. This included the fourth generation of HPE Private Cloud, along with expanded file and object storage support in the HPE Alletra Storage MP X10000 platform.

“We’ve been very much on the cutting edge of bringing together the technology to drive the customer benefit and to be able to really start to look at and simplify the inference architecture,” Braun told theCUBE. “Are there ways we can do it faster, better and more economically just by improving how you store unstructured data? What we found is the answer is yes. [Customers] don’t have to massage all their messy data; they just need to put it on an Alletra X 10K, and we can do all the heavy lifting for them.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of HPE’s “Unleash AI Momentum” interview series:

(* Disclosure: TheCUBE is a paid media partner for HPE’s “Unleash AI Momentum” interview series. Neither HPE, the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.