INFRA
INFRA
INFRA
As artificial intelligence scales, AI infrastructure is becoming the deciding factor in whether enterprise AI delivers real value or stalls.
AI is shifting from experimentation to unified systems where infrastructure, data and software work together. Partnerships such as Dell Technologies Inc. and Nvidia Corp. are aligning these layers to simplify deployment and support scalable, long-term platforms, according to Varun Chhabra (pictured, left), senior vice president of ISG and telecom marketing at Dell Technologies.
“The word that I think is top of mind for everybody is agentic,” Chhabra said. “With OpenClaw and the announcements Nvidia made around NemoClaw, it does feel like we’re at that ChatGPT moment for agentic. Everybody’s asking us about how to adopt agentic faster than ever before.”
Chhabra and Anne Hecht (right), senior director of product marketing, enterprise, at Nvidia, spoke with theCUBE’s John Furrier for theCUBE + NYSE Wired: AI Factories – Data Centers of the Future interview series, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how enterprises are rethinking AI systems, costs and deployment as infrastructure becomes central to scaling real-world AI.
Enterprises are increasingly recognizing that speed alone is not the constraint; the real challenge is managing complexity across rapidly evolving technologies. As models, tools and architectures change, organizations are prioritizing systems that can absorb that change without constant redesign. That means building around flexibility, scalability and long-term viability rather than short-term performance gains, according to Hecht.
“It’s changing so fast, which I think is one of the challenges that enterprises also have,” she said. “Last year, we were talking about reasoning models because DeepSeek had just dropped. Now, we’re talking about agents that evolve and create other agents because OpenClaw dropped. With every one of these waves, we’re working with Dell and the ecosystem, evaluating these technologies and then bringing it to an enterprise in a way that they can leverage these advancements, but safely and securely, so they can always get the best advantages from AI and use these technologies that are coming out so quickly.”
This shift is also redefining where AI workloads live. The assumption that everything will run in the public cloud is giving way to more distributed approaches that span on-premises environments, edge locations and developer workstations. The goal is not centralization, but control — particularly around cost, governance and performance.
“Technologies like confidential computing, that’s where you see the ecosystem coming together where frontier models like Google’s Gemini model can now run on-prem on a Dell server because they’re building a confidential computing stack,” Hecht added. “Enterprises are going to have their AI workloads running across a very distributed architecture, and as an ecosystem, we need to build the systems to make that possible.”
As AI moves deeper into operations, token consumption is emerging as a new economic pressure point. Early use cases such as coding assistance are already exposing how quickly demand can scale, forcing enterprises to rethink whether consumption-based models are sustainable long term. Infrastructure decisions are now directly tied to how efficiently organizations can generate and manage that demand, Hecht explained.
“You can set an agent to do a task and then come back in the morning and it’s done it,” she said. “It’s done research. It’s a report. It can take actions on your part with your approval, and it’s generated and burned through a bunch of tokens to do that. You need systems that actually can support that level of inference, like almost always on inference at scale.”
That dynamic is reinforcing the case for owning more of the infrastructure stack, especially for organizations running high-volume inference workloads. Instead of relying entirely on external providers, many are starting to evaluate capital investment as a way to stabilize costs and ensure availability.
“I think enterprises have to look at the variable costs of their AI that they have right now,” she said. “Enterprise can manage the availability and prioritize those workloads, which sometimes is harder depending on your contract relationship and where you’re renting your AI Factory if you’re using a third party and not owning your own infrastructure.”
At the same time, the rise of agentic systems is raising new questions about governance and control. As AI systems become more autonomous, enterprises are being forced to define boundaries around how much access and decision-making authority those systems should have. That tension between productivity and oversight is quickly becoming a central design consideration, Chhabra emphasized.
“We’re doing a lot of work with the Dell Automation Platform,” he said. [We’re] working closely with Nvidia to actually create blueprints that can help accelerate the deployment of the entire AI stack, whether it’s infrastructure, the software that Nvidia is delivering, models, as well as capabilities on top of that, all of those things.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of theCUBE + NYSE Wired: AI Factories – Data Centers of the Future interview series:
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.