AI
AI
AI
The initial phase of the artificial intelligence gold rush was defined by “The Build.” Hyperscalers and model builders raced to secure every available Nvidia Corp. H100 GPU, constructing massive, centralized cathedrals of compute.
But as the industry descends from the peak of inflated expectations toward real-world utility, the conversation is shifting. AI is moving from the lab to the factory floor, the retail aisle and the telco edge.
At Nvidia’s annual GTC today in San Jose, Cisco Systems Inc. laid out its blueprint for this transition. Cisco’s message is for AI to work in the enterprise, it requires more than just raw GPU power. It needs a “Secure AI Factory” — a full-stack, validated architecture that treats AI not as a science project, but as a high-value production line.
For decades, Cisco’s role in the data center was to provide the “plumbing” — the reliable, invisible pipes that moved data from point A to point B. But in an analyst briefing, Kevin Wollenweber, Cisco’s senior vice president and general manager of data center and internet infrastructure, explained Cisco’s role has fundamentally changed.
“The network has gone from just plumbing and infrastructure to really a critical component to what enables these models to learn and think,” he said. “Whether it’s connecting GPUs in a massive network efficiently to allow training workloads to run across tens of thousands of GPUs, or as we pivot more into inference, it’s about how we actually get low latency and high bandwidth access to storage.”
This shift is critical for Cisco and Nvidia customers alike. As workloads move from training (learning) to inference (doing), the bottleneck isn’t just the processor; it’s the ability to feed that processor data at the speed of thought. By integrating Nvidia’s Spectrum-X Ethernet platform with Cisco’s UCS compute and Nexus management, the two companies are attempting to standardize and simplify the AI stack. This is similar to the approach Cisco took with private cloud when it entered into a joint venture with VMware and EMC, and “VCE” created a turnkey, engineered solution for cloud.
Perhaps the most significant point mentioned in the briefing was the focus on “tokenomics.” In the enterprise, the value of AI is increasingly measured by the cost and speed of the output — the tokens. Wollenweber argued that the competitive moat for modern businesses will be built on how efficiently they can generate these tokens.
“The competitiveness for a lot of our customers is going to be around: how do we drive efficient token generation?” Wollenweber explained. “You’re going to have OpEx and engineering resources, but you have to look at actually how you can either leverage tokens efficiently or generate tokens efficiently to be able to grow in this ecosystem.”
This is why Cisco is pushing the “AI factory” concept. If an enterprise tries to “DIY” their AI infrastructure, they face a “complexity tax” that drains token efficiency. By providing a validated “Secure AI Factory” stack, Cisco and Nvidia are offering a way to bypass the architectural heavy lifting, allowing customers to focus on the workloads that drive return on investment.
The briefing also touched on a massive looming shift in AI architecture: the move from human-led prompts to agentic AI. We are moving into an era where autonomous agents communicate with other agents to execute complex workflows. Wollenweber shared how this is already changing his own work habits: “I think the agentic era that we’re in is going to drive a lot more of that [on-premises demand] than people probably realized. I go into a meeting, a closed-laptop type of meeting with my executive team, and I make sure I kick off six agents before I leave to go generate work and do work for me while I’m sitting in a meeting.”
This “agentic” workflow creates a massive security headache. How do you secure a conversation between two autonomous agents? Cisco’s answer is to fuse security into the fabric itself. By extending their Hybrid Mesh Firewall into the Nvidia BlueField Data Processing Unit ecosystem, Cisco is placing a security guard at every single GPU entrance.
The implication for customers is to greatly simplify threat protection: security is no longer a “bolt-on” that adds latency; it is an offloaded process that happens on the DPU, ensuring that the “security tax” doesn’t slow down the “token generation.”
One of the most ambitious parts of its GTC announcement is the expansion into the telco edge. Through a partnership with AT&T, Cisco is taking these AI factory concepts and pushing them into the mobility network.
The goal is to solve the “Mobile Edge Compute Hangover.” For years, telcos built edge compute sites that struggled to find a clear revenue stream. Wollenweber believes distributed inferencing — running AI tasks such as video analytics or real-time sensor processing close to the source — is the “killer app” the edge has been waiting for.
By bringing Nvidia RTX Pro GPUs into the Cisco UCS edge portfolio, they are enabling what Wollenweber calls “distributed intelligence.” This isn’t just about big H100 clusters; it’s about putting the right amount of compute in the right place to make a decision in milliseconds.
This could solve the age-old problem of how telcos can make more money. Historically, they spend more and often the new technology reduces costs but rarely generates more revenue. The telcos have a great opportunity to offer the network and the token generation and reverse the declining revenue curve that has plagued them of years.
Finally, the briefing addressed the elephant in the room: the staggering cost and rapid obsolescence of AI hardware. For a chief financial officer, spending tens of millions on GPUs is terrifying when the next generation is always six months away. Cisco is countering this fear with a focus on “Time to First Intelligence.” Through new service offerings, Cisco is aiming to get massive clusters up and running in days rather than months.
“We all know that this equipment has a very, very short half-life,” Wollenweber noted. “The longer it sits on the shelf, the less value you get out of it before next generations are released. The faster we can get things up and running and generating tokens, the better it is for customers.”
In one Asia-Pacific deployment, Cisco managed to get a 1,000-GPU cluster fully validated and running workloads in less than a week. This operational speed is the true value proposition of the Cisco-Nvidia partnership. It’s not just about the silicon; it’s about the “velocity of AI.”
For information technology leaders, the takeaway from Cisco’s GTC announcements is that the era of AI experimentation is closing, and the era of AI industrialization is beginning. Cisco is no longer content to be the plumber. By integrating Nvidia’s accelerated computing with its own security, networking and observability tools, including Splunk, Cisco is positioning itself as the operating system for the AI factory.
As Wollenweber concluded, the goal is simple: “Enable our customers to build everything end-to-end required: to manage, monitor and react to anything that we see.” For the enterprise, the “Secure AI Factory” isn’t just a new product — it’s the infrastructure required to capitalize on the token-driven economy.
Zeus Kerravala is a principal analyst at ZK Research, a division of Kerravala Consulting. He wrote this article for SiliconANGLE.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.