UPDATED 15:41 EST / JANUARY 08 2026

At CES 2026, Nvidia's Vera Rubin system redefines AI with scalable context storage, faster reasoning and next-gen networking innovations. INFRA

Vast Data weighs in on Nvidia’s Vera Rubin and the future of AI context storage

Nvidia Corp.’s CES 2026 keynote delivered more than a hardware reveal — it signaled a fundamental rethinking of how AI systems are built, scaled and fed with data. At the center of the conversation was Vera Rubin, Nvidia’s next-generation AI system, and a critical but often overlooked component: context storage.

Vera Rubin isn’t a single GPU; it’s an entirely re-architected system composed of multiple chips working in concert, according to John Mao (pictured, left), vice president of global technology alliances at Vast Data Inc.

“I think Jensen [Huang] did a great job explaining how they had to reinvent the entire system,” Mao said. “It’s not just a new GPU, but six different chips. I think a lot of that also spills into the rest of the stack. The rest of the stack in this context is reinventing how you do things like KV cache and how do we evolve when models get bigger, when longer reasoning starts to happen, when more turns are happening on inferencing, which means that different paradigms require for storing KV cache.”

Mao spoke with theCUBE’s Rob Strechay (right) at CES 2026, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how Nvidia and Vast are redefining the way AI handles memory, reasoning and scale in the era of long-context, multi-turn inference.

Vera Rubin overcomes the limits of local KV caching

Traditionally, KV cache has been tightly bound to the GPU, or extended modestly using local NVMe SSDs inside GPU servers. While this approach helps, it hits scalability limits quickly. Long-running, multi-turn inference and reasoning-heavy workloads require far more capacity than any single server can provide, according to Mao.

“KV cache used to be very local to the GPU and the high bandwidth memory,” he said. “But, obviously, that’s not good enough if you’re trying to store very long conversations. If you’re trying to grow that context over time, you need a different method. A lot of that development that Vast has been doing with Nvidia is in how do we build and re-architect that part of the stack for these new systems that are going into deployment?”

Nvidia’s advances in networking — particularly Spectrum-X and the new BlueField-4 DPU — open the door to a radically different approach. Instead of confining context to local storage, KV cache can now “spill” across the network into a highly scalable, shared pool of NVMe storage, Mao added.

“Yes, local NVME is good, but imagine a world where we had an infinitely scalable pool of NVME across a very fast fabric, being able to do that. And that’s part of the announcement today,” he said.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of CES 2026:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.