

As generative models grow more complex and agentic systems push computational limits, a robust AI infrastructure isn’t just important — it’s essential. Today’s enterprises need hardware that can keep pace with surging token counts, growing user concurrency and ever-tightening power constraints.
Enter SambaNova Systems Inc. and its latest leap forward. The company’s SN40L steps into that breach with a reconfigurable dataflow architecture tailored for AI workloads. Designed to accelerate training and inference while slashing energy consumption, it represents a new approach to scalable, power-efficient infrastructure — one purpose-built for the hybrid AI era, according to Rodrigo Liang (pictured), chief executive officer at SambaNova.
SambaNova’s Rodrigo Liang talks with theCUBE about the impactfulness of SN40L from an AI infrastructure perspective.
“Our Generation Four is an RDU, a reconfigurable dataflow unit,” he said. “We decided to create silicon that matches the way that these neural nets want to run. The neural nets are dataflow by construct, so this was an idea that was really researched by my co-founders at Stanford. We’ve now commercialized it. These models are getting bigger; we’re able to run this 10 times faster than an Nvidia chip at one-tenth the power. I do think that as AI grows, that’s going to be the constraint for how do you scale AI.”
Liang spoke with theCUBE’s John Furrier and Dave Vellante at theCUBE + NYSE Wired: Robotics & AI Infrastructure Leaders 2025 event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how SN40L provides a robust AI infrastructure.
SambaNova Cloud delivers a robust AI infrastructure through a full-stack, purpose-built solution that seamlessly integrates hardware, software and AI models to optimize performance for enterprise-scale workloads. This comprehensive approach enables efficient deployment of AI agents, according to Liang.
“The first thing most people will do is they’ll come to the SambaNova Cloud,” he said. “You go on there and you can see all the open-source models. You’ve trained your private models. Now you have them. Next thing they do is they use the SambaNova Cloud and start experimenting. Let me string these agents together in this way because I want to make sure that this workflow actually does what I want it to do, because it’s calling various different models at any given prompt.”
SambaNova’s Gen 4 RDU enhances hybrid AI solutions by combining hardware flexibility, high performance and smooth software integration. The SN40L, built on a dataflow-based architecture, can dynamically reconfigure to match workload demands — making it highly adaptable for hybrid AI environments, according to Liang.
“What’s public is public, but what’s private; you have three problems,” he said. “One, you need a large model, and SambaNova, we’re number one on the very big models. Second, their data centers don’t have enough power. If you look at on-prem, they don’t have a gigawatt data center. Even the largest banks don’t, and so SambaNova, deploying a rack at 10 kilowatts compared to 140 kilowatts of Nvidia, 10 kilowatts goes into most of your existing data centers. The third thing that you have is then you have this ability to do multi-tenancy, which allows a large enterprise to be able to, with a very small footprint, host hundreds if not thousands of users concurrently.”
The SN40L enhances energy savings primarily through its architecture-level efficiency and workload-optimized design. By reducing redundant computation, minimizing data movement and tightly aligning compute resources with AI workloads, it significantly cuts down energy consumption — all while sustaining high performance, Liang added.
“When Chat[GPT] started, you put a prompt, and it responds … about 3,000 tokens per [prompt] on the average,” he said. “Once you get to reasoning, it’s not 10x more, it’s 100x more. People are generating 20,- 30-page documents, and what’s scarier is agentic systems, another 10 to 100x more per prompt. We’ve created this new architecture to handle a very large number of users concurrently for very, very high throughput because as these token counts increases, you need to do it really, really efficiently and then drive the power way, way down because ultimately, we’re going to see that the data centers and the power at data centers are going to be the constraint.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of theCUBE + NYSE Wired: Robotics & AI Infrastructure Leaders 2025 event:
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.