AI
AI
AI
Real-time voice artificial intelligence startup Deepgram Inc. has reasons to think the new year could be a good one after raising $130 million in funding at a $1.3 billion valuation.
It also unveiled its latest acquisition, buying an AI-native voice startup called OfOne Inc. that’s focused on restaurants and the quick-service drive-through market.
The Series C round announced today was led by AVP and saw the return of all major existing investors, including Alkeon, In-Q-Tel, Madrona, Tiger, Wing and Y Combinator, plus new backers such as Alumni Ventures, Columbia University, Twilio Inc., SAP SE and Princeville Capital.
Deepgram won over those investors with its highly sophisticated speech recognition engine, which enables AI models to understand what humans are saying to them with impressive accuracy, no matter how accented someone’s voice might be. Speech recognition is key to enabling humanlike conversations with AI, and that means Deepgram is poised to have a significant impact on the industry as it gravitates to voice-based interactions.
Deepgram’s technology has been praised due to the way it adds a level of realism to human-AI interactions. It teaches conversational models to wait for the most appropriate moment to break into a conversation, similar to how another human might wait until someone else pauses before interjecting. It’s also “interruptible,” meaning that humans can interject themselves while it’s talking, and it will immediately pause and reconfigure its response based on what the person says.
In an interview with SiliconANGLE, Deepgram Chief Executive and co-founder Scott Stephenson said real-time voice has a different bar than most AI experiences: “The response has to be generated in 500 milliseconds or less.”
Stephenson argued the voice AI market shifted meaningfully in 2025 as customers crossed a belief threshold, moving from dismissing voice due to clunky implementations such as Siri and legacy IVR systems to viewing it as “skeptical to inevitable.”
The startup aims to position itself as the application programming interface platform for the communications economy, in the same way that Stripe Inc. became the API foundation of the payments industry and Twilio Inc. did for application communications. Deepgram has developed a number of APIs that it says will serve as the foundational infrastructure layer for voice AI.
For instance, Aura-2 is a specialized text-to-speech system that’s focused on clarity, realism and ultra-low latency, while Nova-3 is a real-time, enterprise-grade speech-to-text model that’s built for accuracy, and Flux is the world’s first real-time conversational speech recognition model developed specifically for chatbots. It has also created the Voice Agent API, which serves as a unified foundation for cost-effective and enterprise-ready voice agents.
Its APIs are currently used by more than 1,300 enterprise clients across a wide selection of industries. Each of its models can be customized to support nuanced, domain-specific terminology and deployed via the cloud or on-premises servers.
The acquisition of OfOne is aimed at helping boost Deepgram’s presence in the customer service industry. The startup’s team and technology will anchor the new Deepgram for Restaurants service, which is a specialized voice model that’s trained to take orders from human customers and support restaurant staff with real-time AI assistance. “The impact of AI for restaurants and drive-throughs is enormous, and together we can deliver on that opportunity with the accuracy, speed and reliability operators need at international scale,” said OfOne CEO Will Edwards, who will take on the role of general manager of Deepgram for Restaurants.
According to Stephenson, retail represents the first mass touchpoint for everyday voice AI, and the environment is the real test: “Retail is extremely important for many people, their first interaction with voice AI is going to be in a retail setting.”
He argued that the complexity of drive-thru acoustics is exactly why Deepgram wanted to go deeper than its usual API-first approach. “This is [an] extremely challenging acoustic environment,” he said.
In recent years, multiple retail voice AI pilot programs have been scaled back or abandoned, including Taco Bell pulling back its drive-thru program and McDonald’s putting the brakes on its AI voice options. Stephenson said, “the gap is getting both the model layer and infrastructure layer right, and it’s very hard to do that unless you’re sort of boots on the ground… figuring it out.”
In addition to the funding and acquisition, Deepgram is also opening a new Voice AI Collaboration Hub in San Francisco to give the voice AI community a physical space to interact. The hub will be designed for hands-on working sessions, live demonstrations, executive briefings, community meetups and developer hackathons.
Stephenson framed voice as the next major computing surface: “It’s the new interface,” he said, adding, “I think it’s a trillion-dollar market easily.”
With reporting from Kyt Dotson
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.