AI
AI
AI
Audio artificial intelligence startup Gradium is launching today after closing on an impressive $70 million seed funding round, just three months after it was founded.
The startup is backed by investors that include FirstMark Capital and Eurazeo, which led the funding round, as well as DST Global Partners, Korelya Capital and Amplify Partners, and high-profile angels such as former Google LLC Chief Executive Eric Schmidt.
Gradium’s mission is to commercialize audio language models, which are specialized AI systems that are designed to process, understand and generate natural language using audio-text data. Natural language is leveraged as a “supervision signal,” allowing ALMs to perform tasks such as audio classification and speech synthesis more effectively than general-purpose large language models.
The startup says ALMs are the “audio-native counterpart” to LLMs and are meant to support more natural and expressive voice interactions with dramatically lower latency, making conversations with AI feel more realistic.
The concept was first developed by Gradium’s founders during their time at Kyutai, a nonprofit AI research lab. ALMs are trained on datasets that pair audio with descriptive text, enabling them to learn the complex relationships between sound and language. The natural language supervision technique replaces traditional labeling, using natural speech as a guiding signal to teach them how to understand and say specific words.
Co-founder and Chief Executive Neil Zeghidour explained that his company wants to help ALMs lock the true potential of “voice AI,” which is still reliant on what he says are subpar systems. “Existing systems are brittle, costly and unable to deliver truly natural interactions,” he said. “Our goal is to make voice the primary interface between humans and machines.”
According to Zeghidour, ALMs can outperform LLMs in any kind of voice AI task, including areas such as speech recognition, where spoken language is transformed into written text, as well as audio generation, such as creating original speech, and audio classification, which refers to identifying and categorizing different audio signals. Ultimately, Gradium wants to transform the capabilities of AI assistants and agents, making conversational interactions between them and humans feel more natural and realistic.
“To achieve this, we’re eliminating the longstanding tradeoff between quality and scalability: combining ultra-realistic expressivity, accurate transcription and ultra-low-latency interactions at a price point that finally makes high-quality voice ubiquitous,” Zeghidour said.
Zeghidour has assembled a talented team to make good on this promise, comprised of researchers and engineers who previously worked at Google’s DeepMind, Meta Platforms Inc.’s FAIR research team and Jane Street Capital LLC. He said the company possesses one of the industry’s highest concentrations of generative audio expertise assembled so far, and it has already developed a number of production-ready systems that are being used by early adopters and generating revenue. Those early adopters include companies in gaming, customer care, language learning, healthcare and AI agents.
Gradium is launching its platform and enabling access to its first models today, with support for English, French, German, Spanish and Portuguese. It says it offers flexible plans catering to the smallest developer teams all the way up to the largest enterprises.
The startup said it will continue its ongoing collaboration with Kyutai, ensuring it has access to the latest frontier research in generative audio so it can remain at the forefront of the latest innovations in ALMs.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.