UPDATED 09:00 EDT / APRIL 07 2025

AI

PyannoteAI raises $9M for its speech processing AI

French artificial intelligence startup pyannoteAI SAS today announced that it has raised $9 million in funding to enhance its technology.

Crane Venture Partners and Serena led the seed investment. They were joined by Alexis Conneau, chief executive of venture-backed AI startup WaveForms Inc., and Hugging Face Inc. Chief Technology Officer Julien Chaumond.

Founded last year, pyannoteAI offers an open-source AI toolkit of the same for transcribing speech. The software supports multiple languages and can automatically perform speaker diarization. That’s the process of attributing each section of a transcript to the relevant speaker, a task AI models usually struggle to perform reliably.

Under the hood, pyannoteAI’s AI toolkit runs on multiple internally-developed neural networks. It also features pipelines, software workflows that help prepare audio data before it’s processed by the models. Companies can fine-tune the toolkit’s individual components on their internal datasets to improve their performance.

On occasion of today’s funding announcement, pyannoteAI disclosed that its open-source software is downloaded more than 45 million times per month. The company’s installed base includes more than 100,000 developers. It generates revenue with a paid version of its open-source AI toolkit that includes more advanced capabilities.

According to pyannoteAI, its commercial offering is twice as fast as the open-source edition. The software also provides a 20% accuracy increase, which allows it to more reliably distinguish speakers in audio recordings. The model can tell voices apart even if several people speak at the same time.

Customers can upload files with up to 24 hours of audio to the commercial version of pyannoteAI’s software. According to the company, its platform automatically identifies recurring speakers across files to reduce the need for manual transcript editing.

To mitigate the impact of potential accuracy issues, pyannoteAI’s software generates a confidence score for each transcript segment that it generates. The lower the confidence score, the greater the risk that the AI made a mistake. This feature allows customers to quickly spot errors in lengthy transcripts without a time-consuming manual review.

Organizations can access pyannoteAI’s platform through an application programming interface or deploy it on their own infrastructure. According to the company, the software supports the major public clouds and bare-metal servers.

“We’re bringing enterprise-grade speaker intelligence AI to businesses that depend on voice data,” said pyannoteAI co-founder and CEO Vincent Molina. “Our goal is to make speaker-aware AI as seamless and universal as speech itself.”

The company plans to invest its newly raised capital in product development initiatives. It’s building features that will make it possible to split an audio file into multiple files that each only feature only a single speaker. Additionally, pyannoteAI will enable customers to run its AI models on a broader range of devices.

Photo: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU