BIG DATA
BIG DATA
BIG DATA
Deepgram Inc., developer of a voice-recognition engine that it delivers as a service via application program interfaces, announced today that it has added 23 new language and dialect models to its original U.S. English model.
The company promotes its service as being the fastest and most accurate on the market, capable of recognizing and transcribing speech in less than one-third of a second with accuracy rates better than 90%. Founded in 2015, the company has raised more than $56 million. It says it has transcribed over 100 billion words from audio into text.
Deepgram wrote in a post on its blog today that the new suite is a “significant step toward delivering a global language experience that is on par with the success we’ve seen from our U.S. English model.” In addition to delivering via the cloud, Deepgram is also available for on-premises deployment in software containers with pre-built virtual machine images that can be deployed on most clouds.
Application programming interface integration enables developers to add voice recognition to their applications without requiring significant revisions. “Developers can embed the API into their software so that the integration is seamless,” said Chief Operating Officer Shadi Baqleh. “For example, you can talk into a microphone on a software app, and text appears on the user’s screen in less than one second.”
The 23 new language and dialect models are Dutch, versions of English from Australia, Great Britain, New Zealand and India, French, French Canadian, German, Hindi, Indonesian, Italian, Japanese, Korean, traditional and simplified Mandarin, Portuguese, Brazilian Portuguese, Russian, Spanish, Latin American Spanish, Swedish, Turkish and Ukrainian. The company is making several of those language models free to use for a limited time.
Deepgram uses transfer learning backed by a proprietary architecture and training with real-world audio datasets that it says yields accuracy rates of up to 98% in optimal conditions. It also has linguists on its staff to perform quality control checks. “We have tested our new languages against the big tech providers and beaten their models every time,” Baqleh said.
In addition to real-time voice recognition, the service provides batch transcription at the rate of one hour every three seconds. “A large call center can transcribe 10,000 hours of daily calls in less than 10 hours to find customers who may churn, ones they can upsell, or products with issues,” Baqleh said. Deepgram said its transcription speed is the same across all languages.
Other features include automatic punctuation and capitalization, the ability to identify up to 10 different speakers at one time, acoustic pattern-matching for search, profanity filtering and automatic redaction of sensitive data. Support for representational state transfer APIs enables the engine to be connected to any audio data source and to deliver results a wide variety of output options.
Deepgram offers a free tier and also standard and premium editions beginning at 1.25 cents per minute for batch transcription.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.