UPDATED 11:00 EST / MARCH 05 2020

CLOUD

Google Cloud extends its Speech-to-Text service to 7 new languages

Google LLC today upgraded Speech-to-Text, its artificial intelligence transcription service for enterprises, with expanded language support and better dialect recognition that will broaden the offering’s addressable market.

The flagship enhancement is the addition of compatibility with seven new languages: Burmese, Estonian, Uzbek, Punjabi, Albanian, Macedonian and Mongolian. Speech-to-Text can now transcribe speech across a total of 71 languages.

“These advancements bring our speech technology to over 200 million speakers for the first time and unlock additional features and improves accuracy for more than 3 billion speakers globally,” Calum Barnes, Google Cloud’s product manager for speech, wrote in the announcement. 

Speech-to-Text, which is delivered as an application programming interface in Google Cloud, doesn’t just turn audio files into text but also provides advanced features that allow enterprises to customize the transcription process. Three of those features are being enhanced as part of today’s update.

The first capability, dubbed speech adaptation, allows enterprises to tweak the text that the service generates. Retailers can train Speech-to-Text to recognize hard-to-transcribe product names when they’re mentioned during customer service calls, while an analytics software provider might want to convert times of day such as “quarter to nine” into a numerical form to simplify processing. Speech adaptation is now available in 68 new languages and dialects including French, German, Spanish, Japanese and Mandarin.

Diarization, a feature that makes it possible to attribute words to a specific speaker, is getting support for 10 new languages and dialects. And Google’s “enhanced telephony model” has been extended to British English, Russian and U.S. Spanish. The model, which launched in 2018 with support for American English, can improve transcript accuracy by 62% for low-quality audio from calls.

Lastly, the new version of Speech-to-Text automatically adds punctuation marks to audio files in 18 languages that weren’t supported before, among them German, French, Japanese and Swedish.

“These new languages and features will help billions of speakers across the world use our voice-based interfaces and high-quality speech recognition,” Barnes wrote.

The update should prove particularly handy for companies that offer services in multiple languages. A firm using Speech-to-Text to, say, power  a real-time video captioning service or a tool for analyzing customer sentiment during support calls would now have the ability to sell its solution in more markets.  

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU