UPDATED 11:00 EDT / MARCH 05 2020

CLOUD

Google Cloud extends its Speech-to-Text service to 7 new languages

Google LLC today upgraded Speech-to-Text, its artificial intelligence transcription service for enterprises, with expanded language support and better dialect recognition that will broaden the offering’s addressable market.

The flagship enhancement is the addition of compatibility with seven new languages: Burmese, Estonian, Uzbek, Punjabi, Albanian, Macedonian and Mongolian. Speech-to-Text can now transcribe speech across a total of 71 languages.

“These advancements bring our speech technology to over 200 million speakers for the first time and unlock additional features and improves accuracy for more than 3 billion speakers globally,” Calum Barnes, Google Cloud’s product manager for speech, wrote in the announcement. 

Speech-to-Text, which is delivered as an application programming interface in Google Cloud, doesn’t just turn audio files into text but also provides advanced features that allow enterprises to customize the transcription process. Three of those features are being enhanced as part of today’s update.

The first capability, dubbed speech adaptation, allows enterprises to tweak the text that the service generates. Retailers can train Speech-to-Text to recognize hard-to-transcribe product names when they’re mentioned during customer service calls, while an analytics software provider might want to convert times of day such as “quarter to nine” into a numerical form to simplify processing. Speech adaptation is now available in 68 new languages and dialects including French, German, Spanish, Japanese and Mandarin.

Diarization, a feature that makes it possible to attribute words to a specific speaker, is getting support for 10 new languages and dialects. And Google’s “enhanced telephony model” has been extended to British English, Russian and U.S. Spanish. The model, which launched in 2018 with support for American English, can improve transcript accuracy by 62% for low-quality audio from calls.

Lastly, the new version of Speech-to-Text automatically adds punctuation marks to audio files in 18 languages that weren’t supported before, among them German, French, Japanese and Swedish.

“These new languages and features will help billions of speakers across the world use our voice-based interfaces and high-quality speech recognition,” Barnes wrote.

The update should prove particularly handy for companies that offer services in multiple languages. A firm using Speech-to-Text to, say, power  a real-time video captioning service or a tool for analyzing customer sentiment during support calls would now have the ability to sell its solution in more markets.  

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.