

Google Inc.’s Cloud Speech Application Programming Interface has been updated with more features for developers who wish to integrate speech recognition capabilities into their Android applications.
The update increases the number of languages for which speech recognition is available, and also adds support for “long-form” audio clips. The idea is to provide more functionality and control for developers, Google’s product manager Dan Aharon said in a blog post Monday.
The Google Cloud Speech API, or application programming interface, is a machine learning-powered tool that developers can use to add capabilities such as voice and audio file transcription, voice-enabled commands and call center routing to the applications and services they build. Google said the API relies on deep learning algorithms to keep improving its speech recognition capabilities with repeated use. The speech recognition features can also be customized to particular settings or content by training the API with specific words and phrases used in those situations.
Monday’s update to the API extends its long-form audio capabilities from 80 minutes up to 180 minutes. In addition, the Cloud Speech API can now support files longer than three hours, but only on a case-by-case basis, Aharon said. Developers who want to take advantage of this need to apply for a quota extension through Google’s Cloud Support.
Google has also added word-level timestamps at the request of developers, Aharon said. The timestamps provide developers with the ability to jump to specific points of a transcript where a piece of text has been spoken. They can also be used to display relevant text while an audio clip is being played, helping users to dramatically reduce the time it takes to proofread transcripts.
Finally, Google added support for an additional 30 languages, which means the Cloud Speech API now supports 119 in total. The new languages include Bengali, Latvian and Swahili, covering almost a billion speakers across the world, Google said.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.