UPDATED 22:59 EDT / AUGUST 14 2017

CLOUD

Google updates Cloud Speech API with support for long-form audio clips

Google Inc.’s Cloud Speech Application Programming Interface has been updated with more features for developers who wish to integrate speech recognition capabilities into their Android applications.

The update increases the number of languages for which speech recognition is available, and also adds support for “long-form” audio clips. The idea is to provide more functionality and control for developers, Google’s product manager Dan Aharon said in a blog post Monday.

The Google Cloud Speech API, or application programming interface, is a machine learning-powered tool that developers can use to add capabilities such as voice and audio file transcription, voice-enabled commands and call center routing to the applications and services they build. Google said the API relies on deep learning algorithms to keep improving its speech recognition capabilities with repeated use. The speech recognition features can also be customized to particular settings or content by training the API with specific words and phrases used in those situations.

Monday’s update to the API extends its long-form audio capabilities from 80 minutes up to 180 minutes. In addition, the Cloud Speech API can now support files longer than three hours, but only on a case-by-case basis, Aharon said. Developers who want to take advantage of this need to apply for a quota extension through Google’s Cloud Support.

Google has also added word-level timestamps at the request of developers, Aharon said. The timestamps provide developers with the ability to jump to specific points of a transcript where a piece of text has been spoken. They can also be used to display relevant text while an audio clip is being played, helping users to dramatically reduce the time it takes to proofread transcripts.

Finally, Google added support for an additional 30 languages, which means the Cloud Speech API now supports 119 in total. The new languages include Bengali, Latvian and Swahili, covering almost a billion speakers across the world, Google said.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU