UPDATED 22:59 EDT / AUGUST 14 2017

CLOUD

Google updates Cloud Speech API with support for long-form audio clips

Google Inc.’s Cloud Speech Application Programming Interface has been updated with more features for developers who wish to integrate speech recognition capabilities into their Android applications.

The update increases the number of languages for which speech recognition is available, and also adds support for “long-form” audio clips. The idea is to provide more functionality and control for developers, Google’s product manager Dan Aharon said in a blog post Monday.

The Google Cloud Speech API, or application programming interface, is a machine learning-powered tool that developers can use to add capabilities such as voice and audio file transcription, voice-enabled commands and call center routing to the applications and services they build. Google said the API relies on deep learning algorithms to keep improving its speech recognition capabilities with repeated use. The speech recognition features can also be customized to particular settings or content by training the API with specific words and phrases used in those situations.

Monday’s update to the API extends its long-form audio capabilities from 80 minutes up to 180 minutes. In addition, the Cloud Speech API can now support files longer than three hours, but only on a case-by-case basis, Aharon said. Developers who want to take advantage of this need to apply for a quota extension through Google’s Cloud Support.

Google has also added word-level timestamps at the request of developers, Aharon said. The timestamps provide developers with the ability to jump to specific points of a transcript where a piece of text has been spoken. They can also be used to display relevant text while an audio clip is being played, helping users to dramatically reduce the time it takes to proofread transcripts.

Finally, Google added support for an additional 30 languages, which means the Cloud Speech API now supports 119 in total. The new languages include Bengali, Latvian and Swahili, covering almost a billion speakers across the world, Google said.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.