UPDATED 17:12 EST / AUGUST 25 2017

EMERGING TECH

Google open-sources speech command dataset for TensorFlow

Alphabet Inc.’s TensorFlow machine learning framework and AIY do-it-yourself artificial intelligence teams have released a dataset of more than 65,000 utterances of 30 different speech commands, giving developers a powerful toolset to implement their own simple voice controls without having to build everything from scratch.

Pete Warden, a software engineer on the Google Brain Team, said in a blog post Thursday that although open-source speech recognition systems such as Kaldi can use neural networks to build powerful voice features, they can also be overly sophisticated for developers who only need basic voice functionality for their programs. According to Warden, the new speech data set offers a quick way to implement simple voice commands.

“The dataset is designed to let you build basic but useful voice interfaces for applications, with common words like ‘Yes,’ ‘No,’ digits, and directions included,” said Warden. “The infrastructure we used to create the data has been open-sourced too, and we hope to see it used by the wider community to create their own versions, especially to cover underserved languages and applications.”

According to Warden, the results of using the new dataset will depend on whether the speech patterns needed for a program are included in the set, but he also said that the dataset will become more versatile “as the community contributes improved models to TensorFlow.” Warden also said that Google hopes that the community will add more accents and dialects to the dataset.

While Google has plenty of its own AI projects, the company has also been looking to put AI capabilities in the hands of more people. For example, Google launched its do-it-yourself AI initiative, AIY Projects, back in May when it shipped a free AIY voice kit with the physical edition of The MagPi, the official magazine for the Raspberry Pi minicomputer. The goal behind AIY Projects is to make AI more accessible to developers and tech hobbyists, and the team plans on releasing other kits in the future.

TensorFlow, Google’s open-source machine learning framework, has also released a series of tutorials to help developers get started on AI, including one tutorial specifically for AI-powered audio recognition. Warden said that with the latest version of TensorFlow, developers can download the speech dataset and train a voice model in just a few hours.

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU