![](https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2017/02/Microsoft-custom-speech-service.png)
![](https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2017/02/Microsoft-custom-speech-service.png)
Developers have a new machine learning tool for improving speech recognition to play around with courtesy of Microsoft Corp., which launched the public beta for its Custom Speech Service on Tuesday.
Custom Speech Service is designed to overcome some of the most common problems in speech recognition systems, such as people’s different accents and vocabulary, and issues with background noise. The system allows developers to build custom language models that are able to adapt to each user’s unique way of speaking to the specific vocabulary of each application. It can also adapt to various acoustic models in specific environments, or the number of people using an application, Microsoft said.
“Beneath the hood, the Custom Speech Service leverages an algorithm that shifts Microsoft’s existing speech recognizer to the developer-supplied data,” Microsoft Research’s John Roach said in a blog post. “By starting from models that have been trained on massive troves of data, the amount of application-specific data required is greatly reduced. In cases where the developer’s data is insufficient, the recognizer falls back on the existing models.”
The acoustic modeling capabilities meanwhile, are designed to enable speech recognition in some of the noisiest environments, such as on the factory floor. The algorithm picks out user’s speech amid all of the background noise, while prioritizing jargon that might be associated with a specific industry.
Alongside Custom Speech Service, Microsoft announced two other cognitive tools – the Bing Speech API, and Content Moderator, both of which will be available in March.
The Bing Speech API is designed to transcribe live audio or recorded speech into text, and also vice versa, paving the way for apps that can talk back at users. In addition, the API can be used to create voice-enabled applications that wake up when user’s speak a certain command.
As for Content Moderator, this is used to detect profanities in texts in over 100 languages. The service is also able to spot phishing URLs, personally identifiable information and malware. Finally, it can also analyze images and videos for offensive or unwanted content, including pornographic material.
THANK YOU