UPDATED 09:00 EDT / APRIL 21 2022

AI

Speechmatics adds entity formatting to improve its speech recognition software

U.K.-based startup Speechmatics said today it’s taking a big step in advancing speech recognition with the addition of “entity formatting” to its Autonomous Speech Recognition software.

The startup, officially named Cantab Research Ltd., says it’s tackling one of the major challenges in machine learning today, which involves interpreting spoken numbers, currencies, percentages, addresses, dates and time and correctly transcribing them in written form.

Speechmatics sells artificial intelligence-based transcription software that works by understanding spoken words and transcribing them as text. It said that getting the appropriate formatting of numbers in text has always been a big problem for transcription software because the way these entities are spoken in conversation varies, even between countries that speak the same language.

For instance, some English speakers might use the word “oh” instead of “zero” when saying a telephone number. They may also use double or triple digits, such as “triple three.”

Speechmatics said it has improved the accuracy of its transcription software using a technology called Inverse Text Normalization to do a better job of recognizing spoken formats and numbers and interpreting them correctly. Correct entity formatting makes its transcripts more readable, reducing the need for them to be processed afterwards for accuracy.

Speechmatics Chief Executive Katy Wigdahl said that creating more professional, correctly formatted transcripts will speed up customers’ workflows by reducing the need for human editing across all of the languages it supports.

“Context is also critical – there are so many nuances and ambiguities that need to be accounted for in language, such as whether ‘pounds’ is a reference to weight or currency, and whether ‘venti’ is being used as the Italian word for 20 or winds,” she said.

Speechmatics said entity formatting will have a big impact in “numerically intensive industries” that need to transcribe lots of spoken content accurately.

“Entity formatting has always been a notoriously challenging task for speech recognitio, but with this latest update we are delivering best-in-market functionality and bringing significant value to our customers operating in industries where getting numbers right for speech-to-text tasks is mission-critical,” Wigdahl added.

Speechmatics claims to have made previous advances in speech recognition. It says its Autonomous Speech Recognition platform is one of the first to be trained on huge amounts of unlabeled data without any human intervention. The company said this method allows it to understand a range of different voices and accents more comprehensively, reducing both AI bias and errors in speech recognition.

Image: Speechmatics

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU