UPDATED 20:34 EST / NOVEMBER 22 2016

NEWS

Google’s DeepMind learns to lip-read better than humans

Google may have found a way to use machine learning technology to help millions of deaf and hearing-impaired people better understand what people are saying to them.

Researchers from Google Inc.’s DeepMind artificial intelligence project, which built the boardgame-playing AlphaGo that managed to successfully defeat one of the top Go players in the world, have teamed up with peers at the Oxford University to create an AI system that’s able to outperform professional lip-readers after training itself on thousands of hours of BBC videos.

New Scientist reports that in tests, a human lip-reader who provides services for the U.K. courts was able to correctly decipher only about a quarter of words spoken when shown a random sample from 200 BBC video broadcasts. However, DeepMind’s AI system was able to decipher almost half of the words from the same sample videos. In addition, the AI was able to annotate 46 percent of the words without error, compared with just 12 percent by the human lip-reader.

The researchers hope that the technology could one day be used on phones, either as a new way to instruct a voice assistant like Siri, or as a way to enhance speech recognition.

“A machine that can lip-read opens up a host of applications: ‘dictating’ instructions or messages to a phone in a noisy environment; transcribing and redubbing archival silent films; resolving multi-talker simultaneous speech; and improving the performance of automated speech recognition in general,” the researchers said in their research paper.

Machine learning involves using massive data sets to train AI systems. In this case, the researchers trained their lip-reading system, called “Watch, Listen, Attend and Spell,” on almost 5,000 hours of talking faces from six BBC shows, such as BBC Breakfast, Newsnight and Question Time. The system was fed 118,000 sentences and 17,500 unique words in total.

The researcher explained that unlike other lip-reading systems, theirs was focused on interpreting “unconstrained natural language sentences” and “in-the-wild videos.” Previous systems, such as the University of Oxford’s LipNet, have targeted recognition only on a much more limited number of words and phrases.

DeepMind and the Oxford University say they’ll make their data publicly available as a training resource for other researchers and projects.

Image credit: Google/Oxford University

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google’s DeepMind learns to lip-read better than humans

Image credit: Google/Oxford University

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Google’s DeepMind learns to lip-read better than humans

Image credit: Google/Oxford University

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Cookies