Meta AI researchers develop ways to read speech from people’s brains
When people speak, they are thinking about how the words they are going to form with their mouths — and in fact, it’s not necessary to talk at all in order to for the brain to form the activity involved in speech.
That’s important because the parts of the brain that control the mouth and the parts that involve understanding and forming language are separate. Researchers at Meta Platform Inc.’s Facebook AI Research Labs have been putting this knowledge along with artificial intelligence to work in order to learn how to assist people who have suffered traumatic neurological injuries that have left them unable to communicate through speech, typing or gestures.
“We’ve developed an AI model that can decode speech from noninvasive recordings of brain activity,” said Jean Remi King, a research scientist with FAIR Labs. “Decoding speech from brain activity has been a longstanding goal of neuroscientists and clinicians, but most of the progress has relied on invasive brain-recording techniques.”
Most people may be familiar with the common types of brain scans such as magnetic resonance imaging, or MRI, and computerized tomography, or CT, both of which produce detailed images of the brain. However, they show structures rather than activity. The best ways to date to get clear ongoing activity have been invasive — meaning opening up the skull and placing electrodes directly onto the brain itself.
However, noninvasive techniques such as electroencephalograms, EEG, and magnetoencephalography, MEG, can scan the brain from the outside and watch activity without any surgery. Both EEG and MEG can take millisecond-level snapshots of brain activity, which makes them perfect for a continuous view of what’s happening in a person’s brain while they’re listening to
The problem is that they don’t get a very clear picture of what’s happening, since the recordings from EEG and MEG sessions can be extremely noisy. Although they’re useful for the diagnosis of injuries, this makes them problematic for determining specific, nuanced brain activities such as if the person is thinking of saying the word “cat.”
“Noninvasive recordings are notoriously noisy and can greatly vary across recording sessions and individuals for a variety of reasons, including differences in each person’s brain and where the sensors are placed,” King said.
In order to address this problem, FAIR researchers turned to machine learning algorithms to help “clean up” the noise. The model they used is called wave2vec 2.0, an open-source AI tool developed by the FAIR team in 2020 that can be used to identify correct speech from noisy audio.
They then tasked the tool with four open-source EEG and MEG recordings consisting of 150 hours of 169 healthy volunteers listening to audiobooks and isolated sentences in English in Dutch. These recordings then became the training set for the wave2vec 2.0 model, which could then be used to improve its ability to pick out potential words that an individual heard.
“Given a snippet of brain activity, it can determine from a large pool of new audio clips which one the person actually heard,” said King. “From there, the algorithm infers the words the person has most likely heard.”
The researchers found this encouraging because it shows that the AI can be trained to learn to decode noisy and variable recordings of brain activity from perceived speech, the next step is to see if that can be extended to brain activity without the pool of audio clips. That would result in a much more versatile decoder that wouldn’t need a pre-set vocabulary.
It’s only a first step, though, King cautioned, as it only focused on decoding perceived speech, although the ultimate goal of the study is to enable patients to communicate by permitting speech production. It could even lead to further technological advancements such as new ways to control computers just by thinking of the words or task at hand.
“More generally, our work is a part of the broader effort by the scientific community to use AI to better understand the human brain,” King said.
A message from John Furrier, co-founder of SiliconANGLE:
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.
We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.