To combat COVID-19 ‘infodemic’ on Twitter with machine learning, Novetta taps AWS
As the COVID-19 pandemic spreads globally, misinformation about the virus spreads online through word of mouth on Twitter. Today, Novetta, an advanced analytics and technical solutions company, announced it used Amazon Web Services to understand better how to curtail these harmful narratives.
Decision-makers and activists combatting misinformation about COVID-19, which Novetta calls the “infodemic,” need better information about what people believe about the pandemic. Twitter has become an excellent resource to harvest such information, as users produce millions of posts a day about such beliefs, but the scale of this activity can be daunting.
In order to tackle this huge amount of information, Novetta turned to Amazon Web Services and created a technology calls Rapid Narrative Analysis, or with tongue firmly planted in cheek “RNA,” a nod to COVID-19 being a ribonucleic acid or RNA virus.
Because Novetta needs to ingest as much data as possible, the company uses AWS and Amazon’s cloud for compute and storage. This sort of power pulled out of the cloud, with the ability to spin up virtual servers automatically and let them vanish when not needed, has become a mainstay of the machine learning and artificial intelligence ecosystem.
“An abundance of data and access to powerful compute and storage for all that data continues to usher in the golden age of machine learning,” Sandy Carter, a vice president of Amazon Web Services, told SiliconANGLE. “In the case of Novetta, accessing the right kind of compute for training models and making predictions was no different, and they took advantage of Amazon EC2 Spot instances, which are spare compute capacity in the AWS cloud available to customers at steep discounts of up to 90% compared to on-demand prices.”
Virulence over virality
RNA uses machine learning to analyze the sentiment and severity of key belief narratives spreading on social media at the speeds needed to stay ahead of the curve. The goal is to prepare marketing strategies to combat them quickly and take action.
And RNA goes one step further. People are aware of viruses and social media popularized the idea of memes – popular ideas that spread such as cat videos and funny pictures with silly words that people spread – that go “viral” and get viewed by thousands to millions. Novetta’s analysis adds an extra dimension, “virulence,” that examines the harm a narrative does to the audience.
Some examples of potential harmful narratives include “COVID-19 is a biological weapon,” “5G is responsible for COVID-19” and “certain unproven remedies cure COVID-19.” Some narratives run the gamut from the cinematic and conspiratorial to the outright dangerous.
“In the context of junk information, adopting a virulence-over-virality focus has allowed us to determine which strands of information are particularly potent and deserve the most attention versus which are just a flash in the pan,” Elliot Stewart, senior analyst at Novetta, told SiliconANGLE in an interview. “For example, with our analysis of online discussion of the 5G assertion, a story that received a huge amount of attention in both traditional and social media, we were able to show that this idea was not actually very convincing to people online, despite all the concern.”
The harmful side of the narrative, of course, can be a different matter, Stewart said. “Importantly, the same was not true for other unproven narratives surrounding COVID-19, such as the discussion of the multitude of unproven remedies circulating online or the assertion that the virus was a man-made, biological weapon – both of which were considerably more convincing to online users,” he said.
What RNA does is examine hundreds of thousands of tweets a day on Twitter to determine not just the spread of a narrative but how people are reacting to those tweets. It does this by comparing the sentiment of those tweets to those of experts.
“What we’re accomplishing with RNA isn’t determining the correctness/incorrectness of information in a tweet, but whether or not people are expressing belief in certain information,” Dr. Shauna Revay, lead machine learning engineer at Novetta. “We label our data to make this determination in the best way possible, with smart humans evaluating the full text of a sample of tweets from our target conversation.”
RNA is taught upfront by a small number of experts who must train the machine learning model to understand what it’s looking for. It doesn’t need to know that it’s looking for misinformation precisely. Since Novetta knows that it’s looking at narratives that people have wrong, the company is interested in how people feel about the information.
“One benefit of using transfer learning in our machine learning models is that we only need to have our human experts label a relatively small number of tweets to train a model that can then label hundreds of thousands of tweets accurately and rapidly,” Revay said. “In the case of COVID-19 misinformation, this is important because high volumes of data such as tweets are being created in a short amount of time. So this allows us to make inferences on large amounts of data quickly. Then we can have fine-grained insight into the COVID-19 misinformation landscape in the timeframe that matters most.”
Novetta is currently working with Africa CDC to combat a significant amount of misinformation that exists surrounding vaccines for COVID-19 and other diseases on the continent. Using the technology, the company is helping local governments and other groups identify hotspots and direct resources and campaigns to combat misinformation driving that behavior to help them maximize resources.
Since you’re here …
Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!
Support our mission: >>>>>> SUBSCRIBE NOW >>>>>> to our YouTube channel.
… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.