Facebook aims to advance machine learning with two new video AI projects
Facebook Inc. today shared details about two internal artificial intelligence projects, Learning from Videos and TimeSformer, that are aimed at facilitating the development of more powerful machine learning models.
With Learning from Videos, the first project, the company will train the machine learning systems powering its social network on clips uploaded by users. Facebook relies on AI to perform tasks ranging from content recommendation to policy enforcement. The company hopes that training its machine learning systems on user-created videos will enhance their capabilities.
Normally, researchers develop AI models with handmade training datasets in which the individual files are tagged by experts with descriptive labels. Those labels help guide the model in the right direction as it learns.
But there’s a tradeoff: The requirement to annotate files manually limits the size of the training dataset that researchers can realistically assemble. That, in turn, limits the amount of learning that AI models can do during training.
By enabling its models to train on unlabeled user-created videos, Facebook will allow them to learn from a far larger amount of information than is available in traditional handmade training datasets. “By learning from global streams of publicly available videos spanning nearly every country and hundreds of languages, our AI systems will not just improve accuracy but also adapt to our fast moving world and recognize the nuances and visual cues across different cultures and regions,” Facebook researchers wrote in a blog post today.
The researchers stressed that the initiative will emphasize privacy. “We’re building and maintaining a strong privacy foundation that uses automated solutions to enforce privacy at scale,” they wrote. “By embedding this work at the infrastructure level, we can consistently apply privacy requirements across our systems.”
To facilitate training on user videos, Facebook is using an approach known as self-supervised learning that removes the need for labeled datasets. It has already started implementing the approach in production. Instagram’s Reels feature, the social network disclosed, uses a self-supervised AI model to display recommended videos that are similar to clips users have viewed recently.
Alongside its work on self-supervised learning, the company today detailed a separate AI project it dubs TimeSformer. It’s described as the first video processing AI based entirely on so-called Transformers, highly efficient machine learning models originally created to analyze text. Thanks to its use of the technology, Facebook say, TimeSformer processes data using less than a 10th the computing resources required by traditional models and can be trained three times as fast.
Facebook says its approach improves the training process in other ways as well. “The best 3D CNNs [a type of AI] today can only use video segments that are a few seconds long,” the company’s researchers explained. “With TimeSformer, we are able to train on far longer video clips — up to several minutes long. This may dramatically advance research to teach machines to understand complex long-form actions in video.”
Image: Facebook
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU