With Ego4D, Facebook wants AI to understand the world from a first-person perspective
Facebook Inc. announced today a long-term project aimed at solving research challenges with artificial intelligence and first-person human perception.
The research produced in the project, called Ego4D, would be useful for numerous applications, including augmented reality, virtual reality and robotics. For example, AI capable of understanding human perception from a first-person perspective could provide instructions for technicians, guide people for recipes, assist people in locating lost items and so on.
Facebook AI calls this “egocentric perception,” which differs from what is common in most of today’s computer vision capture systems. They typically learn from photos and video captured from a third-person perspective where the camera is a spectator to action.
“Next-generation AI systems will need to learn from an entirely different kind of data — videos that show the world from the center of the action, rather than the sidelines,” said Kristen Grauman, lead research scientist at Facebook.
In order to conduct this research, Facebook AI recruited a consortium of 13 universities and labs across nine countries. They collected more than 2,200 hours of first-person video in the wild, featuring more than 700 participants going about their daily lives. Most importantly, the AI must be able to provide familiar, in-context assistance for day-to-day activities to be useful, so the data should be captured in that context.
“Equally important as data collection is defining the right research benchmarks or tasks,” Grauman said. “A major milestone for this project has been to distill what it means to have intelligent egocentric perception, where we recall the past, anticipate the future, and interact with people and objects.
Using the Ego4D data set, Facebook AI chose five benchmarks that work with AI applications for episodic memory, forecasting, hand-object interaction, audio-visual memory and social interaction.
Episodic memory examples could include the AI answering questions about personal memory from egocentric video capture, such as finding misplaced keys. Although it’s easy to have forgotten where they might have been mislaid, an AI could quickly scan back through the memory of recorded video and discover them laying on a table (or left in the fridge) using a camera on wearable glasses.
With forecasting, AI could provide helpful guidance during task-oriented activities such as cooking, construction, repair work or other technical jobs. An AI could use the wearer’s camera to understand what’s previously happened and predict what’s likely to happen next. Combined with human-object-interaction, it could identify that users had already added salt to their food and warn them that they’d reached for the salt yet again.
AI could also be used to augment audio-visual memory and social interaction. For example, if someone misses something important during a class because of a distraction, he could ask their assistant for a summary. An AI with social intelligence could understand eye contact and who is talking to whom, so an AI assistant could make it easier to focus during a noisy dinner party.
The objective of Ego4D is to allow AI to gain a deeper understanding of how people go about their day-to-day lives as they normally would so that it can better contextualize and personalize experiences. As a result, AI assistants could have a positive impact on how people live, work and play.
The data sets will be available in November of this year for researchers who sign Ego4D’s data use agreement.
Image: Facebook AI
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU