Google’s new AVA dataset aims to help AI better understand humans
A week after Facebook Inc. introduced two datasets aimed at helping developers train their computer vision models, Google LLC has upped the ante with a contribution of its own.
The company on Thursday released AVA, a vast collection of video content that can be used to hone an artificial intelligence’s observation skills. Both Google and Facebook have picked clips that portray everyday actions such as walking. At first glance, it’s a rather specific area to be focusing on given the many other areas in which computer vision is being applied these days. What makes the datasets significant is that the ability to interpret human actions is a key requirement for the some of the most cutting-edge applications of AI.
Google created AVA to address the lack of computer vision datasets that feature complex scenes with multiple people performing different activities. As part of the project, the search giant’s researchers extracted 15-minute videos from long-form content on YouTube, mainly old films, and split each up into 300 three-second segments. Those short clips were in turn manually tagged with labels describing the actions shown on the screen.
AVA contains a total of 57,600 segments with some 210,000 action labels, according to Google. The breadth of the dataset could be useful for helping computer vision models pick up on the differences that exist in how people perform a given action. Because of the variety that exists in human behavior, activities historically have been harder to categorize than objects.
AVA might also help AI systems learn how to detect certain patterns better. For example, Google’s researchers noted in a blog post that the dataset shows actors who sing during a scene often also play an instrument while they’re at it. The fact that the individual segments form continuous 15-minute videos could potentially let computer vision models look for much deeper patterns as well.
Enabling AI to better identify human actions could prove useful in a variety of areas. A drone maker, for example, may benefit from the ability to customize flight patterns based on what users are doing. The technology has the potential to be even more valuable in industrial environments such as factories where robots operate alongside human workers.
Image: Unsplash
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU