Zuckerberg unveils new advances in computer vision AI at SIGGRAPH
Meta Platforms Inc.’s artificial intelligence research team has followed up with a sequel to the popular Segment Anything machine learning model that was released last summer.
Segment Anything 2 was announced by Meta Chief Executive Mark Zuckerberg during a wide-ranging fireside chat with Nvidia Corp. CEO Jensen Huang at the SIGGRAPH 2024 event today. It’s a significant improvement on the original model, which was designed to identify specific objects and things within an image, bringing that same capability to videos.
SA2, as it’s known, is a “segmentation model,” which is a special kind of computer vision model that can look at an image and describe what it’s seeing. So it can identify a dog that’s partially obscured by a tree, or a bucket that’s collecting rainwater from a leaky roof, for example.
The difference between SA1 and SA2 is that the latter can be applied to videos, not only images, representing a significant step forward for the computer vision realm.
Zuckerberg said scientists often use these kinds of models to study things like coral reefs and natural habitats. “But being able to do this in video and have it be zero shot and tell it what you want, it’s pretty cool,” he said.
The fact SA2 can do this for videos is a testament to the advances in the AI industry, particularly in terms of processing power. Just one year ago, applying image segmentation to video wouldn’t have been possible, Zuckerberg said.
The SA2 model is being open-sourced and can be downloaded from GitHub, and there’s a free demo available here.
Zuckerberg said the model was trained on a massive amount of data, and the company has released an annotated database of about 50,000 videos that was created specifically to train SA2. However, he said the model was also trained on a second database with more than 100,000 videos, but that one is not being made public. Zuckerberg didn’t say why, but it’s reasonable to assume those videos are likely to be user-generated content from Facebook and Instagram.
In the chat, Zuckerberg admitted to Huang that, although most of the company’s AI research is made open-source, it still has commercial interests at heart.
“We’re not doing this because we’re altruistic people, even though I think that this is going to be helpful for the ecosystem — we’re doing it because we think that this is going to make the thing that we’re building the best,” he said.
Holger Mueller of Constellation Research Inc. told SiliconANGLE that the release of SA 2 is a timely reminder of the impressive gains generative AI had the delivered in terms of image generation, editing and understanding.
“All too often people get lost in the query and text generation applications of AI, but it is also making good progress in being able to understand what the pixels in an image or video are,” he said. “This has big implications in video editing, as it can cut down on the time it takes to edit video content.”
Digital twins for influencers
During the discussion, Zuckerberg also talked about his vision of a future where Facebook and Instagram might be able to generate AI doubles of social media influencers and content creators that act like “an agent or assistant that their community can interact with.”
He explained that some creators simply don’t have enough hours in the day to engage with their followers in the way they would like to. By using a digital twin of themselves, influencers could engage in direct messaging with their followers, he said.
Rather than talk to their followers directly, “the next best thing is to enable people to build digital agents trained on material that represents them in the way they want,” Zuckerberg said.
Meta’s ultimate goal with this is to be able to pull all of a user’s content and quickly stand up a kind of business agent in order to “interact with your customers and do sales and customer support,” he added.
Images: Meta Platforms
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU