UPDATED 16:19 EDT / SEPTEMBER 11 2019

Google’s VideoBERT algorithm predicts the future one cooking video at a time

Google LLC today debuted VideoBERT, an artificial intelligence that can watch part of a video and extrapolate what will happen in the next few seconds like a human.

Equipping a computer with the ability to understand and draw correct conclusions from a visual scene requires an incredibly sophisticated algorithm. For Google’s researchers, however, the challenge wasn’t building the algorithm but finding enough data with which to train it. Machine learning models must ingest enormous amounts of information to understand even basic concepts and that information typically must be prepared by hand.

That wasn’t feasible for VideoBERT, since teaching the model how to predict future events required more sample videos that what Google’s researchers could’ve assembled by hand. They would have additionally had to write descriptions for each individual frame of every clip just so the AI could follow what’s happening. So the team came up with an alternative: freely available instructional videos.

In a video that shows how to cook an omelette or fill a tire, the person demonstrating the task will often explain each step as they perform it, narration that the researchers used as a substitute for the frame-by-frame descriptions they would have had to create for the AI otherwise. The team compiled over a million clips spanning categories such as cooking and gardening. They then fed them to VideoBERT to teach the model how to trace the progress of common activities.

After the training, the model was set loose on a collection of cooking videos it had never seen before. When presented with a video fragment showing a bowl of flour and cocoa powder, VideoBERT astutely predicted that the ingredients will be placed in an oven and become a brownie or a cupcake. The researchers also managed to harness the algorithm’s observation skills to extract a recipe from a video in which a chef explained how to cook a steak.

The methods Google developed to train VideoBERT could eventually find use in far more serious applications. Self-driving cars, for instance, might become safer if they gained the ability to predict accurately where nearby vehicles will be a few seconds into the future. Such foresight can also be a big asset for drones and industrial robots that operate in close proximity to human workers.

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google’s VideoBERT algorithm predicts the future one cooking video at a time

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

Google’s VideoBERT algorithm predicts the future one cooking video at a time

Photo: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

Cookies