UPDATED 23:01 EDT / OCTOBER 05 2022

AI

Google follows Meta in introducing text-to-video AI

Researchers at Google LLC’s AI lab, Google Brain, today unveiled Imagen Video, a program that can create high-quality videos from text, similar to what Meta Platforms Inc. introduced last week.

Google calls Imagen Video a “text-conditional video generation system based on a cascade of video diffusion models.” With just a text prompt, it says, it can generate high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models.”

The generator will produce 1280×768 HD video at 24 frames per second. It’s currently in the development stage, but it’s already quite the step up from Google’s text-to-image generation model DALL-E, which was debuted earlier this year. With that, if you said you wanted to see a still frame of a spaceman riding a horse, you could, and now it seems you can have your astronaut-horse team galloping through space.

To program the video generator, Google let it look at a vast range of videos and still images, each labeled with some text. So, when text is later inputted, the generator replicates the videos and images it has seen in the past as a synthesis of the data. 14 million videos and 60 million still images, as well as 400 million images in the LAION-400M open dataset, were used for the AI’s training. Google showed some examples, such as a panda eating and a teddy bear doing various things.

Google said it realized that there are always dangers in video manipulation technology, such as when people create what has come to be known as deep fakes. Such technology is already a problem, but as systems advance, society may have quite a problem on its hands.

“Video generative models can be used to positively impact society, for example, by amplifying and augmenting human creativity,” the company said. “However, these generative models may also be misused, for example, to generate fake, hateful, explicit or harmful content. We have taken multiple steps to minimize these concerns, for example, in internal trials, we apply input text prompt filtering, and output video content filtering.”

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.