UPDATED 13:29 EST / MAY 24 2022

Google details new cutting-edge image generation AI

Google LLC today detailed Imagen, an artificial intelligence system that can automatically generate images based on text prompts provided by a user.

Over the past few years, researchers have developed multiple neural networks capable of automatically generating images. One of the most sophisticated entries into the category is an AI system called DALL-E 2 that was detailed by OpenAI LLC earlier this year. According to Google, its newly announced Imagen system can outperform DALL-E 2 as well as other AI models in the category.

Imagen includes two separate neural networks. The first takes as input a piece of text that describes what image should be drawn. The neural network turns this description into a form that can be understood by Imagen’s second neural network, which is responsible for drawing the image.

To build Imagen, Google drew on a number of key advances in AI research that were made over the past decade.

The first neural network in Imagen, which is responsible for translating a text description into a form that the system can understand, is a so-called Transformer model. Transformer models are a type of natural language processing algorithm that was invented by Google in 2017. They can understand the meaning of text more accurately than earlier algorithms.

A Transformer model relies on context to understand the meaning of the words in a sentence. It analyzes the text that surrounds a word, determines which specific pieces of text influence the word’s meaning the most and uses them to make a decision. Google’s new Imagen system uses a Transformer model to turn an image description provided by a user into an embedding, a mathematical representation of data that neural networks can understand.

After the image description is turned into an embedding, a second AI integrated into Imagen uses it to draw the corresponding image. This second AI is a so-called diffusion model, a type of neural network that was first developed in 2015.

Such neural networks differ from other image generation algorithms in the way they are trained. To train a diffusion model, engineers first supply it with images that contain a type of error known as Gaussian noise. Then, the diffusion model is given the task of finding a way to remove the Gaussian noise.

AI researchers commonly use a dataset called COCO to compare the effectiveness of image generation algorithms. Google says that Imagen significantly outperformed competing AI systems, including OpenAI’s cutting-edge DALL-E 2 system, in an internal test that used COCO. Imagen also managed to outperform the competition in a separate test based on DrawBench, a new benchmark developed by Google.

Google’s announcement of Imagen comes a few weeks after the search giant debuted PaLM, another cutting-edge AI developed by its researchers. It’s designed for natural language processing tasks and features 540 billion parameters, the configuration settings that help determine how a neural network makes decisions. According to Google, PaLM can outperform OpenAI’s sophisticated GPT-3 neural network when performing certain tasks.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google details new cutting-edge image generation AI

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Google details new cutting-edge image generation AI

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies