UPDATED 13:28 EDT / JANUARY 06 2021

OpenAI’s newest AI models draw and recognize objects more efficiently

Researchers at OpenAI have developed two neural networks that can draw objects based on natural-language user prompts and describe images with a high degree of accuracy.

The projects, detailed Tuesday, expand the range of tasks to which artificial intelligence can be applied. They also advance the AI research community’s goal of creating more versatile models that require less manual fine-tuning by engineers to produce accurate results.

DALL·E, the first new neural network, is a miniaturized version of the GPT-3 natural-language processing model that OpenAI debuted in 2020. GPT-3, one of the most complex neural networks created to date, can generate text and even software code from simple descriptions.DALL·E applies the same capability to drawing images based on user prompts.

The model’s standout capability is that it can produce images even in response to descriptions that it’s encountering for the first time and are normally difficult for an AI to interpret. During testing performed by OpenAI researchers, the model successfully generated drawings in response to descriptions such as “an armchair in the shape of an avocado” and “a snail made of harp.” Moreover, the model is capable of generating images in several different styles.

The researchers decided to test exactly how versatile the AI is by having it tackle several additional tasks of varying difficulty. In one series of experiments, the model demonstrated an ability to generate the same image from multiple angles and with different levels of resolution. Yet another test showed that the model is sophisticated enough to customize individual details of the image it’s asked to generate.

“Simultaneously controlling multiple objects, their attributes, and their spatial relationships presents a new challenge,” OpenAI’s researchers wrote in a blog post. “For example, consider the phrase “a hedgehog wearing a red hat, yellow gloves, blue shirt, and green pants.” To correctly interpret this sentence, DALL·E must not only correctly compose each piece of apparel with the animal, but also form the associations (hat, red), (gloves, yellow), (shirt, blue), and (pants, green) without mixing them up.”

OpenAI’s other newly detailed neural network, Clip, focuses on recognizing objects in existing images rather than drawing new ones.

There are already computer vision models that classify images in such a manner. However, most of them can identify only a narrow set of objects for which they are specifically trained. An AI that classifies animals in wildlife photos, for example, has to be trained on a large number of wildlife photos to produce accurate results. What sets OpenAI’s Clip apart is that it’s capable of creating a description of an object it hasn’t encountered before.

Clip’s versatility is the fruit of a new training approach the lab has developed to build the model. For the training process, OpenAI used not a manually crafted image dataset but rather images sourced from the public web and their attached text captions. The captions enabled Clip to build a broad lexicon of words associated with different types of objects, associations it could then use to describe objects it hasn’t seen before.

“Deep learning needs a lot of data, and vision models have traditionally been trained on manually labeled datasets that are expensive to construct and only provide supervision for a limited number of predetermined visual concepts,” detailed the researchers behind Clip. “In contrast, CLIP learns from text-image pairs that are already publicly available on the internet.”

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+

CUBE Alumni Network

C-level and Technical

Domain Experts

15M+

theCUBE

Viewers

Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

OpenAI’s newest AI models draw and recognize objects more efficiently

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

theCUBE + NYSE Wired: MedTech Unplugged Series

Google Cloud Partner AI Series

Black Hat USA 2025

Open Storage Summit 2025

World of Workato 2025

RECENT CUBE EVENTS

theCUBE + NYSE Wired: AI + Cloud Leaders Media Week 2025

AWS Summit NYC 2025

AWS Mid-Year Leadership Summit 2025

RAISE Summit 2025

Blue Yonder AI and the Autonomous Supply Chain 2025

OpenAI’s newest AI models draw and recognize objects more efficiently

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Join theCUBE Alumni Trust Network

LATEST STORIES

LATEST STORIES

theCUBE + NYSE Wired: MedTech Unplugged Series

Google Cloud Partner AI Series

Black Hat USA 2025

Open Storage Summit 2025

World of Workato 2025

theCUBE + NYSE Wired: AI + Cloud Leaders Media Week 2025

AWS Summit NYC 2025

AWS Mid-Year Leadership Summit 2025

RAISE Summit 2025

Blue Yonder AI and the Autonomous Supply Chain 2025

Cookies