UPDATED 16:24 EDT / AUGUST 10 2021

AI

OpenAI’s new Codex neural network writes software in response to text prompts

OpenAI LLC today launched the private beta program for an artificial intelligence system dubbed Codex that can write software and perform data science tasks in response to natural-language prompts.

Codex will be available to participants of the private beta program through an application programming interface. The API will enable participants to build their own custom software services atop the AI system.

Codex is an improved version of an AI of the same name that OpenAI had developed for GitHub Copilot, a developer productivity tool introduced last month. Both algorithms are based on GPT-3, a groundbreaking neural network that OpenAI researchers first detailed last year. GPT-3 can write essays on business topics, create web pages and perform other tasks based on simple natural-language instructions provided by the user.

OpenAI’s Codex system applies the same approach to coding. A developer can provide a natural-language prompt such as “make a user interface with 10 buttons” and Codex will automatically generate source code that performs the requested task. It’s capable of writing much more sophisticated software as well.

In one project detailed by OpenAI, researchers had Codex analyze weather data from the U.S. National Oceanic and Atmospheric Administration to create a graph of peak daily temperatures in San Francisco. In another experiment, OpenAI staffers used Codex to create a video game. The AI system on one occasion even managed to take a piece of code written in one programming language, Python, and rewrite it in the Ruby language.

Codex supports more than a dozen programming languages on launch, though it’s most proficient in Python. Python is widely used for data science tasks such as creating visualizations of business data. The language is also popular among AI developers.

Codex “has a memory of 14KB for Python code, compared to GPT-3 which has only 4KB — so it can take into account over 3x as much contextual information while performing any task,” OpenAI’s researchers detailed.

“Once a programmer knows what to build, the act of writing code can be thought of as (1) breaking a problem down into simpler problems, and (2) mapping those simple problems to existing code (libraries, APIs, or functions) that already exist,” the researchers added. “The latter activity is probably the least fun part of programming (and the highest barrier to entry), and it’s where OpenAI Codex excels most.”

Another version of Codex is available as part of GitHub Copilot, a developer tool that Microsoft’s GitHub subsidiary launched last month. It’s a sophisticated autocomplete engine that allows programmers to start writing a snippet of code and have the next few lines written automatically in some cases. However, GitHub Copilot doesn’t have a natural-language interface like the new edition of Codex that OpenAI debuted today.

Both neural networks are particularly adept at integrating external software components into a developer’s code. A sizable portion of software teams’ work consists of integrating external components such as databases into their applications.

The task is relatively simple in many cases, but it requires developers to spend a considerable amount of time reading technical guides. By automating the process, GitHub Copilot can spare developers the hassle of browsing through guides and thus potentially save them hours of work in some cases.

GitHub parent Microsoft is a major backer of OpenAI, having invested $1 billion in the AI research lab two years ago to support its work. The technology giant is also a leading provider of developer tools. Microsoft not only owns GitHub, the industry’s go-to platform for hosting code, but also provides several other products that are widely used in software development projects, including the Visual Studio code editor.

Microsoft may eventually make Codex available as a cloud service to enterprise customers. The company previously bought an exclusive license to GPT-3, the AI on which Codex is based, from OpenAI, and plans to make the technology available via Azure.

Tools that can improve developer productivity are big business. Sourcegraph Inc., a startup that saves time for software teams by helping developers explore application code faster, recently raised $125 million from investors. A theoretical future Azure service based on Codex that will streamline not one set of development tasks but many, and across multiple programming languages, may have the potential to generate significant revenue for Microsoft. 

That’s especially true if OpenAI continues to improve Codex by increasing the number of tasks the AI can automate. To support OpenAI’s research, Microsoft has built a dedicated supercomputer for the lab and deployed it in Azure. The system includes no fewer than 10,000 graphics processing units and 285,000 central processing unit cores that the lab’s scientists use to train new AI models.

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU