Meta’s new Code Llama large language model optimized for programming tasks
Meta Platforms Inc. today introduced Code Llama, an open-source large language model that can automatically generate code snippets and explain how they work.
The model is free for commercial use.
Code Llama is based on another open-source language model, Llama 2, that Meta released last month. The latter model is more general-purpose in nature. It can not only perform coding tasks but also summarize documents, translate text and answer trivia questions.
Llama 2 is one of the most advanced language models in the open-source ecosystem. In a series of benchmarks tests carried out by Meta researchers, it outperformed several other freely available neural networks. Code Llama, the language model Meta introduced today, is a specialized version of Llama 2 with significantly enhanced programming capabilities.
Meta developed Code Llama by training the original Llama 2 neural network on a large dataset of code samples and “code-related” files. According to the company, that training dataset comprised 500 billion tokens. A token is a basic unit of information in artificial intelligence projects that usually comprises a few letters or numbers.
Code Llama is available in three flavors: a standard edition and two specialized versions.
The first specialized version is designed to generate software in the Python programming language. It was trained on a dataset that included 100 billion tokens’ worth of Python code.
The other specialized version of Code Llama is called Code Llama – Instruct. It’s optimized to generate code based on natural language instructions from the user. Furthermore, the model can explain how the code it generates works.
The three editions of Code Llama are each available in three versions. Those versions include 7 billion, 13 billion and 34 billion parameters, respectively. Parameters are the configuration settings that influence how an AI turns data into decisions.
According to Meta, the versions of Code Llama that have 7 billion and 13 billion parameters are faster than the 34-billion edition. This speed advantage makes them more suitable for latency-sensitive tasks. A company could, for example, use them to build a development tool that generates real-time code autocomplete suggestions for programmers.
The 34-billion edition of Code Llama trades off speed for increased accuracy. As a result, it should prove more useful in cases where the priority is to maximize the quality of responses.
One key feature that sets Code Llama apart from Llama 2, the general-purpose language model on which it’s based, is its context window.
An AI’s context window determines the amount of data that users can include in a single prompt. That amount of data is 4,096 tokens in the case of Llama 2. Code Llama, in contrast, has a maximum context window of 100,000 tokens.
The larger context window will enable the model to perform some programming tasks more effectively than its namesake. According to Meta, Code Llama will be better at debugging software errors. The company also believes the feature can help developers increase the quality of AI-generated code.
“For example, users can provide the model with more context from their codebase to make the generations more relevant,” Meta’s researchers wrote in a blog post.
Meta evaluated Code Llama’s capabilities using two popular coding benchmarks known as HumanEval and Mostly Basic Python Programming. According to the company, the model outperformed several state-of-the-art alternatives from the open-source ecosystem. Furthermore, it carried out some tasks better than GPT-3.5, a recent predecessor to OpenAI LP’s flagship GPT-4 language model.
“Our benchmark testing showed that Code Llama performed better than open-source, code-specific LLMs and outperformed Llama 2,” Meta’s researchers detailed. “Code Llama 34B, for example, scored 53.7% on HumanEval and 56.2% on MBPP, the highest compared with other state-of-the-art open solutions.”
Image: Meta
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU