UPDATED 06:30 EDT / FEBRUARY 01 2024

AI

Allen Institute for AI launches open and transparent OLMo large language model

The Allen Institute for AI, a nonprofit organization created by the late Microsoft Corp. co-founder Paul Allen, said today it’s bringing more transparency into the generative artificial intelligence industry with what it says is the world’s most open large language model.

By open-sourcing the OLMo framework, AI2 says, it’s giving AI researchers and developers the tools they need to better understand how LLMs work, and why they generate the responses they do.

AI2, which says it conducts AI research and engineering in service of the common good, argues that the current generative AI landscape lacks transparency because of the closed nature of popular LLMs such as OpenAI’s GPT-3.5 and GPT 4 and Google LLC’s Gemini. As a result, the institute says, many AI applications have leapfrogged the ability of developers to understand exactly what they have created.

OLMo 7B is described as a “truly open, state-of-the-art” LLM. It’s being made available alongside both its pretraining data and its training code, marking an industry first. OLMo and its framework are meant to aid researchers and developers in training and experimenting with LLMs by providing full transparency into how they work. It’s being made available on the Hugging Face platform and also in GitHub, where anyone can access it.

Within the framework is a suite of completely open AI development tools, including A12’s Dolma dataset, which features more than 3 trillion token open corpus for language model pretraining, including the code that produces the training data. There are full model weights for each of the four model varieties, trained to at least 2 trillion tokens. The framework also includes the inference code, training metrics and training logs, as well as the evaluation suite that was used in its development, complete with more than 500 checkpoints per model.

OLMo Project Lead Hanna Hajishirzi, who is the senior director of natural language processing research at AI2, said the nonprofit is aiming to counter the limited transparency and provide greater understanding into how LLMs work. She explained that this is necessary because working with an LLM without knowing how it was trained is similar to conducting drug discovery without performing any clinical trials.

“With our new framework, researchers will finally be able to study the science of LLMs, which is critical to building the next generation of safe and trustworthy AI,” she said.

The completely open nature of OLMo is the result of a collaboration between AI2 and partners such as Databricks Inc., Advanced Micro Devices Inc. and the Paul G. Allen School of Computer Science & Engineering at the University of Washington. By making OLMo and its training data fully open source, AI2 says, it can work with the AI research community to build the most powerful open LLM in the world.

In other words, OLMo is intended to be an ongoing initiative, and A12 and its partners will continue to iterate on it with different model sizes, modalities, datasets and capabilities added on a regular basis.

For researchers and developers, the benefits of using OLMo to power their generative AI applications include more precision, as it reduces the need to depend on qualitative assumptions on how the model is performing. Instead, its open nature means it can be tested scientifically to understand its performance better.

In addition, AI2 says, OLMo is more energy-efficient. By opening up its full training and evaluation ecosystem, OLMo reduces developmental redundancies, which are critical for the decarbonization of AI.

AI2’s decision to make OLMo available with its training dataset and training code, plus full model weights, is a sign that open-source AI models are becoming truly open, complete with a fairly open licensing model for people to use it as they please, said Andy Thurai, vice president and principal analyst of Constellation Research Inc. However, he pointed out that the open nature of OLMo could cause headaches for companies that want to use it.

“The production of the model is free because of the collaboration and the potentially free-to-use resources provided by partners such as AMD and Databricks, but it remains to be seen if these companies are fully committed to continually training and updating OLMo, or if they did it simply as a showcase of their abilities,” Thurai said. “If that turns out to be the case, then the model training costs and maintenance might prove to be unaffordable for many users. I’d like to see a few more versions released to show their commitment. Because there are many examples of open-source models that are never updated. It’s not easy to maintain such a costly endeavor without some big pocket backing.”

Image: Allen Institute for AI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU