Meta AI shares its large language model with AI research community
Meta Platforms Inc.’s artificial intelligence research team said today it’s making large language models more accessible to the scientific and academic research community with the availability of its Open Pretrained Transformer (OPT-175B) model.
So-called large language models are natural language processing systems that contain more than 100 billion parameters. LLMs have proven to be transformational in NLP and AI research in recent years, leading to the creation of algorithms that can generate creative text, solve basic math problems, answer reading comprehension questions and more.
Meta AI says the potential of these LLMs is clear, limited only by the fact they largely remain inaccessible to the wider research community.
“While in some cases the public can interact with these models through paid APIs, full research access is still limited to only a few highly resourced labs,” Meta AI’s researchers wrote in a blog post. “This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues such as bias and toxicity.”
Meta AI said it’s aiming to democratize access to LLMs by sharing OPT-175B, which is an exceptionally large model with 175 billion parameters that’s trained on publicly available data sets. It said it’s the first model of this size that’s being shared with both the pretrained models and the code needed to train and use them.
With the release of OPT-175B, Meta AI said, it’s hoping that researchers will become more acquainted with the limitations and risks of LLMs that are not yet well understood. In addition, it hopes researchers will also be able to improve on detection and mitigation strategies for the possible harmful repercussions of LLMs.
“We hope that OPT-175B will bring more voices to the frontier of large language model creation, help the community collectively design responsible release strategies, and add an unprecedented level of transparency and openness to the development of large language models in the field,” Meta AI said.
Although the release will undoubtedly excite the AI research community, Meta AI made it be known that it’s putting some restrictions on its use. OPT-175B is being shared under a noncommercial license in order to focus on research use cases only. What that means is that access will be granted only to academic researchers, those affiliated with government and civil society organizations, academia and some industry research laboratories.
Along with OPT-175B, Meta AI said it’s sharing the codebase that was used to train and deploy it using only 16 Nvidia Corp. V100 graphics processing units, so as to make it more accessible to researchers with limited computing resources. Meta AI said it is doing this because it wants to provide a foundation for analyzing potential harms rooted in quantifiable metrics on a common, shared model.
In addition, Meta AI is releasing a suite of smaller-scale baseline models trained on the same dataset and with similar settings as OPT-175B, so researchers can also study the effects of scale. These additional models come with parameter counts of 125 million, 250 million, 1.3 billion, 2.7 billion, 6.7 billion, 13 billion and 30 billion, the researchers said.
Researchers interested in accessing OPT-175B can request to do so at this link.
Image: Meta AI
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU