Stability AI releases open-source language model
Artificial intelligence startup Stability AI Ltd. today released StableLM, an open-source language model that can generate text and code.
StableLM is the first in a series of language models that the startup plans to build. Future additions to the series are set to feature more complex architectures.
London-based Stability AI is backed by $101 million in funding. It’s best known as the developer of Stable Diffusion, an open-source neural network that can generate images based on text prompts. Today’s introduction of the StableLM language model comes a few days after the startup rolled out a major update to Stable Diffusion.
StableLM is available in two versions on launch. The first features 3 billion parameters, the configuration settings that determine how a neural network goes about processing data. The second version includes 7 billion such settings.
The more parameters there are in a neural network, the more tasks it can perform. PaLM, a large language model Google LLC detailed last year, can be configured with more than 500 billion parameters. It has demonstrated the ability to solve relatively advanced mathematical problems, as well as generate code and text.
Stability AI’s new StableLM model can perform a similar set of tasks. However, the startup has not yet released detailed information about the model’s capabilities. Stability AI plans to publish a technical overview of StableLM further down the line.
Although the startup didn’t share detailed data about StableLM today, it did divulge how the model was trained. Stability AI built it using an enhanced version of an open-source training dataset called The Pile. The standard edition of the dataset comprises 1.5 trillion tokens, units of data that each include a few letters.
StableLM is available under an open-source CC BY-SA-4.0 license. Developers can use the model in both research and commercial projects, as well as change its code if necessary.
“We open-source our models to promote transparency and foster trust,” Stability AI stated in a blog post today. “Researchers can ‘look under the hood’ to verify performance, work on interpretability techniques, identify potential risks, and help develop safeguards. Organizations across the public and private sectors can adapt (‘fine-tune’) these open-source models for their own applications.”
Alongside the core edition of StableLM, Stability AI has released five other versions that were trained on additional datasets besides The Pile. Training an AI model on additional data enables it to incorporate more information into replies and perform new tasks. The five specialized versions of StableLM may only be used for research purposes.
Among the datasets that Stability AI used to train the specialized versions of StableLM is Dolly, a collection of 15,000 chatbot prompts and replies. Dolly was released by Databricks Inc. earlier this month. Databricks used the dataset to train an advanced language model that, like StableLM, is available under an open-source license.
StableLM is currently in alpha. It’s the first in a series of language models that Stability AI plans to release. As part of its development roadmap, the startup intends to build more advanced versions of StableLM that will feature between 15 billion and 65 billion parameters.
Image: Stability AI
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU