Databricks open-sources an AI it says is as good as ChatGPT, but much easier to train
Big-data analytics firm Databricks Inc. has emerged as an unlikely player in the generative artificial intelligence space, open-sourcing a new AI model that it claims is “as magical as ChatGPT,” despite being trained on far less data in less than three hours using a single machine.
Databricks announced in a blog post today that it’s making what it calls Dolly available for anyone to use, for any purpose, as an open-source model, together with all of its training code and instructions on how to recreate it. The company said the release is aimed at democratizing large language models, so that instead of being something only the biggest technology companies can afford, millions of smaller firms will be able to build and use their own customized generative AI models.
In its blog post, Databricks explains that ChatGPT was trained on millions of words from thousands of different web sources, and that this training involved the use of thousands of powerful GPUs. OpenAI LP’s creation took the world by storm with its ability to create cohesive sentences in response to almost any kind of question, and chat about virtually any topic.
Responding to ChatGPT, Facebook’s parent company Meta Platforms Inc. released its own partially open-source model, called LLaMA, which was likely also trained on trillions of words. Earlier this month, a group of researchers took Facebook’s LLaMA and created an AI called Alpaca, which was trained using a very small dataset of around 50,000 questions and answers and could exhibit ChatGPT-like qualities.
Although Alpaca is encouraging, it’s not available under a fully open-source license, meaning it cannot be used commercially. However, it provided the inspiration for Databricks to come up with its own model.
Instead of creating its own model from scratch or using LLaMA, Databricks took a much older and open-source LLM called GPT-J, which was created by EleutherAI several years earlier. GTP-J was the foundation on which Dolly was built. The model, Databricks said, “has not made a huge splash, presumably because it does not exhibit magical instruction-following capabilities.”
Databricks said it was able to take the EleutherAI model and make it “highly approachable” simply by training it with a small, 50,000-word dataset, in less than three hours using a single machine. Despite the much smaller model — only 6 billion parameters versus ChatGPT’s 175 billion — as well as a smaller dataset and training time, Databricks said, Dolly still exhibits the same “magical human interaction ability” demonstrated by ChatGPT.
“This shows that the magic of instruction following does not lie in training models on gigantic datasets using massive hardware,” Databricks explained. “Rather, the magic lies in showing these powerful open-source models specific examples of how to talk to humans, something anybody can do for a hundred dollars using this small 50K dataset of Q&A examples.”
Databricks said it named the model Dolly in homage to Dolly the sheep, the first cloned mammal, because it’s really just a very cheap clone of Alpaca and GPT-J. It claims that it’s still a momentous achievement, because by open-sourcing Dolly and its training data, it enables anyone to train and operate a genuinely humanlike AI, without investing millions of dollars.
“This is AI’s ‘waking up’ moment,” the company said. “We haven’t fundamentally changed anything and we haven’t done anything miraculous from an R&D perspective, but we realized that all that’s required to unlock the potential of these widely-available tools is to show them just a few thousand examples of how you want them to behave.”
Databricks said this is the first of a series of announcements it’s making on large language models. Those who want to try out Dolly can contact the company at hello-dolly@databricks.com.
Image: Freepik
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU