Japanese startup Sakana releases AI models created through ‘evolutionary’ processes
Sakana AI, an artificial intelligence startup based in Tokyo that takes inspiration from nature, today announced a new method for creating generative AI models using what the company says is similar to evolutionary processes by merging models together.
“Our method can automatically create a new underlying model with the capabilities specified by the user,” the research team said. “Models can be created very efficiently because they leverage the vast collective intelligence of existing open models.”
Typical evolutionary algorithms randomly pick traits from two parents, in the case of AI models, Sakana’s researchers technique found ways to select and reorder layers from parent models and pull them into a brand-new model. A second approach would be to mix the parameters – the trainable, or learned, and fixed components — of the parent models numerically. By combining these methods, this process can be used to produce successive generations of models to be bred and selected from, using criteria such as efficiency and performance.
In development, Sakana took three open-source AI models and “bred” them together to create more than 100 offspring, which were then benchmarked to determine which ones performed best. Those were then used to create a second generation of offspring. This process was repeated for several hundred generations until a final model was chosen.
“We found that our approach can automatically find ways to merge models from completely different domains, such as ‘non-English language and mathematics’ and ‘non-English language and images.’ which were previously considered difficult,” the researchers said. “This method of fusion, which is automatically discovered by our algorithms, is often novel and can be difficult for experts to discover by trial and error.”
The first model, EvoLLM-JP, combined the capabilities of Japanese language fluency using a chatbot LLM and a mathematics LLM using evolutionary model merge. The researchers said that not only was it very good at math, but it was also good at general Japanese language ability. Even as a 7B parameter model, they said, it achieved excellent performance compared with other models of the same size and exceeded even those with 70B parameters.
The next, EvoVLM-JP, an image language model that provides text-to-image capabilities also managed to show good benchmarks and handle Japanese cultural knowledge using images and Japanese text. Finally EvoSDXL-JP: High Speed, created through the merge technique, proved capable of swiftly generating vivid images through only four steps of inference.
“We believe that evolutionary algorithms, an approach inspired by the mechanisms of natural selection, are the key to opening the door to more effective model merging,” the research team said. “Evolutionary algorithms can automatically explore a vast space of possibilities and uncover sometimes surprising answers that are often overlooked by traditional methods and human intuition.”
Sakana raised $30 million in seed funding earlier this year to build smaller, more agile AI models that collaborate using “collective intelligence” similar to fish — Sakana is the Japanese word for fish — bees and ants. The company said it intends to engineer smaller AI models that could collaborate in a swarm, adapting using biomimicry. That’s in contrast to the ever-increasing sizes of LLMs that have scaled up in complexity to overcome problems such as holding conversations, answering questions and generating images.
The release of these three new models using evolutionary algorithms continues the company’s vision of taking inspiration from the natural world in its research towardsbuilding and releasing new AI models and processes.
Photo: Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU