Mistral AI releases open-source language model with 7B parameters
Mistral AI, a well-funded artificial intelligence startup that launched five months ago, today released an open-source language model with 7 billion parameters.
The model is called Mistral 7B in a nod to its parameter count. It’s available on GitHub under an Apache 2.0 license. According to the company, the model may be used for both research and commercial purposes.
Paris-based Mistral AI was founded in May by former Meta Platforms Inc. and Google LLC researchers. Chief Executive Officer Arthur Mensch worked at the search giant’s DeepMind machine learning unit before launching the company. Chief Science Officer Guillaume Lample led the development of Meta’s open-source Llama language model.
Four weeks after launching in May, Mistral AI closed a €105 million funding round at a €240 million valuation. The investment included contributions from Lightspeed Venture Partners, Index Ventures, Redpoint Ventures and more than a half dozen other backers. Mistral AI said at the time that it was planning to introduce its first language models in 2024.
The release of the Mistral 7B language model today suggests that the development effort is advancing faster than expected. In a blog post, the company detailed that the model took three months to develop. In that time frame, Mistral AI’s founders assembled an engineering team and built a so-called MLOps stack, a collection of specialized software tools used for neural network development.
The company says Mistral 7B can generate prose, summarize documents and perform other text processing tasks. It’s also capable of autocompleting software code written by developers. The model has a context length of 8k, which means that each prompt entered by users may contain up to 8,000 tokens.
At the architectural level, Mistral AI features 7 billion parameters. Those are the configuration settings that determine how a neural network goes about processing data. The most advanced AI systems on the market today have hundreds of millions of such settings.
The company claims that Mistral 7B “outperforms all currently available open models up to 13B parameters on all standard English and code benchmarks.” That includes the 13 billion parameter version of Llama 2, an advanced language model Meta released earlier this year. Moreover, Mistral 7B achieved performance “on par ” with a 34 billion parameter version of Meta’s Llama model, a predecessor to Llama 2.
Mistral AI says its model can match the performance of larger neural networks while using less hardware. Lowering an AI’s hardware requirements not only decreases the cost of running it but also improves performance. As a result, the company sees Mistral 7B coming particularly for latency-sensitive use cases.
Mistral 7B is the first in a series of large language models that the company plans to release. The upcoming additions to the lineup are expected to be better at reasoning tasks and support more languages. In the long term, Mistral AI also plans to offer hosted neural networks for the enterprises.
Image: Mistral AI
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.