Mistral AI debuts Mixtral 8x22B, one of the most powerful open-source AI models yet
The Paris-based open-source generative artificial intelligence startup Mistral AI today released another big large language model in an effort to keep pace with the industry’s big boys.
The new Mixtral 8x22B model is expected to outperform the company’s previous model, Mixtral 8x7B. Many experts considered it to be an extremely worthy competitor to better-known contenders such as OpenAI’s GPT-3.5 and Meta Platforms Inc.’s Llama 2.
According to the startup, which raised $415 million in December and is valued at north of $2 billion, the new model is its most powerful yet, boasting a 65,000-token context window, which refers to the amount of text it can process and reference at the same time. In addition, Mixtral 8x22B features a parameter size of up to 176 billion, which refers to the number of internal variables it uses to make decisions and predictions.
Mistral was founded by AI researchers from Google LLC and Meta, and is one of several AI startups focused on building open-source models that anyone can use. The company took the somewhat unusual approach of making the new model available via a torrent link posted to the social media platform X. It later made Mixtral 8x22B available on the Hugging Face and Together AI platforms, where users can retrain and refine it to handle more specialized tasks.
The startup released Mixtral 8x22B just days after its rivals delivered their own latest models. On Tuesday, OpenAI debuted GPT-4 Turbo with Vision, the latest in its series of GPT-4 Turbo models that feature vision capabilities that enable it to work with photos, drawings and any other images uploaded by users. Later that day, Google made its most advanced, Gemini Pro 1.5 LLM generally available, giving developers access to a free version that allows them up to 50 requests per day.
Not to be outdone, Meta also said this week that it’s planning to launch Llama 3 later this month.
Mixtral 8x22B is widely expected to outperform Mistral AI’s previous Mixtral 8x7B model, which was about to beat GPT-3.5 and Llama 2 in a number of key benchmarks.
The model leverages an advanced, sparse “mixture-of-experts” architecture that enables it to perform efficient computation and deliver high performance across a wide range of tasks. The sparse MoE approach aims to provide users with a combination of different models, with each one specialized in a different category of tasks, as a way to optimize performance and costs.
“At every layer, for every token, a router network chooses two of these groups (the ‘experts’) to process the token and combine their output additively,” Mistral AI says on its website. “This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of parameters per token.”
The unique architecture means that, even though Mixtral 8x22B is enormous, it only requires about 44 billion active parameters per forward pass, which makes it faster and more cost-effective to use than similar-sized models.
The launch of Mixtral 8x22B is therefore a key milestone for open-source generative AI, giving researchers, developers and other enthusiasts the opportunity to play with some of the most advanced models without barriers such as limited access and huge costs. It’s available to use under a permissive Apache 2.0 license.
The reaction from the AI community on social media has been mostly positive, with enthusiasts voicing hope that it will deliver significant capabilities for tasks such as customer service, drug discovery and climate modeling.
Despite earning substantial praise for its open-source approach, Mistral AI has also attracted criticism. The company’s models are known as “frontier models,” and meaning there is potential for misuse. Moreover, because anyone can download and build on the company’s AI models, the startup has no way to prevent its technology being used for harmful purposes.
Image: SiliconANGLE/Microsoft Designer
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU