UPDATED 15:04 EDT / JANUARY 30 2025

Mistral, Ai2 release new open-source LLMs

Mistral AI and the Allen Institute for AI today released new large language models that they claim are among the most advanced in their respective categories.

Mistral’s model is called Mistral Small 3. The new LLM from the Allen Institute for AI, or Ai2 as it’s commonly referred to, is called Tülu 3 405B. Both are available under an open-source license.

Mistral Small 3 includes 24 billion parameters, significantly less than the most advanced LLMs on the market. That makes it small enough to run on certain MacBooks when quantization is enabled. Quantization is a method of configuring LLMs that trades off some output quality for lower hardware usage.

In an internal evaluation, Mistral compared Mistral Small 3 against Llama 3.3 70B Instruct, an open-source LLM from Meta Platforms Inc. that has more than three times as many parameters. Mistral Small 3 delivered comparable output quality with significantly faster response times. In another test, the new LLM delivered higher output quality and lower latency than OpenAI’s GPT-4o mini.

Developers usually build LLMs by creating a base model, then refining its output quality using several different training methods. While building Mistral Small 3, the company developed the base model but skipped the subsequent refinement process. This allows users to carry out their own fine-tuning to align Mistral Small 3 with their project requirements.

The company sees developers applying the LLM to a range of tasks. According to Mistral, the model is useful for powering AI automation tools that require the ability to carry out tasks in external applications with low latency. The company says that several of its customers are also harnessing Mistral Small 3 for industry-specific use cases in segments such as robotics, financial services and manufacturing.

“Mistral Small 3 is a pre-trained and instructed model catered to the ‘80%’ of generative AI tasks — those that require robust language and instruction following performance, with very low latency,” Mistral researchers wrote in a blog post.

The debut of Mistral Small 3 today coincided with a new LLM release from A2I, a nonprofit AI institute. Tülu 3 405B is a customized version of the open-source Llama 3.1 405B model that Meta rolled out last June. In testing carried out by Ai2, Tülu 3 405B achieved better performance than the original Llama model across more than a half-dozen benchmarks.

The research group created the LLM using a development process that it first detailed in November. The workflow incorporates multiple LLM training methods, including one that Ai2 invented in-house.

The first step of the workflow is dedicated to supervised fine-tuning. This is a training method that involves providing an LLM with sample prompts and the corresponding answers, which helps it learn how it should respond to user queries. Next, Ai2 used another training technique called DPO to align Tülu 3 405B’s output with a set of user preferences.

Ai2 further honed the model’s capabilities using an internally developed training method called RLVR. It’s a variation of reinforcement learning, a widely used AI training technique. Ai2 says that RLVR makes AI models better at tasks such as solving math problems.

Tülu 3 405B represents “the first application of fully open post-training recipes to the largest open-weight models,” Ai2 researchers wrote in a blog post. “With this release, we demonstrate the scalability and effectiveness of our post-training recipe applied at 405B parameter scale.”

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Mistral, Ai2 release new open-source LLMs

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

Mistral, Ai2 release new open-source LLMs

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Oracle Data Deep Dive NYC 2026

HPE World Quantum Day 2026

Qlik Connect 2026

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

Cookies