UPDATED 17:16 EDT / AUGUST 21 2024

AI

Nvidia, Microsoft release new small language models

Nvidia Corp. today released a lightweight language model, Mistral-NeMo-Minitron 8B, that can outperform comparably sized neural networks across a range of tasks.

The code for the model is available on Hugging Face under an open-source license. Its debut comes a day after Microsoft Corp. rolled out several open-source language models of its own. Like Nvidia’s new algorithm, the new models are designed to run on devices with limited processing capacity.

Mistral-NeMo-Minitron 8B is a scaled-down version of a language model called Mistral NeMo 12B that Nvidia debuted last month. The latter algorithm was developed through a collaboration with Mistral AI SAS, a heavily funded artificial intelligence startup. Nvidia created Mistral-NeMo-Minitron 8B using two machine learning techniques known as pruning and distillation.

Pruning is a way of reducing a model’s hardware requirements by removing unnecessary components from its code base. A neural network is made up of numerous artificial neurons, code snippets that each perform one relatively simple set of computations. Some of those code snippets play a less active role in processing user requests than others, which means they can be removed without significantly decreasing the AI’s output quality.

After pruning Mistral NeMo 12B, Nvidia moved on to the so-called distillation phase of the project. Distillation is a process whereby engineers transfer an AI’s knowledge to a second, more hardware-efficient neural network. In this case, the second model was the Mistral-NeMo-Minitron 8B that debuted today, which has 4 billion fewer parameters than the original.

Developers can also reduce an AI project’s hardware requirements by simply training a brand new model from scratch. Distillation offers several benefits over that approach, notably better AI output quality. Distilling a large model into a smaller one also costs less because the task doesn’t require as much training data. 

According to Nvidia, its approach of combining pruning and distillation techniques during development significantly improved the efficiency of Mistral-NeMo-Minitron 8B. The new model “is small enough to run on an Nvidia RTX-powered workstation while still excelling across multiple benchmarks for AI-powered chatbots, virtual assistants, content generators and educational tools,” Nvidia executive Kari Briski detailed in a blog post.

The release of Mistral-NeMo-Minitron 8B comes a day after Microsoft open-sourced three language models of its own. Like Nvidia’s new algorithm, they were developed with hardware efficiency in mind. 

The most compact model in the lineup is called Phi-3.5-mini-instruct. It features 3.8 billion parameters and can process prompts with up to 128,000 tokens’ worth of data, which allows it to ingest lengthy business documents. A benchmark test conducted by Microsoft determined that Phi-3.5-mini-instruct can perform some tasks better than Llama 3.1 8B and Mistral 7B, which feature about twice as many parameters.

Microsoft also open-sourced two other language models on Tuesday. The first, Phi-3.5-vision-instruct, is a version of Phi-3.5-mini-instruct that can perform image analysis tasks such as explaining a chart uploaded by the user. It rolled out alongside Phi-3.5-MoE-instruct, a significantly larger model that features 60.8 billion parameters. Only a 10th of those parameters activate when a user enters a prompt, which significantly reduces the amount of hardware necessary for inference. 

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU