UPDATED 17:15 EDT / MARCH 17 2026

OpenAI, Mistral AI release new hardware-efficient language models

OpenAI Group PBC and Mistral AI SAS today introduced new artificial intelligence models optimized for cost-sensitive use cases.

OpenAI is rolling out two algorithms called GPT-5.4 mini and GPT 5.4 nano to its cloud services. Anthropic, in turn, has released a new addition to its Mistral Small series of open-source models.

GPT-5.4 mini and GPT 5.4 nano are lower-cost versions of GPT 5.4, OpenAI’s flagship large language model. The former algorithm is the more capable of the two. It came within 5% of the scores set by GPT 5.4 across two benchmarks, SWE-Bench Pro and OS-World-Verified, that measure AI models’ programming skills and computer use capabilities, respectively.

OpenAI also compared GPT-5.4 mini against its predecessor, another cost-optimized model called GPT-5 mini. The company determined that the new model can complete some tasks more than twice as fast.

GPT-5.4 mini has a context window of 400,000 tokens. The prompts that users send to the model can include not only text but also images. For example, a developer could upload a screenshot of an application’s interface and ask GPT-5.4 mini to suggest usability improvements.

The model is available in ChatGPT, the Codex programming assistant and OpenAI’s application programming interface. The API version of GPT-5.4 mini is priced at 75 cents per 1 million input tokens and $4.50 per 1 million output tokens.

GPT-5.4 nano, the other hardware-optimized model that OpenAI debuted today, is significantly more cost-efficient. The company is charging 20 cents per 1 million input tokens and $1.25 per 1 million output tokens. It’s available solely through OpenAI’s API.

This model is designed for a more limited range of tasks than GPT-5.4 mini. OpenAI sees developers using it to power data extraction, classification and ranking software. The model can also perform certain simple coding tasks, particularly when it’s used in conjunction with other LLMs. A programming assistant could use a more advanced model such as GPT-5.4 to tackle complex coding challenges and route supporting tasks to GPT-5.4 nano.

“Mini delivers strong reasoning, while nano is responsive and efficient for live conversational workflows,” said Perplexity AI Inc. Deputy Chief Technology Officer Jerry Ma.

The new open-source model that Mistral released in conjunction, Mistral Small 4, includes 119 billion parameters. Those parameters are organized into 128 experts, or miniature neural networks, that each focus on a different set of tasks. Mistral Small 4 activates four experts with 6 billion combined parameters to answer prompts.

Like OpenAI’s latest models, Mistral Small 4 can process multimodal files. It also supports a range of other use cases. Mistral says the model can automate reasoning tasks such as code generation, analyze documents and power general-purpose AI assistants.

A setting called reasoning_effort enables developers to adjust the amount of time that Mistral Small 4 spends on tasks. The simpler the task, the smaller the time investment required to complete it. Avoiding unnecessary reasoning calculations helps reduce inference costs and response latency. Mistral says Small 4 can reduce the “end-to-end completion time” of requests by 40% in a latency-optimized configuration.

The model requires at least 4 of Nvidia Corp.’s HGX H100 graphics cards or equivalent hardware to run. Mistral Small 4 is under the Apache 2.0 license, which enables organizations to create custom versions of the algorithm at no charge. A company could, for example, fine-tune the model on a proprietary e-commerce dataset to make it better at predicting product demand trends.

In addition to the two new models, Mistral today introduced Forge, a system to enable enterprises to build frontier-grade AI models grounded in their own data and knowledge. It said Forge can “understand their internal context embedded within systems, workflows, and policies, aligning AI with their unique operations.”

In December, Amazon Web Services Inc. introduced a similar service called Nova Forge, allowing enterprises to build customized models.

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

OpenAI, Mistral AI release new hardware-efficient language models

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

OpenAI, Mistral AI release new hardware-efficient language models

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Nutanix .NEXT 2026

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

Cookies