UPDATED 10:30 EDT / JANUARY 29 2025

Alibaba unveils Qwen 2.5-Max AI model, saying it outperforms DeepSeek-V3

Alibaba Cloud, the cloud computing arm of China’s Alibaba Group Ltd., has released its latest breakthrough artificial intelligence large language model just in time for the Chinese New Year: Qwen 2.5-Max, which it claims surpasses today’s most powerful AI models.

The Tuesday release of Qwen 2.5-Max is the second big LLM release out of China in the past two weeks alongside DeepSeek’s R1 reasoning model. DeepSeek, a Chinese AI research startup, made waves with claims that R1 could rival the performance of the most capable models built by U.S. companies and trained at a small fraction of the cost.

“We developed Qwen 2.5-Max, a large-scale mixture of experts LLM model that has been pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning and Reinforcement Learning from Human Feedback methodologies,” the company said in a blog post.

Mixture of experts, or MoE, is an LLM architecture that uses multiple specialized models working in concert to handle complex tasks more efficiently according to a specific subset of expertise. It’s essentially as if a team of AI models, each trained to excel in a specific subcategory of knowledge, work together to combine their training to answer questions and complete tasks.

According to Alibaba, using this technique the new Qwen model exceeded the efficiency of DeepSeek-V3, the startup’s latest non-reasoning model released in late December, on key benchmarks, including ArenaHard, LiveBench and MMLU-Pro. The company also claimed it outperformed Anthropic PBC’s Claude 3.5 Sonnet, OpenAI’s GPT-4o and Meta Platform Inc.’s Llama 3.1-401B.

The architecture also allowed the company to build the model with a smaller footprint, only needing 20 trillion tokens to train. That allows it to use fewer resources when deployed and run at higher efficiency.

“The scaling of data and model size not only showcases advancements in model intelligence but also reflects our unwavering commitment to pioneering research,” the company said. “We are dedicated to enhancing the thinking and reasoning capabilities of large language models through the innovative application of scaled reinforcement learning.”

Unlike the other Qwen models, which have been released as open source, which allows developers to experiment and extend freely, Qwen 2.5-Max is still closed source. Alibaba has made the model available via an application programming interface through Alibaba Cloud, compatible with OpenAI’s API, making it easy for developers to integrate. Its also accessible through a ChatGPT-like chatbot interface on Qwen Chat.

Alibaba recently released Qwen2-VL in August, the company’s new vision-language model. The model is capable of advanced visual comprehension of video. It can ingest a high-quality video up to 20 minutes long and answer questions about its contents.

Image: Alibaba Cloud

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Alibaba unveils Qwen 2.5-Max AI model, saying it outperforms DeepSeek-V3

Image: Alibaba Cloud

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

UiPath Fusion 2025

theCUBE + NYSE Wired: AI Factories - Data Centers of the Future 2025

DigiCert World Quantum Readiness Day 2025

EVOLVE25

Oktane 2025

Alibaba unveils Qwen 2.5-Max AI model, saying it outperforms DeepSeek-V3

Image: Alibaba Cloud

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

UiPath Fusion 2025

theCUBE + NYSE Wired: AI Factories - Data Centers of the Future 2025

DigiCert World Quantum Readiness Day 2025

EVOLVE25

Oktane 2025

Cookies