UPDATED 15:00 EDT / FEBRUARY 27 2025

AI

Tencent releases new AI model it says is faster than DeepSeek-R1

Chinese technology giant Tencent Holdings Ltd. today released a new artificial intelligence model named Hunyuan Turbo S, designed as a “fast-thinking model,” a rival to so-called “slow-thinking” models such as DeepSeek-R1.

The company explained the model is capable of near “instant replies” within a second by doubling the output speed and cutting the delay of the first word out by almost 44%.

Different from DeepSeek-R1 and other “reasoning models,” which the company said “think a little and answer,” the new Turbo S model is capable of beginning an answer immediately by using a short thinking chain that is more akin to human intuition. This is fused with a slow-thinking chain that provides reasoning capabilities for scientific, mathematical and rational answers.

The company said Hunyuan Turbo S demonstrated performance comparable to leading models on the market such as DeepSeek-V3, OpenAI’s GPT-4o and Anthropic PBC’s Claude in benchmarks for math, reasoning and knowledge.

The notable success of China-based DeepSeek’s AI models, such as R1 and V3, has made numerous headlines in recent months, prompting numerous AI model developers to produce rival models rapidly. Competition has been hot from companies such as China’s Alibaba Group Ltd. with its Qwen 2.5-Max model, which it claims outperforms V3.

To create the model, Tencent used a Hybrid-Mamba-Transformer fusion to reduce the computational complexity and KV-Cache of the model’s Transformer architecture.  The end result model is a hybrid that can use Mamba deep learning architecture, which excels at handling long sequences, while still maintaining the Transformer’s capability to understand the context behind complex ideas and statements in data.

“This is also the first time that industry has successfully applied the Mamba architecture to the ultra-large Mixture of Experts model without damage,” Tencent said in its announcement.

Mixture of experts is a machine learning technique where multiple AI models are split apart according to distinct expertise and work together to solve problems.

The company also said the new architecture significantly reduces the training and deployment costs. With rival companies such as DeepSeek releasing competitive models at cheap prices this has driven Tencent and other companies slash service prices and research more efficient AI training and inference.

Tencent added as a flagship model, Turbo S will become core for foundation models in the future for inference, text and code generation.

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU