UPDATED 05:00 EDT / JUNE 12 2025

AI

Multiverse Computing bags $215M for its quantum-inspired AI model compression tech

Multiverse Computing S.L. said today it has raised $215 million in funding to accelerate the deployment of its quantum computing-inspired artificial intelligence model compression technology, which promises to reduce the costs of AI inference dramatically without impacting on performance.

The Series B investment was led by Bullhound Capital and saw the participation of numerous others, including Hewlett Packard Enterprise Co.’s HP Tech Ventures, SETT, Forgepoint Capital International, CDP Venture Capital, Santander Climate VC, Quantonation, Toshiba Corp. and Capital Riesgo de Euskadi – Grupo SPRI.

The round represents a significant capital injection for the startup, which last raised $25 million through a Series A investment in March 2024. The big boost illustrates the enormous potential of its technology, which the company says can reduce the size of large language models by up to 95% without them taking any performance hit. It has dramatic implications in terms of the cost of AI inference, or running those models in production.

When AI applications scale up, the cost of running them can quickly run into millions of dollars. It’s extremely prohibitive, hence the desire to find a way to run them at a more affordable cost, and that’s what Multiverse aims to provide.

The challenge is that LLMs require powerful hardware, with the most advanced applications utilizing enormous clusters of Nvidia Corp.’s graphics processing units, which cost thousands of dollars each and use massive amounts of energy. What Multiverse does is compress the size of those LLMs so they can run on much smaller clusters.

Its CompatifAI technology does this through the use of “quantum-inspired algorithms,” which are advanced tensor networks based on the principles of quantum computing. These algorithms have the unique ability to locate the most relevant parts of any AI model, as well as the less relevant parts. Doing this, it claims, it can strip out the unnecessary bits of the model and significantly reduce its size, without any noticeable performance impact.

Multiverse co-founder and Chief Technology Officer Román Orús, the mastermind who first pioneered tensor networks, said they work by profiling the inner workings of the neural networks that power LLMs. “We can eliminate billions of spurious correlations to truly optimize all sorts of AI models,” he said.

Besides offering its technology, it has also created a library of CompactifAI models, which are highly compressed versions of leading open-source LLMs such as Llama, Mistral and DeepSeek that retain their original accuracy. According to Multiverse, these compacted models are anywhere from four to 12 times faster than the originals, allowing for inference costs to be reduced by anything from 50% to 80%. It says the CompactifAI models can run in the cloud, in private on-premises data centers or, in the case of its “ultra-compressed LLMs,” they can even run on edge devices such as personal computers, smartphones, cars and other devices, such as the Raspberry Pi.

The company insists that CompactifAI is much more effective than existing model compression techniques such as quantization and pruning, which significantly hamper the accuracy and performance of LLMs. It adds that the technology can also be used for AI training, accelerating the time it takes to train and fine-tune models by up to 1,000 times, meaning dramatically lower costs.

Co-founder and Chief Executive Enrique Lizaso Olmos said he’s trying to change the prevailing wisdom that shrinking LLMs comes at a cost in terms of performance. “What started as a breakthrough in model compression quickly proved transformative,” he said. “We’re unlocking new efficiencies in AI deployment and earning rapid adoption for our ability to radically reduce the hardware requirements for running AI models.”

Holger Mueller of Constellation Research Inc. said the unstoppable growth of AI models has led to a lot of research being undertaken in how to downsize them and increase their efficiency, and he believes Multiverse’s approach is one of the most promising.

“It’s pursuing an approach based on its experience with quantum software, and claims to be able to reduce LLM sizes by almost magical numbers,” the analyst said. “It says it can do this with very little loss in terms of quality, so it’s no wonder it managed to get the backing of some prominent investors here. Now it has to demonstrate that its approach really does work, and not only for open source models.”

The startup has already convinced some very big enterprises of the advantages of its quantum-inspired algorithms.

“By making AI applications more accessible at the edge, Multiverse’s innovative approach has the potential to bring AI benefits of enhanced performance, personalization, privacy and cost efficiency to life for companies of any size,” said HP President of Technology and Innovation Tuan Tran.

Bullhound Capital co-founder and Managing Partner Per Roman said he’s backing Multiverse because there’s a “global need” for more efficiency in AI models. “Román Orús has convinced us that he and his team of engineers are developing truly world-class solutions in this highly complex and compute-intensive field,” he said.

Image: SiliconANGLE/Dreamina

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU