Multiverse Computing: Revolutionizing AI Compression

In the past, we compressed files with ZIP. Now, what we increasingly need is to compress  AI  to make it smaller and more efficient. This is exactly the idea behind  Multiverse Computing , a Spanish startup that is becoming the new crown jewel of our AI industry. Its founders (in the image, from left to right,  Román Orús ,  Enrique Lizaso Olmos , and  Samuel Mugel ) alongside  Alfonso Rubio  have a lot to celebrate.

Investment Round Success

Multiverse Computing has recently closed an investment round of  189 million euros  (215 million dollars). The round, labeled  Series B , was led by  Bullhound Capital , with participation from major players such as  HP Tech Ventures ,  SETT ,  Forgepoint Capital International ,  CDP Venture Capital ,  Santander Climate VC ,  Quantonation ,  Toshiba , and  Capital Riesgo de Euskadi – Grupo SPRI . Earlier in March, the company received an investment of  67 million euros  from the Government of Spain.

AI Inference at Its Core

While the spotlight currently shines on  big tech companies  investing billions of dollars in data centers to train large language models (LLMs), there is an increasing focus on the other side: the part we users engage with when asking questions to AI systems like  ChatGPT . This is known as  AI inference , and it is estimated that by 2025, the value of this industry will soar to  106 billion dollars . Multiverse Computing aims to claim a significant share of this market with its unique technology.

Introducing CompactifAI

CompactifAI is the name of the AI model compression technology developed by Multiverse Computing. This technology enables the conversion of monolithic AI models—those that are costly to “run”—into far smaller and more efficient models, making them more manageable and saving substantial resources (and time) during inference.

How to Compress an AI Model

Román Orús, the company’s scientific director, led a study in May 2024 that explained the concept of  tensor networks , inspired by quantum principles, which allow for the compression of these models. The process involves decomposing the weight matrices of neural networks by “truncating” and retaining only the most significant values. Essentially, the concept revolves around discarding less relevant information to focus on what truly matters within the model.

Does This Compromise Model Accuracy?

Indeed, it can, but the degree of truncation is controllable to strike a balance between compression and precision loss. Despite compressing these models, Multiverse Computing asserts that the drop in accuracy is merely between  2% to 3% .

Same Performance at 95% Smaller Size

To counteract potential accuracy declines, this system includes a rapid retraining phase known as  “curation,”  which can be repeated multiple times to achieve an accuracy even closer to the original model. Ultimately, the company claims they can compress an AI model by up to  95%  while maintaining performance.

Making AI More Affordable

According to their data, a model like  Llama 3.1 405B  incurs an operational cost of around  390,000 dollars  when running locally (needing  13 GPUs H100  and drawing  9100 W ). However, with the help of CompactifAI, that cost can be slashed to just  60,000 dollars  (requiring only  2 GPUs H100  and consuming  1400 W ).

Meta is so desperate that it is starting to offer up to 100 million dollars to AI researchers from OpenAI and Google

Slim AI Models

The “slim” models provided by the company—derived from  Llama 3.3 70B  or  Llama 4 Scout —are compressed versions that theoretically maintain accuracy. They can be executed through the  AWS  platform or via licenses that also allow for  on-premise  use, meaning local infrastructure. According to their metrics, these models run between  4 and 12 times  faster than their non-compressed counterparts, translating to an inference cost that is  50% to 80% lower .

Image | Multiverse Computing

As Multiverse Computing continues to innovate in the realm of AI, the implications for businesses and consumers alike could be monumental. By reducing operational costs and enhancing efficiency in AI application, they are not only paving the way for future advancements but are also enabling broader and more accessible AI utilization across various sectors.



General News – 2