Multiverse Computing: Revolutionizing AI Compression
In the past, we compressed files with ZIP. Now, what we increasingly need is to compress AI to make it smaller and more efficient. This is exactly the idea behind Multiverse Computing , a Spanish startup that is becoming the new crown jewel of our AI industry. Its founders (in the image, from left to right, Román Orús , Enrique Lizaso Olmos , and Samuel Mugel ) alongside Alfonso Rubio have a lot to celebrate.
Investment Round Success
Multiverse Computing has recently closed an investment round of 189 million euros (215 million dollars). The round, labeled Series B , was led by Bullhound Capital , with participation from major players such as HP Tech Ventures , SETT , Forgepoint Capital International , CDP Venture Capital , Santander Climate VC , Quantonation , Toshiba , and Capital Riesgo de Euskadi – Grupo SPRI . Earlier in March, the company received an investment of 67 million euros from the Government of Spain.
AI Inference at Its Core
While the spotlight currently shines on big tech companies investing billions of dollars in data centers to train large language models (LLMs), there is an increasing focus on the other side: the part we users engage with when asking questions to AI systems like ChatGPT . This is known as AI inference , and it is estimated that by 2025, the value of this industry will soar to 106 billion dollars . Multiverse Computing aims to claim a significant share of this market with its unique technology.
Introducing CompactifAI
CompactifAI is the name of the AI model compression technology developed by Multiverse Computing. This technology enables the conversion of monolithic AI models—those that are costly to “run”—into far smaller and more efficient models, making them more manageable and saving substantial resources (and time) during inference.
How to Compress an AI Model
Román Orús, the company’s scientific director, led a study in May 2024 that explained the concept of tensor networks , inspired by quantum principles, which allow for the compression of these models. The process involves decomposing the weight matrices of neural networks by “truncating” and retaining only the most significant values. Essentially, the concept revolves around discarding less relevant information to focus on what truly matters within the model.
Does This Compromise Model Accuracy?
Indeed, it can, but the degree of truncation is controllable to strike a balance between compression and precision loss. Despite compressing these models, Multiverse Computing asserts that the drop in accuracy is merely between 2% to 3% .
Same Performance at 95% Smaller Size
To counteract potential accuracy declines, this system includes a rapid retraining phase known as “curation,” which can be repeated multiple times to achieve an accuracy even closer to the original model. Ultimately, the company claims they can compress an AI model by up to 95% while maintaining performance.
Making AI More Affordable
According to their data, a model like Llama 3.1 405B incurs an operational cost of around 390,000 dollars when running locally (needing 13 GPUs H100 and drawing 9100 W ). However, with the help of CompactifAI, that cost can be slashed to just 60,000 dollars (requiring only 2 GPUs H100 and consuming 1400 W ).

Slim AI Models
The “slim” models provided by the company—derived from Llama 3.3 70B or Llama 4 Scout —are compressed versions that theoretically maintain accuracy. They can be executed through the AWS platform or via licenses that also allow for on-premise use, meaning local infrastructure. According to their metrics, these models run between 4 and 12 times faster than their non-compressed counterparts, translating to an inference cost that is 50% to 80% lower .
Image | Multiverse Computing
As Multiverse Computing continues to innovate in the realm of AI, the implications for businesses and consumers alike could be monumental. By reducing operational costs and enhancing efficiency in AI application, they are not only paving the way for future advancements but are also enabling broader and more accessible AI utilization across various sectors.

