A Basque AI startup has just raised 189 million euros with a brilliant idea: compressing AI.

Multiverse Computing: Revolutionizing AI Compression

In the past, we compressed files with ZIP. Now, what we increasingly need is to compress AI to make it smaller and more efficient. This is exactly the idea behind Multiverse Computing , a Spanish startup that is becoming the new crown jewel of our AI industry. Its founders (in the image, from left to right, Román Orús , Enrique Lizaso Olmos , and Samuel Mugel ) alongside Alfonso Rubio have a lot to celebrate.

Investment Round Success

Multiverse Computing has recently closed an investment round of 189 million euros (215 million dollars). The round, labeled Series B , was led by Bullhound Capital , with participation from major players such as HP Tech Ventures , SETT , Forgepoint Capital International , CDP Venture Capital , Santander Climate VC , Quantonation , Toshiba , and Capital Riesgo de Euskadi – Grupo SPRI . Earlier in March, the company received an investment of 67 million euros from the Government of Spain.

AI Inference at Its Core

While the spotlight currently shines on big tech companies investing billions of dollars in data centers to train large language models (LLMs), there is an increasing focus on the other side: the part we users engage with when asking questions to AI systems like ChatGPT . This is known as AI inference , and it is estimated that by 2025, the value of this industry will soar to 106 billion dollars . Multiverse Computing aims to claim a significant share of this market with its unique technology.

Introducing CompactifAI

CompactifAI is the name of the AI model compression technology developed by Multiverse Computing. This technology enables the conversion of monolithic AI models—those that are costly to “run”—into far smaller and more efficient models, making them more manageable and saving substantial resources (and time) during inference.

How to Compress an AI Model

Román Orús, the company’s scientific director, led a study in May 2024 that explained the concept of tensor networks , inspired by quantum principles, which allow for the compression of these models. The process involves decomposing the weight matrices of neural networks by “truncating” and retaining only the most significant values. Essentially, the concept revolves around discarding less relevant information to focus on what truly matters within the model.

Does This Compromise Model Accuracy?

Indeed, it can, but the degree of truncation is controllable to strike a balance between compression and precision loss. Despite compressing these models, Multiverse Computing asserts that the drop in accuracy is merely between 2% to 3% .

Same Performance at 95% Smaller Size

To counteract potential accuracy declines, this system includes a rapid retraining phase known as “curation,” which can be repeated multiple times to achieve an accuracy even closer to the original model. Ultimately, the company claims they can compress an AI model by up to 95% while maintaining performance.

Making AI More Affordable

According to their data, a model like Llama 3.1 405B incurs an operational cost of around 390,000 dollars when running locally (needing 13 GPUs H100 and drawing 9100 W ). However, with the help of CompactifAI, that cost can be slashed to just 60,000 dollars (requiring only 2 GPUs H100 and consuming 1400 W ).

Meta is so desperate that it is starting to offer up to 100 million dollars to AI researchers from OpenAI and Google

Slim AI Models

The “slim” models provided by the company—derived from Llama 3.3 70B or Llama 4 Scout —are compressed versions that theoretically maintain accuracy. They can be executed through the AWS platform or via licenses that also allow for on-premise use, meaning local infrastructure. According to their metrics, these models run between 4 and 12 times faster than their non-compressed counterparts, translating to an inference cost that is 50% to 80% lower .

Image | Multiverse Computing

As Multiverse Computing continues to innovate in the realm of AI, the implications for businesses and consumers alike could be monumental. By reducing operational costs and enhancing efficiency in AI application, they are not only paving the way for future advancements but are also enabling broader and more accessible AI utilization across various sectors.

General News – 2

Multiverse Computing: Revolutionizing AI Compression

Investment Round Success

AI Inference at Its Core

Introducing CompactifAI

How to Compress an AI Model

Does This Compromise Model Accuracy?

Same Performance at 95% Smaller Size

Making AI More Affordable

Slim AI Models

I Wanted to Find the Closest Spot to View the Eclipse, So I Programmed an AI App to Search for It

“I Have Nothing to Lose, Let Them Kill Me if They Want”: Venezuelans Block Machines from Searching for Their Relatives

AS Monaco Announces Krépin Diatta’s Departure

Ángel Simón and Josep Maria Recasens Promise to End 8×8 Delays and Open the Door to High-Level Changes at Their First Board Meeting at Indra

China Has Cut Off the Yttrium Supply: The US Has No Alternatives for Its Gas Turbines

You missed

I Wanted to Find the Closest Spot to View the Eclipse, So I Programmed an AI App to Search for It

“I Have Nothing to Lose, Let Them Kill Me if They Want”: Venezuelans Block Machines from Searching for Their Relatives

AS Monaco Announces Krépin Diatta’s Departure

Ángel Simón and Josep Maria Recasens Promise to End 8×8 Delays and Open the Door to High-Level Changes at Their First Board Meeting at Indra