TechCrunch AI·2026年3月19日 17:00·約1分

Multiverse Computing、圧縮AIモデルを主流に押し上げる

#モデル圧縮 #推論最適化 #AI効率化 #Multiverse Computing #API提供 #エッジAI

TL;DR

Multiverse Computingは、OpenAIやMetaなどの主要AIラボのモデルを圧縮した後、圧縮モデルの能力を実演するアプリと、それらを広く利用可能にするAPIをローンチした。

AI深層分析2026年3月19日 18:40

注目/ 5段階

深度40%

キーポイント

主要AIモデルの圧縮実績

Multiverse Computingは、OpenAI、Meta、DeepSeek、Mistral AIなど主要AIラボのモデルを圧縮することに成功している。

製品・サービスのローンチ

圧縮モデルの能力を実演するアプリと、圧縮モデルを広く利用可能にするAPIの両方をローンチした。

主流市場への進出

これまでに蓄積した圧縮技術を活用し、製品・サービスを通じてより広範なユーザー層への提供を開始した。

影響分析・編集コメントを表示

影響分析

この発表は、AIモデルの効率化と実用性向上に焦点を当てた重要な動向を示している。モデル圧縮技術の商用化が進むことで、計算リソースの制約がある環境でのAI導入が促進される可能性がある。

編集コメント

AIモデルの効率化は持続可能なAI開発の重要な課題であり、主要モデルの圧縮実績を持つ企業の市場進出は注目に値する。ただし、圧縮による性能低下の詳細な評価が記事からは読み取れない点が気になる。

Multiverse Computingは、OpenAI、Meta、DeepSeek、Mistral AIといった主要AI研究所のモデルを圧縮したうえで、圧縮モデルの性能を実証するアプリケーションと、それらをより広く利用可能にするAPIの両方を公開しました。

原文を表示

With private company defaults running at upward of 9.2% — the highest rate in years — VC firm Lux Capital recently advised companies relying on AI to get their compute capacity commitments confirmed in writing. With financial instability rippling through the AI supply chain, Lux warned, a handshake agreement isn’t enough.

But there’s another option entirely, which is to stop relying on external compute infrastructure altogether. Smaller AI models that run directly on a user’s own device — no data center, no cloud provider, no counterparty risk — are getting good enough to be worth considering. And Multiverse Computing is raising its hand.

The Spanish startup has so far kept a lower profile than some of its peers, but as demand for AI efficiency grows, this is changing. After compressing models from major AI labs, including OpenAI, Meta, DeepSeek, and Mistral AI, it has launched both an app that showcases the capabilities of its compressed models and an API portal — a gateway that lets developers access and build with those models — that makes them more widely available.

The CompactifAI app, which shares its name with Multiverse’s quantum-inspired compression technology, is an AI chat tool in the vein of ChatGPT or Mistral’s Le Chat. Ask a question, and the model answers. The difference is that Multiverse embedded Gilda, a model so small that it can run locally and offline, according to the company.

Image Credits:Multiverse Computing

For end users, this is a taste of AI on the edge, with data that doesn’t leave their devices and doesn’t require a connection. But there’s a caveat: Their mobile devices must have enough RAM and storage. If they don’t — and many older iPhones won’t — the app switches back to cloud-based models via API. The routing between local and cloud processing is handled automatically by a system Multiverse has named Ash Nazg, whose name will ring a bell for Tolkien fans as it references the One Ring inscription in “The Lord of the Rings.” But when the app routes to the cloud, it loses its main privacy edge in the process.

These limitations mean that CompactifAI is not quite ready for mass customer adoption yet, although that may never have been the goal. According to data from Sensor Tower, the app had fewer than 5,000 downloads in the past month.

The real target is businesses. Today, Multiverse is launching a self-serve API portal that gives developers and enterprises direct access to its compressed models — no AWS Marketplace required.

Techcrunch event

San Francisco, CA

October 13-15, 2026

“The CompactifAI API portal [now] gives developers direct access to compressed models with the transparency and control needed to run them in production,” CEO Enrique Lizaso said in a statement.

Real-time usage monitoring is one of the key features of the API, and that’s no accident. Alongside the potential advantages of deploying on the edge, lower compute costs are one of the main reasons why enterprises are considering smaller models as an alternative to large language models (LLMs).

It also helps that small models are less limited than they used to be. Earlier this week, Mistral updated its small model family with the launch of Mistral Small 4, which it says is simultaneously optimized for general chat, coding, agentic tasks, and reasoning. The French company also released Forge, a system that lets enterprises build custom models, including small models for which they can pick the trade-offs their use cases can best tolerate.

Multiverse’s recent results also suggest the gap with LLMs is narrowing. Its latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b — an OpenAI model whose underlying code is publicly available. The company claims it now delivers faster responses at lower cost than the original it was derived from, an advantage that matters particularly for agentic coding workflows, where AI autonomously completes complex, multistep programming tasks.

Making models small enough to operate on mobile devices while still remaining useful is a big challenge. Apple Intelligence sidestepped that issue by combining an on-device model and a cloud model. Multiverse’s CompactifAI app can also route requests to gpt-oss-120b via API, but its main goal is to showcase that local models like Gilda and its future replacements have advantages that go beyond cost savings.

For workers in critical fields, a model that can run locally and without connecting to the cloud offers more privacy and resilience. But the bigger value is in the business use cases this can unlock — for instance, embedding AI in drones, satellites, and other settings where connectivity can’t be taken for granted.

The company already serves more than 100 global customers, including the Bank of Canada, Bosch, and Iberdrola, but expanding its customer base could help it unlock more funding. After raising a $215 million Series B last year, it is now rumored to be raising a fresh €500 million funding round at a valuation of more than €1.5 billion.

Anna Heim is a writer and editorial consultant.

You can contact or verify outreach from Anna by emailing annatechcrunch [at] gmail.com.

As a freelance reporter at TechCrunch since 2021, she has covered a large range of startup-related topics including AI, fintech & insurtech, SaaS & pricing, and global venture capital trends.

As of May 2025, her reporting for TechCrunch focuses on Europe’s most interesting startup stories.

Anna has moderated panels and conducted onstage interviews at industry events of all sizes, including major tech conferences such as TechCrunch Disrupt, 4YFN, South Summit, TNW Conference, VivaTech, and many more.

A former LATAM & Media Editor at The Next Web, startup founder and Sciences Po Paris alum, she’s fluent in multiple languages, including French, English, Spanish and Brazilian Portuguese.

View Bio

この記事をシェア

TechCrunch AI2026年7月5日 05:55

Google、AI を活用して独立宣言書を作成する商業広告を発表

TechCrunch AI重要度42026年7月5日 03:00

ミッドジャーニー、ハリウッドスタジオに AI 利用の詳細開示を要求

TechCrunch AI重要度42026年7月5日 01:32

アリババ、従業員によるClaude Codeの使用を禁止と報じられる

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む