Hugging Face Blog·2026年4月2日 01:36·約6分

Holo3：コンピュータ利用の新たなフロンティアを切り拓く

#Computer Use Agent #Mixture of Experts (MoE)#Synthetic Data Training #Hugging Face #Open Source AI

TL;DR

HcompanyはOSWorld-Verifiedで78.85%を記録するコンピュータ操作特化型AI「Holo3」を発表し、122Bパラメータ中10Bのアクティブパラメータを採用するMoEアーキテクチャと合成環境学習パイプラインにより実務適用を目指している。

AI深層分析2026年4月2日 02:40

重要/ 5段階

深度40%

キーポイント

ベンチマークSOTAとアーキテクチャ設計

OSWorld-Verifiedで78.85%の新規最高スコアを達成し、122B総パラメータに対し推論時に10Bのみが動作するMoE設計によりコストと性能を両立。

エージェント学習フライホイールと合成環境

合成ナビゲーションデータ、ドメイン外拡張、強化学習を組み合わせた継続的フィードバックループと、企業システムを自動構築する「Synthetic Environment Factory」で実務ワークフローへの適応力を強化。

オープンソース公開と推論インフラ提供

35B版のモデル重みをApache 2.0ライセンスでHugging Faceに公開し、無料ティア付き推論APIを提供することで開発者エコシステムへの展開を促進。

影響分析・編集コメントを表示

影響分析

Holo3の登場は、コンピュータ操作エージェント分野における「高性能と低コストの両立」を実証する重要なケーススタディとなる。特にMoEアーキテクチャと合成環境学習パイプラインの組み合わせは、実務適用を前提としたAIエージェント開発の新たな基準を提示する。今後はオープンソース版のコミュニティ活用と、プロプライエタリモデルとの実務ベンチマーク比較が業界の動向を左右する。

編集コメント

パラメータ効率と合成環境学習による実務適用への注力は評価できるが、ベンチマークスコアと実際の企業システムでの動作検証にはまだ差があるため、今後の実証結果に注目したい。

記事に戻る Holo3: コンピュータ利用のフロンティアを切り拓く

Upvote - Ramzi De Coster ramzidecoster Follow Hcompany Pierre-Louis Cedoz plcedoz38 Follow Hcompany

私たちは、自律型エンタープライズ（Autonomous Enterprise）へのビジョンの最新進化形であるHolo3を発表できることを誇りに思います。OSWorld-Verifiedベンチマークで78.85%のスコアを達成したHolo3-122B-A10Bは、主要なデスクトップコンピュータ利用ベンチマークにおいて、業界の新たな最高性能（state of the art）を確立しました。

Holo3は単なるベンチマークリーダーではありません。本番環境での使用を想定して設計されています。私たちのエージェント型フライホイール（agentic flywheel）を用いて構築され、合成された企業環境内で現実世界のワークフローを実行するように訓練されています。これにより、Holo3が現在のビジネスシナリオで優れた性能を発揮するだけでなく、私たちのエージェントが事実上あらゆるデジタル環境を自律的に操作できる未来の基盤を築きます。

何より、Holo3はアクティブパラメータ（active parameters）わずか100億（総パラメータ1220億）でこれを達成しています。つまり、GPT 5.4やOpus 4.6のような大規模な独自モデルに比べてはるかに低コストです。すべてのモデルは私たちの推論API（Inference API）を通じて利用可能です。Holo3-35B-A3Bの重み（weights）はApache2ライセンスの下でHugging Faceで公開されており、無料枠（free tier）で私たちの推論APIから自由にアクセスできます。

エージェント型学習フライホイール（Agentic Learning Flywheel）

Holo3を際立たせているのは、その専門的なトレーニングパイプラインです。これは、2つの核心的なエージェント機能（agentic pillars）である知覚（perception）と意思決定（decision-making）を研ぎ澄ますために設計された、継続的なフィードバックループです。

私たちのトレーニングフライホイールは、注釈付きの例からモデルに特定のタスクを実行する方法を教えながら、事実上無限の種類のユーザーインターフェースにわたって汎用的なスキルを育成することです。これが、世界クラスのコンピュータ利用モデルを構築する方法です：

合成ナビゲーションデータ（Synthetic Navigation Data）: 人間による指示と生成された指示を用いて、シナリオ固有のナビゲーション例を生成します。

ドメイン外拡張（Out-of-Domain Augmentation）: プログラム的にシナリオを拡張し、データを増強して、Holo3が予期しない事態に対処できるようにします。

厳選された強化学習（Curated Reinforcement Learning）: すべてのデータサンプルは注意深く選別され、高度なデータフィルタリングと強化学習（reinforcement learning）を活用するパイプラインを通じて取り込まれ、性能を最大化します。

生のスコアを超えて、OSWorldの結果は私たちの学習フライホイールの決定的な概念実証（proof-of-concept）として機能します。その現実世界のビジネスアプリケーションへの転移性（transferability）を検証するために、私たちは合成環境ファクトリー（Synthetic Environment Factory）を作成しました。

合成環境ファクトリー（Synthetic Environment Factory）とH Corporateベンチマーク

この独自のファクトリーは企業システムの現実を再現し、Holo3が鍛えられたトレーニング環境の一つです。私たちの環境は、シナリオ仕様に基づいて一からウェブサイトをプログラムするコーディングエージェント（coding agents）を使用して自動的に構築され、検証スクリプトでエンドツーエンドで検証される様々な難易度の検証可能なタスクを生成します。

現実世界での実用性を測定するために、私たちはH Corporateベンチマーク（H Corporate Benchmarks）も設計しました。これは、Eコマース（E-commerce）、ビジネスソフトウェア（Business software）、コラボレーション（Collaboration）、様々なマルチアプリ設定（Multi-App setups）の4つのカテゴリーにわたる486の多段階現実的タスクからなる専用評価スイートです。

このベンチマークは複雑さの全範囲を網羅しています：焦点を絞った単一アプリケーションタスクから、実際の作業の進め方を反映した長期的なマルチアプリケーションのワークフローまでです。難易度の高い側（マルチアプリ）では、タスクはエージェントが複数のシステム間で同時に情報を調整することを要求します。例えば、PDFから機器の価格を取得し、各従業員の残予算と照合し、各個人にパーソナライズされた承認または却下のメールを自律的に送信するといったことです。この種のタスクは、正確な計算と文書解析（document parsing）だけでなく、状態や意図を失うことなくアプリケーション間で持続的な多段階推論（multi-step reasoning）を要求します。

Holo3のトレーニングのために作成された合成環境の例

以下の結果で、Holo3が単一アプリケーションベンチマークで競合他社を上回っていることがわかります。Holo3とベースのQwen3.5モデルとの性能差は、私たちのエージェント型学習フライホイールの影響を反映しています。大幅に大きなパラメータ数を有するモデルよりも高い成功率を達成しながら、同じローカライゼーション（localization）とグラウンディング（grounding）基準を維持することで、Holo3はこの専門的なトレーニングの真の価値を示しています。

ユニバーサルエージェンシー（Universal Agency）へ向けて

Holo3はマイルストーンですが、目的地ではありません。クライアントのデジタルプラットフォーム内で見て、推論し、行動できるシステムを構築することで、私たちは自律型エンタープライズを現実のものにしています。

私たちの「合成環境ファクトリー」が進化を続けるにつれ、私たちのエージェントはますます複雑なタスクを処理することを学んでいます。今日のHolo3がインターフェースを習得している一方で、私たちはすでに次のフロンティアに取り組んでいます：適応型エージェンシー（Adaptive Agency）です。ここでは、私たちのモデルは既知のツールを使用するだけでなく、リアルタイムで全く新しい、特注のエンタープライズソフトウェアを自律的に操作することを学びます。

原文を表示

Back to Articles Holo3: Breaking the Computer Use Frontier

Upvote - Ramzi De Coster ramzidecoster Follow Hcompany Pierre-Louis Cedoz plcedoz38 Follow Hcompany

We are proud to unveil Holo3, the latest evolution of our vision for the Autonomous Enterprise. With a score of 78.85% on the OSWorld-Verified benchmark, Holo3-122B-A10B establishes a new state of the art for the industry on the leading desktop computer use benchmark.

Holo3 is more than a benchmark leader; it is engineered for production. Built using our agentic flywheel, it has been trained to execute real-world workflows within synthetic enterprise environments. This not only ensures that Holo3 excels in today's business scenarios, but establishes the foundation for a future where our agents can autonomously navigate virtually any digital landscape.

Best of all, Holo3 achieves this with only 10B active parameters (122B total), so at a fraction of the cost of large-scale proprietary models, such as GPT 5.4 or Opus 4.6. All models are available through our Inference API. Holo3-35B-A3B weights are openly accessible on Hugging Face under the Apache2 license and freely accessible through our inference API under a free tier.

The Agentic Learning Flywheel

What sets Holo3 apart is its specialized training pipeline—a continuous feedback loop designed to sharpen two core agentic pillars: perception and decision-making.

Our training flywheel is about teaching our model from annotated examples how to execute specific tasks, all while developing generalist skills across a virtually infinite variety of user interfaces. Here is how we build world-class computer use models:

Synthetic Navigation Data: using human and generated instructions, we generate scenario-specific navigation examples.

Out-of-Domain Augmentation: we programmatically extend the scenarios and augment the data to ensure Holo3 can handle the unexpected.

Curated Reinforcement Learning: every data sample is carefully curated and ingested through a pipeline that leverages advanced data filtering and reinforcement learning to maximize performance.

Beyond the raw scores, the OSWorld results serve as a definitive proof-of-concept for our learning flywheel. To validate its transferability to real-world business applications we created the Synthetic Environment Factory.

The Synthetic Environment Factory & H Corporate Benchmarks

This proprietary factory reproduces the reality of enterprise systems and is one of the training gyms Holo3 was forged in. Our environments are automatically built using coding agents that program websites from scratch based on scenario specifications, producing verifiable tasks of varying difficulty that are validated end-to-end with verification scripts.

To measure real-world readiness, we also designed H Corporate Benchmarks, a dedicated evaluation suite of 486 multi-step realistic tasks spanning 4 categories: E-commerce, Business software, Collaboration, and various Multi-App setups.

The benchmark spans the full complexity spectrum: from focused, single-application tasks to long-horizon, multi-application workflows that mirror how work actually gets done. At the harder end of the scale (Multi-Apps), tasks require the agent to coordinate information across multiple systems simultaneously—for example, retrieving equipment prices from a PDF, cross-referencing them against each employee's remaining budget, and autonomously sending personalised approval or rejection emails to every individual. This kind of task demands not only accurate calculation and document parsing, but sustained multi-step reasoning across applications without losing state or intent.

Examples of synthetic environments created for training Holo3

In our results below, we see Holo3 surpassing its competitors on single application benchmarks. The performance difference between Holo3 and the base Qwen3.5 models reflects the impact of our agentic learning flywheel. By achieving higher success rates than models with significantly larger parameter counts—while maintaining the same localization and grounding standards—Holo3 illustrates the true magnitude of this specialized training.

Towards Universal Agency

Holo3 is a milestone, but it is not the destination. By building a system that can see, reason, and act within our clients digital platform, we are making the Autonomous Enterprise a reality.

As our "Synthetic Environment Factory" continues to evolve, our agents are learning to handle increasingly more intricate tasks. While Holo3 today masters the interface, we are already at work on the next frontier: Adaptive Agency, where our models will not only use the tools they know but autonomously learn to navigate entirely new, bespoke enterprise software in real-time.

この記事をシェア

Simon Willison Blog重要度42026年7月4日 07:04

オープンソース AI グラップマップの公開

Hugging Face Blog2026年7月1日 09:00

Hugging Face と Cerebras が Gemma 4 をリアルタイム音声 AI に導入

Hugging Face Blog重要度42026年7月1日 03:32

ScarfBench：エンタープライズ向け Java フレームワーク移行における AI エージェントのベンチマーク

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

Hugging Face Blog·2026年4月2日 01:36·約6分

Holo3：コンピュータ利用の新たなフロンティアを切り拓く

#Computer Use Agent #Mixture of Experts (MoE)#Synthetic Data Training #Hugging Face #Open Source AI

TL;DR

AI深層分析2026年4月2日 02:40

重要/ 5段階

深度40%

キーポイント

ベンチマークSOTAとアーキテクチャ設計

OSWorld-Verifiedで78.85%の新規最高スコアを達成し、122B総パラメータに対し推論時に10Bのみが動作するMoE設計によりコストと性能を両立。

エージェント学習フライホイールと合成環境

オープンソース公開と推論インフラ提供

35B版のモデル重みをApache 2.0ライセンスでHugging Faceに公開し、無料ティア付き推論APIを提供することで開発者エコシステムへの展開を促進。

影響分析・編集コメントを表示

影響分析

編集コメント

記事に戻る Holo3: コンピュータ利用のフロンティアを切り拓く

Upvote - Ramzi De Coster ramzidecoster Follow Hcompany Pierre-Louis Cedoz plcedoz38 Follow Hcompany

エージェント型学習フライホイール（Agentic Learning Flywheel）

合成ナビゲーションデータ（Synthetic Navigation Data）: 人間による指示と生成された指示を用いて、シナリオ固有のナビゲーション例を生成します。

ドメイン外拡張（Out-of-Domain Augmentation）: プログラム的にシナリオを拡張し、データを増強して、Holo3が予期しない事態に対処できるようにします。

合成環境ファクトリー（Synthetic Environment Factory）とH Corporateベンチマーク

Holo3のトレーニングのために作成された合成環境の例

ユニバーサルエージェンシー（Universal Agency）へ向けて

原文を表示

Back to Articles Holo3: Breaking the Computer Use Frontier

Upvote - Ramzi De Coster ramzidecoster Follow Hcompany Pierre-Louis Cedoz plcedoz38 Follow Hcompany

The Agentic Learning Flywheel

What sets Holo3 apart is its specialized training pipeline—a continuous feedback loop designed to sharpen two core agentic pillars: perception and decision-making.

Synthetic Navigation Data: using human and generated instructions, we generate scenario-specific navigation examples.

Out-of-Domain Augmentation: we programmatically extend the scenarios and augment the data to ensure Holo3 can handle the unexpected.

Curated Reinforcement Learning: every data sample is carefully curated and ingested through a pipeline that leverages advanced data filtering and reinforcement learning to maximize performance.

The Synthetic Environment Factory & H Corporate Benchmarks

Examples of synthetic environments created for training Holo3

Towards Universal Agency

Holo3 is a milestone, but it is not the destination. By building a system that can see, reason, and act within our clients digital platform, we are making the Autonomous Enterprise a reality.

この記事をシェア

Simon Willison Blog重要度42026年7月4日 07:04

オープンソース AI グラップマップの公開

Hugging Face Blog2026年7月1日 09:00

Hugging Face と Cerebras が Gemma 4 をリアルタイム音声 AI に導入

Hugging Face Blog重要度42026年7月1日 03:32

ScarfBench：エンタープライズ向け Java フレームワーク移行における AI エージェントのベンチマーク

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む