MarkTechPost·2026年6月23日 03:42·約14分で読める

Sakana AI、フロントティア LLM を柔軟にルーティングするオーケストレーションモデル「Sakana Fugu」を発表

#LLM Orchestration #Multi-Agent Systems #Sakana AI #Fugu #Model Routing

TL;DR

Sakana AI は、単一エンドポイントから複数の最先端 LLM を動的に選定・協調させる「Fugu」を発表し、ベンダーロックインの回避と複雑なタスク解決能力を両立する新たなオーケストレーションモデルを提供した。

AI深層分析2026年6月23日 04:03

重要/ 5段階

深度40%

キーポイント

単一エンドポイントによるマルチエージェント制御

Fugu は外部からは一つの LLM のように動作し、内部でタスクの複雑さに応じて直接処理するか、複数の専門エージェントを協調させるかを自動判断する。

ベンダーロックイン回避とコンプライアンス対応

特定のモデルやプロバイダへの依存リスクを軽減するため、外部の LLM プールを動的にルーティングし、データ規制やプライバシー要件に応じて特定エージェントの使用をオプトアウト可能にする。

Fugu と Fugu Ultra の二層構成

低遅延・高効率な「Fugu」と、複雑な多段階問題に特化し固定された専門家プールを使用する高性能版「Fugu Ultra」の 2 つを提供し、API は OpenAI 互換である。

学習による自律的オーケストレーション

手動で設計したワークフローに依存せず、「Trinity」と「Conductor」という研究に基づき、強化学習や進化計算を通じてタスクに応じた役割分担と協調戦略を自律的に獲得する。

ベンチマークでの圧倒的パフォーマンス

Fugu Ultra はコーディング、推論、そして人間性の最終試験を含む11のベンチマークのうち10で最上位スコアを記録し、特に複雑な多段階タスクにおいて単一モデルを上回る成果を示した。

実世界での自律的タスク実行能力

ベータテストでは、AI による実験計画の自動最適化や、盲眼チェス、オンライン取引など、複雑で多段階なタスクにおいて既存のフロンティアモデルを凌駕する結果を達成した。

OpenAI 互換 API による即時利用

SDK の変更や移行なしに既存のクライアントから利用可能で、リクエストごとのトークン使用量とコストをリアルタイムで監視・管理できる設計となっている。

影響分析・編集コメントを表示

影響分析

この発表は、LLM の利用において「どのモデルを使うか」をユーザーが手動で選択・管理する従来のパラダイムから、AI が自律的に最適なリソース配分を行う次世代のアーキテクチャへの転換を示唆しています。特に地政学的リスクやベンダー依存によるサービス停止リスクに対する耐性（レジリエンス）を技術レベルで解決するアプローチは、企業向け AI 導入において極めて重要なインフラストラクチャの進化と言えます。

編集コメント

単一モデルの性能向上だけでなく、複数のモデルを賢く組み合わせる「オーケストレーション」技術が実用化段階に入ったことを示す画期的なニュースです。特にベンダーロックイン回避というビジネス課題に対して技術的な解決策を提示している点が評価できます。

本日、Sakana AI は Sakana Fugu を発表しました。これは一つのモデルのように振る舞うマルチエージェントオーケストレーションシステムです。リクエストは単一のエンドポイントに送信されます。Fugu が内部でどのように処理するかを決定します。タスクが単純な場合は直接解決し、必要に応じて専門家のモデルチームを組み立てて調整します。マルチエージェントシステムの複雑さは、あなたのコードには届きません。

TL;DR

Fugu は、一つの OpenAI 互換 API の背後にマルチエージェントシステムを提供します。

Fugu Ultra は、公開されているコーディングおよび推論ベンチマークのほとんどで首位を占めています。

オーケストレーターは、自身が調整する個々のモデルよりも優れたパフォーマンスを発揮します。

オプトアウトとプロバイダールーティングは、コンプライアンスと単一ベンダーリスクへの対応を目的としています。

ルーティングは独自技術のため、クエリごとのモデル選択は非公開のままです。

Sakana Fugu とは何か

Fugu はそれ自体が言語モデルです。エージェントプール内の他の LLM を呼び出すように訓練されています。このプールには、再帰的に呼ばれる自身のインスタンスも含まれます。Fugu は内部でモデルの選択、委任、検証、合成を管理します。

ハードコードされた役割やワークフローではなく、Fugu は調整方法を学習します。いつ委任し、エージェントがどのように通信すべきかを決定します。そしてそれらの成果を一つの回答に統合します。外部からは単一のモデルを呼び出しているように見えますが、内部では専門家の協調システムが作業を行います。

Sakana AI はこれを、単一ベンダーへの依存に対する保険として位置付けています。あるプロバイダがアクセスを制限した場合でも、Fugu はその混乱を迂回して処理します。研究チームは、Anthropic の Fable および Mythos モデルに対する最近の輸出規制を動機として挙げています。時間の経過とともに、新しいモデルもプールに組み込まれていきます。

Fugu と Fugu Ultra: 2 つのモデル、1 つの API

Fugu は 2 つの変種で提供され、どちらも OpenAI 互換の API の背後にあります:

Fugu は高いパフォーマンスと低遅延をバランスよく実現しています。これは日常のコーディング、コードレビュー、チャットボットのデフォルトとして機能します。また Codex などのツールにも適合します。特定のエージェントをプールから除外することも可能です。これにより、チームはデータ、プライバシー、コンプライアンス要件を満たすことができます。

Fugu Ultra は、困難な多段階問題に対する回答の最大品質のために調整されています。これはより深い専門家エージェントのプールを調整します。そのプールは固定されているため、除外オプションは利用できません。現在のモデル ID は fugu-ultra-20260615 です。

オーケストレーターに関する研究背景

Fugu は、ICLR 2026 で発表された 2 つの論文「Trinity」と「Conductor（学習に基づくオーケストレーション）」に基づいています。

TRINITY は、複数のターンにわたる軽量な進化的コーディネーターを使用します。これは Thinker、Worker、Verifier の役割を割り当てて、作業を適応的に委譲します。Conductor は強化学習によって訓練されており、多様な LLM プールに対する自然言語による調整戦略と焦点を絞ったプロンプトを発見します。

これら 2 つを組み合わせることで、システムがタスクごとにエージェントを組み立ててルーティングすることを学習できることが示されました。これは手作業で設計されたワークフローに代わるものです。

インタラクティブな解説者

(function(){

window.addEventListener("message", function(e){

if (e && e.data && e.data.type === "fugu-sim-height") {

var f = document.getElementById("fugu-sim-frame");

if (f && e.data.height) { f.style.height = e.data.height + "px"; }

}

});

})();

ベンチマーク

Sakana AI は、Fugu をそのオーケストレーション対象となる基盤モデルと比較します。ベースラインはプロバイダーが報告したスコアを使用しています。SWE Bench Pro では、mini-swe-agent が足場として利用されています。

ベンチマーク Fugu Fugu Ultra Opus 4.8 Gemini 3.1 Pro GPT 5.5

SWE Bench Pro* 59.0 73.7 69.2 54.2 58.6

TerminalBench 2.1 80.2 82.1 74.6 70.3 78.2

LiveCodeBench 92.9 93.2 87.8 88.5 85.3

LiveCodeBench Pro 87.8 90.8 84.8 82.9 88.4

Humanity's Last Exam 47.2 50.0 49.8 44.4 41.4

CharXiv Reasoning 85.1 86.6 84.2 83.3 84.1

GPQA-D 95.5 95.5 92.0 94.3 93.6

SciCode 60.1 58.7 53.5 58.9 56.1

τ³ Banking 21.7 20.6 20.6 8.4 20.6

Long Context Reasoning 74.7 73.3 67.7 72.7 74.3

MRCRv2 86.6 93.6 87.9 84.9 94.8

オーケストレーターは、11 行中 10 行で最高スコアを記録しています。Fugu Ultra は、4 つのコーディングベンチマーク、CharXiv Reasoning、および Humanity's Last Exam で首位に立ちます。GPQA-D では通常の Fugu と同点です。通常の Fugu は SciCode、τ³ Banking、Long Context Reasoning でリードしています。GPT 5.5 は MRCRv2 で勝利しており、これが唯一のベースラインによる勝利です。

その Fugu モデルは、Anthropic の Fable 5 および Mythos Preview と肩を並べる性能を示します。これら 2 つは Fugu のプールに含まれていません。なぜなら、それらは一般公開されていないためです。

ユースケース

Sakana AI は、約 500 名の初期ユーザーを対象にベータ版を実行しました。公開された例では、長く多段階のタスクが重視されています。

AutoResearch：あるエージェントが小規模な GPT のトレーニングレシピを自律的に改善しました。このエージェントは 1 つの H100 GPU で約 14 時間にわたり 123 回の実験を実行しました。Fugu Ultra は、平均検証 BPB（Perplexity）で 0.9774 という最良の結果を達成し、単一のランでは 0.9748 を記録しました。

ルービックキューブソルバー：各モデルはライブラリの使用を禁止された純粋な Python でソルバーを作成しました。Fugu Ultra は保持された 300 個のすべてのキューブを解き、平均して 19.72 手という結果を出しました。一方、あるベースラインモデルも 19.76 手でこれに非常に近い成績を残しましたが、他の 2 つのモデルはクラッシュし、1 つも解くことができませんでした。

古典日本語の仮名読み順序：1610 文字のデータセットにおいて、Fugu Ultra は NED（Normalized Edit Distance）で 0.80 のスコアを記録しました。最も近いベースラインモデルが達成したのはわずか 0.24 に過ぎませんでした。

盲眼チェス：Fugu は盤面を見せずに記憶に基づいて 4 試合をプレイしました。その結果、3 つのフロンティアモデルと Elo レート 2100 の Stockfish エンジンを破りました。

オンライン取引：50 週間の期間において、Fugu Ultra は 5 回のランで平均して +19.43% のリターンを達成しました。他のフロンティアモデルはすべて +15% を下回っていました。Sakana AI は、過去の成績が将来の結果を保証するものではないと注意しています。

最小限の API 例

Fugu は OpenAI と互換性のある API を採用しているため、SDK の移行は不要です。既存のクライアントを、コンソールで提供されたエンドポイントに指向させるだけで利用可能です。

コードをコピーしました（別のブラウザを使用してください）

from openai import OpenAI

エンドポイントとキーは Sakana コンソール (console.sakana.ai) から取得します。

client = OpenAI(

base_url="https:///v1", # console.sakana.ai から取得

api_key="YOUR_SAKANA_API_KEY",

)

resp = client.chat.completions.create(

model="fugu-ultra-20260615", # または "fugu"

messages=[

{"role": "user",

"content": "この論文に記載された手法を再現し、そのギャップを報告してください。"},

)

必ず JSON 形式で返してください。translation フィールドのみ。他のフィールド (technical_terms 等) は一切追加しないこと — 余計なフィールドを書こうとして本文翻訳がトークン上限で打ち切られる事故を防ぐため:

{"translation": "翻訳全文"}

print(resp.choices[0].message.content)

トークン使用量とコストはリクエストごとに報告されます。そのため、支出をリアルタイムで監視できます。

コミュニティの反応

{"translation": "翻訳全文"}

#fugu-sent-root *{box-sizing:border-box;margin:0;padding:0}

#fugu-sent-root{

--bg:#fff;--ink:#0a0a0a;--mut:#6b6b6b;--line:#dcdcdc;--soft:#f5f5f5;--soft2:#ebebeb;

font-family:"IBM Plex Mono",ui-monospace,SFMono-Regular,Menlo,Consolas,monospace;

background:var(--bg);color:var(--ink);border:1px solid var(--ink);

max-width:920px;margin:0 auto;-webkit-font-smoothing:antialiased;line-height:1.5;

}

#fugu-sent-root .hd{border-bottom:1px solid var(--ink);padding:18px 20px;display:flex;justify-content:space-between;align-items:flex-start;gap:12px;flex-wrap:wrap}

#fugu-sent-root .hd h2{font-size:17px;letter-spacing:.03em;font-weight:700}

#fugu-sent-root .hd p{font-size:11.5px;color:var(--mut);margin-top:6px;max-width:560px}

#fugu-sent-root .tag{font-size:10px;letter-spacing:.12em;text-transform:uppercase;border:1px solid var(--ink);padding:4px 8px;white-space:nowrap}

#fugu-sent-root .panel{padding:18px 20px;border-bottom:1px solid var(--line)}

#fugu-sent-root .lbl{font-size:10px;letter-spacing:.16em;text-transform:uppercase;color:var(--mut);margin-bottom:10px;display:block}

/* overview bar */

#fugu-sent-root .obar{display:flex;height:32px;border:1px solid var(--ink);overflow:hidden}

#fugu-sent-root .seg{display:flex;align-items:center;justify-content:center;white-space:nowrap;border-right:1px solid var(--ink)}

#fugu-sent-root .seg:last-child{border-right:0}

#fugu-sent-root .seg.sup{background:#0a0a0a}

#fugu-sent-root .seg.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}

#fugu-sent-root .seg.cri{background:#fff}

#fugu-sent-root .seg .t{font-size:10.5px;font-weight:700;background:#fff;color:#0a0a0a;border:1px solid #0a0a0a;padding:1px 7px;line-height:1.4}

#fugu-sent-root .legend{display:flex;gap:18px;flex-wrap:wrap;margin-top:12px;font-size:11px;color:var(--mut)}

#fugu-sent-root .legend span{display:inline-flex;align-items:center;gap:7px}

#fugu-sent-root .sw{width:14px;height:14px;border:1px solid var(--ink);display:inline-block}

#fugu-sent-root .sw.sup{background:#0a0a0a}

#fugu-sent-root .sw.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}

#fugu-sent-root .sw.cri{background:#fff}

#fugu-sent-root .summary{font-size:12.5px;margin-top:14px;border-left:3px solid var(--ink);padding-left:12px}

/* filters */

#fugu-sent-root .filters{display:flex;gap:0;flex-wrap:wrap;border:1px solid var(--ink);width:max-content;max-width:100%}

#fugu-sent-root .filters button{font-family:inherit;font-size:12px;background:var(--bg);color:var(--ink);border:0;padding:9px 15px;cursor:pointer;letter-spacing:.02em;border-right:1px solid var(--ink)}

#fugu-sent-root .filters button:last-child{border-right:0}

#fugu-sent-root .filters button.on{background:#0a0a0a;color:#fff}

/* cards */

#fugu-sent-root .cards{padding:8px 20px 18px;display:grid;grid-template-columns:1fr 1fr;gap:12px}

#fugu-sent-root .card{border:1px solid var(--ink);padding:13px 14px;display:flex;flex-direction:column;gap:9px;background:var(--bg)}

#fugu-sent-root .card .top{display:flex;justify-content:space-between;align-items:center;gap:8px}

#fugu-sent-root .who{display:flex;align-items:center;gap:8px;min-width:0}

#fugu-sent-root .plat{font-size:9px;letter-spacing:.08em;border:1px solid var(--ink);padding:2px 5px;font-weight:700;flex:none}

#fugu-sent-root .plat.x{background:#0a0a0a;color:#fff}

#fugu-sent-root .handle{font-size:12px;font-weight:700;white-space:nowrap;overflow:hidden;text-overflow:ellipsis}

#fugu-sent-root .chip{font-size:9px;letter-spacing:.08em;text-transform:uppercase;border:1px solid var(--ink);padding:2px 7px;flex:none}

#fugu-sent-root .chip.sup{background:#0a0a0a;color:#fff}

#fugu-sent-root .chip.ske{background:var(--soft2)}

#fugu-sent-root .chip.cri{background:#fff;border-style:dashed}

#fugu-sent-root .card .body{font-size:12.5px;line-height:1.5}

#fugu-sent-root .card .q{font-style:italic}

#fugu-sent-root .card .foot{display:flex;justify-content:space-between;align-items:center;gap:8px;margin-top:auto;padding-top:4px;border-top:1px dotted var(--line)}

#fugu-sent-root .theme{font-size:10px;color:var(--mut);letter-spacing:.04em}

#fugu-sent-root a.src{font-size:11px;color:var(--ink);text-decoration:none;border-bottom:1px solid var(--ink);white-space:nowrap;font-weight:700}

#fugu-sent-root a.src:hover{background:#0a0a0a;color:#fff;border-bottom-color:#0a0a0a;padding:0 3px}

#fugu-sent-root .affil{font-size:9px;color:var(--mut)}

/* press row */

#fugu-sent-root .press{display:flex;gap:10px;flex-wrap:wrap}

#fugu-sent-root .press a{font-size:11.5px;color:var(--ink);text-decoration:none;border:1px solid var(--ink);padding:7px 11px;display:inline-flex;gap:6px;align-items:center}

#fugu-sent-root .press a:hover{background:#0a0a0a;color:#fff}

#fugu-sent-root .note{font-size:10px;color:var(--mut);line-height:1.6;padding:0 20px 16px}

#fugu-sent-root .ft{padding:12px 20px;border-top:1px solid var(--ink);display:flex;justify-content:space-between;align-items:center;gap:10px;flex-wrap:wrap;font-size:10.5px;color:var(--mut)}

#fugu-sent-root .ft b{color:var(--ink);letter-spacing:.04em}

@media(max-width:640px){

#fugu-sent-root .cards{grid-template-columns:1fr}

#fugu-sent-root .hd h2{font-size:15px}

#fugu-sent-root .seg{font-size:10px}

}

Sakana Fugu — 初期のコミュニティ反応

X と Hacker News における公的な反応を手動レビューし、すべてのソースへのリンクを掲載。2026 年 6 月 22 日時点でのキャプチャ。

12 件の投稿をレビュー

感情の分布（n = 12）

支持派 3

懐疑派 6

批判派 3

支持派

懐疑派

批判派

初期反応は懐疑的に傾いている。「これは単なるルーターかラッパーに過ぎないのか？」という問いが支配的だ。最も明確な支持の声は、Sakana AI に所属する者たちによるものだ。

All

Supportive

Skeptical

Critical

プレス & アナリシス

Hacker News スレッド · 50 ポイント &nearr;

VentureBeat レポート &nearr;

Clanker Cloud アナリシス &nearr;

方法：感情は、2026 年 6 月 22 日の公的な投稿の小さなサンプルを手動で割り当てたものである。これは統計調査ではなく、より多くの反応が寄せられるにつれて分布は変化する可能性がある。3 つの支持派のうち 2 つは Sakana AI またはその CEO によるものだ。引用文は短縮されているため、完全な文脈については各リンクを参照のこと。Reddit の引用は VentureBeat が報告した通りである。

Marktechpost · Sakana Fugu 感情トラッカー

ソース：X · Hacker News · VentureBeat

(function(){

var root = document.getElementById('fugu-sent-root');

var DATA = [

// サポート

{s:'sup', plat:'X', handle:'@SakanaAILabs', affil:'Sakana AI (公式)',

body:'発表の告知。Fugu Ultra を Fable や Mythos に匹敵するものとして位置づけつつ、輸出規制リスクを回避している。',

theme:'発表', url:'https://x.com/SakanaAILabs/status/2068861630327443966'},

{s:'sup', plat:'X', handle:'@hardmaru', affil:'David Ha, Sakana CEO',

body:'「オーケストレーションモデルは、より大きなモデルを超えた次のフロンティアである」。これを単一ベンダー依存リスクに対するヘッジとして位置づけている。',

theme:'ビジョン', url:'https://x.com/hardmaru/status/2068884466056225025'},

{s:'sup', plat:'Blog', handle:'Clanker Cloud', affil:'独立系分析',

body:'Fugu を製品化されたオーケストレーション層と捉え、健全な議論を促しているが、どのエージェントが実行されたかという実測可能性（オバザビリティ）の可視化を求めている。',

theme:'分析', url:'https://clankercloud.ai/blog/sakana-fugu-release-model-orchestration-clanker-cloud'},

// 懐疑的

{s:'ske', plat:'HN', handle:'ed_mercer', affil:'Hacker News',

body:'「つまり要するに... OpenRouter のこと？」',

theme:'ルーターとしての枠組み', url:'https://news.ycombinator.com/item?id=48625104'},

{s:'ske', plat:'HN', handle:'embedding-shape', affil:'Hacker News',

body:'「一つの API が、単一のベンダー依存を別のものに置き換えているだけではないのか」と問うている。',

theme:'主権', url:'https://news.ycombinator.com/item?id=48625312'},

{s:'ske', plat:'HN', handle:'bprasanna', affil:'Hacker News',

body:'「これは Perplexity がやっていることではないか？」',

theme:'ルーターとしての枠組み', url:'https://news.ycombinator.com/item?id=48625401'},

{s:'ske', plat:'HN', handle:'stygiansonic', affil:'Hacker News',

body:'単なる融合ではなく調整役として捉え、エージェントのエージェントであり、それに伴ってトークン使用量が増加すると読んでいる。',

theme:'アーキテクチャ', url:'https://news.ycombinator.com/item?id=48625273'},

{s:'ske', plat:'HN', handle:'alasano', affil:'Hacker News',

body:'Fugu Ultra は OpenRouter Fusion を超える、動的なマルチモデルのミニプランを構築していると見ている。',

theme:'アーキテクチャ', url:'https://news.ycombinator.com/item?id=48625361'},

{s:'ske', plat:'Reddit', handle:'GreedyWorking1499', affil:'Reddit (VentureBeat 経由)',

body:'「極めて高度なルーター/ラッパー」であり、それ以上が証明されるまで、Mythos や Fable のような根本的な飛躍ではない。',

theme:'ルーターとしての枠組み', url:'https://venturebeat.com/orchestration/no-claude-fable-5-no-problem-sakana-achieves-frontier-performance-with-new-fugu-multi-model-auto-synthesis-system'},

// 批判的

{s:'cri', plat:'X', handle:'@eliebakouch', affil:'Prime Intellect',

body:'「これは『AI の主権』ではない」。通常の Fugu をルーターと呼び、不明瞭な「モデル A/B/C」のベースラインを指摘している。',

theme:'主権 / 透明性', url:'https://x.com/eliebakouch/status/2068939729811468503'},

{s:'cri', plat:'X', handle:'@teortaxesTex', affil:'独立系',

body:'コスト分析がなされるまで興奮を控えている。多くのフロンティアトークンを消費するオーケストレーターは、ベスト・オブ・N（Best-of-N）に勝てない可能性がある。',

theme:'コスト', url:'https://x.com/teortaxesTex/status/2068986775796687229'},

{s:'cri', plat:'HN', handle:'adamnemecek', affil:'Hacker News',

body:'「約 4 億ドルを調達したことを考えると、どうにもがっかりさせられる」という。',

theme:'期待値', url:'https://news.ycombinator.com/item?id=48625429'}

];

var labelS = {sup:'Supportive', ske:'Skeptical', cri:'Critical'};

function render(filter){

var box = root.querySelector('#cards');

box.innerHTML = '';

DATA.filter(function(d){ return filter==='all' || d.s===filter; }).forEach(function(d){

var platCls = d.plat==='X' ? 'plat x' : 'plat';

var card = document.createElement('div');

card.className = 'card';

card.innerHTML =

''+

''+d.plat+''+

''+d.handle+'\n'+

''+labelS[d.s]+''+

'\n'+

''+d.affil+'\n'+

''+d.body+'\n'+

''+d.theme+''+

'View source &nearr;\n';

box.appendChild(card);

});

sz();

}

root.querySelectorAll('#filters button').forEach(function(b){

b.addEventListener('click', function(){

root.querySelectorAll('#filters button').forEach(function(x){x.classList.remove('on');});

b.classList.add('on');

render(b.getAttribute('data-f'));

});

render('all');

// auto-resize for WordPress iframe embed

function sz(){

var h = root.offsetHeight + 40;

if(window.parent && window.parent!==window){

window.parent.postMessage({type:'fugu-sent-height', height:h}, '*');

}

window.addEventListener('load', sz);

window.addEventListener('resize', sz);

setTimeout(sz, 300);

if(window.MutationObserver){

new MutationObserver(sz).observe(root, {childList:true, subtree:true, attributes:true});

}

})();

{"translation": "翻訳全文"}

製品ページと技術詳細をご覧ください。また、Twitter でフォローしていただき、15 万人以上の ML サブレッドに参加し、ニュースレターを購読することもぜひお忘れなく。待ってください！Telegram をご利用ですか？今なら Telegram でも私たちに参加いただけます。

GitHub リポジトリや Hugging Face ページ、製品リリース、ウェビナーなどのプロモーションのためにパートナーシップをご検討の場合は、こちらまでご連絡ください。

本記事「Sakana AI が Sakana Fugu を発表：フロントティア LLM の交換可能なプール間でタスクをルーティングするオーケストレーションモデル」は、MarkTechPost で最初に掲載されました。

原文を表示

Today, Sakana AI launched Sakana Fugu. It is a multi-agent orchestration system that behaves like one model. You send a request to a single endpoint. Fugu decides how to handle it internally. It solves a task directly when that is enough. It also assembles and coordinates a team of expert models when needed. The complexity of a multi-agent system never reaches your code.

TL;DR

Fugu delivers a multi-agent system behind one OpenAI-compatible API.

Fugu Ultra leads most published coding and reasoning benchmarks.

The orchestrator beats the individual models it coordinates.

Opt-out and provider routing target compliance and single-vendor risk.

Routing is proprietary, so per-query model selection stays hidden.

What is Sakana Fugu

Fugu is itself a language model. It is trained to call other LLMs in an agent pool. That pool includes instances of itself, called recursively. Fugu manages model selection, delegation, verification, and synthesis internally.

Instead of hard-coded roles or workflows, Fugu learns how to coordinate. It decides when to delegate and how agents should communicate. It then combines their work into one answer. From the outside, you call a single model. Inside, a coordinated system of experts does the work.

Sakana AI frames this as a hedge against single-vendor dependency. If one provider restricts access, Fugu routes around the disruption. The research team cites recent export controls on Anthropic’s Fable and Mythos models as motivation. Over time, newer models can be folded into the pool.

Fugu and Fugu Ultra: Two Models, One API

Fugu ships in two variants, both behind one OpenAI-compatible API:

Fugu balances strong performance with low latency. It is a default for everyday coding, code review, and chatbots. It also fits tools like Codex. You can opt specific agents out of its pool. That helps teams meet data, privacy, and compliance requirements.

Fugu Ultra is tuned for maximum answer quality on hard, multi-step problems. It coordinates a deeper pool of expert agents. Its pool is fixed, so opt-out is not available. The current model ID is fugu-ultra-20260615.

The Research Behind the Orchestrator

Fugu builds on two ICLR 2026 papers Trinity and the Conductor on learned orchestration.

TRINITY uses a lightweight evolved coordinator across several turns. It assigns Thinker, Worker, or Verifier roles to delegate work adaptively. Conductor is trained with reinforcement learning. It discovers natural-language coordination strategies and focused prompts for diverse LLM pools.

Together, they show systems can learn to assemble and route agents per task. That replaces hand-designed workflows.

Interactive Explainer

(function(){

window.addEventListener("message", function(e){

if (e && e.data && e.data.type === "fugu-sim-height") {

var f = document.getElementById("fugu-sim-frame");

if (f && e.data.height) { f.style.height = e.data.height + "px"; }

}

});

})();

Benchmark

Sakana AI compares Fugu against the foundation models it orchestrates. Baselines use provider-reported scores. SWE Bench Pro uses the mini-swe-agent as scaffolding.

BenchmarkFuguFugu UltraOpus 4.8Gemini 3.1 ProGPT 5.5

SWE Bench Pro*59.073.769.254.258.6

TerminalBench 2.180.282.174.670.378.2

LiveCodeBench92.993.287.888.585.3

LiveCodeBench Pro87.890.884.882.988.4

Humanity’s Last Exam47.250.049.844.441.4

CharXiv Reasoning85.186.684.283.384.1

GPQA-D95.595.592.094.393.6

SciCode60.158.753.558.956.1

τ³ Banking21.720.620.68.420.6

Long Context Reasoning74.773.367.772.774.3

MRCRv286.693.687.984.994.8

The orchestrator posts the top score on 10 of 11 rows. Fugu Ultra tops the four coding benchmarks, CharXiv Reasoning, and Humanity’s Last Exam. It ties regular Fugu on GPQA-D. Regular Fugu leads SciCode, τ³ Banking, and Long Context Reasoning. GPT 5.5 wins MRCRv2, the only baseline win here.

Its Fugu models stand shoulder-to-shoulder with Anthropic’s Fable 5 and Mythos Preview. Those two are not in Fugu’s pool, since they are not publicly accessible.

Use Cases

Sakana AI ran a beta with close to 500 early users. The published examples favor long, multi-step tasks.

AutoResearch: An agent improved a small GPT’s training recipe autonomously. It ran 123 experiments over roughly 14 hours on one H100 GPU. Fugu Ultra reached the best mean validation BPB of 0.9774, with a best single run of 0.9748.

Rubik’s Cube solver: Each model wrote a pure-Python solver, no libraries allowed. Fugu Ultra solved all 300 held-out cubes, averaging 19.72 moves. One baseline matched it closely at 19.76 moves. Two others crashed and solved none.

Classical Japanese kana reading order: On a 1610 letter, Fugu Ultra scored NED 0.80. The nearest baseline reached only 0.24.

Blindfold chess: Fugu played four games from memory, with no board shown. It beat three frontier models and a 2100-Elo Stockfish engine.

Online trading: On one 50-week window, Fugu Ultra returned +19.43% on average across five runs. The other frontier models stayed below +15%. Sakana AI notes past performance does not guarantee future results.

A Minimal API Example

Fugu uses an OpenAI-compatible API, so no SDK migration is required. Point an existing client at your console-provided endpoint.

Copy CodeCopiedUse a different Browser

from openai import OpenAI

Endpoint and key come from your Sakana console (console.sakana.ai).

client = OpenAI(

base_url="https://<your-fugu-endpoint>/v1", # from console.sakana.ai

api_key="YOUR_SAKANA_API_KEY",

)

resp = client.chat.completions.create(

model="fugu-ultra-20260615", # or "fugu"

messages=[

{"role": "user",

"content": "Reproduce the method in this paper and report the gap."},

)

print(resp.choices[0].message.content)

Token usage and cost are reported per request. So you can monitor spend in real time.

Community Reactions

#fugu-sent-root *{box-sizing:border-box;margin:0;padding:0}

#fugu-sent-root{

--bg:#fff;--ink:#0a0a0a;--mut:#6b6b6b;--line:#dcdcdc;--soft:#f5f5f5;--soft2:#ebebeb;

font-family:"IBM Plex Mono",ui-monospace,SFMono-Regular,Menlo,Consolas,monospace;

background:var(--bg);color:var(--ink);border:1px solid var(--ink);

max-width:920px;margin:0 auto;-webkit-font-smoothing:antialiased;line-height:1.5;

}

#fugu-sent-root .hd{border-bottom:1px solid var(--ink);padding:18px 20px;display:flex;justify-content:space-between;align-items:flex-start;gap:12px;flex-wrap:wrap}

#fugu-sent-root .hd h2{font-size:17px;letter-spacing:.03em;font-weight:700}

#fugu-sent-root .hd p{font-size:11.5px;color:var(--mut);margin-top:6px;max-width:560px}

#fugu-sent-root .tag{font-size:10px;letter-spacing:.12em;text-transform:uppercase;border:1px solid var(--ink);padding:4px 8px;white-space:nowrap}

#fugu-sent-root .panel{padding:18px 20px;border-bottom:1px solid var(--line)}

#fugu-sent-root .lbl{font-size:10px;letter-spacing:.16em;text-transform:uppercase;color:var(--mut);margin-bottom:10px;display:block}

/* overview bar */

#fugu-sent-root .obar{display:flex;height:32px;border:1px solid var(--ink);overflow:hidden}

#fugu-sent-root .seg{display:flex;align-items:center;justify-content:center;white-space:nowrap;border-right:1px solid var(--ink)}

#fugu-sent-root .seg:last-child{border-right:0}

#fugu-sent-root .seg.sup{background:#0a0a0a}

#fugu-sent-root .seg.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}

#fugu-sent-root .seg.cri{background:#fff}

#fugu-sent-root .seg .t{font-size:10.5px;font-weight:700;background:#fff;color:#0a0a0a;border:1px solid #0a0a0a;padding:1px 7px;line-height:1.4}

#fugu-sent-root .legend{display:flex;gap:18px;flex-wrap:wrap;margin-top:12px;font-size:11px;color:var(--mut)}

#fugu-sent-root .legend span{display:inline-flex;align-items:center;gap:7px}

#fugu-sent-root .sw{width:14px;height:14px;border:1px solid var(--ink);display:inline-block}

#fugu-sent-root .sw.sup{background:#0a0a0a}

#fugu-sent-root .sw.ske{background:repeating-linear-gradient(45deg,#0a0a0a,#0a0a0a 1px,#fff 1px,#fff 6px)}

#fugu-sent-root .sw.cri{background:#fff}

#fugu-sent-root .summary{font-size:12.5px;margin-top:14px;border-left:3px solid var(--ink);padding-left:12px}

/* filters */

#fugu-sent-root .filters{display:flex;gap:0;flex-wrap:wrap;border:1px solid var(--ink);width:max-content;max-width:100%}

#fugu-sent-root .filters button:last-child{border-right:0}

#fugu-sent-root .filters button.on{background:#0a0a0a;color:#fff}

/* cards */

#fugu-sent-root .cards{padding:8px 20px 18px;display:grid;grid-template-columns:1fr 1fr;gap:12px}

#fugu-sent-root .card{border:1px solid var(--ink);padding:13px 14px;display:flex;flex-direction:column;gap:9px;background:var(--bg)}

#fugu-sent-root .card .top{display:flex;justify-content:space-between;align-items:center;gap:8px}

#fugu-sent-root .who{display:flex;align-items:center;gap:8px;min-width:0}

#fugu-sent-root .plat{font-size:9px;letter-spacing:.08em;border:1px solid var(--ink);padding:2px 5px;font-weight:700;flex:none}

#fugu-sent-root .plat.x{background:#0a0a0a;color:#fff}

#fugu-sent-root .handle{font-size:12px;font-weight:700;white-space:nowrap;overflow:hidden;text-overflow:ellipsis}

#fugu-sent-root .chip{font-size:9px;letter-spacing:.08em;text-transform:uppercase;border:1px solid var(--ink);padding:2px 7px;flex:none}

#fugu-sent-root .chip.sup{background:#0a0a0a;color:#fff}

#fugu-sent-root .chip.ske{background:var(--soft2)}

#fugu-sent-root .chip.cri{background:#fff;border-style:dashed}

#fugu-sent-root .card .body{font-size:12.5px;line-height:1.5}

#fugu-sent-root .card .q{font-style:italic}

#fugu-sent-root .card .foot{display:flex;justify-content:space-between;align-items:center;gap:8px;margin-top:auto;padding-top:4px;border-top:1px dotted var(--line)}

#fugu-sent-root .theme{font-size:10px;color:var(--mut);letter-spacing:.04em}

#fugu-sent-root a.src{font-size:11px;color:var(--ink);text-decoration:none;border-bottom:1px solid var(--ink);white-space:nowrap;font-weight:700}

#fugu-sent-root a.src:hover{background:#0a0a0a;color:#fff;border-bottom-color:#0a0a0a;padding:0 3px}

#fugu-sent-root .affil{font-size:9px;color:var(--mut)}

/* press row */

#fugu-sent-root .press{display:flex;gap:10px;flex-wrap:wrap}

#fugu-sent-root .press a{font-size:11.5px;color:var(--ink);text-decoration:none;border:1px solid var(--ink);padding:7px 11px;display:inline-flex;gap:6px;align-items:center}

#fugu-sent-root .press a:hover{background:#0a0a0a;color:#fff}

#fugu-sent-root .note{font-size:10px;color:var(--mut);line-height:1.6;padding:0 20px 16px}

#fugu-sent-root .ft{padding:12px 20px;border-top:1px solid var(--ink);display:flex;justify-content:space-between;align-items:center;gap:10px;flex-wrap:wrap;font-size:10.5px;color:var(--mut)}

#fugu-sent-root .ft b{color:var(--ink);letter-spacing:.04em}

@media(max-width:640px){

#fugu-sent-root .cards{grid-template-columns:1fr}

#fugu-sent-root .hd h2{font-size:15px}

#fugu-sent-root .seg{font-size:10px}

}

Sakana Fugu — Early Community Sentiment

A manual review of public reaction on X and Hacker News, with links to every source. Captured June 22, 2026.

12 posts reviewed

Sentiment split (n = 12)

Supportive 3

Skeptical 6

Critical 3

Supportive

Skeptical

Critical

Early reaction skews skeptical. The “is this just a router or wrapper?” question dominates. The clearest supportive voices are Sakana‑affiliated.

All

Supportive

Skeptical

Critical

Press & analysis

Hacker News thread · 50 pts &nearr;

VentureBeat report &nearr;

Clanker Cloud analysis &nearr;

Method: sentiment was assigned by hand from a small sample of public posts on June 22, 2026. This is not a statistical survey, and the split can shift as more reactions arrive. Two of the three supportive posts are from Sakana AI or its CEO. Quotes are shortened; follow each link for full context. The Reddit quote is as reported by VentureBeat.

Marktechpost · Sakana Fugu sentiment tracker

Sources: X · Hacker News · VentureBeat

(function(){

var root = document.getElementById('fugu-sent-root');

var DATA = [

// SUPPORTIVE

{s:'sup', plat:'X', handle:'@SakanaAILabs', affil:'Sakana AI (official)',

body:'Launch announcement. Positions Fugu Ultra to match Fable and Mythos, without export-control risk.',

theme:'Announcement', url:'https://x.com/SakanaAILabs/status/2068861630327443966'},

{s:'sup', plat:'X', handle:'@hardmaru', affil:'David Ha, Sakana CEO',

body:'“Orchestration Models are the next frontier, beyond bigger models.” Frames it as a hedge against single-vendor risk.',

theme:'Vision', url:'https://x.com/hardmaru/status/2068884466056225025'},

{s:'sup', plat:'Blog', handle:'Clanker Cloud', affil:'Independent analysis',

body:'Calls Fugu a productized orchestration layer and a healthy debate — but wants real observability into which agents ran.',

theme:'Analysis', url:'https://clankercloud.ai/blog/sakana-fugu-release-model-orchestration-clanker-cloud'},

// SKEPTICAL

{s:'ske', plat:'HN', handle:'ed_mercer', affil:'Hacker News',

body:'“So basically... openrouter?”',

theme:'Router framing', url:'https://news.ycombinator.com/item?id=48625104'},

{s:'ske', plat:'HN', handle:'embedding-shape', affil:'Hacker News',

body:'Asks how one API is not just swapping one single-vendor dependency for another.',

theme:'Sovereignty', url:'https://news.ycombinator.com/item?id=48625312'},

{s:'ske', plat:'HN', handle:'bprasanna', affil:'Hacker News',

body:'“Isn’t this what perplexity is?”',

theme:'Router framing', url:'https://news.ycombinator.com/item?id=48625401'},

{s:'ske', plat:'HN', handle:'stygiansonic', affil:'Hacker News',

body:'Reads it as a coordinator, not just fusion — an agent-of-agents, with token usage rising accordingly.',

theme:'Architecture', url:'https://news.ycombinator.com/item?id=48625273'},

{s:'ske', plat:'HN', handle:'alasano', affil:'Hacker News',

body:'Sees Fugu Ultra building a dynamic multi-model mini-plan, more than OpenRouter Fusion.',

theme:'Architecture', url:'https://news.ycombinator.com/item?id=48625361'},

{s:'ske', plat:'Reddit', handle:'GreedyWorking1499', affil:'Reddit (via VentureBeat)',

body:'“A highly advanced router/wrapper” — not a fundamental leap like Mythos or Fable, until proven otherwise.',

theme:'Router framing', url:'https://venturebeat.com/orchestration/no-claude-fable-5-no-problem-sakana-achieves-frontier-performance-with-new-fugu-multi-model-auto-synthesis-system'},

// CRITICAL

{s:'cri', plat:'X', handle:'@eliebakouch', affil:'Prime Intellect',

body:'“This is not ‘AI sovereignty’.” Calls regular Fugu a router and flags opaque “Model A/B/C” baselines.',

theme:'Sovereignty / transparency', url:'https://x.com/eliebakouch/status/2068939729811468503'},

{s:'cri', plat:'X', handle:'@teortaxesTex', affil:'Independent',

body:'Withholds excitement pending cost analysis. An orchestrator spending many frontier tokens may not beat best-of-n.',

theme:'Cost', url:'https://x.com/teortaxesTex/status/2068986775796687229'},

{s:'cri', plat:'HN', handle:'adamnemecek', affil:'Hacker News',

body:'“Seems kinda underwhelming considering they raised like $400M.”',

theme:'Expectations', url:'https://news.ycombinator.com/item?id=48625429'}

];

var labelS = {sup:'Supportive', ske:'Skeptical', cri:'Critical'};

function render(filter){

var box = root.querySelector('#cards');

box.innerHTML = '';

DATA.filter(function(d){ return filter==='all' || d.s===filter; }).forEach(function(d){

var platCls = d.plat==='X' ? 'plat x' : 'plat';

var card = document.createElement('div');

card.className = 'card';

card.innerHTML =

''+

''+d.plat+''+

''+d.handle+'

''+labelS[d.s]+''+

''+d.affil+'

''+d.body+'

''+d.theme+''+

'View source &nearr;

box.appendChild(card);

});

sz();

}

root.querySelectorAll('#filters button').forEach(function(b){

b.addEventListener('click', function(){

root.querySelectorAll('#filters button').forEach(function(x){x.classList.remove('on');});

b.classList.add('on');

render(b.getAttribute('data-f'));

});

render('all');

// auto-resize for WordPress iframe embed

function sz(){

var h = root.offsetHeight + 40;

if(window.parent && window.parent!==window){

window.parent.postMessage({type:'fugu-sent-height', height:h}, '*');

}

window.addEventListener('load', sz);

window.addEventListener('resize', sz);

setTimeout(sz, 300);

if(window.MutationObserver){

new MutationObserver(sz).observe(root, {childList:true, subtree:true, attributes:true});

}

})();

Check out the Product page and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post Sakana AI Launches Sakana Fugu: An Orchestration Model That Routes Tasks Across a Swappable Pool of Frontier LLMs appeared first on MarkTechPost.

この記事をシェア

TLDR AI★42026年6月22日 09:00

サカナ AI のフグ（3 分読了）

サカナ AI ラボが、魚の「フグ」に由来する新しいプロジェクトまたは研究を発表しました。具体的な技術内容や機能の詳細は本文抜粋からは読み取れません。

AI News★42026年6月23日 01:11

Sakana AI の「Fugu」がベンダーロックインを緩和するマルチエージェントモデルとして登場

日本のAI企業サカナ・エーアイは、単一ベンダー依存によるリスクを軽減するため、多様なモデルを呼び出してタスクを遂行するオーケストレーション言語モデル「Fugu」を発表した。ユーザーは1つのエンドポイントからこのエコシステムにアクセスできる。

Vercel Blog★42026年6月22日 09:00

Sakana Fugu Ultra が AI Gateway で利用可能に

Sakana AI の「Fugu Ultra」が AI Gateway で利用可能になった。これは単一モデルではなく、複数の最先端モデルを協調させて回答を生成する仕組みで、推論能力は Claude や Fable 5 に匹敵する。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む