Cloudflare Blog·2026年2月20日 23:00·約7分

コードモード：1,000トークンでエージェントにAPI全体を提供

#AIエージェント #MCP #コンテキストウィンドウ最適化 #Cloudflare #API統合 #コード実行

TL;DR

Cloudflare APIの2,500以上のエンドポイントを、コードモードにより2つのツールと約1,000トークンに圧縮し、効率的なAIエージェント連携を実現。

AI深層分析2026年2月26日 15:41

重要/ 5段階

キーポイント

CloudflareがAIエージェント向けにCode Mode技術を活用したMCPサーバーを発表

従来のMCPサーバーと比較して99.9%のトークン削減を実現（1,000トークン vs 117万トークン）

search()とexecute()の2ツールのみで全Cloudflare APIにアクセス可能

Code Mode SDKをオープンソース化し、他社でも同技術を利用可能に

Dynamic Worker Loaderによる安全なコード実行環境を提供

影響分析・編集コメントを表示

影響分析

この技術はAIエージェントの実用性を大幅に向上させる可能性があり、大規模APIの効率的な利用という課題を解決します。業界標準であるMCPの実装方法に革新をもたらし、他のクラウドプロバイダーにも影響を与えるでしょう。

編集コメント

AIエージェントの実用化における大きな技術的障壁を解決する画期的なアプローチ。クラウドAPIとAIの統合における新たな標準となる可能性。

Code Mode: 1,000トークンでエージェントにAPI全体を提供する

Matt Carey

Model Context Protocol (MCP)は、AIエージェントが外部ツールを使用するための標準的な方法となっています。しかし、その中核には緊張関係があります。エージェントは有用な作業を行うために多くのツールを必要としますが、追加されるツールはすべてモデルのコンテキストウィンドウを埋め、実際のタスクに使える余地を減らしてしまうのです。

Code Modeは、エージェントがツールを使用する際のコンテキストウィンドウの使用量を削減するために私たちが初めて導入した技術です。すべての操作を個別のツールとして記述する代わりに、モデルに型付けされたSDKに対してコードを書かせ、そのコードをDynamic Worker Loader内で安全に実行させるのです。コードはコンパクトな計画として機能します。モデルはツール操作を探索し、複数の呼び出しを組み合わせ、必要なデータだけを返すことができます。Anthropicも彼らのCode Execution with MCPの投稿で同じパターンを独自に探究しました。

本日、私たちはCode Modeを使用した、Cloudflare API全体（DNSやZero TrustからWorkers、R2まで）向けの新しいMCPサーバーを発表します。search()とexecute()のわずか2つのツールだけで、このサーバーはMCP経由でCloudflare API全体へのアクセスを提供可能であり、約1,000トークンしか消費しません。APIエンドポイントの数に関わらず、このフットプリントは固定されたままです。

Cloudflare APIのような大規模なAPIでは、Code Modeは使用される入力トークン数を99.9%削減します。Code Modeなしの同等のMCPサーバーは117万トークンを消費するでしょう。これは、最も先進的な基盤モデルのコンテキストウィンドウ全体よりも多い量です。

tiktokenで測定した、Code modeによる節約効果とネイティブMCPの比較

この新しいCloudflare MCPサーバーは本日からご利用いただけます。また、Cloudflare Agents SDKに新しいCode Mode SDKをオープンソース化しますので、ご自身のMCPサーバーやAIエージェントでも同じアプローチを使用できます。

サーバーサイドのCode Mode

この新しいMCPサーバーはCode Modeをサーバーサイドで適用します。何千ものツールの代わりに、サーバーはわずか2つをエクスポートします：search()

[ { "name": "search", "description": "Cloudflare OpenAPI仕様を検索します。すべての$refは事前にインライン解決されています。", "inputSchema": { "type": "object", "properties": { "code": { "type": "string", "description": "OpenAPI仕様を検索するJavaScript非同期アロー関数" } }, "required": ["code"] } }, { "name": "execute", "description": "Cloudflare APIに対してJavaScriptコードを実行します。", "inputSchema": { "type": "object", "properties": { "code": { "type": "string", "description": "実行するJavaScript非同期アロー関数" } }, "required": ["code"] } } ]

何ができるかを発見するために、エージェントはsearch()を呼び出します。

エージェントが実行する準備ができたら、execute()を呼び出します。

両方のツールは、生成されたコードをDynamic Worker isolate内で実行します。これは軽量なV8サンドボックスであり、ファイルシステムはなく、環境変数はプロンプトインジェクションを通じて漏洩することはなく、外部フェッチはデフォルトで無効化されています。アウトバウンドリクエストは、必要に応じてアウトバウンドフェッチハンドラーで明示的に制御できます。

例：オリジンをDDoS攻撃から保護する

ユーザーがエージェントに「私のオリジンをDDoS攻撃から保護して」と指示したとします。エージェントの最初のステップは、ドキュメントを参照することです。Cloudflare Docs MCP Serverを呼び出したり、Cloudflare Skillを使用したり、ウェブを直接検索したりするかもしれません。ドキュメントから、オリジンの前にCloudflare WAFとDDoS保護ルールを配置することを学びます。

ステップ1：適切なエンドポイントを検索する

検索

async () => { const results = []; for (const [path, methods] of Object.entries(spec.paths)) { if (path.includes('/zones/') && (path.includes('firewall/waf') || path.includes('rulesets'))) { for (const [method, op] of Object.entries(methods)) { results.push({ method: method.toUpperCase(), path, summary: op.summary }); } } } return results; }

サーバーはこのコードをWorkers isolateで実行し、以下を返します：

[ { "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages", "summary": "WAFパッケージを一覧表示" }, { "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}", "summary": "WAFパッケージを更新" }, { "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules", "summary": "WAFルールを一覧表示" }, { "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules/{rule_id}", "summary": "WAFルールを更新" }, { "method": "GET", "path": "/zones/{zone_id}/rulesets", "summary": "ゾーンルールセットを一覧表示" }, { "method": "POST", "path": "/zones/{zone_id}/rulesets", "summary": "ゾーンルールセットを作成" }, { "method": "GET", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "ゾーンエントリーポイントルールセットを取得" }, { "method": "PUT", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "ゾーンエントリーポイントルールセットを更新" }, { "method": "POST", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules", "summary": "ゾーンルールセットルールを作成" }, { "method": "PATCH", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules/{rule_id}", "summary": "ゾーンルールセットルールを更新" } ]

Cloudflare API仕様全体には2,500以上のエンドポイントがあります。モデルはそれを、必要なWAFとルールセットのエンドポイントに絞り込みました。仕様のいかなる部分もコンテキストウィンドウに入ることはありませんでした。

モデルはまた、特定のエンドポイントのスキーマを呼び出す前に詳細に調べることもできます。ここでは、ゾーンルールセットで利用可能なフェーズを調べています：

async () => { const op = spec.paths['/zones/{zone_id}/rulesets']?.get; const items = op?.responses?.['200']?.content?.['application/json']?.schema; // スキーマを辿ってフェーズの列挙型を見つける const props = items?.allOf?.[1]?.properties?.result?.items?.allOf?.[1]?.properties; return { phases: props?.phase?.enum }; } { "phases": [ "ddos_l4", "ddos_l7", "http_request_firewall_custom", "http_request_firewall_managed", "http_response_firewall_managed", "http_ratelimit", "http_request_redirect", "http_request_transform", "magic_transit", "magic_transit_managed" ] }

エージェントは、必要な正確なフェーズがddos_l7とhttp_request_firewall_managedであることを知りました。

ステップ2：APIに対してアクションを実行する

エージェントはexecute()の使用に切り替え、cloudflare.request()を使用してAPIを呼び出します。

async () => { const response = await cloudflare.request({ method: "GET", path: /zones/${zoneId}/rulesets }); return response.result.map(rs => ({ name: rs.name, phase: rs.phase, kind: rs.kind })); } [ { "name": "DDoS L7", "phase": "ddos_l7", "kind": "managed" }, { "name": "Cloudflare Managed","phase": "http_request_firewall_managed", "kind": "managed" }, { "name": "Custom rules", "phase": "http_request_firewall_custom", "kind": "zone" } ]

エージェントは、管理されたDDoSとWAFのルールセットが既に存在することを確認します。これで、単一の実行内で、それらのルールを検査し、感度レベルを更新するために呼び出しを連鎖させることができます：

async () => { // 現在のDDoS L7エントリーポイントルールセットを取得 const ddos = await cloudflare.request({ method: "GET", path: /zones/${zoneId}/rulesets/phases/ddos_l7/entrypoint }); // WAF管理ルールセットを取得 const waf = await cloudflare.request({ method: "GET", path: /zones/${zoneId}/rulesets/phases/http_request_firewall_managed/entrypoint }); }

この操作全体、仕様の検索とスキーマの検査から、ルールセットの一覧表示、DDoSとWAFの設定の取得まで、わずか4回のツール呼び出しで完了しました。

Cloudflare MCPサーバー

私たちは個別の製品向けのMCPサーバーから始めました。DNSを管理するエージェントが欲しいですか？DNS MCPサーバーを追加してください。Workersのログが欲しいですか？Workers Observability MCPサーバーを追加してください。各サーバーは、API操作に対応する固定されたツールセットをエクスポートしていました。これはツールセットが小さい場合は機能しましたが、Cloudflare APIには2,500以上のエンドポイントがあります。手動でメンテナンスされるサーバーのコレクションでは追いつけません。

Cloudflare MCPサーバーはこれを簡素化します。2つのツール、約1,000トークンで、API内のすべてのエンドポイントをカバーします。新しい製品を追加するときも、同じsearch()とexecute()ツールが自動的にそれらを公開します。

私たちのMCPサーバーは最新のMCP仕様に基づいて構築されています。OAuth 2.1に準拠しており、Workers OAuth Providerを使用してトークンをダウンスコープし、

原文を表示

Code Mode: give agents an entire API in 1,000 tokens

Matt Carey

Model Context Protocol (MCP) has become the standard way for AI agents to use external tools. But there is a tension at its core: agents need many tools to do useful work, yet every tool added fills the model's context window, leaving less room for the actual task.

Code Mode is a technique we first introduced for reducing context window usage during agent tool use. Instead of describing every operation as a separate tool, let the model write code against a typed SDK and execute the code safely in a Dynamic Worker Loader. The code acts as a compact plan. The model can explore tool operations, compose multiple calls, and return just the data it needs. Anthropic independently explored the same pattern in their Code Execution with MCP post.

Today we are introducing a new MCP server for the entire Cloudflare API — from DNS and Zero Trust to Workers and R2 — that uses Code Mode. With just two tools, search() and execute(), the server is able to provide access to the entire Cloudflare API over MCP, while consuming only around 1,000 tokens. The footprint stays fixed, no matter how many API endpoints exist.

For a large API like the Cloudflare API, Code Mode reduces the number of input tokens used by 99.9%. An equivalent MCP server without Code Mode would consume 1.17 million tokens — more than the entire context window of the most advanced foundation models.

Code mode savings vs native MCP, measured with tiktoken

You can start using this new Cloudflare MCP server today. And we are also open-sourcing a new Code Mode SDK in the Cloudflare Agents SDK, so you can use the same approach in your own MCP servers and AI Agents.

Server‑side Code Mode

This new MCP server applies Code Mode server-side. Instead of thousands of tools, the server exports just two: search()

[ { "name": "search", "description": "Search the Cloudflare OpenAPI spec. All $refs are pre-resolved inline.", "inputSchema": { "type": "object", "properties": { "code": { "type": "string", "description": "JavaScript async arrow function to search the OpenAPI spec" } }, "required": ["code"] } }, { "name": "execute", "description": "Execute JavaScript code against the Cloudflare API.", "inputSchema": { "type": "object", "properties": { "code": { "type": "string", "description": "JavaScript async arrow function to execute" } }, "required": ["code"] } } ]

To discover what it can do, the agent calls search()

When the agent is ready to act, it calls execute()

Both tools run the generated code inside a Dynamic Worker isolate — a lightweight V8 sandbox with no file system, no environment variables to leak through prompt injection and external fetches disabled by default. Outbound requests can be explicitly controlled with outbound fetch handlers when needed.

Example: Protecting an origin from DDoS attacks

Suppose a user tells their agent: "protect my origin from DDoS attacks." The agent's first step is to consult documentation. It might call the Cloudflare Docs MCP Server, use a Cloudflare Skill, or search the web directly. From the docs it learns: put Cloudflare WAF and DDoS protection rules in front of the origin.

Step 1: Search for the right endpoints The search

The server runs this code in a Workers isolate and returns:

[ { "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages", "summary": "List WAF packages" }, { "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}", "summary": "Update a WAF package" }, { "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules", "summary": "List WAF rules" }, { "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules/{rule_id}", "summary": "Update a WAF rule" }, { "method": "GET", "path": "/zones/{zone_id}/rulesets", "summary": "List zone rulesets" }, { "method": "POST", "path": "/zones/{zone_id}/rulesets", "summary": "Create a zone ruleset" }, { "method": "GET", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "Get a zone entry point ruleset" }, { "method": "PUT", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "Update a zone entry point ruleset" }, { "method": "POST", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules", "summary": "Create a zone ruleset rule" }, { "method": "PATCH", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules/{rule_id}", "summary": "Update a zone ruleset rule" } ]

The full Cloudflare API spec has over 2,500 endpoints. The model narrowed that to the WAF and ruleset endpoints it needs, without any of the spec entering the context window.

The model can also drill into a specific endpoint's schema before calling it. Here it inspects what phases are available on zone rulesets:

async () => { const op = spec.paths['/zones/{zone_id}/rulesets']?.get; const items = op?.responses?.['200']?.content?.['application/json']?.schema; // Walk the schema to find the phase enum const props = items?.allOf?.[1]?.properties?.result?.items?.allOf?.[1]?.properties; return { phases: props?.phase?.enum }; } { "phases": [ "ddos_l4", "ddos_l7", "http_request_firewall_custom", "http_request_firewall_managed", "http_response_firewall_managed", "http_ratelimit", "http_request_redirect", "http_request_transform", "magic_transit", "magic_transit_managed" ] }

The agent now knows the exact phases it needs: ddos_l7

http_request_firewall_managed

Step 2: Act on the API The agent switches to using execute

cloudflare.request()

The agent sees that managed DDoS and WAF rulesets already exist. It can now chain calls to inspect their rules and update sensitivity levels in a single execution:

async () => { // Get the current DDoS L7 entrypoint ruleset const ddos = await cloudflare.request({ method: "GET", path: /zones/${zoneId}/rulesets/phases/ddos_l7/entrypoint }); // Get the WAF managed ruleset const waf = await cloudflare.request({ method: "GET", path: /zones/${zoneId}/rulesets/phases/http_request_firewall_managed/entrypoint }); }

This entire operation, from searching the spec and inspecting a schema to listing rulesets and fetching DDoS and WAF configurations, took four tool calls.

The Cloudflare MCP server

We started with MCP servers for individual products. Want an agent that manages DNS? Add the DNS MCP server. Want Workers logs? Add the Workers Observability MCP server. Each server exported a fixed set of tools that mapped to API operations. This worked when the tool set was small, but the Cloudflare API has over 2,500 endpoints. No collection of hand-maintained servers could keep up.

The Cloudflare MCP server simplifies this. Two tools, roughly 1,000 tokens, and coverage of every endpoint in the API. When we add new products, the same search()

Our MCP server is built on the latest MCP specifications. It is OAuth 2.1 compliant, using Workers OAuth Provider to downscope the token to selected permissions approved by the user when connecting. The agent only gets the capabilities the user explicitly granted.

For developers, this means you can use a simple agent loop and still give your agent access to the full Cloudflare API with built-in progressive capability discovery.

Comparing approaches to context reduction

Several approaches have emerged to reduce how many tokens MCP tools consume:

Client-side Code Mode was our first experiment. The model writes TypeScript against typed SDKs and runs it in a Dynamic Worker Loader on the client. The tradeoff is that it requires the agent to ship with secure sandbox access. Code Mode is implemented in Goose and Anthropics Claude SDK as Programmatic Tool Calling.

Command-line interfaces are another path. CLIs are self-documenting and reveal capabilities as the agent explores. Tools like OpenClaw and Moltworker convert MCP servers into CLIs using MCPorter to give agents progressive disclosure. The limitation is obvious: the agent needs a shell, which not every environment provides and which introduces a much broader attack surface than a sandboxed isolate.

Dynamic tool search, as used by Anthropic in Claude Code, surfaces a smaller set of tools hopefully relevant to the current task. It shrinks context use but now requires a search function that must be maintained and evaluated, and each matched tool still uses tokens.

Each approach solves a real problem. But for MCP servers specifically, server-side Code Mode combines their strengths: fixed token cost regardless of API size, no modifications needed on the agent side, progressive discovery built in, and safe execution inside a sandboxed isolate. The agent just calls two tools with code. Everything else happens on the server.

Get started today

The Cloudflare MCP server is available now. Point your MCP client at the server URL and you'll be redirected to Cloudflare to authorize and select the permissions to grant to your agent. Add this config to your MCP client:

{ "mcpServers": { "cloudflare-api": { "url": "https://mcp.cloudflare.com/mcp" } } }

For CI/CD, automation, or if you prefer managing tokens yourself, create a Cloudflare API token with the permissions you need. Both user tokens and account tokens are supported and can be passed as bearer tokens in the Authorization

More information on different MCP setup configurations can be found at the Cloudflare MCP repository.

Looking forward

Code Mode solves context costs for a single API. But agents rarely talk to one service. A developer's agent might need the Cloudflare API alongside GitHub, a database, and an internal docs server. Each additional MCP server brings the same context window pressure we started with.

Cloudflare MCP Server Portals let you compose multiple MCP servers behind a single gateway with unified auth and access control. We are building a first-class Code Mode integration for all your MCP servers, and exposing them to agents with built-in progressive discovery and the same fixed-token footprint, regardless of how many services sit behind the gateway.

この記事をシェア

GitHub Changelog重要度42026年7月2日 17:17

GitHub、全組織向けにイシューフィールドを一般提供開始

Cloudflare Blog重要度42026年7月1日 22:00

Cloudflare、x402 を活用した「マネタイゼーションゲートウェイ」を発表：保護されたあらゆるリソースへの課金機能を追加

Cloudflare Blog重要度52026年7月1日 22:00

コンテンツ独立記念日、1 年後：エージェント型インターネットのビジネスモデル構築について

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

Cloudflare Blog·2026年2月20日 23:00·約7分

コードモード：1,000トークンでエージェントにAPI全体を提供

#AIエージェント #MCP #コンテキストウィンドウ最適化 #Cloudflare #API統合 #コード実行

TL;DR

Cloudflare APIの2,500以上のエンドポイントを、コードモードにより2つのツールと約1,000トークンに圧縮し、効率的なAIエージェント連携を実現。

AI深層分析2026年2月26日 15:41

重要/ 5段階

キーポイント

CloudflareがAIエージェント向けにCode Mode技術を活用したMCPサーバーを発表

従来のMCPサーバーと比較して99.9%のトークン削減を実現（1,000トークン vs 117万トークン）

search()とexecute()の2ツールのみで全Cloudflare APIにアクセス可能

Code Mode SDKをオープンソース化し、他社でも同技術を利用可能に

Dynamic Worker Loaderによる安全なコード実行環境を提供

影響分析・編集コメントを表示

影響分析

編集コメント

AIエージェントの実用化における大きな技術的障壁を解決する画期的なアプローチ。クラウドAPIとAIの統合における新たな標準となる可能性。

Code Mode: 1,000トークンでエージェントにAPI全体を提供する

Matt Carey

tiktokenで測定した、Code modeによる節約効果とネイティブMCPの比較

サーバーサイドのCode Mode

この新しいMCPサーバーはCode Modeをサーバーサイドで適用します。何千ものツールの代わりに、サーバーはわずか2つをエクスポートします：search()

何ができるかを発見するために、エージェントはsearch()を呼び出します。

エージェントが実行する準備ができたら、execute()を呼び出します。

例：オリジンをDDoS攻撃から保護する

ステップ1：適切なエンドポイントを検索する

検索

サーバーはこのコードをWorkers isolateで実行し、以下を返します：

エージェントは、必要な正確なフェーズがddos_l7とhttp_request_firewall_managedであることを知りました。

ステップ2：APIに対してアクションを実行する

エージェントはexecute()の使用に切り替え、cloudflare.request()を使用してAPIを呼び出します。

Cloudflare MCPサーバー

原文を表示

Code Mode: give agents an entire API in 1,000 tokens

Matt Carey

Code mode savings vs native MCP, measured with tiktoken

Server‑side Code Mode

This new MCP server applies Code Mode server-side. Instead of thousands of tools, the server exports just two: search()

To discover what it can do, the agent calls search()

When the agent is ready to act, it calls execute()

Example: Protecting an origin from DDoS attacks

Step 1: Search for the right endpoints The search

The server runs this code in a Workers isolate and returns:

The full Cloudflare API spec has over 2,500 endpoints. The model narrowed that to the WAF and ruleset endpoints it needs, without any of the spec entering the context window.

The model can also drill into a specific endpoint's schema before calling it. Here it inspects what phases are available on zone rulesets:

The agent now knows the exact phases it needs: ddos_l7

http_request_firewall_managed

Step 2: Act on the API The agent switches to using execute

cloudflare.request()

The agent sees that managed DDoS and WAF rulesets already exist. It can now chain calls to inspect their rules and update sensitivity levels in a single execution:

This entire operation, from searching the spec and inspecting a schema to listing rulesets and fetching DDoS and WAF configurations, took four tool calls.

The Cloudflare MCP server

The Cloudflare MCP server simplifies this. Two tools, roughly 1,000 tokens, and coverage of every endpoint in the API. When we add new products, the same search()

For developers, this means you can use a simple agent loop and still give your agent access to the full Cloudflare API with built-in progressive capability discovery.

Comparing approaches to context reduction

Several approaches have emerged to reduce how many tokens MCP tools consume:

Get started today

{ "mcpServers": { "cloudflare-api": { "url": "https://mcp.cloudflare.com/mcp" } } }

More information on different MCP setup configurations can be found at the Cloudflare MCP repository.

Looking forward