Claude Blog·2026年4月10日 09:00·約11分で読める

エージェントのように見る：Claude Codeにおけるツール設計の方法

#AIエージェント #ツール設計 #Claude Code #プロンプトエンジニアリング #ユーザーインタラクション #開発方法論

TL;DR

Claude Codeチームは、エージェントの視点に立ってツール設計を行う方法論を紹介し、AskUserQuestionツール開発における試行錯誤のプロセスを具体的に説明している。

AI深層分析2026年4月11日 05:41

注目/ 5段階

深度40%

キーポイント

エージェント視点のツール設計哲学

Claude Codeチームは、モデルの能力に合わせたツール設計の重要性を強調し、数学問題解決の例えを用いて、汎用ツールと特化ツールのバランスを考えるフレームワークを提示している。

AskUserQuestionツールの開発プロセス

ユーザーとの対話効率を向上させるため、最初はExitPlanToolへのパラメータ追加を試みたが混乱を招き、その後出力フォーマット変更へと設計を進化させた実践的経験を共有している。

ツール設計の実践的アプローチ

ツールの追加・削除の判断基準を明確化し、モデルの出力を注意深く観察し実験することで、エージェントの能力を理解する方法論を提案している。

AskUserQuestionツールの開発

Claudeがユーザーに質問するための専用ツールを開発し、構造化された出力を促し、ユーザーに複数の選択肢を提供できるようにした。

TodoからTaskツールへの進化

モデルの能力向上に伴い、TodoWriteツールは制約となり、より柔軟なTaskツールに置き換えられた。Taskツールは依存関係の管理やサブエージェント間の連携を可能にする。

ツール設計の継続的見直し

モデルの能力が向上すると、以前必要だったツールが制約になる可能性があるため、ツールの必要性に関する前提を常に見直すことが重要である。

プログレッシブディスクロージャーの導入

RAGからGrepツールを経て、エージェントが自らコンテキストを構築できるように進化し、プログレッシブディスクロージャーによって必要な情報を段階的に発見できるようになった。

影響分析・編集コメントを表示

影響分析

この記事は、AIエージェント開発における実践的なツール設計方法論を提供しており、開発者コミュニティに具体的な設計指針を与えると同時に、Claudeプラットフォームの技術的成熟度を示している。

編集コメント

企業ブログという性質上PR要素はあるが、具体的な開発プロセスの失敗談を含めており、実践的な知見として参考になる内容。ツール設計の哲学と実装のバランスがよく説明されている。

モデルの視点から考えることで、Claude Codeチームがツールをどのように設計、テスト、進化させているかを学びましょう。

カテゴリーClaude Code

製品Claude Code

日付2026年4月10日

読了時間5分

共有リンクをコピーhttps://claude.com/blog/seeing-like-an-agent

エージェントハーネスを構築する上で最も難しい部分の一つは、そのツールを構築することです。

Claudeはツール呼び出しを通じて完全に動作しますが、Claude APIではbash、スキル、コード実行などのプリミティブを使用してツールを構築する方法がいくつかあります。（@RLanceMartinの新しい記事で、Claude APIでのプログラムによるツール呼び出しについて詳しく読むことができます）。

では、エージェントのツールをどのように設計すればよいのでしょうか？ bashやコード実行のような汎用ツールを一つ与えるべきでしょうか？それとも、ユースケースごとに特化した50のツールを与えるべきでしょうか？

モデルの立場に立って考えるために、難しい数学の問題を与えられたと想像してみてください。それを解決するために、どんなツールが欲しいでしょうか？それはあなた自身のスキルセットに依存するでしょう！

紙が最低限必要ですが、手動計算に制限されます。電卓の方が良いでしょうが、より高度な機能の操作方法を知る必要があります。最も速くて強力な選択肢はコンピューターですが、コードを書いて実行する方法を知る必要があります。

これは、あなたのエージェントを設計するための有用な枠組みです。エージェント自身の能力に合わせて形作られたツールを与えたいのです。しかし、それらの能力が何であるかをどうやって知るのでしょうか？注意を払い、その出力を読み、実験します。あなたはエージェントのように見ることを学ぶのです。

もしあなたがエージェントを構築しているなら、私たちが直面したのと同じ疑問に直面するでしょう：いつツールを追加するか、いつツールを削除するか、そしてその違いをどう見分けるか。以下は、Claude Codeを構築しながら私たちがそれらにどう答えたか、最初にどこで間違えたかを含めて説明します。

AskUserQuestionツールによるエリシテーションの改善

AskUserQuestionツールを構築する際、私たちの目標はClaudeの質問能力（しばしばエリシテーションと呼ばれる）を改善することでした。

Claudeはプレーンテキストで単に質問をすることができましたが、それらの質問に答えることに不必要に時間がかかると感じられました。この摩擦をどうやって減らし、ユーザーとClaudeの間のコミュニケーションの帯域幅を増やせるでしょうか？

試行1: ExitPlanToolの編集

最初に試したアプローチは、ExitPlanToolにパラメータを追加して、計画とともに質問の配列を持たせることでした。これは実装が最も簡単な修正でしたが、計画と計画に関する一連の質問を同時に求めていたため、Claudeを混乱させました。もしユーザーの回答が計画の内容と矛盾したらどうなるでしょうか？ ClaudeはExitPlanToolを2回呼び出す必要があるでしょうか？この戦術はうまくいかないとわかっていたので、振り出しに戻りました。（プロンプトキャッシュに関する私たちの投稿で、なぜExitPlanToolを作ったのかについて詳しく読むことができます）

試行2: 出力形式の変更

次に、Claudeの出力指示を更新して、質問をするために使用できる少し修正されたマークダウン形式を提供することを試みました。例えば、ブラケット内に代替案を含む箇条書きの質問リストを出力するように依頼することができました。その後、その質問を解析してユーザー向けのUIとしてフォーマットすることができました。

Claudeは通常この形式を生成できましたが、確実ではありませんでした。余分な文章を追加したり、選択肢を省略したり、構造を完全に放棄したりしました。次のアプローチへ進みます。

試行3: AskUserQuestionツール

最終的に、Claudeがいつでも呼び出すことができるツールを作成することに落ち着きましたが、特に計画モード中にそうするよう促されました。ツールがトリガーされると、質問を表示するモーダルを表示し、ユーザーが回答するまでエージェントのループを停止させました。

このツールにより、構造化された出力をClaudeに促すことができ、Claudeがユーザーに複数の選択肢を与えることを確実にするのに役立ちました。また、ユーザーがこの機能を組み立てる方法も提供しました。例えば、Agent SDKで呼び出したり、スキルで参照したりすることです。

最も重要なことに、Claudeはこのツールを呼び出すことを好んでいるようで、その出力はうまく機能していることがわかりました。結局のところ、最高に設計されたツールでも、Claudeがそれを呼び出す方法を理解していなければ機能しません。

これがClaude Codeにおけるエリシテーションの最終形態でしょうか？私たちはそうは思いません。 Claudeがより能力を高めるにつれて、それを支えるツールも進化しなければなりません。次のセクションでは、かつて役立っていたツールが邪魔になり始めた事例を示します。

能力に合わせた更新: タスクとTo-doリスト

Claude Codeを最初にローンチしたとき、モデルが軌道に乗るためにTo-doリストが必要であることに気づきました。 To-doは作業の開始時に書き込まれ、モデルが作業を進めるにつれてチェックオフされることができます。これを行うために、私たちはClaudeにTodoWriteツールを与えました。これはTo-doを書き込んだり更新したりしてユーザーに表示するものです。

しかし、それでもなお、Claudeがやるべきことを忘れているのをよく目にしました。適応するために、5ターンごとにシステムリマインダーを挿入して、Claudeにその目標を思い出させました。

モデルが改善されるにつれて、To-doリストは制限的であると感じられるようになりました。 To-doリストのリマインダーを送られることで、Claudeはコースを変更する必要があると気づいたときにリストを修正するのではなく、リストに固執しなければならないと考えてしまうようになりました。また、Opus 4.5がサブエージェントの使用もはるかに上手くなっているのを見ましたが、サブエージェントは共有のTo-doリストをどのように調整できるでしょうか？

これを見て、私たちはTodoWrite機能をTaskツールに置き換えました。 To-doリストがモデルを軌道に乗せることに焦点を当てているのに対し、タスクはエージェント同士がコミュニケーションを取るのに役立ちます。タスクには依存関係を含めることができ、サブエージェント間で更新を共有し、モデルはそれらを変更したり削除したりすることができます。

モデルの能力が向上するにつれて、あなたのモデルがかつて必要としていたツールが、今ではそれらを制約しているかもしれません。どのツールが必要かについての以前の仮定を常に見直すことが重要です。これがまた、能力プロファイルがかなり類似している、サポートするモデルの小さなセットに固執することが有用である理由でもあります。

検索インターフェースの設計

私たちが構築した最も重要なツールは、Claudeが独自のコンテキストを見つけられるようにするものです。

Claude Codeが最初に内部でリリースされたとき、私たちはRAGを使用しました：ベクトルデータベースがコードベースを事前にインデックス化し、ハーネスは関連するスニペットを取得して各応答の前にClaudeに渡します。 RAGは強力で高速でしたが、インデックス化とセットアップが必要であり、さまざまな環境にわたって脆弱である可能性がありました。最も重要なことに、Claudeはコンテキストを見つける代わりに、このコンテキストを与えられていました。

しかし、もしClaudeがウェブ上で検索できるなら、なぜあなたのコードベースも検索できないのでしょうか？ GrepツールをClaudeに与えることで、ファイルを検索し、自分自身でコンテキストを構築させることができました。

Claudeがより賢くなるにつれて、適切なツールを与えられると、コンテキストを構築することがますます上手になります。

Agent Skillsを導入したとき、私たちはプログレッシブディスクロージャーの概念を形式化しました。これは、エージェントが探索を通じて関連するコンテキストを段階的に発見できるようにするものです。

Claudeは今やスキルファイルを読むことができ、それらのファイルはモデルが再帰的に読むことができる他のファイルを参照することができます。実際、スキルの一般的な使用法は、APIの使用方法やデータベースのクエリ方法についての指示を与えるなど、Claudeにさらなる検索能力を追加することです。

1年の間に、Claudeは独自のコンテキストをほとんど構築できない状態から、必要な正確なコンテキストを見つけるために複数のレイヤーのファイルにわたってネストされた検索を行うことができるようになりました。

プログレッシブディスクロージャーは現在、ツールを追加せずに新機能を追加するために私たちが使用する一般的な技術です。次のセクションで、その理由を説明します。

プログレッシブディスクロージャー: Claude Code Guideエージェント

Claude Codeは現在約20のツールを持っており、私たちのチームはClaudeが最も効果的であるためにそれらすべてが必要かどうかを頻繁に見直しています。新しいツールを追加するためのハードルは高く、これはモデルに考えるべき選択肢を一つ増やすからです。

例えば、私たちはClaudeがClaude Codeの使用方法について十分に知らないことに気づきました。 MCPを追加する方法やスラッシュコマンドが何をするのかを尋ねると、返答することができませんでした。

この情報すべてをシステムプロンプトに入れることもできましたが、ユーザーがこれについてめったに尋ねないことを考えると、コンテキストの腐敗を引き起こし、Claude Codeの主な仕事であるコードを書くことを妨害したでしょう。

代わりに、私たちはプログレッシブディスクロージャーを試みました：必要なときに読み込んで検索できるように、Claudeにそのドキュメントへのリンクを与えました。これは機能しましたが、Claudeはユーザーが一文で得られたかもしれない答えを見つけるために、大量のドキュメントをコンテキストに引き込んでいました。

そこで私たちはClaude Code Guideを構築しました — ユーザーがClaude Code自体について尋ねたときにClaudeが呼び出すサブエージェントです。サブエージェントは独自のコンテキストでドキュメント検索を行い、検索方法や抽出すべき内容についての詳細な指示に従い、答えだけを返します。メインエージェントのコンテキストはクリーンなままです。

これは完璧な解決策ではありませんが（Claudeに自身のセットアップ方法について尋ねると、まだ混乱することがあります）、新しいツールを追加することなく、Claudeのアクションスペースにものを追加することができました。

エージェントのように見ることは、科学ではなく芸術です

あなたのモデルのためのツールを設計することは、科学であると同様に芸術でもあります。それは、使用しているモデル、エージェントの目標、そしてそれが動作している環境に大きく依存します。

私たちの最良のアドバイスは？頻繁に実験し、あなたの出力を読み、新しいことを試してください。そして最も重要なことに、エージェントのように見ようと試みてください。

今日からClaude Codeを始めましょう。

著者について: Thariq ShihiparはAnthropicの技術スタッフの一員で、Claude Codeに取り組んでいます。

原文を表示

Seeing like an agent: how we design tools in Claude Code

Learn how the Claude Code team designs, tests, and evolves tools by thinking from the model's point of view.

CategoryClaude Code

ProductClaude Code

DateApril 10, 2026

Reading time5min

ShareCopy linkhttps://claude.com/blog/seeing-like-an-agent

One of the hardest parts about building an agent harness is constructing its tools.

Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution. (You can read more about programmatic tool calling on the Claude API in @RLanceMartin's new article).

So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?

To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!

Paper would be the minimum, but you’d be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.

This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.

If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.

Improving elicitation with the AskUserQuestion tool

When building the AskUserQuestion tool, our goal was to improve Claude’s ability to ask questions (often called elicitation).

While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?

Attempt 1: Editing the ExitPlanTool

The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user’s answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn’t work, so we went back to the drawing board. (You can read more about why we made an ExitPlanTool in our post on prompt caching)

Attempt 2: Changing output format

Next, we tried updating Claude’s output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.

Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.

Attempt 3: The AskUserQuestion Tool

Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.

This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.

Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn’t work if Claude doesn’t understand how to call it.

Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.

Updating with capabilities: tasks & todos

When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.

But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.

As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?

Seeing this, we replaced the TodoWrite feature with the Task tool . Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.

As model capabilities increase, the tools that your models once needed might now be constraining them. It’s important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.

Designing a search interface

The most consequential tools we've built are the ones that let Claude find its own context.

When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response.. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.

But if Claude could search on the web, why couldn’t it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.

As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.

When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.

Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.

Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.

Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.

Progressive disclosure: the Claude Code Guide agent

Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.

For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.

We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code’s main job: writing code.

Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.

So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.

While this isn’t a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.

Seeing like an agent is an art, not a science

Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it’s operating in.

Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

Get started with Claude Code today.

About the author: Thariq Shihipar is a member of technical staff at Anthropic, working on Claude Code.

PrevPrev0/5NextNexteBook

curl -fsSL https://claude.ai/install.sh | bashCopy command to clipboardirm https://claude.ai/install.ps1 | iexCopy command to clipboardOr read the documentationTry Claude CodeTry Claude CodeTry Claude CodeDeveloper docsDeveloper docsDeveloper docsRelated posts

Explore more product news and best practices for teams building with Claude.

How and when to use subagents in Claude Code

Claude CodeHow and when to use subagents in Claude CodeHow and when to use subagents in Claude CodeHow and when to use subagents in Claude CodeHow and when to use subagents in Claude Code Mar 19, 2026Product management on the AI exponential

Claude CodeProduct management on the AI exponential Product management on the AI exponential Product management on the AI exponential Product management on the AI exponential Feb 23, 2026How AI helps break the cost barrier to COBOL modernization

Claude CodeHow AI helps break the cost barrier to COBOL modernizationHow AI helps break the cost barrier to COBOL modernizationHow AI helps break the cost barrier to COBOL modernizationHow AI helps break the cost barrier to COBOL modernization Feb 20, 2026Bringing automated preview, review, and merge to Claude Code on desktop

Claude CodeBringing automated preview, review, and merge to Claude Code on desktopBringing automated preview, review, and merge to Claude Code on desktopBringing automated preview, review, and merge to Claude Code on desktopBringing automated preview, review, and merge to Claude Code on desktopTransform how your organization operates with Claude

Get the developer newsletter

Product updates, how-tos, community spotlights, and more. Delivered monthly to your inbox.

SubscribeSubscribePlease provide your email address if you'd like to receive our monthly developer newsletter. You can unsubscribe at any time.

この記事をシェア

Anthropic Research★32026年3月6日 09:00

2026年3月6日 Frontier Red TeamによるClaudeのCVE-2026-2796エクスプロイトのリバースエンジニアリング

Frontier Red Teamが、Claudeの脆弱性CVE-2026-2796を悪用するエクスプロイトをリバースエンジニアリングした。