Simon Willison Blog·2026年5月7日 00:58·約9分

ライブブログ：Claude 2026 と共にコーディングする

#Claude Code #マルチエージェント #LLM #Anthropic #インフラ拡張

TL;DR

アンソロピックは 2026 年のイベントで新モデル発表を避け、Claude Code のレート制限緩和やスペースXとのデータセンター提携など、インフラとエージェント機能の強化を発表した。

AI深層分析2026年7月5日 06:18

注目/ 5段階

深度40%

キーポイント

新モデルなしの戦略転換

今回のイベントでは新しい言語モデルの発表はなく、既存製品の使いやすさと機能向上に焦点を当てていることが明確にされた。

Claude Code の利用制限緩和とインフラ拡張

Pro、Max、Enterprise ユーザー向けの 5 時間実行制限が倍増し、スペースXの Colossus データセンターとの提携により計算リソースが大幅に強化された。

エージェント機能と設計能力の進化

マルチエージェントオーケストレーションや「無限」に近いコンテキストウィンドウ、そして Opus 4.7 の視覚デザイン能力など、自律的なタスク実行能力が強化された。

API 利用量の爆発的増加

アンソロピックプラットフォーム上の API 利用量が前年比で 17 倍に増加しており、開発者コミュニティにおける採用度が急激に高まっていることを示している。

Advisor Strategy のコスト削減効果

Opus を Sonnet のアドバイザーとして活用する戦略により、ベンチマーク結果の向上とコストを 5 分の 1 に抑えることに成功した事例がある。

Claude Managed Agents の新機能

複雑なタスク解決のためのマルチエージェント編成、目標達成を目指す「Outcomes」、および過去のセッションから学習して自己改善する「Dreaming」の 3 つの新機能が発表された。

Claude Code の信頼と進化

Cat Wu は、Sonnet 3.7 が最上位モデルだった時期から本番データベースで Claude Code を信頼して利用してきたユーザーへの感謝を表明した。

影響分析・編集コメントを表示

影響分析

この発表は、AI 業界が単なるモデルの性能競争から、大規模な実運用を支えるインフラと信頼性の高いエージェント機能への転換期にあることを示唆しています。特にスペースXとの提携により計算リソースの制約が緩和されることで、長時間実行を要する複雑な開発タスクや自律的な業務自動化が現実的な選択肢として広がります。

編集コメント

新モデル発表を期待した読者にとっては物足りなく見える内容ですが、AI エージェントの実用化に向けたインフラ基盤の確立という点では極めて重要な一歩です。計算リソースの制約が緩和されることで、今後はより複雑で長時間にわたる自律タスクの実行が可能になるでしょう。

私は2026年のAnthropic「Code w/ Claude」イベントに参加しており、本日の基調講演やその他のメモをライブブログとして更新していきます。

08:56 現在メインルームに座っています。基調講演は9時に開始されます。

09:03 クラウディアの小さなオレンジ色のピクセルアートキャラクターが登場する可愛らしいオープニングアニメーション。

09:05 ステージ上には、Anthropicのチーフプロダクトオフィサーであるアミ・ヴォーラ氏。彼女は今年早些にマイク・クリガー氏（現在はAnthropic Labsの共同リーダー）の後任として就任しました。

09:07 アミ氏は開発者の生産性向上に関するエピソードを共有しています。Stripeのスコット・マックヴィッカー氏のチームや、Bintiのフェリシア・クルクルー氏のチームの話です。

09:07 （これは少し私にとってインスピレーション過多な内容で、新しいモデル/製品/機能の発表を期待しているのですが！）

09:09 現在は、MythosがOpenBSDのソースツリーを読み込み、27年前の脆弱性を発見した事例について話しており、モデルの改善点を説明しています。

09:09 AnthropicプラットフォームにおけるAPI利用量は、前年比で17倍に増加しました。

09:09 今日の発表では新しいモデルはありません。「今日は、製品をより良く機能させる方法についてお話しします。」

09:11 クラウディア管理エージェントのアップデートとして、マルチエージェントオーケストレーションとClaude Codeルーチンが紹介されました。

「ほとんどの人は、あなたが構築したクラウディアプラットフォーム上のいずれかの手段を通じてAIを体験するでしょう」

09:12 「少し興奮するニュースをお伝えします」として、本日付でClaude CodeおよびAPIを利用する開発者に対するレート制限を引き上げました。

Pro、Max、Enterprise顧客向けのClaude Codeの5時間利用制限を倍増させます。

「私たちはSpaceXと提携し、同社のColossusデータセンターの全容量を活用します。」

09:14 現在登壇中：ダイアン・ナ・ペン、リサーチ担当プロダクト責任者。

09:16 ツールの活用、長いコンテキスト、コンピューター操作、適応的思考、ビジュアルデザイン、エージェントループの重要性について話しています。「モデル知能——その中核となる基盤——は、これらすべてを支えるのに十分なほど強固になりました。」

09:17 現在、Claude Design について話しています。「Opus 4.7 はビジュアルデザインに対する優れた感覚を持っています」。

09:18 より高い判断力とコードセンス。高品質なメモリと組み合わせれば、無限大のように感じるコンテキストウィンドウ。単一のインスタンスでは達成不可能な大きな目標を達成するために、マルチエージェント協調を活用します。

09:19 昨年の今頃はモデルは数分間しか動作できませんでしたが、今日では多くの人が何時間も連続して稼働させています。

09:20 （これまでのところ、このセッションでの唯一のニュースは SpaceX の Colossus 契約です。そして、去年から API トラフィックが 17 倍に増加したことも推測されます。）

09:21 古典的なアドバイス：次のモデルのために設計せよ。今日ではまだ完全には機能しないものを構築し、将来のモデルアップグレードで動作するようになるという前提に基づいてください。

09:22 ダイアンによると、Claude から最も効果を引き出しているチームは、自動化された評価（自動評価）、シンプルなスケフォールディング、そして他者がまだ見つけていないようなモデルの創造的な活用方法に注力しています。

09:23 次に登場：ケイトリン・レスとアンジェラ・キアン。

09:24 この部分は Claude Platform についてのものであり、そこから「適切な成果」を引き出すことに焦点を当てています。

09:25 「アドバイザー戦略」——Opus が必要に応じて小規模モデルに対してアドバイスを提供する仕組みです。Sonnet が Opus をアドバイザーとして呼び出す際のベンチマーク結果が改善され、スコアも向上しコストも低下しました。ある顧客である Eve は、「最前線モデルの品質を 5 分の 1 のコストで実現できる」と評価しています。

09:26 速度とスケールを同時に達成するのは困難です。Claude Managed Agents は、チームが「10 倍のスピード」で製品をリリースできるよう支援することを目的としています。メモリ管理など、ベストプラクティスをパッケージ化してすぐに使える状態にしています。

09:28 今日発表された Claude Managed Agents の新機能は 3 つです。複雑なタスクを解決するためにエージェント群を作成するための「マルチエージェントオーケストレーション」、成功の基準を設定し Claude が反復実行して達成を目指す「アウトカム」（Ralph loop に似ています）、そして Claude が過去のセッションを検証して見落としを発見し自己改善を行う「ドリーミング」です。

09:28 それでは例として、月面へのドローン着陸を想定した製品の構築デモをご覧ください。

09:30 この作業を完了させるために、司令官（Commander）、検出器（Detector）、ナビゲーター（Navigator）の複数のエージェントが連携します。デモの中で少し混乱してしまいましたが、セッション後に詳細なノートが公開されることを願っています。

09:32 「ドリーミング」は非常に興味深く見えます。夜間にタスクを実行し、過去のセッションを検査して新しい記憶を作成できます。この例では「descent-playbook.md」というファイルが作成されました。

09:33 マルチエージェントのオーケストレーションとアウトカムはどちらもパブリックベータです。ドリーミングは研究プレビュー版です。この 2 つのカテゴリーの違いが何なのかは確信がありません。

09:34 次に登壇するのは、Claude Code の製品責任者である Cat Wu です。

09:34 「Sonnet 3.7 が当社の最上位モデルだった頃、生産環境のデータベースで Claude Code を信頼していただきありがとうございます。」（素晴らしいですね）

09:36 Dreams のドキュメントはこちらです。試すにはアクセスをリクエストする必要があるようです（そのため「研究プレビュー」となっています）。

09:37 Claude Code は CLI から始まりました。最新のカスタマイズ機能と最も高い制御性を提供します。その後、IDE が追加されました。同じエージェントが UI 上で動作し、コード変更をより容易に追跡できる環境です。最新のインターフェースはデスクトップ版の Claude Code です。フルスクリーン GUI、フルスクリーンプレビュー、画像表示、リッチな出力を望む人向けのプラットフォームです。

09:37 IDE とデスクトップアプリの両方は、外部開発者も自らの用途で使用できる同じ Claude Agent SDK を基盤に構築されています。

09:38 「コードレビューにかける時間を減らしたいというご要望を伺いました」とのこと。そこで、Anthropic の全チームで利用されている Code Review を立ち上げました。

09:38 リモートエージェントを使えば、スマートフォンからラップトップを操作できます。私は代わりにスマートフォン上で Web 版の Claude Code を使用しており、そのためわざわざラップトップをどこかに開きっぱなしにする必要もありません。

09:39 私は以前「CI の自動修正」という機能を見たことがありません。これは PR に対して自動的に修正を行うものです。それに関する資料で見つかったのは、こちらのリリースノートエントリだけです。

09:41 現在、Claude Code の顧客である Shopify や Mercado Libre（エンジニアが 23,000 名も在籍！）について自慢しています。彼らは「今年第 3 四半期までにコーディングの 90% を自律的に実行する」という目標を掲げています。

09:42 Cat が言及した点で、私も注目していることがあります。経営陣やマネージャーが再びコードを書く作業に直接携わるようになっていることです。有用な貢献をするために必要な時間が以前ほど長くは不要になったためです。

09:43 次に登壇するのは、Claude Code を生み出した Boris Cherny です。「今日私たちが目撃しているすべてのことは、私が毎日 Claude Code の開発に関わっているにもかかわらず、まだ魔法のように感じられます。」

09:44 Boris は Claude デスクトップアプリを使ってデモを実行しています。「Claude は ACME のダッシュボードへの返金機能の追加に取り組んでいます」。二重払いを防ぐための冪等性（idempotency）、複数通貨の処理、コンプライアンスチーム向けの監査ログなどです。開発中の Web UI が右側のパネルに表示されており、そこで Claude が実際に UI を操作し、エッジケースのバグを発見している様子を確認できます。

09:45 …しかし、Boris は Claude デスクトップアプリ内で複数のセッションを同時に実行しており、それらの間を切り替えて、どのセッションで入力を必要としているかを確認できます。「今後、多くのコードは非同期（async）方式で書かれるようになる我们认为しています。」

09:46 ボリスは、今日の彼のコードの多くがルーチンによって構築されていると述べています。「ルーチンは高次プロンプトです。」

09:46 「ルーチンを使えば、開発者は非同期自動化を設定でき、PR がマージ準備完了の状態になっているのを朝に迎えることができます。」

09:48 PR の自動修正に関するアイデアは、「PR を所有する人が赤い X を見ることは決してない」という点にあります。Claude が Claude Code に自身に対してプロンプトを投げています。

09:49 キーノートセッション終了。本日のテーマは、「Code w/ Claude」という名前のイベントであることに驚くべきことではありませんが、既存のモデルを最も効果的に活用する方法を学ぶことにあるようです。

原文を表示

I’m at Anthropic’s Code w/ Claude event in 2026, and I’ll be live blogging the keynote and a few other notes throughout the day.

08:56 I'm now seated in the main room. The keynote starts at 9am.

09:03 Cute opening animation featuring the little orange Claude pixel art character.

09:05 On stage: Anthropic's Chief Product Officer Ami Vora - who replaced Mike Krieger earlier this year (he's now the co-lead of Anthropic Labs.)

09:07 Ami is sharing anecdotes about developer velocity - Scott MacVicar's team at Stripe, Felicia Curcuru's team at Binti.

09:07 (This is all a little bit too inspirational for my liking, I'm hoping for some new model / product / feature announcements!)

09:09 Now talking about Mythos reading the OpenBSD source tree and finding a 27-year-old vulnerability, to illustrate model improvement.

09:09 API volume is up 17x year-on-year on the Anthropic platform.

09:09 No new model today. "Today is about how we are making our products work better for you."

09:11 Updates to Claude managed agents - multi-agent orchestration. Claude Code routines.

"Most people will experience AI through one of the hings you've builtn on the Claude platform"

09:12 "Sharing a little exciting news" - as of today, increased rate limits for developers on Claude Code and the API.

Doubling Claude Code five hour limit for Pro, Max, Enterprise customers.

"We're partnering with SpaceX to use all of the capacity of their Colossus data center".

09:14 Now up: Dianne Na Penn - Head of Product for Research.

09:16 Talking about the importance of tool use, long context, computer use, adaptive thinking, visual design, agentic loops. "The model intelligence - the core foundation - has got strong enough to support all of this."

09:17 Now talking about Claude Design. "Opus 4.7 has a real taste for visual design".

09:18 Higher judgment and code taste. "Context windows that feel infinite" when combined with high quality memory. Multi-agent coordination to help achieve big goals that could not be achived using a single instance.

09:19 This time last year models could work for minutes. Today many people have them running for hours on end.

09:20 (So far the only news in this session has been the SpaceX Colossus deal. And I guess the 17x increase in API traffic since last year.)

09:21 Classic advice: design for the next model. Build things that don't quite work today on the assumption that they'll start working with a model upgrade in the future.

09:22 Dianne says that the teams getting the most out of Claude are focusing on automated evals, simple scaffolding and imaginative uses of models that others haven't figured out yet.

09:23 Now: Katelyn Lesse and Angela Kiang.

09:24 This bit is all about the Claude Platform, and "getting the right outcomes" from it.

09:25 "The advisor strategy" - where Opus can provide advice on demand to smaller models. They got better benchmark results for Sonnet calling Opus as an advisor - both higher benchmarks and lower cost. One customer, eve, got "frontier model quality at 5x lower cost".

09:26 Speed and scale are difficult to achieve at the same time. Claude Managed Agents is meant to help teams ship "10 times faster". It bundles a lot of the best practices out of the box - things like memory.

09:28 Today: three new features for Claude Managed Agents. Multi-agent orchestration, for creating fleets of agents to solve complex tasks. Outcomes to set what success looks like so Claude can iterate and get it done - sounds like a Ralph loop. And "Dreaming" - Claude can inspect its previous sessions and figure out what it missed and self-improve.

09:28 Now an example, building a hypothetical product for landing drones on the moon.

09:30 Multiple agents to get this work done - a Commander, Detector and Navigator. I'm getting a little lost in the demo, hoping they publish detailed notes after the session.

09:32 Dreaming looks *really* interesting. You can run a task over night which examines previous sessions and creates new memories - in this example it created a descent-playbook.md file.

09:33 Multiagent orchestration and Outcomes are both public beta. Dreaming is a research preview. I'm not sure what the difference between those two categories are.

09:34 Now up: Cat Wu, Head of Product, Claude Code.

09:34 "Thank you for trusting Claude Code on your production databases back when Sonnet 3.7 was our top model." (Nice.)

09:36 Here's documentation on Dreams. Looks like you need to request access to try it out (hence "research preview".)

09:37 Claude Code started with the CLI - all the latest customizations, the most control. Then added IDE - the same agents but in a UI where you can more easily follow the code changes it's making. The latest surface is Claude Code on Desktop - a surface for people who want a full screen GUI with full screen preview and images and rich outputs.

09:37 Both IDE and Desktop app are built on the same Claude Agent SDK that external developers can use themselves.

09:38 "We heard from you that you want to spend less time on code review" - so they launched Code Review, used by every team at Anthropic.

09:38 Remote Agents lets you control your laptop from your phone. I use Claude Code for web on my phone instead, then I don't even have to leave a laptop open somewhere.

09:39 I hadn't seen "CI auto-fix" before, which files automatic fixes against PRs. Only documentation I could find for that is this release notes entry.

09:41 Now boasting about some Claude Code customers - Shopify, Mercado Libre (who have 23,000 engineers!) - they are aiming for "90% autonomous coding by Q3 this year".

09:42 Cat mentions something I've been watching too: execs and managers are getting their hands dirty with code again, because you don't need so much time to be able to usefully contribute.

09:43 Now up: Boris Cherny, who created Claude Code. "Everything we are seeing today still feels magical to me, and I work on Claude Code every day."

09:44 Boris is running a demo with the Claude desktop app. "Claude is working on adding refunds to ACME's dashboard". With idempotency so you can't double-refund, multi-currency handling, audit logging for the compliance team. It's showing the in-development web UI in the right hand panel where you can see Claude directly using it and discovering an edge-case bug.

09:45 ... but Boris has multiple sessions all running in the Claude desktop app at once, and can switch between them and see which ones need your input. "We think that going forward a lot of code is going to be written in an async way."

09:46 Boris says that today a lot of his code is built by routines. "Routines are higher-order prompts."

09:46 "With Routines, developers can setup async automations and wake up to PRs that are ready to merge."

09:48 The idea with the PR auto-fixes is that "The person who owns the PR is never going to see a red X". Claude is prompting Claude Code on its own.

09:49 Keynote session over. The theme of the day - unsurprisingly for an event called "Code w/ Claude" - appears to be learning the most effective ways to put the existing models to use.

この記事をシェア

Simon Willison Blog重要度42026年7月5日 07:53

より優れたモデル、劣化したツール

Simon Willison Blog重要度42026年7月4日 03:51

Fable の判断力を活用する重要性について

The Zvi重要度42026年7月3日 22:12

Fable #6：王の帰還

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む