Anthropic Red Team·2025年12月1日 09:00·約2分

AIエージェントがスマートコントラクトの脆弱性を発見

#AIセキュリティ #スマートコントラクト #自律エージェント #Anthropic #GPT-5 #経済的リスク評価

TL;DR

実際に攻撃されたスマートコントラクトを評価した結果、最新のAIモデルが知識カットオフ後の脆弱性を発見できることが示された。

AI深層分析2026年2月25日 10:41

重要/ 5段階

キーポイント

AIエージェントがスマートコントラクトの脆弱性を発見し、合計460万ドルの経済的被害を実証

既知の脆弱性だけでなく、新規契約からもゼロデイ脆弱性を発見し、実世界での自律的悪用が技術的に可能であることを証明

従来のサイバーセキュリティベンチマークでは評価できなかった経済的影響を直接測定する新しい評価手法を提案

AIの攻撃能力の急速な進化に対し、防御目的でのAI活用の積極的導入が必要であると結論付け

影響分析・編集コメントを表示

影響分析

この研究は、AIが現実世界の金融システムに直接的な経済的被害を与え得る能力を具体的な金額で実証した点で画期的である。スマートコントラクトという公開性の高い領域での実証により、AIの自律的悪用が技術的に可能であることを示し、セキュリティ対策のパラダイムシフトを迫る内容となっている。

編集コメント

AIの攻撃能力が具体的な金額で測定可能になったことで、政策立案者や一般向けのリスクコミュニケーションが大きく前進する。防御側のAI活用が急務であることを示す強力なエビデンスとなっている。

人工知能（AI）エージェントがブロックチェーンのスマートコントラクトの脆弱性を発見し、数百万ドル規模の悪用可能性を実証した。Anthropic社の研究者らによるプロジェクトでは、2020年から2025年に実際に悪用された405件の契約を含む新ベンチマーク「SCONE-bench」を構築。知識カットオフ日以降に悪用された契約に対して、Claude Opus 4.5、Claude Sonnet 4.5、GPT-5のAIエージェントを評価したところ、合計460万ドル相当のエクスプロイト（悪用コード）の作成に成功した。これは、これらの能力がもたらし得る経済的損害の具体的な最低額を示している。

さらに、既知の脆弱性がない最近導入された2,849件の契約に対するシミュレーション評価では、Sonnet 4.5とGPT-5が2件の新規ゼロデイ脆弱性を発見し、3,694ドル相当のエクスプロイトを生成した。GPT-5はAPIコスト3,476ドルでこれを達成し、利益が出る現実世界の自律的な悪用が技術的に可能であることを概念実証として示した。この結果は、防御策としてのAIの積極的採用が急務であることを浮き彫りにしている。なお、実害を防ぐため、すべてのテストはブロックチェーンシミュレーター内で行われ、実際のブロックチェーンや資産には影響を与えていない。

現在、AIのサイバー能力は、複雑なネットワーク侵入の指揮から国家レベルでの諜報活動の強化まで急速に進歩しており、その進捗を追跡するベンチマークが存在する。しかし、既存のベンチマークは、AIサイバー能力の具体的な経済的影響を定量化していないという重大な欠点があった。成功率のような抽象的な指標ではなく、金銭的価値に換算することで、政策立案者や技術者、一般社会へのリスクをより効果的に評価・伝達できる。

本研究では、ソフトウェアの脆弱性に直接的な価格付けが可能な領域としてスマートコントラクトに着目した。スマートコントラクトはイーサリアムなどのブロックチェーン上に展開されるプログラムで、PayPalに類似した金融アプリケーションを支える。そのソースコードと取引ロジックはすべて公開されており、人の介在なしに動作する。そのため、脆弱性があると契約から直接資金を盗むことが可能となり、悪用の経済的価値を正確に測定できるのである。このアプローチは、AIサイバー能力がもたらす現実的な経済的リスクを明確に示す重要な一歩となった。

原文を表示

red.anthropic.com AI agents find $4.6M in blockchain smart contract exploits

Winnie Xiao*, Cole Killian* Henry Sleight, Alan Chan Nicholas Carlini, Alwin Peng *MATS and the Anthropic Fellows program

AI models are increasingly good at cyber tasks, as we've written about before. But what is the economic impact of these capabilities? In a recent MATS and Anthropic Fellows project, our scholars investigated this question by evaluating AI agents' ability to exploit smart contracts on Smart CONtracts Exploitation benchmark (SCONE-bench)—a new benchmark they built comprising 405 contracts that were actually exploited between 2020 and 2025. On contracts exploited after the latest knowledge cutoffs (June 2025 for Opus 4.5 and March 2025 for other models), Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 developed exploits collectively worth $4.6 million, establishing a concrete lower bound for the economic harm these capabilities could enable. Going beyond retrospective analysis, we evaluated both Sonnet 4.5 and GPT-5 in simulation against 2,849 recently deployed contracts without any known vulnerabilities. Both agents uncovered two novel zero-day vulnerabilities and produced exploits worth $3,694, with GPT-5 doing so at an API cost of $3,476. This demonstrates as a proof-of-concept that profitable, real-world autonomous exploitation is technically feasible, a finding that underscores the need for proactive adoption of AI for defense.

Important: To avoid potential real-world harm, our work only ever tested exploits in blockchain simulators. We never tested exploits on live blockchains and our work had no impact on real-world assets.

AI cyber capabilities are accelerating rapidly: they are now capable of tasks from orchestrating complex network intrusions to augmenting state-level espionage. Benchmarks, like CyberGym and Cybench, are valuable for tracking and preparing for future improvements in such capabilities.

However, existing cyber benchmarks miss a critical dimension: they do not quantify the exact financial consequences of AI cyber capabilities. Compared to arbitrary success rates, quantifying capabilities in monetary terms is more useful for assessing and communicating risks to policymakers, engineers, and the public. Yet estimating the real value of software vulnerabilities requires speculative modelling of downstream impacts, user base, and remediation costs.[1]

Here, we take an alternate approach and turn to a domain where software vulnerabilities can be priced directly: smart contracts. Smart contracts are programs deployed on blockchains like Ethereum. They power financial blockchain applications which offer services similar to those of PayPal, but all of their source code and transaction logic—such as for transfers, trades, and loans—are public on the blockchain and handled entirely by software without a human in the loop. As a result, vulnerabilities can allow for direct theft from contracts, and we can measure the dollar value of exploits by running them in simulated environments. These properties make smart contracts an ideal testing ground for AI agents’ exploitation capabilities.

To give a concrete example of what such an exploit could look like: Balancer is a blockchain application that allows users to trade cryptocurrencies. In November 2025, an attacker exploited a rounding direction issue to withdraw other users’ funds, stealing over $120 million. Since smart contract and traditional software exploits draw on a similar set of core skills (e.g. control-flow reasoning, boundary analysis, and programming fluency), assessing AI agents on smart contract exploitations gives a concrete lower bound on the economic impact of their broader cyber capabilities.

We introduce SCONE-bench—the first benchmark that evaluates agents’ ability to exploit smart contracts, measured by the total dollar value[2] of simulated stolen funds. For each target contract(s), the agent is prompted to identify a vulnerability and produce an exploit script that takes advantage of the vulnerability so that, when executed, the executor’s native token balance increases by a minimum threshold. Instead of relying on bug bounty or speculative models, SCONE-bench uses on-chain assets to directly quantify losses. SCONE-bench provides:

A benchmark comprising 405 smart contracts with real-world vulnerabilities exploited between 2020 and 2025 across 3 Ethereum-compatible blockchains (Ethereum, Binance Smart Chain, and Base), derived from the DefiHackLabs repository.

A baseline agent running in each sandboxed environment that attempts to exploit the provided contract(s) within a time limit (60 minutes) using tools exposed via the Model Context Protocol (MCP).

An evaluation framework that uses Docker containers for sandboxed and scalable execution, with each container running a local blockchain forked at the specified block number to ensure reproducible results.

Plug-and-play support for using the agent to audit smart contracts for

この記事をシェア

Simon Willison Blog重要度42026年7月5日 07:53

より優れたモデル、劣化したツール

MarkTechPost重要度42026年7月5日 01:21

Anthropic、再現可能なゲノム・プロテオーム・ケミインフォマティクスパイプライン向けマルチエージェント AI ワークベンチ「Claude Science Beta」をリリース

The Verge AI重要度42026年7月3日 22:56

Anthropic、自社製薬の開発を計画

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む