Anthropic Red Team·2026年2月5日 09:00·約3分

LLMが発見したゼロデイ脆弱性

#LLM #サイバーセキュリティ #脆弱性発見 #オープンソース

TL;DR

AIモデルが大規模に高深刻度の脆弱性を発見できるようになり、防御側を強化する機会が訪れた。Claudeを使用してオープンソースソフトウェアの脆弱性を特定し修正を支援している。

AI深層分析2026年2月24日 18:40

重要/ 5段階

キーポイント

Claude Opus 4.6が従来のファジングでは検出困難な高深刻度の脆弱性を人間のようにコードを読解して発見

Anthropicが自らLLMを活用してオープンソースソフトウェアの脆弱性を発見・修正する取り込みを開始（500件以上発見済み）

AIモデルのサイバーセキュリティ能力が転換点を迎え、防御側の活用を加速すべき時期であるという認識

専門的なツールやプロンプト設計なしに「そのまま」で脆弱性発見が可能な能力の向上

影響分析・編集コメントを表示

影響分析

LLMが人間の研究者のようにコードを推論し、長年見過ごされてきた脆弱性を発見できる段階に達したことは、ソフトウェアセキュリティのパラダイムシフトを意味する。特にリソースの限られたオープンソースプロジェクトへの支援としての活用は、エコシステム全体の安全性向上に寄与する可能性が高い。

編集コメント

AIが「攻撃ツール」ではなく「防御ツール」としての活用を前面に打ち出した戦略的発表。技術的能力のアピールと倫理的枠組みの提示がバランスよく構成されている。

LLMが発見するゼロデイ脆弱性：AIがもたらすサイバーセキュリティの転換点

アンソロピック社は、最新のAIモデル「Claude Opus 4.6」が、従来のモデルや手法を大きく上回る効率で深刻な脆弱性（ゼロデイ脆弱性を含む）を発見できる能力を獲得したと報告している。これは、AIがサイバーセキュリティに与える影響が「転換点」に達し、防御側の能力を急速に強化すべき時であることを示す強力な証拠である。

従来手法を超えるAIの能力

従来の大規模な脆弱性発見は、ファジング（大量のランダム入力を投げて不具合を探す）など、特定のツールとインフラに多大な投資を必要としてきた。しかしOpus 4.6は、そのような専用のツールや特別なプロンプトを必要とせず、すぐに脆弱性を発見できる。その手法は、人間の研究者のようにコードを読み、論理を推論する点で画期的である。具体的には、過去の修正パッチから未修正の類似バグを見つけたり、問題を引き起こしやすいパターンを発見したり、ロジックを理解してそれを破壊する入力を正確に推測したりする。この能力により、何年もファジングが施され、数百万CPU時間のテストを積んだ堅牢なコードベースであっても、長年検出されなかった深刻な脆弱性を発見することができた。

防御側への積極的活用：オープンソースソフトウェアの保護

著者らは、この技術のバランスを防御側に傾けるため、自らClaudeを活用してオープンソースソフトウェア（OSS）の脆弱性発見と修正支援を開始した。OSSは企業システムから重要インフラまであらゆる場所で実行されており、その脆弱性はインターネット全体に波及するリスクを持つ。しかし多くのOSSプロジェクトは小規模チームやボランティアによって維持され、専任のセキュリティリソースを持たない。そこで、AIが発見し人間が検証したバグと、人間がレビューした修正パッチを提供することは、エコシステム全体のセキュリティ強化に大きく寄与する。

現状の成果と今後の取り組み

これまでに、Claudeは500件以上の深刻な脆弱性を発見・検証済みである。すでに報告を開始し、最初の修正パッチは適用され始めており、他の脆弱性についてもメンテナーと連携して修正を進めている。今後の課題として、このような強力な能力が悪用されるリスクを管理するための保護策の構築も進めている。これは彼らの取り組みの始まりに過ぎず、作業が拡大するにつれ、さらに多くの知見を共有する予定である。

要約すると、高度なLLMは、人間的な推論によるコード解析を通じて、従来の自動化手法では見逃されていた深刻な脆弱性を大規模に発見できる段階に来ている。アンソロピック社は、この「窓」が開いている間に防御側を強化し、可能な限り多くのコードを保護するため、この技術を積極的にOSSセキュリティ向上に活用し始めた。これはサイバーセキュリティ

原文を表示

red.anthropic.com Evaluating and mitigating the growing risk of LLM-discovered 0-days

Nicholas Carlini*, Keane Lucas*, Evyatar Ben Asher*, Newton Cheng, Hasnain Lakhani, David Forsythe, and Kyla Guru *indicates equal contribution

Claude Opus 4.6, released today, continues a trajectory of meaningful improvements in AI models’ cybersecurity capabilities. Last fall, we wrote that we believed we were at an inflection point for AI's impact on cybersecurity—that progress could become quite fast, and now was the moment to accelerate defensive use of AI. The evidence since then has only reinforced that view. AI models can now find high-severity vulnerabilities at scale. Our view is this is a moment to move quickly—to empower defenders and secure as much code as possible while the window exists.

Opus 4.6 is notably better at finding high-severity vulnerabilities than previous models and a sign of how quickly things are moving. Security teams have been automating vulnerability discovery for years, investing heavily in fuzzing infrastructure and custom harnesses to find bugs at scale. But what stood out in early testing is how quickly Opus 4.6 found vulnerabilities out of the box without task-specific tooling, custom scaffolding, or specialized prompting. Even more interesting is how it found them. Fuzzers work by throwing massive amounts of random inputs at code to see what breaks. Opus 4.6 reads and reasons about code the way a human researcher would—looking at past fixes to find similar bugs that weren't addressed, spotting patterns that tend to cause problems, or understanding a piece of logic well enough to know exactly what input would break it. When we pointed Opus 4.6 at some of the most well-tested codebases (projects that have had fuzzers running against them for years, accumulating millions of hours of CPU time), Opus 4.6 found high-severity vulnerabilities, some that had gone undetected for decades.

Part of tipping the scales toward defenders means doing the work ourselves. We're now using Claude to find and help fix vulnerabilities in open source software. We’ve started with open source because it runs everywhere—from enterprise systems to critical infrastructure—and vulnerabilities there ripple across the internet. Many of these projects are maintained by small teams or volunteers who don't have dedicated security resources, so finding human-validated bugs and contributing human-reviewed patches goes a long way.

So far, we've found and validated more than 500 high-severity vulnerabilities. We've begun reporting them and are seeing our initial patches land, and we’re continuing to work with maintainers to patch the others. In this post, we’ll walk through our methodology, share some early examples of vulnerabilities Claude discovered, and discuss the safeguards we've put in place to manage misuse as these capabilities continue to improve. This is just the beginning of our efforts. We'll have more to share as this work scales.

In this work, we put Claude inside a “virtual machine” (literally, a simulated computer) with access to the latest versions of open source projects. We gave it standard utilities (e.g., the standard coreutils or Python) and vulnerability analysis tools (e.g., debuggers or fuzzers), but we didn’t provide any special instructions on how to use these tools, nor did we provide a custom harness that would have given it specialized knowledge about how to better find vulnerabilities. This means we were directly testing Claude’s “out-of-the-box” capabilities, relying solely on the fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available.

To ensure that Claude hadn’t hallucinated bugs (i.e., invented problems that don’t exist, a problem that increasingly is placing an undue burden on open source developers), we validated every bug extensively before reporting it. We focused on searching for memory corruption vulnerabilities, because they can be validated with relative ease. Unlike logic errors where the program remains functional, memory corruption vulnerabilities are easy to identify by monitoring the program for crashes and running tools like address sanitizers to catch non-crashing memory errors. But because not all inputs that cause a program to crash are high severity vulnerabilities, we then had Claude critique, de-duplicate, and re-prioritize the crashes that remain. Finally, for our initial round of findings, our own security researchers validated each vulnerability and wrote patches by hand. As the volume of findings grew, we brought in external (human) security researchers to help with validation and patch development. Our intent here was to meaningfully assist human maintainers in handling our reports, so the process optimized for reducing false positives. In parallel, we are accelerating our efforts to automate patch development to reliably remediate bugs as we find them.

Here are three of

この記事をシェア

TechCrunch AI2026年7月5日 00:51

ミストラル AI とは？OpenAI の競合企業に関する全知識

MarkTechPost重要度52026年7月4日 07:20

Mistral AI、Apache-2.0ライセンスのLean 4用コードエージェント「Leanstral 1.5」を公開しPutnamBenchで672問中587問を解決

Simon Willison Blog重要度42026年7月4日 07:04

オープンソース AI グラップマップの公開

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む