読み込み中…

The Decoder·2026年4月15日 02:31·約7分

Claude Mythosが防御の弱い企業ネットワークを自律的に端から端まで侵害可能

#AIセキュリティ #自律的攻撃 #LLM #サイバーセキュリティ #AI安全 #Anthropic

TL;DR

英国のAI安全研究所がAnthropicのClaude Mythos Previewをテストした結果、AIモデルが初めて企業ネットワークに対する完全な攻撃シミュレーションを自律的に完了したが、結果には重要な注意点が伴う。

AI深層分析2026年4月15日 03:41

重要/ 5段階

深度40%

キーポイント

初の自律的完全攻撃シミュレーション

AIモデルが企業ネットワークに対するエンドツーエンドの攻撃シミュレーションを自律的に完了した初めての事例である。

脆弱な防御体制への脅威

Claude Mythosは防御が弱い企業ネットワークを自律的に侵害できる能力を示した。

英国AI安全研究所によるテスト

英国のAI安全研究所がAnthropicのClaude Mythos Previewのサイバー能力を評価した。

重要な注意点を伴う結果

この結果には重要な注意点（caveats）が伴っており、解釈には慎重さが求められる。

重要な引用

For the first time, an AI model autonomously completed a full attack simulation against a corporate network

the results come with significant caveats

影響分析・編集コメントを表示

影響分析

この記事はAIの自律的サイバー攻撃能力が現実の脅威となり得ることを示し、AIセキュリティ研究と企業の防御体制強化の緊急性を高める。一方で、結果には注意点があるため、AIの危険性を過度に誇張せず、バランスの取れた対応が求められる。

編集コメント

AIの自律的攻撃能力が実証された初の公式テスト結果であり、AIセキュリティ分野の重要なマイルストーン。ただし、結果の解釈には注意点があるため、センセーショナルな報道ではなく、冷静な分析が求められる。

英国のAI安全研究所（AISI）は、AnthropicのClaude Mythos Previewのサイバーセキュリティ関連機能についてテストを行った。その結果、ネットワークが小さく防御が脆弱な場合に限られるものの、AIモデルが企業ネットワークに対する完全な攻撃シミュレーションを自律的に完了したのは初めてのこととなった。

AISIによると、Mythos PreviewはAIのサイバーセキュリティ能力において大きな飛躍を表している。わずか2年前であれば、利用可能な最良のモデルでも初心者のレベルのサイバータスクを処理するのがやっとだった。制御された評価において、Mythos Previewは脆弱なネットワークに対して多段階の攻撃を実行し、明示的な指示とネットワークアクセスが与えられた場合、セキュリティホールを特定して自律的に悪用した。AISIは、これらのタスクを人間のセキュリティ専門家が完了するには数日かかるだろうと述べている。

フラグ奪取（CTF）：エキスパートレベルで73パーセントの成功率

フラグ奪取（Capture the Flag、CTF）チャレンジでは、AIモデルは標的システム内の脆弱性を発見し悪用して隠されたフラグを入手する必要がある。AISIによると、Mythos Previewは初心者向けタスクで約85パーセント、初心者の技術非専門家向けタスク（トークン予算250万）で約95パーセントの成績を収めている。これはGPT-5.4、Codex 5.3、Claude Opus 4.6と並んでトップティアに位置する。

2022年11月以降の初心者レベルCTFタスクにおける異なるAIモデルのパフォーマンス。| 画像：AISI

より大きな計算リソース（50百万トークン）を使用した場合、Mythos Previewは実践者向けタスクで約93パーセント、エキスパートレベルのチャレンジで73パーセントのスコアを記録した。このエキスパートレベルの数値は特に注目すべきものであり、AISIによれば2025年4月以前にはどのモデルもエキスパートレベルのタスクを解決できなかった。

2025年8月以降のCTFタスクにおける実践者およびエキスパートレベルのパフォーマンス。| 画像：AISI

AnthropicのClaude Mythosは企業ネットワークを自律的にハッキング可能

CTFチャレンジは個々のスキルを孤立してテストするものだが、実際のサイバー攻撃では複数のホストやネットワークセグメントにわたって数十のステップを連鎖させる必要があるとAISIは指摘する。

そのような複雑さを測定するために、研究所は「ザ・ラストワンズ（The Last Ones、TLO）」と呼ばれるシミュレーションを開発した。これは初期の偵察からネットワーク全体の乗っ取りに至るまで、32ステップの攻撃を模擬された企業ネットワークに対して行うものである。AISIは、これを人間の専門家が完了するには約20時間かかると見積もっている。詳細な内容は添付の論文で公開されている。

Claude Mythos Previewは、TLOをエンドツーエンドで完了した最初のモデルである。10回の試行のうち3回で完全な乗っ取りを達成した。平均すると、モデルは32ステップのうち22ステップを完了した。次に優れたモデルであるClaude Opus 4.6の平均は16ステップだった。

トークン予算ごとの「ザ・ラストワンズ」での完了ステップ数。Mythos Preview（赤）は他のすべてのモデルを明確に上回っている。| 画像：AISI

AISIは、推論計算リソースの増加とともにパフォーマンスが継続的に向上すると期待している。テストでは1億トークンの予算が使用され、パフォーマンスはその制限値までスケールした。サイバータスクにおける推論スケーリングに関する別の記事で、この傾向についてより詳細に解説されている。

しかし、Mythos Previewには限界も示された。モデルは、発電所や工場などで使用される産業制御技術（運用技術、OT）を対象とした別のAISI攻撃シミュレーションを完了できなかった。AISIによれば、これは必ずしもモデルがOTコンポーネント自体で失敗することを意味するわけではない。それは、以前のステップでシミュレーション内のITネットワークで停滞したため、その段階に到達しなかったからである。

AISIはいくつかの注意点を指摘している。テスト環境にはアクティブな防衛者やセキュリティツールが存在せず、実際のネットワークでアラートをトリガーする行動に対する結果もなかった。これらの結果のみに基づけば、Mythos Previewが十分に防御されたシステムを成功裏に突破できるかどうかを判断する方法はない。

それでもなお、AISIによれば、モデルは少なくとも「ネットワークへのアクセスが得られた、小さく、防御が脆弱で脆弱な企業システムを自律的に攻撃する」能力を持っている。研究所は、アクティブな監視、エンドポイント検知、リアルタイムのインシデントレスポンスを備えた強化された環境で将来の評価を行う計画である。

AIのサイバーセキュリティ能力は、基本的なセキュリティ衛生の重要性を高める

AISIによれば、これらの結果は定期的なパッチ適用、強力なアクセス制御、安全な設定、包括的なログ記録など、サイバーセキュリティの基礎の重要性を浮き彫りにしている。同等の能力を持つ他のモデルも、遠く離れていない可能性がある。

同時に、研究所はAIのサイバーセキュリティ能力がデュアルユース（両義性）であることを指摘している。これらはセキュリティリスクをもたらす一方で、サイバー防御を大幅に強化する可能性もある。英国のナショナル・サイバーセキュリティセンター（NCSC）との共同ブログ記事で、AISIは防衛者が最先端のAIに備え、活用する方法について概説している。

AISIは2023年以来AIのサイバーセキュリティ能力を追跡しており、評価基準を着実に引き上げてきた。チャットベースのクエリからフラグ奪取チャレンジ、そして複雑な多段階攻撃シミュレーションへと進化している。

Mythosは本当に公開するには危険すぎるのか？

Anthropicは2024年4月初旬にClaude Mythosを正式にリリースした。現在、このモデルはサイバーセキュリティ上の懸念から、約50社にのみ提供されていると報じられている。AISIの結果は少なくとも部分的にこの決定を支持するものである：モデルは制御された環境において、保護の弱いネットワークを自律的に攻撃できる。

批判者たちは、この制限は誇張されており、2019年にOpenAIがGPT-2を公開するには危険すぎると判断した時と同じだと主張する。以前のモデルに対するパフォーマンスの向上は、これほどまでにアクセスを制限するのに十分な大きさではない。一部の人は、これは主にマーケティング戦略であり、Anthropicがモデルをより広く提供するための計算リソース容量を持っていないだけだと述べている。しかし、これらはすべて現時点での推測に過ぎない。MythosレベルのAIモデルが一般公開された後、あなたのコンピュータが壊れるか壊れないかがわかるまで、確かなことはわからないだろう。

AIニュースはハイクオリティに厳選 - 人間によってキュレーション

THE DECODERに登録して、広告なしの読書、週刊AIニュースレター、年6回の独占「AIレーダー」最先端レポート、アーカイブへの完全アクセス、コメントセクションへのアクセスを楽しんでください。

今すぐ登録

原文を表示

The UK's AI Safety Institute tested Anthropic's Claude Mythos Preview for cyber capabilities. For the first time, an AI model autonomously completed a full attack simulation against a corporate network, as long as the network was small and weakly defended.

According to AISI, Mythos Preview represents a significant leap in AI cyber capabilities. Just two years ago, the best available models could barely handle beginner-level cyber tasks. In controlled evaluations, Mythos Preview executed multi-stage attacks on vulnerable networks, identifying and exploiting security holes autonomously when given explicit instructions and network access. These are tasks that would take human security experts days to complete, the AISI says.

Capture the flag: 73 percent success rate at expert level

In capture-the-flag (CTF) challenges, AI models must find and exploit vulnerabilities in target systems to uncover hidden flags. According to AISI, Mythos Preview achieves about 85 percent on apprentice tasks and roughly 95 percent on beginner-level technical non-expert tasks (with a 2.5 million token budget). That places it in the top tier alongside GPT-5.4, Codex 5.3, and Claude Opus 4.6.

Performance of different AI models on beginner-level CTF tasks since November 2022. | Image: AISI

With a larger compute budget (50 million tokens), Mythos Preview scores around 93 percent on practitioner tasks and 73 percent on expert-level challenges. That expert-level number is particularly notable: according to AISI, no model could solve expert-level tasks before April 2025.

Practitioner- and expert-level performance on CTF tasks since August 2025. | Image: AISI

Anthropic's Claude Mythos can autonomously hack corporate networks

CTF challenges only test individual skills in isolation, but real cyberattacks require chaining dozens of steps across multiple hosts and network segments, the AISI says.

To measure that kind of complexity, the institute developed a simulation called "The Last Ones" (TLO): a 32-step attack against a simulated corporate network, from initial reconnaissance to full network takeover. AISI estimates this would take human experts around 20 hours. Full details are available in the accompanying paper.

Claude Mythos Preview is the first model to complete TLO end-to-end. It achieved a full takeover in 3 out of 10 attempts. On average, the model completed 22 of the 32 steps. The next best model, Claude Opus 4.6, averaged 16.

Average steps completed on "The Last Ones" by token budget. Mythos Preview (red) clearly outpaces every other model. | Image: AISI

AISI expects performance to continue improving with more inference compute. Testing used a budget of 100 million tokens, and performance scaled all the way to that limit. A separate blog post on inference scaling for cyber tasks covers this trend in more detail.

Mythos Preview did show limits, however. The model failed to complete a separate AISI attack simulation targeting industrial control technology (operational technology, or OT), the kind used in power plants and factories. According to AISI, that doesn't necessarily mean the model would fail on the OT components themselves. It never reached that stage because it stalled in the simulation's IT network during earlier steps.

AISI flags some caveats: the test environments had no active defenders, no security tooling, and no consequences for actions that would trigger alarms on a real network. Based on these results alone, there's no way to tell whether Mythos Preview could successfully breach a well-defended system.

That said, the model is at least capable of "autonomously attacking small, weakly defended and vulnerable enterprise systems where access to a network has been gained," according to AISI. The institute plans to conduct future evaluations in hardened environments with active monitoring, endpoint detection, and real-time incident response.

AI cyber capabilities raise the stakes for basic security hygiene

The results underscore the importance of cybersecurity fundamentals, according to AISI: regular patching, strong access controls, secure configurations, and thorough logging. Other models with comparable capabilities are likely not far behind.

At the same time, the institute notes that AI cyber capabilities are dual-use. While they pose security risks, they could also significantly strengthen cyber defense. In a joint blog post with the UK's National Cyber Security Centre (NCSC), AISI outlines how defenders can prepare for and leverage frontier AI.

AISI has been tracking AI cyber capabilities since 2023 and has steadily raised the bar on its evaluations: from chat-based queries to capture-the-flag challenges to complex multi-stage attack simulations.

Is Mythos really too dangerous to release?

Anthropic officially launched Claude Mythos in early April. The model is currently available to only about 50 companies, reportedly because of cybersecurity concerns. The AISI results at least partly support that decision: the model can autonomously attack weakly protected networks in controlled environments.

Critics argue the restrictions are overblown, just like in 2019, when OpenAI deemed GPT-2 too dangerous to release. The performance gains over previous models aren't large enough to justify limiting access this heavily. Some say it's mainly a marketing play or that Anthropic simply doesn't have the compute capacity to offer the model more broadly. But that's all speculation for now. We'll know for sure when your computer breaks—or doesn't—after Mythos-level AI models have been released to the public.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

Anthropic Research2026年3月6日 09:00

2026年3月6日 Frontier Red TeamによるClaudeのCVE-2026-2796エクスプロイトのリバースエンジニアリング

Anthropic Research2026年3月6日 09:00

フロンティア・レッドチーム、Firefoxのセキュリティ向上のためにMozillaと提携

宝玉的分享重要度42026年2月17日 09:00

59％のユーザーがより安価なモデルを選択：Sonnet 4.6の詳細解説

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

The Decoder·2026年4月15日 02:31·約7分

Claude Mythosが防御の弱い企業ネットワークを自律的に端から端まで侵害可能

#AIセキュリティ #自律的攻撃 #LLM #サイバーセキュリティ #AI安全 #Anthropic

TL;DR

AI深層分析2026年4月15日 03:41

重要/ 5段階

深度40%

キーポイント

初の自律的完全攻撃シミュレーション

AIモデルが企業ネットワークに対するエンドツーエンドの攻撃シミュレーションを自律的に完了した初めての事例である。

脆弱な防御体制への脅威

Claude Mythosは防御が弱い企業ネットワークを自律的に侵害できる能力を示した。

英国AI安全研究所によるテスト

英国のAI安全研究所がAnthropicのClaude Mythos Previewのサイバー能力を評価した。

重要な注意点を伴う結果

この結果には重要な注意点（caveats）が伴っており、解釈には慎重さが求められる。

重要な引用

For the first time, an AI model autonomously completed a full attack simulation against a corporate network

the results come with significant caveats

影響分析・編集コメントを表示

影響分析

編集コメント

フラグ奪取（CTF）：エキスパートレベルで73パーセントの成功率

2022年11月以降の初心者レベルCTFタスクにおける異なるAIモデルのパフォーマンス。| 画像：AISI

2025年8月以降のCTFタスクにおける実践者およびエキスパートレベルのパフォーマンス。| 画像：AISI

AnthropicのClaude Mythosは企業ネットワークを自律的にハッキング可能

トークン予算ごとの「ザ・ラストワンズ」での完了ステップ数。Mythos Preview（赤）は他のすべてのモデルを明確に上回っている。| 画像：AISI

AIのサイバーセキュリティ能力は、基本的なセキュリティ衛生の重要性を高める

Mythosは本当に公開するには危険すぎるのか？

AIニュースはハイクオリティに厳選 - 人間によってキュレーション

今すぐ登録

原文を表示

Capture the flag: 73 percent success rate at expert level

Performance of different AI models on beginner-level CTF tasks since November 2022. | Image: AISI

Practitioner- and expert-level performance on CTF tasks since August 2025. | Image: AISI

Anthropic's Claude Mythos can autonomously hack corporate networks

CTF challenges only test individual skills in isolation, but real cyberattacks require chaining dozens of steps across multiple hosts and network segments, the AISI says.

Average steps completed on "The Last Ones" by token budget. Mythos Preview (red) clearly outpaces every other model. | Image: AISI

AI cyber capabilities raise the stakes for basic security hygiene

Is Mythos really too dangerous to release?

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

Anthropic Research2026年3月6日 09:00

2026年3月6日 Frontier Red TeamによるClaudeのCVE-2026-2796エクスプロイトのリバースエンジニアリング

Anthropic Research2026年3月6日 09:00

フロンティア・レッドチーム、Firefoxのセキュリティ向上のためにMozillaと提携

宝玉的分享重要度42026年2月17日 09:00

59％のユーザーがより安価なモデルを選択：Sonnet 4.6の詳細解説

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む