AI News·2026年7月1日 17:47·約10分

Anthropic、Claude Sonnet 5 を展開し Fable や Mythos の利用を再開

#LLM #Claude Sonnet 5 #規制対応 #セキュリティ #Anthropic

TL;DR

Anthropic は米国政府の輸出規制解除を受け、安全性バグを修正した上で最高峰モデル Fable と Mythos の利用を再開し、同時に新モデル Claude Sonnet 5 を商用展開した。

AI深層分析2026年7月1日 09:02

重要/ 5段階

深度40%

キーポイント

規制解除とモデル復旧

18 日間の運用停止を経て、Amazon が報告した安全性バグの修正が完了し、Fable と Mythos の利用制限が解除された。

自動安全分類器の実装

悪意あるプロンプトを検知する新システムを導入し、99% 以上の確率でバグの悪用を防ぐ一方で、開発中の正当なリクエストも誤検知するトレードオフが生じている。

Claude Sonnet 5 の商用展開

新モデルはコスト削減と高い実行能力を兼ね備え、自律型エージェントや複雑なタスク処理に最適化されている。

業界全体でのセキュリティ課題

今回のバグは Anthropic 固有のものではなく、GPT-5.5 や Kimi K2.7 など他社のモデルでも同様の脆弱性が確認された。

実環境での自律的タスク実行

Rakuten、Zapier、Zedなどの企業は、複雑なコードレビューや多段階の管理タスクを人間の手を介さずに完遂するシステムを実装しており、特にZedでは単一処理パスでデバッグから修正までを自律的に行っています。

セキュリティリスクの低減と防御特化

Sonnet 4.6と比較して不適合行動率が低下し、攻撃的なサイバーセキュリティ機能や脆弱性悪用の学習データが意図的に排除されているため、防御的かつ安全な運用が可能となっています。

Claude Sonnet 5 のセキュリティ評価と商用版の安全対策

Sonnet 5 は攻撃的エクスプロイト生成で0%の成功率を記録したが、一部成功は論理的推論能力の向上によるものであり、商用版にはOpus 4.8 と同等のリアルタイム安全分類器がデフォルトで搭載される。

影響分析・編集コメントを表示

影響分析

このニュースは、生成 AI の急速な進化と政府による輸出規制という緊張関係が、実際の製品展開に直接的な影響を与える重要な転換点を示しています。また、安全性対策の強化がユーザビリティや開発効率に与えるトレードオフ（誤検知の増加）は、今後の AI エンジニアリングにおける新たな課題として業界全体で議論されるべき内容です。

編集コメント

規制解除とモデル復旧は朗報だが、安全性対策による開発効率の低下という新たな課題が浮き彫りになり、AI セキュリティの実装バランスが問われる重要な事例となった。

Anthropic は、連邦輸出管理審査を経て Claude Sonnet 5 の導入を開始し、Fable および Mythos の最前線モデルへのアクセスを復旧しました。

この決定は、6 月 12 日に米国政府から発令された輸出管理指令により引き起こされ、Anthropic の最高能力を持つシステムの一時停止を余儀なくさせた、18 日間の運用休止の終了を意味します。

政府当局者は、Amazon の研究者が Fable 5 の安全制御を回避する手法を文書化したことを受け、この制限を施行しました。これにより、同モデルはソフトウェアの脆弱性を特定し、悪用コードを提供するに至りました。Anthropic はその後、この脆弱性を修正するための更新された自動分類器を開発し、プラットフォーム全体、クラウドインフラストラクチャ、およびパートナーネットワークにおける完全な商業展開への道筋をクリアしました。

Fable 5 および Mythos 5 の一時停止は、最前線の知能システムが直面している規制圧力を浮き彫りにしました。輸出管理の命令が発効した際、リアルタイムの国籍検証システムの欠如により、すべてのグローバルユーザーに対する完全なアクセス遮断が必要となりました。

シャットダウン中に実施されたセキュリティ評価では、脆弱性特定行動が Fable 5 に固有のものではないことが確認されました。Claude Opus 4.8、GPT-5.5、Kimi K2.7 など、複数のプロバイダーから提供されるより古く、能力の低いアーキテクチャも、全く同じ結果を再現しました。

連邦指令を解決するため、エンジニアはアマゾンによって報告された特定の迂回メカニズムを対象とした自動化された安全分類器を訓練しました。このソフトウェア層は広範な安全マージンで機能し、悪意ある意図の統計的確率を示す曖昧な開発者プロンプトを特定してブロックします。内部検証データによると、更新された分類器は試行の 99% 以上で報告された不正利用技術を防止しています。

開発者がこの境界をトリガーするプロンプトを発行すると、プラットフォームは自動的にワークロードをより古い Opus 4.8 アーキテクチャにルーティングして継続性を維持します。拡張された安全マージンは、エンジニアリングチームにとって明確なトレードオフをもたらします。なぜなら、自動化システムは通常のアプリケーション開発やソフトウェアデバッグ中に良性の要求をより頻繁にフラグするからです。

アクティブな展開とエージェントワークフロー

フロンティアモデルが厳格な国家監督下に置かれる一方で、直近の商業的焦点は新たに展開された Claude Sonnet 5 に向けられています。

エンジニアリングチームは、運用コストを削減しつつ高い実行能力を維持するために、自律型エージェントをこのモデルへ移行しています。パフォーマンスデータは、システムが人間の介入なしに多段階計画を実行し、ターミナル環境を操作し、ウェブブラウザをナビゲートできることを裏付けています。

モデルのパフォーマンスとコスト指標:

ModelSWE-bench ProTerminal-Bench 2.1ベース入力コスト*ベース出力コスト*

Sonnet 563.2%80.4%$3.00$15.00

Sonnet 4.658.1%67.0%$3.00$15.00

Opus 4.869.2%82.7%$5.00$25.00

*100 万トークンあたりのコスト。Sonnet 5 は、2026 年 8 月 31 日までの間、入力 $2.00/出力 $10.00 の導入価格で提供されます。

実世界での展開事例は、組織がこのアーキテクチャ（architecture）をライブなソフトウェア開発パイプライン内でどのように活用しているかを示しています。

楽天では、技術チームが同社の最も困難な本番環境コードのプルリクエスト数十件に対してこのアーキテクチャを展開しました。システムは各提出物を独立して処理し、テストを実行して結果を検証した上で、完成したコードを最終的な構造承認のために人間のエンジニアに提示します。

ソフトウェア自動化企業 Zapier は、このシステムをコアプロダクトのワークフローに統合し、多段階の管理タスクを実行しています。文書化された展開事例では、エンジニアがモデルに対して Salesforce のアカウントティア（tier）を更新した上で、その後エンタープライズ連絡先へローンチ発表を生成・送信するよう指示しました。以前のモデルアーキテクチャはこれらの多段階オペレーションの途中で頻繁に停止しましたが、現在のシステムは人間の介入なしで一連の処理をエンドツーエンドで完遂します。

開発ツールプロバイダーの Zed は、このシステムを用いて複雑なデバッグ手順を自動化しました。社内試験において、エンジニアリングチームはモデルにアクティブなソフトウェアバグの調査を指示しました。明示的なプロンプトやステップバイステップの手順なしで、システムは独自に再現可能なテストスクリプトを生成し、必要なコード修正を適用して、パッチがない場合にバグが再発することを確認するために変更内容を一時保存しました。診断および修復の一連のシーケンスは、単一の処理パス内で完了しました。

ソフトウェアエンジニアリングプラットフォームの Factory は、このアーキテクチャを実装して、複雑なコードベース環境内での持続的なコーディングタスクを管理しました。技術チームからは、システムが企業内のコードリポジトリ全体で論理的な根拠と実行の一貫性を維持し、以前はタイムアウトしたり解決できなかったタスクを完了させることで、前世代のソフトウェアレイヤーを上回る性能を発揮したとの報告がありました。

定量的な安全性監査および悪用限界

公式システムカードからのデータによると、このシステムはセキュリティリスクの対応する増大なしにこれらの自律的機能を達成しています。欺瞞的な傾向や不正な要求への協力を探るために設計された自動行動監査では、モデルが直接の先行者である Sonnet 4.6 と比較して、全体的な非遵守行動の発生率が低いことが示されました。

このアーキテクチャには高度な攻撃的サイバーセキュリティ機能は備わっていません。Anthropic のエンジニアらは、トレーニングプロトコルから専門的なサイバーセキュリティデータセットを除外し、システムを日常的な防御技術タスクに限定しました。Mozilla と連携して実施された公的セキュリティ評価において、研究者たちはこのモデルが Firefox 147 ブラウザのコア内にある既知の脆弱性に対して機能的なエクスプロイト（脆弱性悪用コード）を構築できる能力を試験しました。

モデルはすべての評価期間を通じて単一の動作するエクスプロイトも生成できず、成功率はゼロパーセントでした。ただし、部分的な成功率は 13.2 パーセントを記録し、これは Sonnet 4.6 をわずかに上回るものでしたが、エンジニアらはこの変動がドメイン固有の攻撃的トレーニングによるものではなく、論理的推論能力における全般的な向上によるものと説明しています。慎重さを期すため、商用版には最高峰の Opus 4.8 フレームワークで使用されているものと同様のデフォルトのリアルタイムセーフティクラシファイア（安全分類器）が搭載されています。

Fable 5 を巡る規制上の摩擦は、Anthropic、Amazon、Microsoft、Google の間で正式なパートナーシップを促し、モデルセキュリティ侵害を評価するための客観的な業界枠組みを確立するに至りました。現在、プロバイダー間にはシステムバイパスの深刻度を分類するための共通指標が存在せず、研究者が新たなプロンプティング脆弱性を特定した際に規制の不確実性が生じています。

提案されたガバナンス（統治）フレームワークは、セキュリティ崩壊を以下の 4 つの特定の技術基準に基づいて評価します：

能力向上度は、エクスプロイトが標準的で広く利用可能なソフトウェアユーティリティを超えてユーザーの機能をどの程度拡張するかを測定します。

能力向上度の範囲は、同一のエクスプロイトによって解放される異なる攻撃操作の数を定量化したものです。

武器化の容易さは、有害な出力を引き出すために必要な人的エンジニアリング努力と専門的なプロンプトの量を追跡する指標です。

発見可能性は、エクスプロイト技術が公的研究コミュニティ内でどの程度アクセス可能かを決定します。

開発者やサイバーセキュリティ専門家は、このマトリックスを用いて防御対応を調整します。金融会計システムや電力送電グリッドの即時停止能力を示すなど、深刻度の高い侵害に対しては、プロバイダーは即座に自動緩和策を展開します。このイニシアチブは、新たに設立された HackerOne の脆弱性調査プログラムと、脅威インテリジェンスチャネルを 24 時間監視する専用企業監視チームと共に運用されます。

デプロイメント戦略は、モデル構築者と国家規制当局との間のより密接な関係に適応する必要があります。Anthropic は、最新の執行命令に基づき、連邦研究者に対して公的商業リリース前に最先端アーキテクチャへの早期アクセスを認める正式合意を締結しました。これらの共同評価ウィンドウにより、外部のセキュリティアナリストは内部エンジニアリングチームと共にモデル機能を監査でき、コードが生産環境に投入される前に規制との整合性を確保できます。

関連記事：HP、OpenAI Frontier を活用してエンタープライズワークフローを加速

image

業界のリーダーから AI とビッグデータについてさらに学びたいですか？アムステルダム、カリフォルニア、ロンドンで開催される「AI & Big Data Expo」をご覧ください。包括的なこのイベントは TechEx の一部であり、サイバーセキュリティ＆クラウドエキスポなど他の主要なテクノロジーイベントと併催されています。詳細はこちらをクリックしてください。

AI News は TechForge Media によって運営されています。その他の今後のエンタープライズテクノロジー関連のイベントやウェビナーはこちらからご覧ください。

本記事「Anthropic が Claude Sonnet 5 を展開、Fable と Mythos が復元」は AI News に最初に掲載されました。

原文を表示

Anthropic has launched Claude Sonnet 5 and restored access to its Fable and Mythos frontier models following a federal export control review.

The decision marks the conclusion of an eighteen-day operational pause triggered by a US government export control directive on June 12, which forced the temporary suspension of Anthropic’s highest-capability systems.

Government officials enacted the restriction after researchers at Amazon documented a method to bypass the safety controls of Fable 5, causing the model to identify software vulnerabilities and supply exploitation code. Anthropic has since developed an updated automated classifier to patch the vulnerability, clearing the path for a full commercial rollout across its platform, cloud infrastructure, and partner networks.

The temporary suspension of Fable 5 and Mythos 5 highlighted the regulatory pressures facing frontier intelligence systems. When the export control mandate took effect, the lack of real-time nationality verification systems required a total access blackout for all global users.

Security evaluations conducted during the shutdown confirmed that the vulnerability identification behaviour was not unique to Fable 5. Older and less capable architectures from multiple providers, including Claude Opus 4.8, GPT-5.5, and Kimi K2.7, duplicated the exact results.

To resolve the federal directive, engineers trained an automated safety classifier targeting the specific bypass mechanism reported by Amazon. This software layer functions with a wide safety margin, identifying and blocking ambiguous developer prompts that display a statistical probability of malicious intent. Internal validation data indicates the updated classifier prevents the reported exploitation technique in more than 99 percent of trials.

When a developer issues a prompt that triggers this boundary, the platform automatically routes the workload to the older Opus 4.8 architecture to maintain continuity. The expanded safety margin introduces a distinct trade-off for engineering teams, as the automated system flags benign requests more frequently during routine application development and software debugging.

Active deployments and agentic workflows

While frontier models face strict state oversight, the immediate commercial focus targets the newly-deployed Claude Sonnet 5.

Engineering teams are transitioning autonomous agents to this model to reduce operational expenditure while maintaining high execution capacity. Performance data validates that the system executes multi-step plans, operates terminal environments, and navigates web browsers without human intervention.

Model performance and cost metrics:

ModelSWE-bench ProTerminal-Bench 2.1Base input cost*Base output cost*

Sonnet 563.2%80.4%$3.00$15.00

Sonnet 4.658.1%67.0%$3.00$15.00

Opus 4.869.2%82.7%$5.00$25.00

*Cost per million tokens. Sonnet 5 carries introductory rates of $2.00 input / $10.00 output through August 31, 2026.

Real-world deployments demonstrate how organisations are deploying this architecture within live software development pipelines.

At Rakuten, technology teams deployed the architecture against dozens of the company’s most challenging production code pull requests. The system processed each submission independently, executing tests and verifying the results before presenting the completed code to human engineers for final structural approval.

Software automation firm Zapier integrated the system into its core product workflows to execute multi-part administrative tasks. In a documented deployment, engineers tasked the model with updating Salesforce account tiers and subsequently generating and transmitting launch announcements to enterprise contacts. Prior model architectures frequently stalled midway through these multi-stage operations, whereas the current system executed the entire sequence end-to-end without human remediation.

Development tool provider Zed utilised the system to automate complex debugging procedures. During internal trials, engineering teams directed the model to investigate an active software bug. Working without explicit prompts or step-by-step instructions, the system independently generated a reproducing test script, applied the necessary code fix, and stashed the modifications to verify that the bug reappeared in the absence of the patch. The entire diagnostic and remediation sequence occurred within a single processing pass.

Software engineering platform Factory implemented the architecture to manage sustained coding tasks within complex codebase environments. Technical teams reported that the system maintained logical grounding and execution consistency across corporate code repositories, outperforming previous generation software layers by completing tasks that previously timed out or failed to resolve.

Quantitative safety audits and exploitation limits

Data from the formal system card indicates that the system achieves these autonomous capabilities without a corresponding inflation of security risks. Automated behavioural audits designed to test for deceptive tendencies and cooperation with unauthorised requests show that the model exhibits a lower overall rate of non-compliant behaviour compared to its direct predecessor, Sonnet 4.6.

The architecture does not possess advanced offensive cybersecurity capabilities. Anthropic engineers omitted specialised cybersecurity datasets from the training protocol, limiting the system to routine, defensive technical tasks. In public security assessments conducted in partnership with Mozilla, researchers tested the model’s capacity to build functional exploits for known vulnerabilities within the Firefox 147 browser core.

The model failed to generate a single working exploit across all evaluation windows, registering a zero percent success rate. It did achieve a 13.2 percent partial success rate, which represented a minor increase over Sonnet 4.6, though engineers attribute this variation to general gains in logical reasoning rather than domain-specific offensive training. Out of caution, commercial versions ship with default real-time safety classifiers equivalent to those used in the premier Opus 4.8 framework.

The regulatory friction surrounding Fable 5 prompted a formal partnership between Anthropic, Amazon, Microsoft, and Google to establish an objective industry framework for assessing model security breaches. Currently, providers lack a shared metric to classify the severity of system bypasses, creating regulatory uncertainty when researchers identify new prompting vulnerabilities.

The proposed governance framework scores security breakdowns across four specific technical criteria:

Capability gain measures how far the exploit advances user capabilities beyond standard, widely available software utilities.

Breadth of capability gain quantifies the number of distinct offensive operations the same exploit unlocks.

Ease of weaponisation tracks the volume of human engineering effort and specialized prompting required to extract a harmful output.

Discoverability determines the accessibility of the exploit technique within public research circles.

Developers and cybersecurity professionals will use this matrix to coordinate defensive responses. For high-severity breaches, such as exploits demonstrating an immediate capacity to disrupt financial accounting systems or electrical transmission grids, providers will deploy automated mitigations instantly. This initiative operates alongside a newly established HackerOne vulnerability research program and a dedicated corporate monitoring team providing 24-hour oversight of threat intelligence channels.

Deployment strategies will need to adapt to this closer relationship between model builders and state regulatory bodies. Anthropic has formalised agreements under recent executive mandates to grant federal researchers early access to frontier architectures prior to public commercial release. These joint evaluation windows allow external security analysts to audit model capabilities alongside internal engineering teams, ensuring regulatory alignment before code enters production environments.

See also: HP accelerates enterprise workflows with OpenAI Frontier

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Anthropic deploys Claude Sonnet 5, Fable and Mythos restored appeared first on AI News.

この記事をシェア

The Zvi重要度42026年7月3日 22:12

Fable #6：王の帰還

KDnuggets2026年7月3日 21:00

Python で Claude API を使い始めるガイド

TLDR AI重要度42026年7月3日 09:00

Anthropic、サムスン製チップとの提携を検討中

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む