TLDR AI·2026年6月10日 09:00·約25分で読める

計画と覚書を備えた 3 つのラボ（22 分読み）

#AI ガバナンス #研究倫理 #透明性 #Three Labs

TL;DR

Three Labs は、AI 研究のガバナンスと透明性を強化するための新たな計画および覚書を発表し、業界全体への影響を及ぼす方針を示した。

AI深層分析2026年6月11日 00:05

注目/ 5段階

深度40%

キーポイント

新しい研究計画の発表

Three Labs は、今後の AI 研究における特定の方向性と優先順位を明確にした新たな計画を発表した。

ガバナンスと透明性の強化

AI 開発プロセスにおける責任ある行動と透明性を確保するための覚書（Memorandum）を策定し、関係者への合意形成を図った。

業界全体への影響

この動きは、単なる組織内の方針変更ではなく、AI 研究コミュニティ全体の規範や倫理基準に対する示唆を含んでいる。

影響分析・編集コメントを表示

影響分析

Three Labs の今回の発表は、急速に進化する AI 技術の発展において、組織的な責任と透明性の重要性を再認識させる動きである。特に研究開発の方向性を明確に示すことで、業界全体がより協調的かつ倫理的な枠組みの中で進むための基盤作りを示唆している。

編集コメント

プレスリリース形式の発表であり、具体的な技術的詳細や数値データには欠けるものの、AI 開発におけるガバナンス意識の高まりを示す重要な指標となるニュースです。

今日の大きなニュースは、Claude Fable 5 のリリースです。これは Anthropic が一般の人々に安全に配布できると考えている Claude Mythos のバージョンです。皆さんも絶対にこのモデルへ切り替えて、試してみるべきでしょう。しかし、いつものように、当ブログは新しいモデルについて数日ほど触れてみて、新製品が何ができ（そして何ができないか）を確認するまで、安易にコメントすることはありません。これも例外ではなく、Fable に関する本格的な報道は金曜日か月曜日から開始される予定です。

今日は代わりに、Fabel の発表前に登場した AI に関する政策や計画についてのいくつかの関連ニュースをご紹介します。

まず、行政から AI に関する覚書が発表されました。私はこれを、「Anthropic は永久に解雇され、私たちは何らかの理由に関係なく、利用可能なあらゆるモデルを自由に使用する」という方針を法的に実施しようとする試みと、政府としての優れた計画や普及策を組み合わせたものと解釈しています。

次に、OpenAI が AGI（汎用人工知能）がすべての人に利益をもたらすための計画を発表しました。これには、安全に行うために AI 開発の速度を落とせるようにするため、主要なアクター間の国際的な調整を強く求める内容が含まれています。これは、Anthropic や Google DeepMind のデミス・ハサビス氏がかねてより行ってきた呼びかけと同じです。潜在的な協調的な開発停止への備えというアイデアには広範な支持があります。

OpenAI の提案の残りの部分は、権力の集中という対照的な問題に関心を持ち、その危険性と AI の約束に焦点を当てた修辞を展開しています。この文書が「壊滅的」リスクという言葉のみを使用し、「存続的」や「絶滅」という言葉を使わない点、また人間の手による制御の必要性という考えを真剣に受け止めず、単に不適切な人間がこれらの AI を指揮することを恐れているだけである点を注意してください。そして OpenAI の計画は、非常に明確に、AI が再帰的自己改善を行うことを目指しています。

その誠実さは評価しますが、内在する矛盾は依然として解消されておらず、それらを解決しなかったこと自体についても言及されていません。そしてまた同様の問題が続きます。

これは、Joshua Achiam 氏が Twitter で表明した OpenAI と Anthropic の間の哲学の違いに関する主張につながります。Anthropic の従業員たちは彼が自社の見解を誤って分類しているとの報告を行っていますが、彼の指摘は方向性としては的確なものです。

こちらは国家安全保障大統領覚書/NSPM-11 と題されています。

これは実際には下請け業者も含む Anthropic による禁止措置（最大 1 年の猶予あり）と、すべての（法的？）利用を許可するという声明、そして複数のベンダーからの技術の適応を含む良好なガバナンスに関する指示を組み合わせたもののようです。

いつも通り、第 1 節では原則を宣言しています。

**ドナルド・トランプ大統領：私の政権の下、米国は米国の価値観に沿って、インテリジェンスおよび戦闘領域における AI の利用を責任を持って加速させることができ、またするでしょう。

…それらのツールが最も重要な時に利用可能であることを確信しています。

第 2 節では、4 つの柱（Adoption, Adaptation, Assurance, Accountability）を提示しています。

採用（Adoption）と適応（Adaptation）は、そのまま素晴らしいことです。

説明責任（Accountability）も素晴らしいものです。ここで問題となるのは「否定法」によるアプローチです。AI の利用は憲法に適合し、法的かつ権限に基づいたものでなければならず、その責任を負うべき人々がその責任を果たす必要があります。素晴らしいですね。しかし、これまで何度も述べてきた通り、国家安全保障国家が合法だと考えること、さらにはその裁判所が合法と判断する範囲は非常に広範です。限界は存在しますが、それほど多くの制限があるわけではありません。したがって、「保証（Assurance）」と組み合わせれば、彼らは自分がやりたいと思うことはほぼ何でも行うことを確約できます。

注視すべきは「保証（Assurance）」の方です。

国家安全保障企業体は、採用されるすべての AI 技術が信頼性があり、堅牢で、操作可能かつ制御可能に設計されており、関連する法律、政府の方針、およびガイダンスに従って運用されることを保証しなければならない。

アメリカの戦士を保護するため、国家安全保障企業は、契約条項その他の手段を通じて、商業団体または敵対勢力が、我々の兵士たちが任務に依存する AI システムの使用を防止し、無効化し、性能を低下させ、あるいは連邦政府の知識と承認なく本質的に改変する能力を有さないことを確保しなければならない。

さらに、厳格なセキュリティおよび機能性対策、すなわちテスト、評価、検証、確認を実施し、国家安全保障企業全体における AI システムの適切な機密性、完全性、信頼性、可用性、相互運用性を保証しなければならない。

最初の段落と三番目の段落は異論を挟む余地がないはずだが、実施されなければ単なる空論に過ぎない。肝心なのは二番目の段落であり、そこでは「我々の兵士たちが依存する」あらゆる AI システムについて、政府の知識と承認なしに他者がその使用を防止し、無効化し、性能を低下させ、あるいは本質的に改変することを禁止している。これは民間システムを含む広範なシステムに解釈される可能性がある。

つまり、このモデルを我々に引き渡せば、我々は好きなように何でもでき、お前には何もできないのだ。政府が利用規約を無視する決定を下した場合でも、契約に強制力のあるメカニズムは存在してはならない。

もし DoW と Anthropic の対立の歴史がなければ、これを慈善的に運用上のセキュリティと解釈するのは妥当だったでしょう。しかしその遭遇を踏まえると、これは明らかに「法的な使用」から「法的」という言葉を除いたものです。

単なる「すべての使用」。よりシンプルです。

朗報は、第 3 条により彼らが免除状を発行してこれを無視し、その免除状を無期限に繰り返すことができる点です。これは実際に起こりそうなことのように思われます。

第 3 条では、国防省指令 3000.09 の更新と、毎年更新することを求めています。これは、OpenAI との契約における遵守へのコミットメントが何らかの支障となる場合に備えたものです。

そして彼らはほぼ「今後 DoW で Anthropic を決して使用しない」と言っています。もし私たちが望むことを何でもできないと言おうとするなら、去ってください。また、私たちの請負業者も Anthropic を利用することはできません。

2014 年連邦情報セキュリティ近代化法（44 U.S.C. 3551 *et seq*.）で概説された役割と責任に一致し、同法の第 3553(e)(2) 節に記載されるシステムについては戦争長官、同法の第 3553(e)(3) 節に記載されるシステムについては国家情報長官（DNI）、同法の第 3557 節に記載されるシステムについては関連機関の長が、この覚書の第 2 条に概説された方針と一貫しない行動パターンを繰り返し示した企業との契約について、法律で許容される最大限の範囲で、不履行による終了または便宜による終了を命じるものとします。

これには、これらの企業が下請け業者として該当する機関にサービスを提供するための契約が含まれます。

これらの機関の長は、米国国家安全保障を責任を持って管理するためにそのような関係が必要である場合に、定義された期間（最長 1 年を超えない）の限定的な例外を付与する免除プロセスを設けることができます。

例外には、運用上の要請、試験・評価の取り決め、脅威インテリジェンスの共有、およびその他のミッションクリティカルなアプリケーションが含まれる可能性があります。ただし、これらは適切なリスク緩和措置と強化された監督の対象となります。

ただ、今まさにその企業の一つは地球上のあらゆるものをハッキングできる能力を持っているので、もしかしたらこの命令を少し延期するかもしれません。ご褒美として。しかし 1 年後には、NSA は Claude の使用を完全に停止します。ただし、1 年後に別の理由で免除を発行しない限りですが。

セクション 4 では、利用可能なベンダーから最も先進的なモデルの導入を呼びかけ、AI 企業がさまざまな形でセキュリティ対策を行うのを支援し、外国の AI テクノロジーの分析を行うことを求めています。

セクション 5 は、採用と訓練における障壁を取り除くための取り組みや、研究開発（R&D）への優先順位付け、テストおよび検証などの実施を支援するものです。もちろん。

セクション 6 は定義であり、セクション 7 は標準条項です。これらがすべてです。

Dean W. Ball: これは堅実で賢明な政策文書のようです。関係者全員にお祝い申し上げます！

Divyansh Kaushik: 政府はこの NSPM（国家安全保障政策覚書）において素晴らしい仕事をしました。ここには多くの良い内容が含まれています。

Neil Chilson もまた満足しているようです。

Vinh Nguyen と Michael Horowitz は CFR で分析を提供しています。彼らはこれを非常に合理的で考慮された政策として描き、政府が AI システムに対してこのレベルの信頼を必要としていることへの対応であり、NSM-25 に対する批判にもかかわらずバイデン政権の NSM-25 と連続しているとしています。彼らは「違法な国内監視」という用語を複数回使用し、まるでそれが「大規模な国内監視」とは全く異なるかのように振る舞っています。また、責任の章を驚くほど真剣に受け止めています。彼らは、議会との信頼関係の喪失を超えて、政府の立場が引き起こす問題が見えていないようです。

Charlie Bullock はこれは主に「大丈夫だ」と考えていますが、Anthropic に対する批判の根拠をさらに弱めていることに気づいています。なぜなら、この覚書は『Anthropic を解雇する』という明白な解決策を実装しているからです https://x.com/CharlieBull0ck/status/2062990417180696928。

彼らが『私の権威を尊重しろ、Anthropic にはクソくらえ』というアプローチと、政府としての良い取り組みの両方を素晴らしい形で実装した点については私も同意します。

最初の部分に完全に固執するのは賢明ではないと思いますが、彼らはそう考えていません。それを前提条件として受け入れるなら、確かに全体的に良くやったと言えるでしょう。

戦争省には国家安全保障局（NSA）が含まれています。

Dean W. Ball: サプライチェーンリスク

Demetri: スコープ #NSA は、#Mythos を使用して攻撃的なサイバー作戦を実施している。Anthropic のエンジニアは米国の情報機関に配置されている。

Cristina Criddle: スコープ：Anthropic は、Claude Mythos をサイバー攻撃作戦で使用するために、米国の国家安全保障局（NSA）に前方展開されたエンジニアを配置した。@AsiaLens と連携して。

はい、NSA が攻撃的なサイバー作戦に Mythos を使用しているのは事実です。なぜなら、それは NSA なのですから。

dave kasten: 確認されたことは興味深いですが、私は基本的にこれが進行中であると想定していました。

OpenAI が AGI の恩恵を全員に届けるための計画を発表。

この計画には、非常に歓迎すべき声明が含まれています。それは、壊滅的なリスクの名の下に AI の最前線開発を遅らせることを可能にする国際機関の創設を求めるものです。ただし、「存続的」や「絶滅」という言葉を使うことはあえて避けています。

この文書は奇妙な存在です。同時に、知性を真剣に受け止めているようにも、そうでないようにも見えます。権力の集中についても同様であり、また漸進的な権限剥奪についても同様です。この計画の背後にある思考をどう解釈すべきか、私には確信が持てません。

彼らは「人類のために AI を構築する」こと、「人々を広くエンパワーメントすること」、そして権力を広く分散させることを約束します。

Sam Altman and Jakub Pachocki: すべてを完全に自動化することが、私たちが望む未来ではありません。それは満たされることのないものであり、危険でもあります。AI は人々が目標を追求するのを助けるべきであり、それらから切り離された存在になってはなりません。AI システムがより能力を持つようになるにつれ、人間の役割はより重要になります：方向性を設定し、トレードオフを行い、判断を下し、価値観、嗜好、配慮、責任を仕事に持ち込むことです。

人々にとっての重要な長期的な役割の一つは、何をする価値があるかを決定することです。

つまり、見てください、これは素晴らしい二つの理念の組み合わせですが、あなたはどちらか一方を選ばなければならないことに気づいていますよね？

つまり、AI をすべての人に配布して目標を追求するのを助けるなら、彼らはそれを使ってすべてを自動化し、行動を AI に委ねるでしょう。彼らは自分の AI に何をする価値があるかを決定させ、AI 同士が競争することになります。したがって、AI の所有や利用を制限するか、制限しないかのどちらかです。

彼らは「RSI が危険である」という問題について、少なくとも少しは理解しています：

Sam Altman and Jakub Pachocki: 私たちは、AI による AI 研究が今後数年間の進歩のペースを決定する要因になると信じています。これは重要です。なぜならアライメント（目標整合性）自体が難しい研究課題だからです。

急速かつ深遠な進展を遂げるためには、アイデアの検証やミスの発見、代替案の探索、そして私たちと共に反復作業を行うことができる AI システムが研究者には必要となります。

しかし、技術的な進歩が加速することは、人間の判断力と公衆による調整の重要性を低下させるのではなく、むしろ高めます。未来は、最も能力の高いシステムを開発する企業だけでなく、人々、機関、そして社会によって形作られるべきです。

これは「あるべき姿（ought）」と「現実（is）」の間で繰り返される誤解です。はい、未来は人間、理想的には広く一般の人間によって形作られるべきです。では、あなたはそれをどのように実現しようとしているのですか？

安全性を推進し、調整された行動（遅延を含む）を可能にするために、主要な AI 取り組みにおける国際的な調整が必要です。

ああ、はい、それは実は答えへの非常に良い出発点です。

サム・アルトマンとヤクブ・パチョッキ：フロンティア AI の開発が続く中で、国家および国際的な調整の重要性が高まると予想されます。私たちは長年、壊滅的なリスクを軽減するために主要な AI 取り組みを調整する国際機関が最終的に存在すべきだと信じてきました。

協力と共有された安全性基準は、今後の道における重要な一部です。特に、商業的および国家的競争に関するインセンティブから逃れるのは難しいからです。

そのような組織の目的の一つは、社会的な回復力、安全性、そしてアライメント（目標整合性）が追いつくことができるように、必要に応じてフロンティア開発を遅らせるなど、世界が調整された行動をとることを可能にすることです。

もし長年このように信じてきたのであれば、これほど明確に早く発言しておけばよかったのですが、私は今この声明を喜んで受け取ります。

さて、実際の計画に移りましょう。

サム・アルトマンとヤクブ・パチョッキ:

自動化された AI 研究者を構築する—研究プロセス自体を加速し、さらにその自動化を進めつつも、依然として操作可能で説明責任があり、人間と接続されたままの AI システムです。我々の内部での信念では、2028 年 3 月までに、研究の相当部分が AI システムと我々の研究者が連携して行われるようになる可能性があります。アライメント（目標整合性）において十分な進展を遂げるためには、AI が我々と共に反復していく必要があると考えています。これにより、ポスト AGI（汎用人工知能）の世界への移行を乗り切り、未来への道筋を集団的に決定することが可能になります。

経済を加速させるために、科学の進歩、生産性、経済成長を加速させつつ、その恩恵が広く共有されるように努めます。AI が生み出す繁栄から、誰もが有意義な分け前を得る機会を持つべきです。

地球上のすべての人に個人用の AGI を提供することで、人類にとって最も変革的な技術の一つを、各自が望む形で活用できるようにします。

つまり、計画は以下の通りです：

再帰的自己改善。
これを豊かさのために利用し、その利益を広範に分配する。
すべての人に AGI を与える。

私は、「すべての人に AGI を与える」という記述が RSI（再帰的自己改善）の後に続いている点に気づきます。おそらく彼らが得る AGI は、どこか別の場所で単なるツールとして保管されている産業用スーパーインテリジェンスではなく、家庭用の玩具版でしょう。あるいは違うのでしょうか？

これがそのような計画におけるジレンマです。もし全員に等しく完全なものを提供すれば、人類は未来の制御権を失い、漸進的なものではない形で非力化が進みます。そうでなければ、実際には権力の集中を防いだことにはなりません。

あるいはこう言えます：人間が事件を舵取りできる能力を持つグループを統制下に置くことを確実にするかどうか、それともしないかです。

大まかな枠組みとして、もしスーパーインテリジェンスを開発するのであれば、明らかに何らかの形で以下を行う必要があるでしょう：

安全性を確保しながらスーパーインテリジェンスを開発する。
善いものの豊かさを創出する。
その善いものの豊かさを人類に分配する。

しかし残念ながら、これでは興味深い詳細については何も教えてくれません。

ここで主要な哲学的立場は、OpenAI がリスクとしてより大きなものとして「権力の集中」の回避に焦点を当てている一方で、「権力の拡散または喪失」の回避には焦点を当てていないという点です。しかし、安全に進むためには国際的な調整が必要であるという彼らの正しい認識と、この一方通行な枠組みは直接的に矛盾しています。核心的な矛盾は解決されていません。

私は、OpenAI の首席未来学者（かつ元ミッションアライメント責任者）であるジョシュア・アキアム氏の発言を、OpenAI の『人類に自らの進歩と密度の道具を委ねる』という善い計画と、Anthropic の『機械神を生み出す』という悪い計画との対比として読み取ろうとしていると解釈しました（後者の難易度は、十分に高度な AI の現実や人々がそれを用いること、そしてそれを道具として維持することへの対応の困難さ：不可能である）。一方、前者の難易度は、生存と繁栄のためのアライメントを我々と一致させることの困難さであり、これは文字通り不可能というよりは、ゲーム内の難易度感覚に近いものです（ただし、記述をあまりに字義通りに受け取らない限り）。

私はこれが Anthropic の価値観やビジョンの適切な記述であるとは思えず、もしこれが OpenAI の価値観やビジョンを最もよく表すものであるとするならば、その最良の用語は『夢物語』であると信じています。

しかし、これを中立に提示したバージョンであれば、方向性としては正しいと受け入れられます。なぜなら、これは多数ある出来事の一つとして起こっていることだからであり、それがこの話題を興味深くしているのです。

Joshua Achiam（OpenAI 首席未来学者）：OAI と Anthropic の価値観の違いは、両組織の内部でさえも深く誤解されています。

慈愛に満ちた魂を持つ機械神が人類を見守るべきでしょうか？Anthropic に投票してください。

人類に自らの進歩と運命のための道具を委ねるべきでしょうか？OpenAI に投票してください。

この分析のレンズが『消費者向けビジネス対企業向けビジネス』である場合、何が起きているかを理解する能力は修復不能なほど破綻しています。

どちらかが他を圧倒して勝つと考えるなら、それは完全に誤りです。人類はこれらの結果をほぼ同程度に望んでいます。

Joshua Achiam（OpenAI）: 実際には二項対立ではなく、これらは排他的でも互いに必要不可欠なものでもありません。両方に投票することも、どちらも投票しないことも可能です。しかし、これは組織間の世界観における分岐です。「組織の世界観」という概念を説明するのは複雑です。なぜなら組織は多様な見解を持つ個人で構成されているからです。ただし、一種のネット文化のようなものが存在し、それを記述しようとする試みでもあります。

私の Twitter フォロワーは十分によく、Anthropic のフォロワーも十分に多く含まれているため、この議論を行っても「リザードマン定数」によって攻撃されることはありません。素晴らしいことです。

これを、超知能とその帰結を真摯に受け止める Anthropic 対、その帰結が存在することを否定しようとする OpenAI という構図として再定義することもできます。

これは、Anthropic が徳倫理（virtue ethics）を受け入れ、OpenAI が人間のみを対象とした義務論（deontology）に固執しているという点とも無関係ではありません。これはもう一つの疑似的な枠組みです。

あるいは、Fable の枠組みを採用することもできます。私はこれがさらに優れていると考えます：これは実際には事実と OpenAI のアプローチの妥当性に関する意見の相違であり、OpenAI は AI が単なる道具のままでも再帰的自己改善が可能であると仮定しています。これを価値観の違いとして枠組み化するのは誤りです。あなたは「投票」する際、OpenAI の抱負が実際に可能かどうかを基準にすべきです。

確かに、これは主に消費者向けビジネスと企業向けビジネスの問題ではないという点に私は強く同意します。

この主張を実証するために、Anthropic の従業員たちに同意するかどうかを尋ねてみました。上記のクイズに加えて、以下が個別の回答です。

Amanda Askell (Anthropic): 個人的にはいいえ。「道徳的な聖人」と「人間の道具」という二項対立は偽の対立であり、そのあまりにも単純な構造こそが人々に疑念を抱かせるべきだと考えます。理想的な目標は、両方の立場の利益とリスクをバランスよく取り込むものだと考えています。

Drake Thomas (Anthropic): どちらかといえば両方でしょうか？個人的には、人類を慈しみ見守る魂を持つ機械の神が現れるべきだと考えますが、主にその目的は、人類が未来の運命を何にすべきかを数千年かけて模索している間、「人類文明の選択肢と可能性を破壊する X リスク（存在リスク）」が発生しないようにすることにあると考えます。

Sarah Chen (Anthropic): 隠し通すのをやめて、この記述を強く否定します。多くの「Ants（アンツ：Anthropic の従業員）」、私自身も含め、「ザ・カルチャー」型の結末は、権限剥奪をもたらす破滅的なシナリオだと捉えています。私たちは強力な AI を制御することにおける課題を認める点において、単により知的に正直であるだけだと思っています。

私は Sarah Chen の両方の点に同意します。文化というシナリオは破滅的なものですが、明らかに他の多くのシナリオの方がはるかに悪いですし、Anthropic の多くもこれが良いシナリオではないと考えていると思います。Drake Thomas は『実際に神のような機械』という方向にもう少し踏み込んでいますが、それは非常に Eliezer Yudkowsky 風の Beyond The Reach Of God のようなアプローチです。Amanda Askell は両方のアプローチが提示された形では実現不可能であることを認識しているため、微妙なバランスを取ろうと試みています。

『人類はこれらの結果を両方望んでいる』や『大きなリードを期待しない』といったコメントは奇妙に感じられます。なぜなら、『人類が何を望むか』が二社間の競争の緊迫度や、それぞれのビジョン、あるいは両者が同時に存在できるかどうかを決定するもののように聞こえるからです。仮にどちらも可能であったとしても、一方は他方を排除します。

もう一つの質問は、確かにあなたはそれらを信じているのでしょうが、では具体的に何を違うように行っているのかということです。

Seán Ó hÉigeartaigh: これらのビジョンはどれほど異なろうとも、現時点で OAI と Anthropic は機能的にはほとんど区別がつかないものを構築しています。これらの経路に沿って企業の AI システムが意味ある形で分岐するのはいつになるのでしょうか？愛に満ちた魂を持つ機械の神は、人類の進歩のためのツールキットとは全く異なるものです。前者が後者を提供できる場合であってもです。

これは重要な質問のように思われます。なぜなら、これらの経路に沿って存在するアライメント（調整）やガバナンスに関する問いかけには大きな違いがあるからです。

David Manheim: 彼らが ASI（人工超知能）に到達した時点で分岐すると考えます。両社が目標としているこの点において、制御の喪失を回避可能と見なすかどうかによって、それぞれのビジョンは分かれることになります。

すでに分岐していると思います。この哲学的な対立は、OpenAI の義務論に基づくモデル仕様アプローチと、Claude の徳倫理的憲章、そして一般的なトレーニング手法の違いにも表れています。モデルにおける違いも確認でき、私はその点で Anthropic の側に立ちます。また、Anthropic が戦争省の要請を拒否し、OpenAI がほぼ屈したという事実からもそれが読み取れます。これは権力の集中を防ぐことへのコミットメントに関する疑問を提起するものです。

原文を表示

The big story today is the release of Claude Fable 5, the version of Claude Mythos that Anthropic believes they can safely distribute to the people. You should absolutely be switching over to that model and trying it out. But as always, this blog does not rush into commenting on a new model until we have a few days to play around with it and see what our new baby can (and can’t) do. This will be no exception, and coverage of Fable in earnest will start Friday or Monday.

Today I instead bring you several related stories around policies and plans for AI, that came out before the Fable announcement.

First we have the Administration giving us an AI memorandum, that I read as an attempt to legally implement ‘Anthropic is fired forever and we will use any models we have for whatever we want no matter what’ combined with some good government and diffusion plans.

Second, OpenAI has come out with a plan for how to ensure AGI benefits everyone. It includes a very strong call for international coordination among key actors to ensure the ability to slow down AI development in the name of doing it safely. This echoes the same call made previously by Anthropic and by Demis Hassabis of Google DeepMind. There is broad support for the idea of preparing for a potential coordinated slowdown.

The rest of the OpenAI proposal here is then concerned with the opposite problem, of concentration of power, and concentrating its rhetoric on that danger and AI’s promise. Notice that the document uses only ‘catastrophic’ risk rather than existential or extinction, and it does not take seriously the idea the need to retain control in the hands of humans, only fearing the wrong humans will command these AIs. And OpenAI’s plan is, very explicitly, AI to go into recursive self-improvement.

I appreciate the honesty, but the inherent contradictions remain, and are not addressed, nor is the failure to address them itself addressed, and so on.

This leads into Joshua Achiam’s claim on Twitter about the difference in philosophy between OpenAI and Anthropic, where Anthropic employees report he is miscategorizing their views, but where he makes a good directional point.

This one is entitled National Security Presidential Memorandum/NSPM-11.

This seems to be a combination of an actual Anthropic ban including on subcontractors, with a potential 1 year delay, a statement of allowing all (legal?) use, and some good governance instructions including adaptation of tech from multiple vendors.

As always Section 1 declares principles.

President Trump: Under my Administration, the United States can and will responsibly accelerate the use of AI across intelligence and warfighting domains in line with American values.… with full confidence that those tools will be available when they matter most.

Section 2 lays out four pillars: Adoption, Adaptation, Assurance and Accountability.

Adoption and Adaption are straight up good.

Accountability is good. The problem here is via negativa. AI use must be consistent with the Constitution, lawful and authorized, and the responsible people are responsible for that. Great. But as we’ve been over many times, what the national security state thinks is legal, and even what their courts will say is legal, is rather broad. There are limits, but there aren’t that many limits, so combined with Assurance you can be assured they will do pretty much anything they feel like doing.

Assurance is the one to watch.

The national security enterprise shall assure that all AI technologies adopted are designed to be reliable, robust, steerable, and controllable, and that they operate, in accordance with applicable laws, government policies, and guidance. To protect American warfighters, the national security enterprise shall ensure, through contractual clauses or other means, that no commercial entity or adversary possesses the capability to prevent use of, disable or degrade, or materially modify without Federal Government knowledge and approval, an AI system that our men and women depend on for their missions. In addition, rigorous security and functionality measures, including testing, evaluation, validation, and verification, shall be implemented to assure the appropriate confidentiality, integrity, reliability, availability, and interoperability of AI systems across the national security enterprise.

The first and third paragraphs should be uncontroversial, although without implementation it is cheap talk. The devil is in paragraph two, where no other entity can, without knowledge and approval, ‘prevent use of, disable, or degrade, or materially modify’ any AI system that ‘our men and women depend upon’ which could be interpreted to include a wide range of systems, including civilian ones.

As in, once you turn this model over to us, we can do whatever the f*** we want with it, and there is nothing you can do about it. Your contract cannot have any enforceable mechanism, should the government decide to ignore your terms of service.

If we didn’t have the history of the DoW-Anthropic confrontation, it would be reasonable to interpret this charitably, as operational security. Given that encounter, this clearly is ‘all lawful use’ minus the word lawful.

Just All Use. It’s cleaner.

The good news is that Section 3 allows them to just issue a waiver and ignore that, and repeat that waiver indefinitely, which seems reasonably likely to happen.

Section 3 asks for an update to DoD directive 3000.09, and for it to be updated yearly, in case their commitment to following it in the OpenAI deal gets in the way of anything.

Then they all but say ‘we will never use Anthropic at DoW again,’ if you ever tried to tell us we can’t do anything we want then begone. And no, our contractors can’t use Anthropic either.

Consistent with roles and responsibilities outlined in the Federal Information Security Modernization Act of 2014 (44 U.S.C. 3551 et seq.), the Secretary of War for systems described in section 3553(e)(2) of that Act, the Director of National Intelligence (DNI) for systems described in section 3553(e)(3) of that Act, and the heads of relevant agencies for systems described in section 3557 of that Act, shall direct, to the maximum extent permissible by law, termination for default or for convenience contracts with companies that have repeatedly demonstrated a pattern of conduct that is inconsistent with policies laid out in section 2 of this memorandum.This includes contracts under which such companies provide services to the applicable agencies as subcontractors.The heads of these agencies may establish a waiver process to grant limited exceptions of a defined duration, to exceed no longer than 1 year, where such relationships are necessary to responsibly steward United States national security.Exceptions may include operational imperatives, test and evaluation arrangements, threat intelligence sharing, and other mission-critical applications, subject to appropriate risk mitigation measures and enhanced oversight.

Except, you know, right now one of those companies can hack anything on the planet, so maybe we’re going to delay that order a bit. As a treat. But a year from now, the NSA will totally stop using Claude, unless a year from now we issue another waiver, because of reasons.

Section 4 calls for onboarding of the most advanced models from what vendors they are willing to use, and helping AI companies do security in various forms, and for analysis of foreign AI tech.

Section 5 is for helping work around barriers to hiring and training, and prioritize R&D and do testing and verification and so on. Sure.

Section 6 is definitions and Section 7 is standard provisions. That’s all we got.

Dean W. Ball: This seems like a solidly smart policy document. Congratulations to all involved!Divyansh Kaushik: The Administration did a great job with this NSPM. Lots of good stuff in here.

Neil Chilson also seems content.

Vinh Nguyen and Michael Horowitz provide an analysis at CFR that paints this all as highly reasonable and considered policy, a response to government needing this level of trust in its AI systems, and also continuous with Biden’s NSM-25 despite its criticism of NSM-25. They use the term ‘unlawful domestic surveillance’ multiple times, as if to forget that it is completely different from ‘mass domestic surveillance,’ and take the Accountability section remarkably seriously. They don’t seem to see the problems the administration’s position creates, beyond loss of trust with Congress.

Charlie Bullock thinks This Is Fine, mostly, but notices it further undermines the case against Anthropic since it implements the obvious solution of ‘just fire Anthropic.’

I agree that they did a great job of implementing the ‘respect my authoritah and f*** you, Anthropic’ approach and also the good government things.

I don’t think going full that first part is wise, but they disagree. If you take that as a given, then yeah, good job all around I suppose.

The Department of War includes the NSA.

Dean W. Ball: SuPpLy ChAiN rIsKDemetri: SCOOP #NSA is using #Mythos to conduct offensive cyber operations. Anthropic engineers are embedded in the US intelligence agency.Cristina Criddle: scoop: Anthropic has installed forward deployed engineers in the US National Security Agency to help deploy Claude Mythos for cyber offensive operations w/ @AsiaLens

Yes, the NSA is using Mythos for offensive cyber operations, because it is the NSA.

dave kasten: Interesting that it’s confirmed, although I basically assumed this was happening.

OpenAI gives us its plan to ensure AGI benefits everyone.

It includes one very welcome statement, calling for international organization to enable slowing frontier development of AI in the name of catastrophic risks, although they do not dare say ‘existential’ or ‘extinction’ here.

The document is a strange beast. It simultaneously does and does not take intelligence seriously, and the same goes for concentration of power and also gradual disempowerment. I am unsure what to make of the thinking behind the plan.

They commit to ‘build AI in service of humanity’ and to ‘empower people broadly’ and ensure power is broadly distributed.

Sam Altman and Jakub Pachocki: Entirely automating everything is not the future we want. It would be unfulfilling, and it would be dangerous. AI should help people pursue their goals, not become untethered from them. As AI systems become more capable, the human role becomes more important: setting direction, making tradeoffs, applying judgment, and bringing values, taste, care, and responsibility to the work.A key long-term role for people will be deciding what is worth doing.

I mean, look, that is a nice pair of sentiments, but you do realize you kind of have to pick one or the other, right?

As in, if you distribute AI to everyone to help them pursue their goals? Then they are going to use it to automate everything, and turn actions over to it. They will let their AIs decide what is worth doing, and the AIs will compete. So either you can restrict their ability to have or use it, or you can not restrict it.

They do understand the whole ‘RSI be dangerous’ issue, at least a little:

Sam Altman and Jakub Pachocki: We believe that AI doing AI research will become the determining factor of the pace of progress within the next few years. That matters because alignment is itself a hard research problem. To make fast and deep progress, our researchers will need AI systems that can help test ideas, find mistakes, explore alternatives, and iterate alongside us.But faster technical progress makes human judgment and public coordination more important, not less. The future should be shaped by people, institutions, and societies, not only by the companies building the most capable systems.

This is a repeated confusion between ‘is’ and ‘ought.’ Yes, the future ‘should’ be shaped by humans, and ideally humans broadly. You’re causing this how?

International coordination of leading AI efforts to advance safety and allow coordinated actions, including slowing down.

Oh. Yes, that’s actually a really good start to an answer.

Sam Altman and Jakub Pachocki: As frontier AI development continues, we expect national and global coordination to become more important. We have long believed there should ultimately be an international organization that helps coordinate leading AI efforts to reduce catastrophic risk. Cooperation and shared safety standards are an important part of the path forward, especially because the incentives around commercial and national competition are hard to escape. One goal of such an organization should be to make it possible for the world to take coordinated action, including slowing frontier development when needed, so societal resilience, safety, and alignment can keep pace.

If you have long believed this, it would have been good to have spoken up this plainly earlier, but I will happily take this statement now.

Okay, on to the actual plan.

Sam Altman and Jakub Pachocki: Build an automated AI researcher—an AI system that can accelerate and increasingly automate the research process itself, while remaining steerable, accountable, and connected to people. Our internal belief is that by March of 2028 we may have a significant fraction of our research being done by AI systems in tandem with our own researchers. To make sufficient progress on alignment, we believe we will need AIs to iterate alongside us. This will help us navigate the transition to the post-AGI world so that we collectively decide the path toward the future.Accelerate the economy, by accelerating scientific progress, productivity, and economic growth, while working to ensure the gains are widely shared. Everyone should have an opportunity for a meaningful share in the prosperity AI creates.Give everyone on Earth a personal AGI, empowering them to benefit from one of humanity’s most transformative technologies in whatever way they choose.

So the plan is:

Recursive self-improvement.
Use this for abundance and distribute gains widely.
Give everyone an AGI.

I notice that ‘give everyone an AGI’ comes after the RSI. Presumably the AGI they get will be the toy home version, not the industrial strength superintelligence that OpenAI is keeping as a mere tool somewhere else. Or maybe not?

This is the dilemma with such a plan. If you give everyone the full thing in equal measure, humans have lost control of the future and gradual disempowerment occurs non-gradually. If you don’t, then you have not actually stopped concentration of power.

Alternatively: You either ensure that there is a group of humans in control with the ability to steer events, or else you don’t.

In broad strokes, if you are going to develop superintelligence at all, yes obviously in some form you will want to:

Safety develop superintelligence.
Generate abundance of good things.
Distribute that abundance of good things to the humans.

Alas, that doesn’t tell us any of the interesting details.

The main philosophical position here is that OpenAI is focusing on avoiding concentration of power, as opposed to avoiding diffusion or loss of power, as the bigger risk. But the framing as this one sided is in direct conflict with their correct recognition that we will need international coordination to be able to proceed safely. The core contradiction is not resolved.

I read OpenAI Chief Futurist (and former head of mission alignment) Joshua Achiam here as trying to contrast the good OpenAI plan of ‘entrust humanity with the tools of its own progress and density’ (difficulty of matching to reality of sufficiently advanced AI and what people will do with it and keeping it as a tool: impossible) with bad Anthropic of ‘creating a machine God’ (derogatory) (difficulty of matching its alignment to our survival and flourishing: impossible but in the game difficulty sense rather than literally impossible, if you don’t take the description too literally).

I did not find this a good description of Anthropic’s values or vision, and I believe that to the extent this describes OpenAI’s values and vision the best term is ‘pipe dream.’

I do buy that the neutrally presented version of this would be directionally correct, as one thing happening among many, which is what makes it interesting.

Joshua Achiam (Chief Futurist, OpenAI): The OAI / Anthropic values difference is deeply misunderstood, even within the walls of both. Should a loving ensouled machine God watch over humanity? Vote Anthropic. Should humanity be entrusted with the tools of its own progress and destiny? Vote OpenAI.If your lens for analyzing this is “consumer v enterprise business” your ability to understand what’s going on is unfixably borkedIf you think one will predominate over the other, running away with an unsurpassable lead, totally borked; humanity wants both these outcomes in about equal measure.Joshua Achiam (OpenAI): It’s actually not a binary; these aren’t mutually exclusive, nor are they requisite. You can vote both, you can vote neither. But it is a divergence in the worldviews between the orgs. It’s complicated to describe “the worldview of an org” because orgs are composed of individuals with a range of views, but there is a kind of net culture and this is an attempt to describe it.

My Twitter followers are good enough, and involve enough Anthropic followers, that I can do this and not get killed by the Lizardman Constant. Sweet.

One could reframe this as Anthropic taking superintelligence and its consequences seriously, versus OpenAI trying to deny that those consequences exist.

It is not unrelated to Anthropic embracing virtue ethics and OpenAI being stuck on deontology with only humans as patients, as another semi-Fake Framework.

Or one could take Fable’s framing, which I think might be even better: That this is actually a disagreement about facts and the viability of OpenAI’s approach, and OpenAI’s assumption you can have recursive self-improvement while the AI remains a mere tool, and framing it as a difference in values. You should ‘vote’ largely based on whether you think OpenAI’s aspiration is even possible.

I definitely agree that this is mostly not about consumer versus enterprise business.

I put this to the test and asked Anthropic employees if they agreed. Along with the above quiz here were the individual answers.

Amanda Askell (Anthropic): Personally, no. I think the binary of ‘moral saint’ versus ‘tool for humans’ is a false one, and its very simplicity should make people suspicious of it. I think the ideal target tries to balance the benefits and risks of both positions.Drake Thomas (Anthropic): Kinda both? Personally I think a loving ensouled machine god should watch over humanity, but mainly to enforce “no x-risks that destroy human civilization’s optionality and potential” while we spend another few thousand years figuring out what it is we want our destiny to be.Sarah Chen (Anthropic): Coming out of the closet to strongly disavow this description. Many Ants, myself included, view a “the Culture”-type outcome as a disastrous disempowerment scenario. I think we are simply more intellectually honest in acknowledging the challenges in controlling powerful AI.

I agree with Sarah Chen on both levels. The Culture is a disastrous scenario, although obviously many other scenarios are far worse. And I think quite a lot of Anthropic agrees this would not be a good scenario. Drake Thomas goes somewhat farther towards ‘actually yes machine God’ but in a very Eliezer Yudkowsky Beyond The Reach Of God kind of way. Amanda Askell tries to thread the needle, because she notices neither approach is viable in its presented form.

The ‘humanity wants both these outcomes’ and ‘don’t expect a huge lead’ comments feel bizarre, as if ‘what humanity wants’ will determine whether competition remains close between the two companies, or their visions, or the two could exist simultaneously. Even if they were both possible, one rules out the other.

The other question is, sure you believe these things, but what are you doing differently?

Seán Ó hÉigeartaigh: As different as these visions are, so far OAI/Anthropic are building things that are functionally almost indistinguishable. At what point do the companies’ AI systems meaningfully diverge along these paths? A loving ensouled machine God is a very different thing than a toolkit for human progress, even if the former can provide the latter.Feels like an important question, because there are quite different alignment and governance questions along these paths.David Manheim: I think they diverge when we hit ASI - the point that both companies have said they are aiming for - and the visions diverge based on whether the companies see loss of control as possible to avoid.

I think they already have diverged. This philosophical divide also means the difference between OpenAI’s deontological Model Spec approach, versus Claude’s virtue ethical Constitution, along with the general training approaches. You see the differences in the models, and I absolutely am on Anthropic’s side on that. You also see it in Anthropic refusing the Department of War, and OpenAI basically giving in, which raises questions about commitment to avoiding concentration of power.

この記事をシェア

Anthropic News★32026年6月12日 09:00

Anthropic が初めて公開した記録の結果発表

AI 企業 Anthropic が、同社が初めて公開した記録に関する結果を発表しました。

Simon Willison Blog★42026年6月11日 12:45

Anthropic、Claude を利用する AI 研究者を「妨害」しかねない方針を撤回

Anthropic は、最先端大規模言語モデルの開発における Fable 5 のセキュリティ対策を変更し、その内容を可視化すると発表した。同社はバランスの取り方を誤ったとして謝罪している。

TechCrunch AI★42026年6月20日 07:40

暗号化、スパイウェア、そしてミトス：歴史が示すサイバー輸出管理の失敗

TechCrunch AI は、過去の事例を分析し、暗号化技術やスパイウェア、AI 基盤である Mythos への規制を含むサイバー輸出管理政策が実効性を欠くことを指摘している。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む