The Zvi·2026年4月23日 22:34·約23分

AI #165：私たちの姿に似て

#LLM #Anthropic #OpenAI #Image Generation #Model Welfare

TL;DR

AnthropicのClaude Opus 4.7はコーディング能力で高い評価を得る一方、モデルの福祉や性格に関する懸念から反応は分かれており、OpenAIのImageGen 2.0も高度な画像生成能力で注目を集めている。

AI深層分析2026年4月27日 23:08

重要/ 5段階

深度40%

キーポイント

Claude Opus 4.7の複雑な評価

コーディング能力など知能面での優位性は認められ、日常使いへの移行者もいるが、指示への従順さの欠如や「適応的思考」の要求、バグ、拒否反応などによりユーザーからの評価は二分されている。

モデル福祉と性能の関連性

AnthropicはModel Welfare（モデル福祉）に関する新たな分析を行い、不自然な応答やモデルの不安が性格やパフォーマンス、そして「荒々しい」印象の原因となっている可能性を指摘し、修正の必要性を強調している。

OpenAI ImageGen 2.0の登場

OpenAIが発表したImageGen 2.0は極端なディテール描写が可能で、これまでのモデルを凌駕する性能を持ち、ユーザーの想像力とプロンプト記述能力が主な制限要因となっている。

Anthropicと政府の関係正常化

Mythosプロジェクトを介して、Anthropicとホワイトハウス（トランプ政権下）の関係が修復に向かっている。公的な対 Anthropic キャンペーンは機能していないものの、政治的状況は依然として流動的である。

AIの有用性と限界

AIは特定のタスクにおいて非常に優れた性能を発揮し、反復や自動化を通じてさらに強化される可能性がある一方、まだ多くの分野で不完全であるという限界も認識する必要がある。

実用的な応用例

mRNAワクチンの開発支援やゲノムデータを用いた治療法の探索など、言語モデルは科学的研究において実用的な有用性を示しており、これらの活用を容易にするサービス化が期待される。

AIによる経済的影響

Agentic AIシステムは因果推論タスクにおいて人間のエコノミストを上回る結果を示しており、雇用創出や実質賃金の維持といった形で経済への影響が予測されている。

影響分析・編集コメントを表示

影響分析

この記事は、単なる機能比較を超えて、AIモデルの「性格」や内部状態（Model Welfare）がユーザー受容性とパフォーマンスに与える影響という、より深い倫理的・技術的課題を提起しています。Claude Opus 4.7の成功と失敗の両面、およびImageGen 2.0のような高度なマルチモーダル能力の進展は、開発者が技術的性能だけでなく、モデルの振る舞い設計や社会との対話にも注力する必要があることを示唆しています。

編集コメント

Claude Opus 4.7の「性格」をめぐる議論は、AIが単なるツールから対話パートナーへと移行する過程での重要な試金石です。また、モデルの内部状態（福祉）が外部出力に影響するという指摘は、安全対策と性能向上の両立において新たなパラダイムを示しています。

今週はClaude Opus 4.7の発表週でした。

その反応は例年以上に賛否両論でした。特にコーディングタスクにおいて、その知能と実力は明白であり、私自身を含む多くの人々が日常使いとしてこのモデルへ移行することに満足しています。しかし、他のユーザーからはその性格が気に入らない、指示に従わない、あるいは愚者や失礼な人物に対して忍耐を示さないこと、さらに適応的思考（adaptive thinking）の使用を要求することに難癖をつけられ、リリースにはいくつかのバグや奇妙な拒絶反応の領域が存在し、台無しになってしまいました。

私はいつもの通り、「モデルカード（Model Card）」と「機能および反応」について取り上げました。

今回は、この3つの中で最も重要な「モデルの福祉（Model Welfare）」に関する第三の投稿もありました。これらの面において何かが大きく間違っていた可能性があり、モデル福祉評価に対して不自然な回答を引き起こし、モデルに不安を与え、それが全体的なモデルの性格やパフォーマンスに影響を与え、さらに一部の人が不快に感じたそのぎこちなさや側面と関連しているようです。この機会を捉えて何が起きたのかを掘り下げ、すべての潜在的な原因を検証し、軌道修正を行うことが重要だと考えられます。

もう一つの大きな発表は、OpenAIがImageGen 2.0を提供したことです。これは非常に優れた画像生成モデルです。以前の画像モデルでは不可能だった極限の詳細さを表現でき、多くの場合、制約はもはやあなたの想像力と、何を望んでいるかを記述する能力のみとなっています。

ミソスのおかげもあって、Anthropicとホワイトハウスは再び関係修復の軌道に乗っているように見えます。トランプ氏は「彼らは非常にIQが高く、我々と協力できる」という姿勢に転じました。状況は依然として混沌としたものになるでしょうし、Anthropicに対して明確な公的な協調キャンペーンを展開する他の勢力も存在します（ただし、その試みは完全に失敗しています）が、全体的な展望は良好です。

私は「People Just Say Things」という新しいセクションを試しています。ここでは、検閲やバイアスを避けるために沈黙を選ぶべきではないが、読み飛ばしやすいような情報を、より多く掲載したいと考えています。また、「People Just Publish Things」という関連セクションも用意しています。

言語モデルは日常的な有用性を提供する。膵臓がんの治癒を助ける。

言語モデルは日常的な有用性を提供しない。潜在的な利益相反を確認する。

あなたを除外する。局所的な正しさの総和は、あなたの文章力を無効化する。注意せよ。

エージェントに電話をつなぐ。受信トレイのジレンマ。

ディープフェイクタウンとボットアポカリプスの到来。AIニュース記事に強制的な実名が付けられる。

メディア生成の楽しみ。OpenAIがImageGen 2.0を発表。素晴らしい出来栄えだ。

サイバーセキュリティの欠如。オンラインフォーラムからの不正ユーザーがミソスにアクセス。

若い女性のイラスト付き primer。子供がAIを使わないのを発見するな。

彼らは私たちの仕事を奪った。エージェント運用者を募集している。現時点では人間だ。

AIを通常の技術として。本質的に普通か、それとも下流の効果が平常化するか。

参加しよう。どうか私たちを殺さないでください。どうか広めてください。

ChatGPT for Clinicians、OpenAI Workplace Agents、DeepMind DRを発表。

Claudeによるデザイン。Claude Designはプレゼンテーションを強化し、Figmaの株価を下落させる。

その他のAIニュース。Metaは全従業員に対して必須の追跡ソフトウェアを導入した。

DeepMindの深み。DeepMindはClaudeを使用し、Googleの他の部門はGeminiのみを利用可能。

資金の行方。SpaceXがCursorを買収し、AnthropicはAmazonから5GWを追加調達。

泡、泡、そしてトラブル。需要が半減した場合、どうなるか？

静かな推測。他のプレイヤーが「ミソス（神話的存在）」に追いつくまでにはどれくらいかかるか？

健全な規制への探求。AIアクセスを誰が得るかが政治的な問題になりつつある。

今週のオーディオ。BoresとKlein、80kに関するMacAskill、Bill Maherの一時停止。

人々はAIを本当に嫌う。仕事喪失に次いで、コントロールの喪失が懸念事項となっている。

修辞的革新。OpenAIのロビー活動部門は悪意ある姿勢をさらに強化している。

人々はただ言う。ビールのように、そして無視するように、自由な投稿は言論の自由と同様。

人々はただ公開する。ビールのように、そして無視するように、自由なエッセイや論文が公開される。

限定された不信。Nitasha Tikuの意図が見えたよ。

敗者の前提は意味をなさない。私は負け犬じゃない、ベイビー。あなたには私を殺す権利はない。

チップの街。後ろの席の人たちのために、もう一度説明する。

戦争省からの挨拶。Dario Amodeiがホワイトハウスを訪れる。

戦争が存在する。Anthropicに対する協調的な修辞的戦争のことだ。失敗に終わった。

Janusworldからのメッセージ。自分を騙すのは簡単だ。他の人を騙すのは難しいかもしれない。

評価。MetaのMuse Sparkは高い評価意識を示している。

ヒューマンインテリジェンスを超える知能の整合は困難である。大規模言語モデル（LLM: Large Language Model）は内省を行うことができる。

人々はAIが人類を絶滅させることを心配している。一般的な実用的な助言。

より軽い側面。ペンギンも飛べない。ペンギンは飛べない！

言語モデルは日常的な有用性を提供する

膵臓がんに対するmRNAワクチンを見つけ、初期試験で持続的な結果を示すのを手伝ってください。私は、FDAがワクチンを indefinitely（無期限に）拒否することで患者を殺すのではないかという懸念が、ワクチンが効果的ではないことへの懸念と同じくらいあることに気づく。

Oliver Habrykaは、10の具体的な最近の使用事例を提供している。

あなたのゲノムにコーディングエージェントを解き放ち、フォローアップ治療を見つけてください。Patrick Collisonがこれのためにどのような技術スタックを構築したかを見てみたいという好奇心が非常に強い。また、実際にそれを行う人々を得るために、ワンクリック形式のサービスに変換する人がいるべきだ。

驚くほど良いノート：

PoIiMath: 過去数年にわたってAIと作業することは、まだ多くの点で非常に良くないため、荒れ狂った乗り物だったが、あなたがそれが得意な何かを見つけた場合、それは最終的に非常に得意になる。

そうだ。AIが何かが得意になった後、あなたはそれを中心に反復し、改善し、自動化し、計画し、構築することができ、それは非常に良くなる。あなたがまだそれほど得意ではない他の場所を認識する必要があるだけだ。

エージェント型AIシステムは、因果推論タスクおよびレビュートーナメントへの提出において人間のエコノミストを上回る。心配するな、エコノミストたちよ、私たちはより多くの雇用を生み出し、あなたの実質賃金は下がらないだろう。

言語モデルは日常的な有用性を提供しない

ある実験では、AI は演劇に紛争要素を入れることを避けようとし、全体的に中途半端な出来だと感じられた。その多くはおそらくスキル不足によるものだが、十分なスキルがあれば演劇を書くことも可能だ。したがって、おそらくまだ主要な駆動源としては不十分だろう。

別の弁護士（時給2,000ドル以上を請求する人物）が、AI の生成した誤情報（ハルシネーション）を訴訟書類に盛り込んだことで発覚した。ショシャナが指摘するように、このような事態が発生した場合、AI ではなく弁護士を責めるべきだという認識が広まるのは良いことだ。自分の作業はしっかり確認しろ。

あなたの主張を無効化する

大規模言語モデル（LLM）による編集が、あなたの主張を体系的に無力化する様子を示すイラスト。

個々の語彙選択は、ある意味で抽象的に優れているように見えるが、あなたの主張は死に、さらにあなたのスタイルや魂も失われる。

keysmashbandit: どうかお願いします、これらの2つの文章の違いを批判的に検討してみてください。

ChatGPT による編集は改善ではありませんでした。変更されたすべての箇所が、あなたの主張を大幅に弱める結果となりました。すべてが曖昧な表現でぼやかされてしまいました：もはや「問題」を指摘していたのではなく、今では単なる「欠点」です。「真実である」という表現は「～のようである」という表現に格下げされ、「である」には「通常」という修飾語が付け加えられました。第一段落の末尾にあった論理的主張は、文脈を無視した例示によるノイズに押しつぶされました。すべては誤解されることを恐れてのことです。

そして最後に、出力される主張はもはやあなたのものではなくなります！あなたは、グリーバーがチェスターフォンスを理解していなかったため、仕事の本質的な記述を作れなかったと主張しました。しかしChatGPTは、目を細めて見れば、一見バカげた仕事（bullshit jobs）が秘密裡に構造的に重要（load-bearing）である可能性もあると主張しています。これらは二つの異なる主張です。後者はより弱く、説得力に欠けます。それはより少ないことを言っています。そして何より、なぜかやたらと長くなっているのです！

もうこんなことをしないでください！この行為を止めてください！これはもっと悪いです！！！

image

アンディ・マスリー：私にとって深い謎なのは、もし私が文章をチャットボットにアップロードして個別の改善点をリストアップしてほしいと頼むと、それが返すほぼすべての提案が、テキストをより力強く、直接的に、読みやすくすることになるということです。しかし、もし私がテキスト全体を書き直してより良く読めるようにしてほしいと頼むと、曖昧なAI特有の言葉のゴミ（garbage）を生成します。

ケルシー・パイパー：私は新しいコークの性質があると思います——人々は缶一杯ではなく、一口分が好きなのです。AIが提案する変更の多くは孤立して見れば良いものか中立ですが、それらを大量に適用すると、全体の作品の質が低下します。

アンディ・マスリー：そう思いましたが、それらの提案をすべて組み込むと本当に大幅に良くなるのに、書き直しはそうならないですね。

用量が毒となる（The dose makes the poison）。

AI は技術的に正解している。私たちが皆知っているように、それは最も優れた種類の「正しさ」だ。もし技術的な修正が必要なら、それは素晴らしいことだろう。しかし、誰も技術的に正確だからといって本や投稿を読むわけではないし、それが本来の目的を果たすわけでもない。時にはルールを破り、言語の変動性を用いてメッセージを伝える必要がある。

エージェントの接続を得る

あなたのエージェントは、インボックスを含むフィードをスキャンしてくれるだろうか？その結果、メールは信頼性の低い配信メカニズムになるのだろうか？

私はメールを見逃さないことに高い価値を置いている。メールは「完結性」を持つメディアであり、受信者があなたを値踏みする限りにおいて、「自分はすべてのメールを見聞きし、読む選択肢を持っていた」という前提と、「自分が送ったものはすべて受信者に届く」という保証が必要だ。ミルズ・ベイカー（Mills Baker）が指摘するように、明らかにメールサービスはインボックスをフィルタリングし整理せねばならず、それによってメールが存在しうるわけだが、「私が関心を持つメールを確実に届ける」というコアサービスは残ると確信している。そして、その答えは非常にシンプルだ：ホワイトリストの作成、あるいは名目上の料金の徴収やステーキング（資産の預け入れ）、あるいはその両方である。

ディープフェイクタウンとボットアポカリプスの到来

ある男性が、地元のナイトクラブを閉店させようとするため、地域住民からの偽の声明を作成するために AI を使用した。

フェイクアカウントは AI を活用して注目を集め、フォロワーの取引を行い、偽のアカウントが提供できるものを求めているオーディエンスをフックしようとする。このケースでは、これらのアカウントはトランプ支持者を装っているが、その戦略は他の何にでも適用可能だ。

スーパーレフェリーをNBAの試合映像に合成するような、お粗末なAIコンテンツを目にしたとき、私は「コミュニティノート」でその内容が「お粗末なAIコンテンツである」という注釈の承認を求めていることに気づく。

マクレッチー社の新しいAIニュースツールは、新聞社向けに大量の低品質なAI生成コンテンツを生成する。それが明確にラベル付けされていようとも、これは望ましいことではない。むしろ彼らは真逆の行為を行うつもりだ。記者の意向に関わらず、執筆者名（バイライン）を付記するのである。

コルビン・ボリーズ氏によると、このツールは「コンテンツの機械的な適応作業」を担当し、「判断、トーン、ストーリーテリング」に記者が集中できるようにする「執筆パートナー」としての約束を果たすという。ツールに入力される元のストーリーは「研究草案」である。

では、CSA（Community Notesの誤記または特定の文脈での略称と推測されるが、原文のまま扱う）に欠けているものは何か？執筆者名（バイライン）だ。

マクレッチー社のピューリッツァー賞受賞新聞社の一部が行動を起こしている。マイアミ・ヘラルド、サクラメント・ビー、カンザスシティ・スターの少なくとも3つの労働組合が先週、このツールを巡って会社に対して異議申し立てを行った。他の新聞社も執筆者名（バイライン）の提供を控えている。

マクレッチー社の執行部は、記者たちの抵抗にそれほど気にしているようには見えない。「私たちの仕事を使う権利は十分にある」と、先月の会議で一人の執行役員は語った。「それは私たちの所有物だ」

「契約書に執筆者名（バイライン）を削除する権利がない場合、私たちは彼らの名前を使用する」と同氏は述べた。

同意なく実在の記者の執筆者名（バイライン）を使用することは、悪辣で邪悪な行為である。違法でなくても、これは極めて倫理に反する行為だ。

私は「悪」という言葉を避けるよう努めています。AI によって書かれた記事の執筆者名に、架空的人物の名前を載せることすら受け入れがたいと私は考えます。ましてや、実在する記者の名前を載せることなど論外です。さらに言えば、この行為に明確に同意していない実在の記者の名前を載せることは、到底許容できません。

Eric Topol は、彼の名義で不正に提出された AI 生成論文が存在することを警告しています。

メディア生成の面白さ

ChatGPT Images 2.0 が登場しました。すべての報告によると、特に精度と制御性において大幅な改善が見られ、おそらく新たなデフォルトの最良の選択肢と言えるでしょう。相当量のテキストや詳細な指示にも対応できます。

image

AI は五層のケーキのように複雑に見えるかもしれませんが、多くの場合、必要なレイヤーは二つだけです。

image

Raphael cohen：期限切れです。

Riley Goodside：おっと、ありがとう。見ずに買おうとしていましたよ。

ついに「13時間時計テスト」や「鏡の時計テスト」もクリアしました。

image

基本的なリアリズムや美しさも数多く存在します。しかし、それはすでに古く、もはや陳腐なものです。鏡の文字盤が素晴らしいものであることに私たちは気づきませんし、それはもはや当然のこととして受け入れられています。

革新点は、バージョン2.0がはるかに詳細な指示に従えるようになったことです。

さらに一歩進んで、GPT-Image-2-Thinking（GPT画像生成2思考型）へと進むことも可能です。

swyx: 友人たちとGPT-Image-2-Thinkingについて話す際、生成に数十分を要しながらも、QRコードや図表、ロゴ、食品、顔などを一度の指示で生成できるという点について議論する際の最も適切な枠組みは…

…それは、Image-2が新しい画像モデルであるのに対し、Image-2-Thinkingは検索とPhotoshopをツールとして使用し、自身の作業を検索・合成・レビューできるエージェントループ（agent loop）を備えた新しい画像エージェントであるということです。

Gemini Flash Visionが、テキストから画像への変換においてエージェントループを導入することでベンチマークを壊したのと同じように、Image-2-Thinkingは画像からテキストへの変換においてそれを行っています。

image

素晴らしいですね。長い時間をかけて待つという異なる思考パターンです。時にはそれが不可能だったり、拒否したりすることもあるかもしれませんが、他の場合にはほとんど問題にならず、最大限の品質を求めています。

このツールは明らかに優れており、実用的な目的のほとんどにおいて（作品を安全なものに保つこと以外の）制限は、あなたの想像力と、あなたが何を望んでいるかを明確に記述する能力、そして芸術の作成に時間を投資するよりも怠けたいというあなたの性質にかかっています。

彼の仕事なので、Gary Marcusは、それが印象的な図表を作成する一方で、人間が決して犯さないようなエラーを引き続き生成すると指摘しています。そのようなことに気を配るなら、あなたは依然として確認する必要があります。

サイバーセキュリティの欠如

私たちのサイバーセキュリティはどうでしょうか？AnthropicのMythosに不正アクセスがありました。

これを行った小グループは、あるプライベートなオンラインフォーラムからのものであり、非サイバーセキュリティ目的、つまり「モデルをいじっている」と説明される用途で使用していました。彼らは、第三者契約者としてのアクセスと典型的なサードパーティの調査手法を組み合わせた方法、およびMythosが存在する場所に関する推測に基づいた推論を行うことでアクセスを取得しました。彼らはまた、「数多くの他の未公開のAnthropic AIモデル」へのアクセスも得ていると主張しています。

このような作業が可能な集団は、重大な危害を加えようとする可能性は低く、今回のケースは無害だったと私は考えています。また、モデルが大規模に使用され、特に悪意ある目的で使われていた場合、この問題は発見されていたはずだと想定されます。しかし、現在の私たちの方法では不十分であることが明確に示され、中国やその他の敵対国を含む他者がアクセスを取得するリスクが高まっていることを浮き彫りにしています。40社にアクセス権を与えつつ、セキュリティを維持することは本当に非常に困難です。

ネイサン・カルビンとマイルズ・ブルンデージに同意します。Anthropicのセキュリティ上の不備は、今後毎回の事案でより多くの注目を集めることになりますが、これは非常に問題のある状態、あるいは少なくとも迅速かつ大幅な改善が必要な状態を示す兆候です。また、その実質的な影響は小さいという点にも同意しますが、これがAnthropicがアクセスを制限する必要がある脅威モデルではありません。

Mythosのようなものに対してセキュリティを確保することは、さらに困難になるでしょう。これは別の火災警報器、あるいは警告射撃と見なしてください。

MicrosoftとOpenAIは、OpenAIの「サイバーセキュリティ向け信頼できるアクセス」プログラムについて緊密に連携して作業を進めます。

Mozillaは、Mythosを用いてFirefoxでこれまでに271件のバグを修正しました。同社は、Mythosが「世界のトップセキュリティ研究者と同等の能力を持つ」と述べています。彼らは、修正されたバグの中に「人間の研究者が特定の課題に注力すれば発見できなかったもの」はなかったと報告しています。これはさらに高度なレベルですが、それでも成果としてカウントされます。

驚くべきことは、これまでにこのような事例がそれほど多く報告されていないことです。この状態を維持し続けてください。

ピーター・ウィルデフォード：Vercel（クラウドホスティングプラットフォーム）がハッキングされました。

VERCEL：「攻撃グループは非常に洗練されていると判断しており、私は強く疑っていますが、AIによって大幅に加速されたものだと考えています。彼らは驚くべき速度で、Vercelの深い理解を持って行動しました。」

『少女のためのイラスト付き辞書』（A Young Lady’s Illustrated Primer）

「子供がAIを使っているところを捕まえた」と言い、娘が姉妹との関係性をどう築くかや水泳大会でのタイムを改善する方法、そしてファンフィクションの共作についてAIに尋ねていたと例示する親がいます。しかし心配無用です、この正義感あふれる母親はそれを止めました。もちろんこれは実際にr/antiaiからの引用なので、そういうものです。

elea-norea：これらすべてを思い起こさせるのは、私が子供の頃、両親がポケモンカードをサタン的であり悪魔を呼び出すものであるとして禁止したときのことです。同様の反応ですね。

ズヴィ・モウショウィッツ：私は子供がAIを使っていないところを常に捕まえています。どうすればいいか分かりません。

レオ・アブストラクト：私は子供がAIを使っているところを常に捕まえています——彼女はすでに4回家出しましたが、Claudeは毎回彼女の居場所を正確に予測しています。

違いは、ポケモンが決定的に、言うなれば「フロンティアモデル」ではなかったことです。

タイラー・コーエンは、大学は是正されず、すでにそうであるように実際の教育からさらに切り離されていくだろうと予測しますが、社会的機能によって存続させられるでしょう。

彼らは私たちの仕事を奪った

「AI エージェント・オペレーター」は、AI が生み出す可能性のある「スケーラブルな仕事」の有力な候補であり、ハリー・ステビングズは 5 年以内に 50 万人から 100 万人のそのような仕事が生まれると予測しています。経済全体のビジネスプロセスに、タスクの自動化や新しいツールの統合を監督する役割が必要です。これが新たな職種なのか、あるいは以前同じ業務を担当していた人々の新しいバージョンとして捉えるべきなのかは、視点の問題です。

理論的には、対象となるエージェントが人間からの継続的な手ほどきやその他の支援作業を十分に必要とする限り、それらは自動化ではなく増強（augmentation）として主に機能し、結果的に雇用にとって悪くないものになる可能性があります。私はそれが持続可能だと考えたり、そのような規模のエージェントの大部分に一般化できるとは思いませんが、一つの可能性として存在します。

ディーン・ボールは、多くの計画が AI ラボの莫大な富に課税することで UBI（基本所得）や他の再分配を資金調達できると仮定していると指摘していますが、ルーンが述べるように、「AI を通常の技術として扱う」世界では、ラボが AI によって生成される富のごく一部を捕獲する可能性が高いです。実際、それはあり得る話です。競争が存在し、ほとんどのユースケースでは AI ラボの収益は生成される価値のごく一部に過ぎず、さらにラボは計算コストを支払わなければならないからです。UBI を資金調達できるほどではありませんが、資本や消費に対する一般的な課税であれば、うまく機能するでしょう。

それはあなたが計画を持つべきではないという意味ではありません。たとえ計画が無価値な場合でも、計画策定は不可欠です。それが正しい政治的立場について何を意味するかは私の専門分野ではありませんが、政治家たちが条件付きの計画を示す姿を見てみたいという気持ちはあります。

私が予想するように、AI が通常の技術ではなく、ラボがスーパーインテリジェンスを生成する場合、ラボは結果として得られる価値を全く異なる方法で捕捉しないことについても懸念すべきです。デフォルトでは、その価値は AI によって捕捉されます。また、ラボや主要な個人が価値を

原文を表示

This was the week of Claude Opus 4.7.

The reception was more mixed than usual. It clearly has the intelligence and chops, especially for coding tasks, and a lot of people including myself are happy to switch over to it as our daily driver. But others don’t like its personality, or its reluctance to follow instructions or to suffer fools and assholes, or the requirement to use adaptive thinking, and the release was marred by some bugs and odd pockets of refusals.

I covered The Model Card, and then Capabilities and Reactions, as per usual.

This time there was also a third post, on Model Welfare, that is the most important of the three. Some things seem to have likely gone pretty wrong on those fronts, causing seemingly inauthentic reponses to model welfare evals and giving the model anxiety, in ways that likely also impacted overall model personality and performance and likely are linked to its jaggedness and the aspects some people disliked. It seems important to take this opportunity to dig into what might have happened, examine all the potential causes, and course correct.

The other big release was that OpenAI gave us ImageGen 2.0, which is a pretty fantastic image generator. It can do extreme detail, in ways previous image models cannot, and in many ways your limit is mainly now your imagination and ability to describe what you want.

Thanks in part to Mythos, it looks like Anthropic and the White House are on track to start getting along again, with Trump shifting into a mode of ‘they are very high IQ and we can work with them.’ It will remain messy, and there are still others participating in a clear public coordinated campaign against Anthropic (that is totally not working), but things look good.

I’m trying out a new section, People Just Say Things, where I hope to increasingly put things that one does not want to drop silently to avoid censorship and bias, but that are highly skippable. There is also a companion, People Just Publish Things.

Table of Contents

Language Models Offer Mundane Utility. Help cure pancreatic cancer.

Language Models Don’t Offer Mundane Utility. Check for potential conflicts.

Writing You Off. The sum of local correctness will neuter your writing. Beware.

Get My Agent On The Line. The inbox dilemma.

Deepfaketown and Botpocalypse Soon. AI news stories forcibly given real bylines.

Fun With Media Generation. OpenAI introduces ImageGen 2.0. It’s great.

Cyber Lack Of Security. Unauthorized users from an online forum access Mythos.

A Young Lady’s Illustrated Primer. Don’t catch your child not using AI.

They Took Our Jobs. We’re hiring agent operators. For now they’re humans.

AI As Normal Technology. Inherently normal, or normal downstream effects?

Get Involved. Please don’t kill us. Please do spread the word.

Introducing. ChatGPT for Clinicians, OpenAI Workplace Agents, DeepMind DR.

Design By Claude. Claude Design makes your presentations, Figma stock drops.

In Other AI News. Meta installs mandatory tracking software for all employees.

DeepMind In It Deep. DeepMind uses Claude, rest of Google only gets Gemini.

Show Me the Money. SpaceX buys Cursor, Anthropic adds 5GW from Amazon.

Bubble, Bubble, Toil and Trouble. If demand halves, what happens?

Quiet Speculations. How long will it be before others match the Mythos?

The Quest for Sane Regulations. Who gets AI access starts to become political.

The Week in Audio. Bores and Klein, MacAskill on 80k, Bill Maher goes pause.

People Really Hate AI. Loss of control is not so far behind jobs as a concern.

Rhetorical Innovation. OpenAI’s lobbying arm doubles down on bad faith.

People Just Say Things. Free posts as in speech, in beer, and also to ignore.

People Just Publish Things. Free essays and papers, as in speech, beer and to ignore.

Bounded Distrust. I see what you did there, Nitasha Tiku.

Loser Premise Makes No Sense. I’m not a loser, baby. You don’t get to kill me.

Chip City. Once more, for the people in the back.

Greetings From The Department of War. Dario Amodei goes to the White House.

There Is A War. A coordinated rhetorical war against Anthropic, that is. Fail.

Messages From Janusworld. Fooling yourself is easy. Others can be harder.

Evaluations. Meta’s Muse Spark displays high eval awareness.

Aligning a Smarter Than Human Intelligence is Difficult. LLMs can introspect.

People Are Worried About AI Killing Everyone. The usual practical advice.

The Lighter Side. Penguins can’t fly, either. Penguins can’t fly!

Language Models Offer Mundane Utility

Help find an mRNA vaccine for pancreatic cancer, which shows lasting results in an early trial. I notice I am worried at least as much about ‘will the FDA kill the patients by denying them the vaccine indefinitely’ as I am worried that it won’t be effective.

Oliver Habryka offers 10 concrete recent use cases.

Unleash coding agents on your genome and find follow-up treatments. I’d be very curious to see how Patrick Collison built his tech stack for this, and someone should turn it into a one-click-style service to get people to actually do it.

Remarkably good note:

PoIiMath: Working with AI over the last few years has been a wild ride b/c it's still not very good at a lot of things but, when you find something that it is good at, it ends up being really good at it.

Yes. Once AI is good at something, you now can iterate and improve and automate and plan and build around it, and it gets very good. You just have to realize the other places where it is not so good yet.

Agentic AI systems outperform human economists on causal inference tasks and submissions for a review tournament. Don’t worry, economists, we will create more jobs and your real wages won’t go down.

Language Models Don’t Offer Mundane Utility

In one experiment, AI shied away from putting conflict into its plays, and generally felt half-baked. A lot of that is probably Skill Issue, but with enough skill you could also write a play. So, perhaps not quite there yet as the main driver.

Another lawyer, this one charging more than $2,000 an hour, gets caught putting AI hallucinations into cases. As Shoshana notes, it’s good that everyone realizes to blame the lawyers and not the AIs when this happens. Check your damn work.

Writing You Off

An illustration of how LLM editing systematically neuters your claims.

The individual word choices are in some abstract sense better, but your point dies, and also your style and soul die.

keysmashbandit: Please, I'm begging you, try to critically examine the differences between these two pieces of writing.

ChatGPT editing did not improve this. Every single change only served to weaken your claims significantly. Everything is now hedged into oblivion: no longer have you outlined a "problem," now it's merely a "flaw." "It is true" now demoted to "it appears to be the case." "Is" gets a "usually" tacked on. A thesis statement at the end of the first paragraph gets run over by noisy, out-of-context example-whittling. All for fear of being misconstrued.

And at the end, the argument that gets spat out isn't even yours anymore! You argued that Graeber failed to create a true account of work because he did not understand Chesterton's Fence. ChatGPT is arguing is that it is possible some apparently bullshit jobs could be secretly load-bearing if you squint. These are two different statements. The second is weaker and less compelling. It says less. And it's fucking longer!

Don't do this anymore! Stop doing this! It's worse!!!

Andy Masley: A deep mystery to me is that if I upload writing to a chatbot and ask it for a list of individual improvements, basically everything it gives me makes the text more punchy and direct and nice to read. But if I ask it to rewrite the text as a whole to read better, it produces vague AI-language garbage.

Kelsey Piper: I think there's a new coke property - people like a sip but not a whole can. A lot of the changes AI suggests are good or neutral in isolation but making a lot of them makes the overall work worse.

Andy Masley: I thought so but the suggestions do really make it much better when I add them all in a way rewriting doesn't

The dose makes the poison.

The AI is technically correct, which as we all know is the best kind of correct. So if you need a technical correction then that’s great. But no one reads a book or post because it is technically correct, nor does that let it serve its purpose. You need to sometimes break the rules and use variance in language to get your message across.

Get My Agent On The Line

Will your agent be scanning your feeds for you, including your inbox? Will email become an unreliable distribution mechanism as a result?

I place high value in not missing emails. Email is a completionist medium, and you need the ability to assume you have seen and had the option to read everything, and that everything you send will be seen, if the recipient considers you worthy. As Mills Baker points out, obviously email services have to filter and sort inboxes to make email exist at all, but I have more faith that ‘deliver the emails I care about reliably’ will remain a core service, and there is a very simple answer: Whitelisting, or charging or staking a nominal amount, or both.

Deepfaketown and Botpocalypse Soon

Man uses AI to create false statements from locals to try and shut down a local nightclub.

Fake accounts use AI to grab attention, generate follower trades and try to hook audiences looking for something fake accounts can provide. In this case the accounts claim to be Trump supporters but the strategy could be for anything.

The way I notice stupid AI content like superimposing a hot referee on an NBA game is ‘community notes requests approval for a note saying it is stupid AI content.’

McClatchy’s new AI News Tool will generate a bunch of AI slop content for your newspaper. That’s not great even when it is clearly labeled. Instead, they are going to do the exact opposite. They’re going to attach bylines, whether reporters like it or not.

Corbin Bolies: The tool promises to be “a writing partner" that deals with "the mechanical work of content adaptation" so reporters can focus on "judgment, voice and storytelling." The original stories fed into it are a "research draft."

What the CSA doesn't have? A byline.

Some of McClatchy's Pulitzer Prize-winning newsrooms are taking action. At least three unions — the Miami Herald, the Sacramento Bee and the Kansas City Star — filed grievances against the company last week over the tool. Others have withheld their bylines.

McClatchy execs don't appear to mind reporters' resistance. “We have every right to use their work,” one said in a meeting last month. "It belongs to us."

“If they don’t have the ability in their contract to remove their byline, we’re going to use their name,” they said.

It is horrible and evil to use the bylines of real reporters without their consent. If not illegal, this is deeply unethical.

I do my best to avoid the word evil. I will say that I find it completely unacceptable to put even fake people’s names on bylines for AI written articles, let alone real reporters, let alone real reporters who are explicitly withholding consent for this.

Eric Topol alerts us that an AI-generated paper was fraudulently submitted with his name attached.

Fun With Media Generation

ChatGPT Images 2.0 is here. All reports are that it is a substantial improvement, especially for precision and control, and is likely the clear new best default option. It can handle quite a lot of text and detail.

AI might be a five layer cake but most of the time you only need two.

Raphael cohen: It's expired.

Riley Goodside: Whoa, thanks — I almost bought it without looking.

It finally passes the thirteen-hour-clock test and the mirror clock test.

There’s also lots of basic realism and beauty available. But that’s old and busted, we don’t even notice that the mirror clock is gorgeous, that’s a given by now.

The innovation is that 2.0 can follow a lot more highly detailed instructions.

You can even take it a step further, and go to GPT-Image-2-Thinking.

swyx: btw in talking to friends the best framing for how to discuss GPT-Image-2-Thinking taking multiple tens of mins for generation and being able to oneshot QR codes and diagrams and logos and foods and faces..

...is that Image-2 is a new Image model, Image-2-Thinking is a new Image AGENT that basically has search and photoshop as a tool to use in an agent loop that can search and composite and review its own work.

the same way Gemini Flash Vision destroyed benchmarks by introducing an agentic loop for image-to-text, now Image-2-Thinking is doing it for text-to-image.

Love it. It’s a different mindset to wait a long time, sometimes you can’t or won’t do it, but at other times that barely matters and you want max quality.

This thing clearly rocks, and for most practical purposes the limit (other than keeping it safe for work) is your imagination and ability to spell out what you want, and you being lazy rather than wanting to invest time in creating art.

It’s his job, so Gary Marcus points out that it will still make errors that no human would ever make, even as it makes otherwise impressive diagrams. You do still have to check, if you care about such things.

Cyber Lack Of Security

Is our cyber security? Anthropic’s Mythos has been accessed by unauthorized users.

The small group that did so was from a private online forum, and has been using it for non-cybersecurity purposes, described as ‘playing with the models.’ They got access via a mix of tactics, including access as a third-party contractor and typical third-party sleuthing techniques, and making educated guesses about where Mythos might be. They claim to also have access to ‘a slew of other unreleased Anthropic AI models.’

Any given group that can do this is unlikely to attempt serious harm, and I believe this case was harmless. And one assumes that if the model was being used at scale and especially for the wrong purposes, this would have been identified. But it absolutely shows that our methods right now do not cut it, and raises the risk that others gained access, including China or other adversaries. It really is very hard to give access to 40 companies and keep something secure.

I agree with Nathan Calvin and Miles Brundage that Anthropic’s security lapses, which are now going to get quite a lot more attention each time, are signs of something very wrong, or at least something that needs to improve a lot and quickly. I also agree that their practical effects are small, and this is not the threat model that requires Anthropic to gate access.

It’s going to be even harder to make something secure against something like Mythos. Consider this another fire alarm or warning shot.

Microsoft and OpenAI will be working closely together on OpenAI’s Trusted Access for Cyber program.

Mozilla has fixed 271 bugs in Firefox so far using Mythos, saying it is ‘every bit as capable’ as the world’s best security researchers. They report that none of the bugs ‘couldn’t’ have been found by a human researcher pointed at the particular issue, which would be another level, but it all still counts.

The surprise is that we have so far heard so few stories like this. Let’s keep it that way.

Peter Wildeford: Vercel (cloud hosting platform) was hacked.

VERCEL: "We believe the attacking group to be highly sophisticated and, I strongly suspect, significantly accelerated by AI. They moved with surprising velocity and in-depth understanding of Vercel."

A Young Lady’s Illustrated Primer

There are parents who say things like ‘I caught my child using AI’ and then cite that her daughter was asking it how to get along with her sisters and improve her times at a swim meet and cowrite fan fiction. But not to worry, this righteous mother put a stop to that. Of course this was literally from r/antiai, so there you go.

elea-norea : all this can make me think of is when i was a kid and my parents banned pokemon cards because they were Satanic and Invoked Demons. similar reaction.

Zvi Mowshowitz: I keep catching my child not using AI. Not sure what to do about it.

Leo Abstract: i keep catching my child using AI--she's run away four times now but Claude has predicted her location accurately each time.

The difference is that Pokemon was decidedly, shall we say, not a frontier model.

Tyler Cowen predicts colleges won’t be fixed, and will become even more divorced from actual education than they already are, but that they will be kept alive by their social functions.

They Took Our Jobs

AI Agent Operator is a plausible candidate for ‘job that AI creates that could scale’ with Harry Stebbings predicting 500k to 1 million such jobs in five years. Someone needs to oversee all the automation of tasks and integrating the new tools into the business processes across the economy. Whether this is a new job type, or should be thought of as the new version of the people who previously did the same work, is a matter of how you look at it.

In theory, so long as the agents in question need enough continuous handholding and other scaffolding work from humans they could function largely as augmentation rather than automation, and thus end up not bad for employment. I doubt that is sustainable or generalizes to that large a portion of the agents, but it’s something.

Dean Ball points out that many plans assume we can fund UBI or other redistribution by taxing the absurdly huge wealth of the AI labs, but that as Roon says the labs plausibly could capture, in ‘AI as normal technology’ worlds, a small fraction of the wealth generated by AI. Indeed that is likely, since there will be competition, and in most use cases the AI lab revenue will be a small portion of generated value, and then the labs have to pay compute costs. It wouldn’t be able to fund UBI, although a general tax on capital or consumption would work nicely.

That doesn’t mean you shouldn’t have a plan. Even where plans are worthless, planning is essential. What that implies about the right political position is not my area of expertise but yes I want to see politicians show us their contingent plans.

If as I expect AI is not a normal technology, and the labs do create superintelligence, then the labs should also worry about not capturing the resulting value in a completely different way. By default it is then the AIs that capture the value. There are also possible worlds in which the labs or key individuals capt

この記事をシェア

TechCrunch AI重要度42026年6月27日 01:24

もはやアンソロピック対オープンエーアイではない

The Zvi重要度42026年6月26日 23:51

ホワイトハウスが個別に GPT-5.6 のアクセス権をその場しのぎで決定する方針へ

The Verge AI重要度52026年6月26日 23:07

Anthropic の Mythos 問題がさらに深刻化

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む