TLDR AI·2026年5月7日 09:00·約24分

4 月にすべての AI プランが崩壊した（18 分読了）

#LLM #価格戦略 #サブスクリプションモデル #トークン消費量 #OpenAI #Anthropic

TL;DR

主要 AI プロバイダーが相次いで価格改定や機能制限を発表した背景には、既存のサブスクリプション設計が急激に進化する製品能力と利用パターンに耐えられなくなっているという業界全体の構造的課題がある。

AI深層分析2026年7月5日 22:08

最重要/ 5段階

深度40%

キーポイント

主要プロバイダーによる相次ぐ価格・機能改定

Anthropic、OpenAI、GitHub の 3 社が 4 月だけで 5 回の重大な発表を行い、サブスクリプションの条件変更や API 価格の倍増を断行した。

サブスクリプション設計の限界露呈

従来の「定額制」モデルが、トークン使用量の急増や新機能（エージェント実行など）への対応において経済的に破綻し始めていることを示している。

技術的変更がビジネスプロセスに与える影響

単なる価格改定ではなく、トークン計算方式の変更やモデルの削除といった技術的決定が、顧客への通知不足や混乱（パニック）を招くケースが多発した。

利用パターンと収益モデルのミスマッチ

開発者や企業の利用行動が変化し、従来の「1 ユーザーあたり」や「固定トークン数」の課金モデルではコスト回収が困難になったことが背景にある。

Proプランの価格引き上げと機能制限

高度な機能を求めるプロユーザーには、月額100ドル以上の「Max」プランへの移行が事実上必須となり、料金が最大5倍に跳ね上がりました。

Maxプラン利用者の現状と警告

著者は20倍の容量を持つMaxプランを利用しており問題ありませんが、現在Proプランを使用中でこの変更を知らないユーザーへの注意喚起を行っています。

フラットレート価格モデルの終焉

OpenAI、Anthropicなど主要AIプロバイダーは、エージェントワークロードにおける定額制プランを廃止し、トークンベースの使用量課金へ一斉に転換している。

影響分析・編集コメントを表示

影響分析

この一連の動きは、生成 AI 業界が「実験・成長期」から「成熟・収益化期」へ移行する転換点であることを示唆しています。企業側は従来のサブスクリプションモデルの見直しを迫られ、より細分化された課金体系や、利用実態に即した柔軟なプランへの移行が加速すると予想されます。また、技術変更と顧客コミュニケーションの重要性が再認識され、製品管理プロセス全体の見直しが業界全体で求められることになります。

編集コメント

これは単なる値上げのニュースではなく、AI サービスのビジネスモデルが技術進化に追いつけず、根本的な再設計を余儀なくされている「成長の痛み」を示す重要な事例です。開発者や企業は、今後の利用コスト計画を見直す必要があります。

AI の価格設定を追跡してきた人々にとって、4 月は奇妙な月でした。私は主要なラボからの意味のあるパッケージ化と価格設定の動きを随時記録しています。4 月の第 3 週には、その月のメモがページからはみ出し、別のドキュメントに溢れ始めていました。5 つの主要な発表があり、4 つの最大プロバイダーのうち 3 つが関与し、すべてが 3 週間以内に発生し、おおよそ同じ方向を指していました。以下は、発生した順の時系列です。

4 月 4 日。Anthropic は、Claude Pro および Max のサブスクライバーに対し、OpenClaw などのサードパーティ製エージェント・ハーネスがサブスクリプションによって支えられなくなることを、24 時間未満の通知で告知しました。

4 月 9 日。OpenAI は、重度の Codex ユーザーを対象に、新たな月額 100 ドルの「Pro 5x」ティアを立ち上げました。

4 月 13 日。GitHub は「濫用」を理由に、新しい Copilot Pro のトライアルを凍結しました。

4 月 16 日。Anthropic は、同じ入力テキストに対して最大で 35% 多くのトークンを消費する新しいトークナイザーを搭載した Opus 4.7 をリリースしました。

4 月 20 日。GitHub は Copilot Pro、Pro+、および Student プランの新たな個人登録を完全に一時停止し、Pro から Opus モデルを削除しました。

4 月 21 日、Anthropic の価格設定ページから Claude Code がプロプランから静かに削除されました。4 月 23 日には Hacker News で 12 時間にわたって議論が交わされた後、Anthropic はこの変更を撤回しました。

同じく 4 月 23 日、OpenAI が GPT-5.5 をリリースし、API の価格を 100 万トークンあたり 2.50 ドル/15 ドルから 5 ドル/30 ドルへと倍増させました。

世界で最も規模の大きい商用 AI プロバイダー 4 社のうち 3 社が、わずか 3 週間でパニックに陥り、5 つの動きを行いました。これらには共通する糸目があります：

*彼らのサブスクリプションプランの元の設計は、進化し続ける製品機能と利用パターンによって挑戦されています。*

私はこのパターンをより小規模なスケールでも毎週目にします。プロダクトチームが新しいプランをリリースしたいと考えている一方で、営業部門は既存のコホートの特例適用（グランドファーザー）を求め、財務部門はモデルの課金方式を seat 単位からトークン単位へ切り替えたいと考えます。機能へのアクセス権限（エンタイトルメント）を計算するロジックが、推論処理を行う同じサービス内に組み込まれているため、「小さな変更」が四半期にわたる移行プロジェクトへと膨れ上がり、その移行が X 上でのパニック発表へと繋がってしまうのです。4 月に起きたのは、Anthropic、OpenAI、Microsoft の各社で同時にこのパターンが大規模に爆発した出来事でした。

本稿では、何が壊れたのか、なぜ壊れたのか、そして非決定論的なエージェントループの downstream にユニットエコノミクスが存在する場合、どのような適切なファイナンシャルエンジニアリングが求められるのかについて解説します。

この件に関する最も率直な発言は、Anthropic の成長責任者であるアモル・アヴァサレ（Amol Avasare）が 4 月 22 日のXで行ったものです。彼は、Claude Code がなぜプロプランのリストから 12 時間にわたり静かに姿を消したのかを説明しようとしていました。ゆっくりと読んでください…

「昨年 Max をローンチした際、Claude Code は含まれておらず、Cowork も存在せず、数時間稼働するエージェントはまだ一般的ではありませんでした。Max は重いチャット利用のために設計されたもので、それだけのことです。その後、私たちは Claude Code を Max にバンドルし、Opus 4 の登場以降爆発的な人気となりました。Cowork が導入され、長時間非同期で動作するエージェントが日常のワークフローになりました。人々が実際に Claude サブスクリプションを利用する方法は根本的に変化しました。その過程で小規模な調整（週ごとの利用制限やピーク時の厳格な制限など）を行ってきましたが、利用状況は大きく変わり、現在のプランではこの新しい需要に対応しきれていません。」

これは被害拡大防止のために発表された構造的な大改修の宣言です。請求管理システムを一度でも担当したことがあるエンジニアなら、必ず心に響くべきフレーズがあります。「その過程で小規模な調整を行ってきました」という部分です。これらの「小規模な調整」とは、まさに接着テープのようなものです。根本的に製品を正しく規定できないプランに、週ごとの制限やピーク時のスロットリングという接着テープを貼り付けている状態なのです。

サブスクリプションプランは、機械的には権限契約です。顧客が何を、どの制限まで、いつまで行えるかを宣言するものです。もしそのプランがプロダクトがチャットボックスだった時に作成され、かつその権限が推論を実行する同じコードパスに実装されている場合、6 時間自律的に動作するプロダクトを管理するためにそのプランを拡張することはできません。できるのは、それを回避するためのパッチを当てることだけです。それが現在、各プロバイダーが顧客の間に立たされた状態で公に行っていることです。

最も明確な例は、4 月 4 日の OpenClaw によるカットオフですApril 4。OpenClaw はオープンソースのエージェントハネスで、ユーザーが任意の大規模言語モデル（LLM）を独自の自動化ループに接続できるものです。2026 年初頭には GitHub スター数が 10 万を超え、開発者コミュニティが個人向けの月額 200 ドルの Max サブスクリプションを使用して、Claude を経由して自律的なワークロードをルーティングしていました。

Anthropic は金曜日の夜に、太平洋標準時翌日正午から、OpenClaw を含むサードパーティ製のハネスはサブスクリプションクォータを利用できなくなると発表しました。ユーザーは従量課金の「追加利用料」または個別の API キーを通じて支払う必要があります。退出を希望する人については 4 月 17 日まで返金期間が設けられています。

この不可避な事態に至った計算式は残酷です。Aakash Gupta はカットオフの前日、次のように投稿しました:

「月額20ドルの食べ放題ビュッフェはついに閉店しました。1日だけ稼働する単一のOpenClawエージェントが、API換算で1,000ドルから5,000ドルのコストを消費します。これは月額200ドルのMaxサブスクリプションにおける話です。Anthropic社は、サードパーティ製のハーンネル（harness）を経由するすべてのユーザーに対して、この差額を負担していました。」

これは1日あたりユーザーごとに5倍から25倍の補助金に相当します。Anthropic社のClaude Code責任者であるBoris Chernyは、技術的な理由を説明しました。Claude CodeやCoworkといったファーストパーティ製ツールは、プロンプトキャッシュのヒット率を最大化するように設計されています。キャッシュされた入力トークンのコストは標準料金の10%です。一方、サードパーティ製のハーンネルはキャッシュに優しいリクエストパターンに従わないため、すべてのやり取りがフル価格で会話コンテキストの再処理を引き起こします。同じ名目上のトークン量であっても、経済性は桁違いに崩壊してしまいます。

技術的な事実は確かに存在します。しかし、その実行は広く批判され、特に 24 時間未満の通知期間と、既存利用者の権利を認める措置（グランドファーシング）の欠如が問題視されました。OpenClaw の創設者であるピーター・シュタインバーガー氏は 2 月に OpenAI に加入しましたが、6 日後に Anthropic から「不審な活動」の理由で一時利用停止処分を受けました。その後、彼の投稿がバズった数時間後には利用制限が解除されています。カットオフから 4 日後、Anthropic は自社の有料エージェントランタイムである「Claude Managed Agents」をリリースし、OpenClaw と同じカテゴリで競合する製品となりました。公平かどうかは別として、その印象（オプティクス）は「利用ケースを吸収した上で、抜け穴を塞ぐ」というものでした。

技術的な外観よりも重要なのは、金融工学の観点です。Anthropic のスタック内部には、「チャットボックスに入力するユーザー」と「1 日中 18 時間稼働する自律型エージェントを起動するユーザー」を区別できる権限境界が存在しませんでした。両者は同じ OAuth トークンで認証されていました。そのため、ポリシー変更を実装する唯一の方法は、二値的なカットオフ（全停止）でした。非ファーストパーティのツールも、あらゆる利用パターンも、一律に無効化されました。

4 月 20 日はさらに悪化しました。GitHub は Copilot Pro、Pro+、および学生向けプランの新しい登録を完全に一時停止しました。それから 2 日後には、フリープランとチームプランを利用している組織に対しても、セルフサービスによる Copilot Business の新規登録を一時停止しました。残された個人向けの登録経路は無料版 Copilot だけとなりました。既存の契約者については 5 月 20 日まで返金期間が設けられましたが、一つ大きな条件がありました。キャンセルは取り消し不可能であり、一度キャンセルすれば元には戻れないのです。

プロダクト担当バイスプレジデントである Joe Binder のブログ投稿は、あるプロバイダーから見たユニットエコノミクス（単位経済性）の崩壊について、私がこれまでに見た中で最も率直な説明です：

「現在では、数回のリクエストで発生するコストがプラン価格を超えてしまうことが一般的になっています。長時間実行され並列化されたセッションは、もともとプラン構造がサポートするために設計されたものよりもはるかに多くのリソースを消費することが常態化しています。」

Ed Zitron の内部文書に基づく報告によると、今年初め以来 GitHub Copilot を運用する週次コストは倍増している[1]。プラン料金は同じままなのに、基盤となるコストが 2 倍になったのだ。GitHub 自身が推奨したエージェント駆動型ワークフロー（並列エージェントのdispatchのための自社の/fleet コマンドも現在、ユーザーに制限を要請する対象としてリストされている[2]）を加味すると、既存のプラン構造は財政的に持続不可能となる。

表向きには扉を閉ざす一方で、GitHub は裏側でも微妙な変更を行った。Opus モデルはもはや Pro プランでは利用できなくなった。Opus 4.7 は Pro+ に限定され、Opus 4.5 と 4.6 は Pro+ から完全に削除される。さらに新登場の Opus 4.7 は、7.5 倍のプレミアムリクエスト乗数[3]を伴っており、これは Opus 4.6 の 3 倍からの上昇である[4]。ただし、Anthropic の API 価格では両モデルとも百万トークンあたり$5/$25で同一である。

7.5 倍という乗数が「正当化」される理由は、Opus 4.7 が新しいトークナイザーを搭載しており、同じ入力テキストに対して最大 35% 多くのトークンを生成するためである。Anthropic はこの点について価格文書[5]で確認している。つまりレートカード自体は据え置きだが、請求書の内容は異なるのだ。

[1]: https://thenewstack.io/github-copilot-signups-paused/

[2]: https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/

[3]: 乗数 (multiplier)

[4]: https://github.com/orgs/community/discussions/192814

[5]: https://platform.claude.com/docs/en/about-claude/pricing

次の文を注意深く読んでください。35% のインフレはメーターへの静かな変更です。乗数（Copilot の 7.5 倍、Cursor の「API 同等クレジット$20」、Replit のチェックポイントカウンター）をハードコードしたすべての下流ツールは、古い乗数と新しいトークナイザーの現実との差によって、ユーザーに過剰請求しています。これらのツールの一部はこれを検知して調整するでしょうが、他の一部はそうしません。いずれもユーザーに通知していません。

すでに請求書で確認できます。GitHub コミュニティディスカッション#192911 には、自分が選択したこともないモデル（Sonnet 4.5, Haiku 4.5, Gemini 3 Flash）の使用記録がすべて誤った 7.5 倍のレートで請求されているというユーザーの声が文書化されています。幻影モデルの課金です。複合乗数です。これは、単一の真実源を共有しない 3 つの異なるサービスを通じて価格計算ロジックがインラインに重ねて実装されたときに起こることです。

そして 4 月 21 日の午後、claude.com のライブ価格ページが静かに変更され、Pro ティアの Claude Code リストがチェックマークから X に変わりました。サポートドキュメントも同時に、「Pro または Max プランで Claude Code を使用する」から「Max プランのみ」へと変更されました。モバイル版とデスクトップ版の両方でです。プレスリリースもメールも、変更履歴へのエントリーもありません。

Hacker News と Reddit は数時間以内にこれに気づいた。Avasare は X（旧 Twitter）に投稿し、これは「新規プロシューマーの約 2% を対象とした小規模テスト」であり、既存のサブスクライバーには影響がないと主張した。Ed Zitron は明白な点を指摘した。すべての人に対して公開されている価格ページを更新する 2% のテストなど、それはもはや 2% のテストではない。この変更は 24 時間以内に元に戻された。

[Amol Avasare@TheAmolAvasare 明確に言うと、私たちは新規プロシューマーの約 2% を対象とした小規模テストを実行中です。既存の Pro および Max サブスクライバーには影響はありません。George Pu @TheGeorgePu Anthropic はついに Claude Code（Claude Code）を Pro プランから削除しました。

これを必要とする Pro ユーザーは、今すぐ Max に移行する必要があります。

月額 100 ドルが最低ラインです。5 倍の値上げです。

私は Max を 20 倍利用しているので問題ありません。

Pro ユーザーでこれから事態に気づく人に向けてお知らせします。

発表はありません。価格ページの編集のみでした。7:55 AM · Apr 22, 2026 · 6.83M Views1.36K Replies · 684 Reposts · 5.26K Likes](https://x.com/TheAmolAvasare/status/2046724659039932830?s=20)

Sam Altman は、決して見逃すことのない人物として、Avasare のスレッドの下で「ok boomer」と返信した。

実際に重要だったのは、 again、ダメージコントロールの中に埋め込まれていた構造的見直しに関する一文でした。Anthropic は、Claude Code が 20 ドルプランの一部ではなくなる未来を公然とモデル化しています。Avasare 氏はそのように述べています。「ユーザーに素晴らしい体験を提供し続けるための異なる選択肢を検討しています。それらが具体的にどのようなものかはまだ不明です。現在、私たちはそれをテストし、フィードバックを得ている最中です。」これは 2% の実験に関する話ではありません。プラン構造の変更が必要であることを理解し、発表の政治的側面を練り直している企業を見ているのです。

見出しとなる出来事は劇的ですが、その下で進行中のスローモーションのような構造的変化の方がより重要です。同じ月にあった他の 3 つの動きは、すべての主要プロバイダーが同じ結論に収束していることを示しています：エージェントワークロードに対する定額制料金は死滅しました。

OpenAI は、4 月 2 日に Codex の課金方式をメッセージ単位の請求からトークン単位 API スタイルの請求へ変更し（Plus、Pro、Business、Enterprise の新規顧客向け）、その後 4 月 23 日に Edu、Health、Gov、Teachers を含むすべての既存 Enterprise カスタマーに適用しました。現在のレートカードでは、入力トークン、キャッシュされた入力、出力トークンのそれぞれ 100 万単位あたりクレジットを請求し、これは基盤となる API メーターに直接対応しています。

Anthropic は 2 月から 4 月にかけて企業向け契約の再編成を行いました。The Information初報は、この変更を 4 月 14 日に伝えています。以前存在していた、割引されたトークン付与額を含む月額 200 ドルの固定料金プラン（1 セートあたり）はもはやありません。新しい契約では、チャット専用または Claude Code の利用料として 1 セートあたり月額 20 ドルに加え、すべての使用量に対して標準 API レートが適用され、Anthropic があなたの使用量を推定して設定した月額支出コミットメント（達成の有無にかかわらず支払いが必要）も含まれます。以前適用されていた 10〜15% のボリュームディスカウントは廃止されました。Redress Compliance の Fredrik Filipsson 氏によると、大規模な企業ユーザーの請求額は倍増または 3 倍になると見込まれています。

そして 4 月 23 日、OpenAI はGPT-5.5をリリースし、API 料金は 100 万トークンあたり 2.50 ドル/15 ドルから 5 ドル/30 ドルへと倍増しました。彼らは、トークンの効率性を考慮すると「実質的な価格上昇は 20%」であると主張しています。いずれにせよ、同じモデルファミリーにおいて、単一のリリースで headline rate（基本料金）が倍になっています。

これは偶然ではありません。その背後にある数理は至る所で同じです。Anthropic は現在、非常に重いエージェントワークロードを処理しており、2025 年に推論コストが内部予測を 23% 上回りましたその IPO の詳細はこちら。これにより粗利益率は The Information によると約 40% に押し下げられています。OpenAI の API は、10 月の1 分間に 60 億トークンから、3 月には1 分間に 150 億トークンへと増加しました。これは 5 ヶ月間で 2.5 倍の複合的な負荷曲線です。定額制プランはマーケティング上の判断でしたが、トークン単位の課金メーターは、プロバイダーが先送りできない運用上の現実です。

*ここまでお読みいただいた方へ、請求管理を担当するチームのエンジニアにこの投稿を共有してください。彼らは望むと望まざるとに関わらず、まさに今、この物語を生きているのです。

このニュースレターを読むすべての人にとって、最も重要なのは以下の点です。価格変更が不愉快なものになった理由は、プロバイダーが悪意を持っているからではありません。彼らの収益化インフラストラクチャが、製品コードとは切り離された別個の課題として構築されていなかったからです。

「Pro プランからこのユーザーが Claude Code を呼び出せるか」を決定する権限がリクエストハンドラー内に存在する場合、その権限を変更する方法はコードをリリースすることしかありません。現在の 5 時間ウィンドウ内でこのユーザーが消費したトークン数を追跡するメーターが、レート制限を行う同じサービス内に存在する場合、新しいメーターを導入する方法は推論層の再デプロイしかありません。価格とクレジットの変換（Cursor の 20 ドル相当、Copilot のプレミアムリクエスト乗数、Replit の努力ベースのチェックポイント）が定数としてハードコードされている場合、Opus 4.7 のようなトークナイザーの変更に対応するには新しいビルドをプッシュする必要があり、ビルドをプッシュしないツールは静かに過剰課金を行っています。

多くのチームが持っているパターン、そして 4 月に破綻した可能性のあるものは、おおよそ以下のようになります：

ハンドラー内の各行は、将来のパニック声明が起きるのを待っているようなものです。ENABLE_CLAUDE_CODE_FOR_PRO をサインアップの 2% に切り替えて「テスト」をリリースします。SESSION_LIMITS[plan] を変更して「制限強化」をリリースします。異なるトークナイザーを持つ新しいモデルを追加して回帰（regression）をリリースします。すべてのビジネス判断がコード変更になります。すべてのコード変更が課金側の副作用のリスクを抱えます。すべての顧客が価格設定チームのためのベータテスターになります。

壊れないパターンとは、権限（entitlements）、プラン、メーターがリクエストハンドラーによって照会される別々の課金層に存在し、プラン構造そのものがデータとなり、新しいプランが設定情報となり、顧客の祖父条項（grandfathering）がクエリとなり、トークナイザーの変更がメーターに限定され、価格実験がデプロイするエンジニアではなく UI 上で製品マネージャーによって実行されるというものです。

世界で最も大きな 2 つの AI 企業はすでにこれを内部で結論付けています。OpenAI はサラ・コンロン（Sara Conlon）をリーダーとする「財務工学機能（Financial Engineering function）」[https://openai.com/index/beyond-rate-limits/] を創設し、価格設定とパッケージング、インフラストラクチャ、財務自動化、決済の各ポッドに編成しました。Anthropic は 2025 年 6 月、Turo で決済および請求エンジニアリングを担当したシャンムガスンダラム・アラグムトゥ（Shanmugasundaram Alagumuthu）を招聘し、請求プラットフォームチームのリーダーとしました。彼らは、課金において前任者が SRE（サイト信頼性エンジニアリング）に対して行ったことと同じことを実行しています。専任チームと専門分野を持つ生産インフラストラクチャとして扱っているのです。なぜなら、誤った場合の財務コストが、四半期末に企業の S-1 書類上で、そして見出しで明確に見えるようになったからです。

もし AI 推論と課金顧客の両方を含む何かを構築しているなら、次の四半期は直近の四半期と同じような様子になるでしょう。毎週の上限設定が増え、ピーク時のスロットリングが強化され、静かにトークナイザーが切り替わり、Hacker News のスレッドが頂点に達するまで機能を 2% の実験で削除し、その後復元するようなプランが続きます。この変動性は構造的なものです。AI プロバイダーはまだ安定した単価経済を確立していません。彼らは顧客がその変動性を吸収しながら、公の場でそれを模索しているのです。

この期間が終わった時点で賢明に見える企業は、コードを実装することなく変動性を吸収できる収益化レイヤーを持つ企業です。設定変更でワークロードを再メーターできる企業。クエリで既存のカスタマーコホートを継承できる企業。デプロイなしで価格実験を実行できる企業。それは機能ではありません。必要となる一、二四半期前に下されたアーキテクチャ上の決断なのです。

Ramp の CEO エリック・グリマンは、先月トークン追跡ツールをリリースした[1]が、顧客側の現実を要約しました。Ramp の顧客ベース全体における AI 支出は、過去 12 ヶ月間で 13 倍に増加しています[2]。誰もこれをどのように予算化すればよいか知りません。グリマン氏は、4 月 17 日の CNBC の記事[3]で最も難しい問いを指摘しました。「あなたのビジネスモデルが最大限のトークン支出を引き出すことに依存している場合、顧客が AI をより効率的に利用するのを助けるインセンティブはありますか？」この問いこそが、どの AI 企業が IPO（新規株式公開）の機会を生き残るかを決定づけることになります。Anthropic は、S-1 提出書類上の収益ラインが顧客が実際に価値を感じるものと一致するようにするため、トークン単価ベースへの移行を進めています。OpenAI も同様に、より渋々ながら同じ方向へ進んでいます。なぜなら、そうでなければスケールした際にマージンの危機に直面することになるからです。

[1] https://ramp.com/ai-cost-monitoring

[2] https://ramp.com/blog/trillion-dollar-ai-blindspot

[3] https://www.cnbc.com/2026/04/17/ai-tokens-anthropic-openai-nvidia.html

出典：https://ramp.com/blog/trillion-dollar-ai-blindspot

来週、この物語をさらに一層深く掘り下げていきます。3 月に Claude Code に対して提出された単一の GitHub イシュー#41930があります。これはコミュニティのエンジニアたちが Ghidra と MITM（Man-in-the-Middle）プロキシを使用して Claude Code のバイナリを逆解析して作成したもので、数千文字に及びます。そこには、主要な AI プロバイダーのメータリングシステムが実際にどのように機能しているか、その問題点、そしてなぜ以前のプロンプトで特定の単語を入力したことでアカウントから 200 ドル相当の使用クレジットが消えてしまうのかという、最も詳細な公開記録が含まれています。私たちはこれらを一緒に見ていきましょう。キャッシュ無効化、セネチン文字列置換、誤ったレート制限、プロンプトラウティングのバグ、そして本来あるべきメータリングの姿についてです。

それまでは。メーターやエンタイトルメントに触れる何かを構築する場合は、4 月を警告弾として扱ってください。「すべての AI プランが壊れた 4 月」はいつか必ず起こる出来事でした。今問われているのは、あなたの会社が収益化モデルを「曲がるもの」として構築しているのか、「同じように壊れるもの」として構築しているのかということです。

原文を表示

April was a strange month for anyone who’s been tracking AI pricing. I keep a running file of the meaningful packaging and pricing moves from the major labs. By the third week of April my notes for the month had outgrown the page and started spilling into a separate document. Five major announcements, three of the four biggest providers, all in three weeks, all pointing in roughly the same direction. Here is the chronology, in the order it happened.

April 4. Anthropic gives Claude Pro and Max subscribers fewer than 24 hours notice that their subscriptionswill no longer power third-party agent harnesses like OpenClaw.

April 9. OpenAI launches abrand new $100/month “Pro 5x” tier targeting heavy Codex users.

April 13. GitHubfreezes new Copilot Pro trials over “abuse.”

April 16. Anthropicships Opus 4.7 with a new tokenizer that uses up to 35% more tokens for the same input text.

April 20. GitHub pausesnew individual signups for Copilot Pro, Pro+, and Student plans entirely, and removes Opus models from Pro.

April 21. Anthropic’s pricing pagequietly drops Claude Code from Pro. By April 23, after Hacker News spends 12 hours on it, Anthropic walks the change back.

The same April 23, OpenAI ships GPT-5.5 withAPI prices doubled from $2.50/$15 to $5/$30 per million tokens.

Five panicked moves in three weeks, from three of the four biggest commercial AI providers in the world, with one common thread:

*The original design of their subscription plans is being challenged by evolving product capabilities and usage patterns.*

I see this pattern at smaller scale every week. A product team wants to ship a new plan. Sales wants to grandfather a cohort. Finance wants to flip a model from per-seat to per-token. The entitlement that gates the feature is computed inside the same service that does the inference. So the “small change” turns into a quarter-long migration, and the migration turns into a panic announcement on X. What happened in April was that pattern blowing up at the scale at Anthropic, OpenAI, and Microsoft simultaneously.

This post is about what broke, why it broke, and what proper financial engineering looks like when your unit economics live downstream a non-deterministic agent loop.

The most honest line about any of this came from Amol Avasare, Anthropic’s head of growth, onX on April 22, trying to explain why Claude Code had silently disappeared from the Pro plan listing for 12 hours. Read it slowly…

“When we launched Max a year ago, it didn’t include Claude Code, Cowork didn’t exist, and agents that run for hours weren’t a thing. Max was designed for heavy chat usage, that’s it. Since then, we bundled Claude Code into Max and it took off after Opus 4. Cowork landed. Long-running async agents are now everyday workflows. The way people actually use a Claude subscription has changed fundamentally. We’ve made small adjustments along the way (weekly caps, tighter limits at peak), but usage has changed a lot and our current plans weren’t built for this.”

That is a structural overhaul sentence delivered as damage control. The phrase that should land for any engineer who has ever owned a billing system is “small adjustments along the way.” Those small adjustments are the duct tape. They are weekly caps and peak-hour throttles bolted onto plans that fundamentally cannot represent the product they’re supposed to govern.

A subscription plan is, mechanically, an entitlement contract. It declares what a customer can do, with what limits, until when. If the plan was authored when the product was a chat box, and the entitlements live in the same code path that runs inference, you cannot extend that plan to govern a product that runs autonomously for six hours. You can only patch around it. Which is what every provider is now doing in public, with their customers in the middle.

The clearest example was the OpenClaw cutoff onApril 4. OpenClaw is an open-source agent harness that lets users wire any LLM into their own automation loops. By early 2026 it had north of 100,000 GitHub stars and a developer community routing autonomous workloads through Claude using their personal $200/month Max subscriptions.

Anthropic announced on a Friday evening that starting noon Pacific the next day, third-party harnesses including OpenClaw could no longer draw from subscription quotas. Users would have to pay through pay-as-you-go “extra usage” or a separate API key. Refund window through April 17 for anyone who wanted out.

The math that made this inevitable is brutal. Aakash Gupta postedthe day before the cutoff:

“The $20/month all-you-can-eat buffet just closed. A single OpenClaw agent running for one day burns $1,000 to $5,000 in API-equivalent costs. On a $200 Max subscription. Anthropic was eating that difference on every user who routed through a third-party harness.”

That is a 5x to 25x subsidy per user per day. Boris Cherny, Anthropic’s head of Claude Code,explained the technical reason. First-party tools like Claude Code and Cowork are engineered to maximize prompt cache hit rates. Cached input tokens cost 10% of the standard rate. Third-party harnesses don’t follow the cache-friendly request patterns, so every interaction triggers full reprocessing of the conversation context at full price. The economics break by an order of magnitude on the same nominal token volume.

The technical case is real. The execution was widely criticized, specifically due to the sub-24-hour notice and lack of grandfathering. The OpenClaw creator Peter Steinberger, who joined OpenAI in February,got banned from Anthropic six days later for “suspicious activity,” then unbanned within hours after his post went viral. Four days after the cutoff, Anthropiclaunched Claude Managed Agents, its own paid agent runtime competing in the same category as OpenClaw. The optics, fairly or not, were “absorb the use case, then close the loophole.”

The financial engineering point matters more than the optics. There was no entitlement boundary inside Anthropic’s stack that could distinguish “user typing into a chat box” from “user spawning autonomous agents that run 18 hours a day.” The same OAuth token authenticated both. So the only way to ship a policy change was a binary cutoff. Any non-first-party tool, any usage pattern, blanket disabled.

April 20 was worse. GitHub paused new signups forCopilot Pro, Pro+, and Student plans entirely. Two days later they pausednew self-serve Copilot Business signups for organizations on Free and Team plans too. The only remaining individual signup path was free Copilot. Existing subscribers got a refund window through May 20 with one twist. Cancellation is irreversible. You can’t come back.

VP of Product Joe Binder’sblog post is the most candid description of unit economics breaking I’ve seen from any provider:

“It’s now common for a handful of requests to incur costs that exceed the plan price... Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support.”

According to Ed Zitron, citing internal documents,the weekly cost of running GitHub Copilot doubled since the start of the year. Same plan price. Twice the underlying cost. Add the agent-driven workflows GitHub itself encouraged (their own /fleet command for parallel agent dispatch is nowlisted as something they’re asking users to limit) and the existing plan structure becomes financially unsustainable.

While they were closing the front door, GitHub also did something subtle to the back. Opus models are no longer available on the Pro plan. Opus 4.7 is restricted to Pro+. Opus 4.5 and 4.6 will be removed from Pro+ entirely. And the new Opus 4.7 carries a 7.5x premium request multiplier,up from 3x for Opus 4.6, even though Anthropic’s API price for the two models is identical at $5/$25 per million tokens.

The reason 7.5x is “justified” is that Opus 4.7 ships with a new tokenizer that produces up to 35% more tokens for the same input text. Anthropicconfirms this in their pricing documentation. So the rate card is flat. The invoice is not.

Now read the next sentence carefully. The 35% inflation is a silent change to the meter. Every downstream tool that hardcoded a multiplier (Copilot’s 7.5x, Cursor’s “$20 of API-equivalent credit,” Replit’s checkpoint counter) is now overcharging users by the difference between their old multiplier and the new tokenizer reality. Some of those tools will catch it and adjust. Others won’t. None of them notified users.

You can already see it in the bills. GitHub community discussion#192911 documents users seeing usage records for models they never selected (Sonnet 4.5, Haiku 4.5, Gemini 3 Flash) all billed at the wrong 7.5x rate. Phantom model billing. Compound multipliers. This is what happens when pricing logic is layered inline through three different services that don’t share a single source of truth.

Then on the afternoon of April 21,the live pricing page on claude.com silently changed the Pro tier’s Claude Code listing from a checkmark to an X. Support docs changed at the same time, from “Using Claude Code with your Pro or Max plan” to just “Max plan.” Mobile and desktop both. No press release, no email, no changelog entry.

Hacker News and Reddit caught it within hours. Avasare posted on X claiming it wasa “small test on ~2% of new prosumer signups” and that existing subscribers were unaffected. Ed Zitron pointed out the obvious. A 2% test that updates the public-facing pricing page for everyone is not a 2% test. The change reverted within 24 hours.

[Amol Avasare@TheAmolAvasareFor clarity, we're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected.George Pu @TheGeorgePuAnthropic just pulled Claude Code from the Pro plan.

Pro users wanting it need Max now.

$100/month minimum. 5x jump.

I'm on Max 20x so I'm fine.

Flagging for anyone on Pro who's about to find out.

No announcement. Just a pricing page edit.7:55 AM · Apr 22, 2026 · 6.83M Views1.36K Replies · 684 Reposts · 5.26K Likes](https://x.com/TheAmolAvasare/status/2046724659039932830?s=20)Sam Altman, never one to miss a moment,replied “ok boomer” under Avasare’s thread.

The part that actually mattered, again, was the structural overhaul sentence buried inside the damage control. Anthropic is openly modeling a future where Claude Code is no longer part of the $20 plan. Avasare said as much. “We’re looking at different options to keep delivering a great experience for users. We don’t know exactly what those look like yet. That’s what we’re testing and getting feedback on right now.” That is not about a 2% experiment. We’re looking ata company that knows the plan structure has to change and is workshopping the politics of the announcement.

The headline incidents are dramatic, but the slow-motion structural shift underneath them is more important. Three other moves the same month show every major provider converging on the same conclusion: flat-rate pricing on agentic workloads is dead.

OpenAI moved Codex from per-message billing to per-token API-style billing onApril 2 for new Plus, Pro, Business, and Enterprise customers, then extended it toall existing Enterprise customers on April 23 including Edu, Health, Gov, and Teachers. The rate card now bills credits per million input, cached input, and output tokens, mapping directly to the underlying API meter.

Anthropic restructured enterprise contracts in February through April. The Informationfirst reported the shift on April 14. The previous flat $200/seat/month tier with a discounted token allowance is gone. New contracts are $20/seat for chat-only or Claude Code, plus standard API rates on all consumption, plus a monthly spending commitment based on Anthropic’s estimate of your usage that you pay whether you hit it or not. Volume discounts of 10 to 15% that previously applied are gone. Heavy enterprise users will seebills double or triple, per Fredrik Filipsson at Redress Compliance.

And onApril 23 OpenAI shipped GPT-5.5 with the API price doubled from $2.50/$15 to $5/$30 per million tokens. They claim a “20% effective price increase” once token efficiency is factored in. Either way, the headline rate doubled, on the same model family, in a single release.

This is not a coincidence. The math underneath it is the same everywhere. Anthropic now processes such heavy agentic workloads thatits inference costs surged 23% beyond internal projections in 2025, pushing gross margin to roughly 40% per The Information. OpenAI’s APIs went from6 billion tokens per minute in October to15 billion per minute by March. That is a 2.5x compounding load curve in five months. Flat-rate plans were a marketing decision. Per-token meters are an operational reality the providers cannot defer.

*If you’ve gotten this far, share this post with the engineer on your team who owns billing. They are about to live this story whether they want to or not.*

Here is the part that should matter most to anyone reading this newsletter. The reason the pricing changes have been ugly is not because the providers are evil. It is because their monetization infrastructure was not built as a separate concern from their product code.

When the entitlement that decides “can this user invoke Claude Code from a Pro plan” lives inside the request handler, the only way to change that entitlement is to ship code. When the meter that tracks “how many tokens has this user consumed in the current 5-hour window” lives in the same service that does the rate limiting, the only way to ship a new meter is to redeploy the inference layer. When the price-to-credit conversion (Cursor’s $20-equivalent, Copilot’s premium request multipliers, Replit’s effort-based checkpoints) is hardcoded as constants, the only way to handle a tokenizer change like Opus 4.7 is to push a new build, and any tool that doesn’t push a build is silently overcharging.

The pattern most teams have, and the one that may broke for in April, looks roughly like this:

Every line of that handler is a future panicked announcement waiting to happen. Toggle ENABLE_CLAUDE_CODE_FOR_PRO for 2% of signups and you ship a “test.” Change SESSION_LIMITS[plan] and you ship a “tightening.” Add a new model with a different tokenizer and you ship a regression. Every business decision becomes a code change. Every code change risks a billing side effect. Every customer becomes a beta tester for the pricing team.

The pattern that doesn’t break is one where entitlements, plans, and meters live in a separate monetization layer that the request handler queries. The plan structure itself becomes data. New plans become configuration. Customer grandfathering becomes a query. Tokenizer changes are isolated to the meter. Price experiments are run by product managers in a UI, not engineers in a deploy.

The two biggest AI companies in the world have already concluded this internally. OpenAI created aFinancial Engineering function led by Sara Conlon, organized into pods for Pricing & Packaging, Infrastructure, Financial Automation, and Payments. Anthropic hired Shanmugasundaram Alagumuthu in June 2025 to lead its Billing Platform team, after he ran payments and billing engineering at Turo. They are doing for monetization what their predecessors did for SRE. Treating it as production infrastructure with a dedicated team and a dedicated discipline. Because the financial cost of getting it wrong is now visible at quarter-end, on the company’s S-1, and in the headlines.

If you build anything that involves AI inference and a paying customer, the next quarter is going to look like the last one. There will be more weekly caps. More peak-hour throttles. More silent tokenizer shifts. More plans that drop features in 2% experiments and restore them after the Hacker News thread crests. The volatility is structural. AI providers do not yet have stable unit economics. They are figuring it out in public, with their customers absorbing the variance.

The companies that look smart at the end of this period will be the ones whose monetization layer can absorb the volatility without shipping code. The ones who can re-meter a workload in a config change. The ones who can grandfather a customer cohort in a query. The ones who can run a pricing experiment without a deploy. That is not a feature. That is an architectural decision made one or two quarters before you need it.

Ramp’s CEO Eric Glyman, whoshipped a token-tracking tool last month, summed up the customer-side reality. AI spend across Ramp’s customer basegrew 13x in 12 months. Nobody knows how to budget for it. Glyman pointed out the hardest question ina CNBC piece on April 17. If your business model depends on extracting maximum token spend, do you have any incentive to help customers use AI more efficiently? That question is going to define which AI companies survive the IPO window. Anthropic is moving toward per-token because it wants its revenue line on the S-1 to match what customers actually value. OpenAI is moving the same direction at the same time, more reluctantly, because the alternative is a margin crisis at scale.

Next week I’m going to take this story one layer deeper. There’s a single GitHub issue,#41930, filed against Claude Code in March. It runs to thousands of words and was written by community engineers who reverse-engineered the Claude Code binary using Ghidra and a MITM proxy. Inside it is the most detailed public account of how a major AI provider’s metering system actually works, what’s wrong with it, and why $200 of extra usage credits can vanish from your account because you typed certain words in a previous prompt. We’ll go through it together. Cache invalidation, sentinel string replacement, false rate limits, prompt routing bugs, and what proper metering looks like instead.

Until then. If you build anything that touches a meter or an entitlement, treat April as the warning shot. The April every AI plan broke was always going to happen. The question now is whether your company is building monetization that bends, or that breaks the same way.

この記事をシェア

TechCrunch AI2026年7月5日 00:51

ミストラル AI とは？OpenAI の競合企業に関する全知識

The Zvi重要度42026年7月3日 22:12

Fable #6：王の帰還

KDnuggets2026年7月3日 21:00

Python で Claude API を使い始めるガイド

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む