The Decoder·2026年3月12日 20:54·約1分

Meta、数十億ユーザーの推論コスト削減に向けて4世代の独自AIチップを発表

#カスタムシリコン #AI推論最適化 #低精度計算(MX4/MX8)#Meta MTIA #インフラコスト削減

TL;DR

メタはBroadcomとの共同開発による4世代の独自AIチップ「MTIA」シリーズを発表し、推論コスト削減と大規模ユーザー基盤へのAI展開を加速させる。

AI深層分析2026年3月12日 21:40

重要/ 5段階

深度40%

キーポイント

MTIAシリーズの世代別ロードマップと生産状況

MTIA 300は量産済み、400はデータセンター展開中、450/500は2027年量産予定と明確な生産スケジュールを提示。

GenAI推論最適化とハードウェア仕様

低精度フォーマットMX4/MX8対応とHBM帯域幅・容量の大幅向上により、既存商用製品を凌駕する推論性能を目指す。

標準ソフトウェアスタックとの完全互換性

PyTorch、vLLM、Tritonに対応し、既存モデルの特殊な調整なしに移植可能でGPUとの並行運用を支援。

多ベンダー併用によるインフラ戦略

独自チップとAMD/Nvidia製GPUを併用し、ベンダーロックインを回避しつつスケーラビリティとコスト最適化を両立。

影響分析・編集コメントを表示

影響分析

本発表は、大規模プラットフォーム企業がAI推論のコスト構造を根本から再構築しようとする戦略の明確な示唆である。独自チップと標準ソフトウェアスタックの融合により、開発者の採用ハードルを下げつつインフラ依存度を低下させるため、クラウド市場やGPUベンダーの価格競争に新たな圧力をかける可能性がある。長期的にはAIサービスの普及速度と収益性を左右するインフラ覇権争いの重要な転換点となる。

編集コメント

独自チップのソフトウェア互換性確保は、ハードウェア依存からの脱却において極めて重要な戦略だ。推論コストの劇的な削減が実現すれば、大規模AIサービスの収益モデルと普及ペースに直結する。

image

Metaは、推論に特化した4つの新世代カスタムAIチップを発表しました。これにより、NVIDIAやAMDといったGPUメーカーへの依存度を低減させる取り組みを進めています。

この記事「Meta unveils four generations of custom AI chips to cut inference costs for billions of users」は、The Decoderで最初に公開されました。

原文を表示

Meta has unveiled four new generations of custom AI chips—MTIA 300, 400, 450, and 500—designed to make AI cheaper to run across its platforms.

The chips are being developed in partnership with Broadcom and are built to make AI applications more cost-effective for the billions of users on Meta's platforms. Meta says it's following a roughly six-month development cycle per chip generation. From MTIA 300 to 500, memory bandwidth (HBM) increases by a factor of 4.5 and computing power jumps 25x.

MTIA 300 is optimized for ranking and recommendation models (R&R) and is already in production, according to Meta. MTIA 400 is the first generation that Meta says can compete with leading commercial products on raw performance. A rack of 72 chips forms a single scale-up domain. MTIA 400 has completed lab testing and is currently being rolled out to data centers.

MTIA 450 and 500 target generative AI inference

MTIA 450 and 500 are specifically optimized for generative AI inference. MTIA 450 doubles the HBM bandwidth compared to MTIA 400, outperforming existing commercial products, according to Meta. The chips support low-precision data formats like MX4 and MX8, which cut the computing power needed for inference without significantly hurting model quality. MTIA 500 adds another 50 percent HBM bandwidth and up to 80 percent more HBM capacity. Both chips are scheduled for mass production in 2027.

Metric

MTIA 300

MTIA 400

MTIA 450

MTIA 500

Workload Focus

R&R Training

General

GenAI Inference

Module TDP

800 W

1200 W

1400 W

1700 W

HBM Bandwidth

6.1 TB/s

9.2 TB/s

18.4 TB/s

27.6 TB/s

HBM Capacity

216 GB

288 GB

384-512 GB

MX4 Performance

12 PFLOPs

21 PFLOPs

30 PFLOPs

FP8/MX8 Performance

1.2 PFLOPs

6 PFLOPs

7 PFLOPs

10 PFLOPs

BF16 Performance

0.6 PFLOPs

3 PFLOPs

3.5 PFLOPs

5 PFLOPs

Scale-up domain size

Scale-up network

(unidirectional bandwidth*)

1 TB/s

1.2 TB/s

Scale-out network

(unidirectional bandwidth*)

200 GB/s**

100 GB/s

On the software side, Meta built the chips around industry standards like PyTorch, vLLM, and Triton. Developers can port existing models to MTIA without special adaptations and run them on GPUs and MTIA at the same time. More technical details are available on Meta's blog.

Meta also continues to work with AMD and Nvidia for GPUs. In early February 2026, Meta announced a billion-dollar deal with AMD to provide up to six gigawatts of AMD Instinct GPU computing power for Meta's AI workloads.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

The Decoder重要度42026年4月25日 21:44

「ChatGPT登場以降、米プログラマーの雇用成長がほぼ半減」連邦準備理事会の研究で判明

The Decoder重要度42026年4月25日 21:16

Qwen3.6-27B、大半のコーディングベンチマークで大型後継モデルを凌駕

The Decoder重要度42026年4月25日 19:18

アンストロピック「強力なAIモデルはより良い取引を実現し、劣るモデルを使う利用者は気づかない」

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

The Decoder·2026年3月12日 20:54·約1分

Meta、数十億ユーザーの推論コスト削減に向けて4世代の独自AIチップを発表

#カスタムシリコン #AI推論最適化 #低精度計算(MX4/MX8)#Meta MTIA #インフラコスト削減

TL;DR

メタはBroadcomとの共同開発による4世代の独自AIチップ「MTIA」シリーズを発表し、推論コスト削減と大規模ユーザー基盤へのAI展開を加速させる。

AI深層分析2026年3月12日 21:40

重要/ 5段階

深度40%

キーポイント

MTIAシリーズの世代別ロードマップと生産状況

MTIA 300は量産済み、400はデータセンター展開中、450/500は2027年量産予定と明確な生産スケジュールを提示。

GenAI推論最適化とハードウェア仕様

低精度フォーマットMX4/MX8対応とHBM帯域幅・容量の大幅向上により、既存商用製品を凌駕する推論性能を目指す。

標準ソフトウェアスタックとの完全互換性

PyTorch、vLLM、Tritonに対応し、既存モデルの特殊な調整なしに移植可能でGPUとの並行運用を支援。

多ベンダー併用によるインフラ戦略

独自チップとAMD/Nvidia製GPUを併用し、ベンダーロックインを回避しつつスケーラビリティとコスト最適化を両立。

影響分析・編集コメントを表示

影響分析

編集コメント

image

この記事「Meta unveils four generations of custom AI chips to cut inference costs for billions of users」は、The Decoderで最初に公開されました。

原文を表示

Meta has unveiled four new generations of custom AI chips—MTIA 300, 400, 450, and 500—designed to make AI cheaper to run across its platforms.

MTIA 450 and 500 target generative AI inference

Metric

MTIA 300

MTIA 400

MTIA 450

MTIA 500

Workload Focus

R&R Training

General

GenAI Inference

Module TDP

800 W

1200 W

1400 W

1700 W

HBM Bandwidth

6.1 TB/s

9.2 TB/s

18.4 TB/s

27.6 TB/s

HBM Capacity

216 GB

288 GB

384-512 GB

MX4 Performance

12 PFLOPs

21 PFLOPs

30 PFLOPs

FP8/MX8 Performance

1.2 PFLOPs

6 PFLOPs

7 PFLOPs

10 PFLOPs

BF16 Performance

0.6 PFLOPs

3 PFLOPs

3.5 PFLOPs

5 PFLOPs

Scale-up domain size

Scale-up network

(unidirectional bandwidth*)

1 TB/s

1.2 TB/s

Scale-out network

(unidirectional bandwidth*)

200 GB/s**

100 GB/s

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

この記事をシェア

The Decoder重要度42026年4月25日 21:44

「ChatGPT登場以降、米プログラマーの雇用成長がほぼ半減」連邦準備理事会の研究で判明

The Decoder重要度42026年4月25日 21:16

Qwen3.6-27B、大半のコーディングベンチマークで大型後継モデルを凌駕

The Decoder重要度42026年4月25日 19:18

アンストロピック「強力なAIモデルはより良い取引を実現し、劣るモデルを使う利用者は気づかない」

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む