TLDR AI·2026年5月7日 09:00·約6分

NVIDIA Spectrum-X — AI ネイティブなオープン Ethernet ファブリックがギガスケール AI の基準を設定、MRC を追加

#NVIDIA Spectrum-X #RDMA #AI Infrastructure #GPU Clustering

TL;DR

NVIDIA は大規模 AI 学習用ネットワークの標準を確立する「Spectrum-X」に、単一接続で複数経路へトラフィックを分散しスループットと可用性を向上させる新プロトコル MRC を実装した。

AI深層分析2026年5月7日 23:08

重要/ 5段階

深度40%

キーポイント

MRC プロトコルの導入

RDMA 接続において単一の接続で複数のネットワーク経路へトラフィックを分散する「Multipath Reliable Connection」プロトコルが実装された。

大規模 AI 学習の最適化

すべての利用可能な経路に負荷を分散することで、GPU の稼働率を最大化し、スケーラブルな AI 学習ファブリックのスループットと可用性を向上させる。

運用管理の簡素化

管理者がトラフィック経路に対して微細な可視性と制御権を持つことで、大規模環境におけるトラブルシューティングと運用を加速する。

影響分析・編集コメントを表示

影響分析

この技術的進展は、数千〜数万の GPU を使用する大規模 AI モデルの学習において、ネットワークボトルネックを解消しハードウェアリソースの最大活用を可能にする画期的なものです。特に、複雑化する大規模ファブリックの運用負荷を軽減し、開発サイクルの加速に直接寄与するため、AI インフラ業界における標準プロトコルとしての地位を確固たるものにするでしょう。

編集コメント

大規模 AI 学習においてネットワークの信頼性と効率性を担保する MRC の実装は、次世代データセンター設計における重要なマイルストーンです。

世界で最も強力な AI ファクトリを構築するための競争は、AI そのものの野望に匹敵するネットワークを必要としています。

NVIDIA Spectrum-X Ethernet のスケールアウトインフラストラクチャは、パフォーマンス、レジリエンス、スケーラビリティにおいて妥協できない業界のリーダーによって導入されている、現在利用可能な最も先進的な AI ネットワーク技術として、この競争の最前線に立っています。

そこには OpenAI、Microsoft、Oracle も含まれます。

NVIDIA、Microsoft、OpenAI などの企業は、RDMA トランスポートプロトコルであるマルチパス・リライアブル・コネクション（MRC: Multipath Reliable Connection）を導入することで業界のリーダーシップを示しました。MRC は単一の RDMA コネクションでトラフィックを複数のネットワーク経路に分散させることを可能にし、大規模な AI 学習ファブリックにおけるスループット、負荷分散、可用性を向上させます。

これは、町を横断する単一車線の道路を、渋滞や通行止めを回避して迂回できるスマートな街路網システムとリアルタイムの交通アプリで置き換えるようなものだと考えてください。

「Blackwell 世代での MRC の導入は非常に成功し、NVIDIA との強力な協力体制によって実現しました」と、OpenAI の産業用コンピューティング責任者である Sachin Katti は述べています。「MRC のエンドツーエンドのアプローチにより、私たちは典型的なネットワーク関連の遅延や中断を大幅に回避し、大規模な最先端トレーニング実行の効率性を維持することができました」。

さらに、Microsoft と NVIDIA は、次世代 AI に必要なインフラの発展に向けた長年にわたる協力関係を築いています。Microsoft の Fairwater および Oracle Cloud Infrastructure (OCI) の Abilene データセンターは、最先端のフロンティア大規模言語モデル（LLM）のトレーニングと展開のために特別に設計された世界最大級の AI ファクトリーの 2 つであり、パフォーマンス、スケーラビリティ、効率性の要件を満たすために MRC に依存しています。NVIDIA Spectrum-X Ethernet はこの環境に適しており、大規模な AI モデルやアプリケーションを自信を持って実行するために必要なネットワーク基盤を提供するのに役立ちます。

NVIDIA Spectrum-X Ethernet ハードウェア上でパフォーマンスを最適化し、実環境での実績も証明された MRC は、Open Compute Project を通じてオープン仕様としてリリースされました。これは、Spectrum-X Ethernet プラットフォームの力を示すものであり、目的別に設計されたハードウェア、深いテレメトリ（監視データ）、そしてインテリジェントなファブリック制御が連携することで、ネットワークを介して 2 つのシステム間でデータの移動を制御するルールセットである新しいプロトコルを、概念からギガスケールの AI 生産環境へと導く力を証明しています。

MRC は、利用可能なすべての経路にトラフィックを負荷分散させることで GPU の利用率を高いレベルで維持し、トレーニング実行中を通じてすべての GPU が必要な帯域幅を獲得できるようにします。また、混雑が発生してもリアルタイムで過負荷の経路を動的に回避することで、高帯域幅を維持します。

データ損失が発生した際、インテリジェントな再送信機能により迅速かつ精密な復旧が可能となり、長時間実行されるジョブに対する短時間の中断の影響を最小限に抑え、GPU のアイドル時間を避けるのに役立ちます。

管理者はまた、トラフィック経路に関する微細な可視性と制御を獲得し、運用の簡素化とスケールにおけるトラブルシューティングの加速を実現します。

Spectrum-X Ethernet 上で展開された MRC は、大規模なスケールでのレジリエンス（回復力）のために最適化・設計されています。その障害バイパス技術は、わずか数マイクロ秒でネットワーク経路の障害を検出し、ハードウェア内で自動的にトラフィックを迂回させることができます。

この障害回避技術は、数千の GPU が同期状態を維持しなければならない AI 学習クラスターにおいて重要です。ネットワークの断続がわずか数秒であっても、学習ジョブ全体が遅延したり中断されたりする可能性があるからです。Spectrum-X Ethernet はハードウェア速度で応答することでこれを防止し、ギガスケール AI ファブリック全体にわたって正確な経路を維持しながらトラフィックの流れを保ちます。

ギガスケールの AI 工場を実現するためのもう一つの重要な革新は、マルチプレーンネットワーク設計です。これは OpenAI が Spectrum-X Ethernet と MRC を組み合わせて導入しているものです。マルチプレーンネットワークは、複数の独立したネットワークファブリック（または平面）で構成されており、各平面が GPU 間の代替通信経路を提供します。

NVIDIA Spectrum-X のマルチプレーン機能は、ハードウェアアクセラレーションによる平面間での負荷分散をサポートすることでこのネットワークアーキテクチャを強化し、パフォーマンスを犠牲にすることなく耐障害性とスケーラビリティを向上させます。これにより、数百数千の GPU にスケールしても遅延が予測可能に低く保たれます。

Spectrum-X Ethernet を利用すると、顧客は RDMA 転送モデルを選択できます。Spectrum-X Ethernet Adaptive RDMA および MRC プロトコルに加え、その他のカスタムプロトコルも、NVIDIA ConnectX SuperNICs と Spectrum-X Ethernet switches 上でネイティブに実行され、ギガスケールでのマルチプレーンネットワーク設計をサポートします。

このように、現在最大の AI クラスターを支える Spectrum-X Ethernet のハードウェアおよびソフトウェアインフラは、顧客に対してワークロードに最適な転送手段を選択する柔軟性を提供します。

MRC 転送プロトコルは、業界が現代の AI インフラ全体にわたって統合される柔軟で構成可能なプラットフォームとして Spectrum-X Ethernet を活用している最新の実例です。

AI ファクトリが継続してスケールしていく中で、ネットワークは単にデータを高速に移動させるだけでなく、より多くの役割を果たす必要があります。それは知的であり、回復力があり、オープンスタンダードに基づいたものでなければなりません。NVIDIA Spectrum-X Ethernet はこれら三つの要件すべてを満たし、MRC によって高度な AI ネットワーキングの基準をさらに確立し続けています。

NVIDIA は、AMD、Broadcom、Intel、Microsoft、OpenAI と共同で MRC の開発を行いました。

*ウェブページ、*データシート、*技術白書* で NVIDIA Spectrum-X Ethernet について詳しくご覧ください。*

*ソフトウェア製品情報に関する*注意事項もご確認ください**。

原文を表示

The race to build the world’s most powerful AI factories demands networking that keeps pace with the ambitions of AI itself.

NVIDIA Spectrum-X Ethernet scale-out infrastructure stands at the forefront of that race as the most advanced AI networking technology available today, deployed by industry leaders who can’t afford to compromise on performance, resilience or scale.

That includes OpenAI, Microsoft and Oracle.

Companies including NVIDIA, Microsoft and OpenAI have demonstrated industry leadership by introducing Multipath Reliable Connection (MRC), an RDMA transport protocol. MRC enables a single RDMA connection to distribute traffic across multiple network paths, improving throughput, load balancing and availability for large-scale AI training fabrics.

Think of it as replacing a single-lane road spanning a town with a cleverly laid-out street grid system paired with an on-the-fly traffic app, enabling drivers to reroute around slowdowns and road closures.

“Deploying MRC in the Blackwell generation was very successful and was made possible by a strong collaboration with NVIDIA,” said Sachin Katti, head of industrial compute at OpenAI. “MRC’s end-to-end approach enabled us to avoid much of the typical network-related slowdowns and interruptions and maintain the efficiency of frontier training runs at scale.”

In addition, Microsoft and NVIDIA have a longstanding collaboration focused on advancing the infrastructure required for the next generation of AI. Microsoft’s Fairwater and Oracle Cloud Infrastructure (OCI’s) Abilene data center, two of the largest AI factories purpose-built for training and deploying leading-edge frontier LLMs, rely on MRC to deliver on performance, scale and efficiency requirements. NVIDIA Spectrum-X Ethernet is suited for this environment, helping provide the network foundation needed to run large-scale AI models and applications with confidence.

Proven first in production with performance optimized on NVIDIA Spectrum-X Ethernet hardware and now released as an open specification through the Open Compute Project, MRC demonstrates the power of the Spectrum-X Ethernet platform: purpose-built hardware, deep telemetry and intelligent fabric control working together to take a new protocol — a set of rules that controls how data moves between two systems across a network — from concept to gigascale AI production.

MRC delivers high levels of GPU utilization by load-balancing traffic across all available paths, enabling every GPU to get the bandwidth it needs throughout a training run. It sustains high bandwidth even under congestion by dynamically avoiding overloaded paths in real time.

When data loss occurs, intelligent retransmission enables rapid, precise recovery, minimizing the impact of short-lived interruptions to long-running jobs, helping avoid GPU idle time.

Administrators also gain fine-grained visibility and control over traffic paths, simplifying operations and accelerating troubleshooting at scale.

MRC, deployed on Spectrum-X Ethernet, is optimized and engineered for resilience at massive scale. Its failure bypass technology can — in just microseconds — detect a network path failure and reroute traffic automatically in hardware.

This failure bypass technology matters for AI training clusters where thousands of GPUs must stay synchronized, as even a brief network disruption can slow or interrupt an entire training job. Spectrum-X Ethernet prevents that by responding at hardware speed, keeping traffic flowing along precise pathways across gigascale AI fabrics.

Another innovation key to achieving gigascale AI factories is multiplanar network designs, which OpenAI deploys with Spectrum-X Ethernet in conjunction with MRC. A multiplane network consists of multiple independent network fabrics, or planes, with each providing an alternate communication path between GPUs.

The NVIDIA Spectrum-X Multiplane capability enhances this network architecture by supporting hardware-accelerated load balancing across the planes, boosting resiliency and scale without sacrificing performance. This keeps latencies predictably low while scaling to hundreds of thousands of GPUs.

With Spectrum-X Ethernet, customers are provided with a choice of RDMA transport models. Both Spectrum-X Ethernet Adaptive RDMA and MRC protocols, as well as other custom protocols, run natively across NVIDIA ConnectX SuperNICs and Spectrum-X Ethernet switches and support multiplanar network designs at gigascale.

In this way, the Spectrum-X Ethernet hardware and software infrastructure that powers today’s largest AI clusters gives customers the flexibility to choose the right transport for their workload.

The MRC transport protocol is the latest example of how the industry is using Spectrum-X Ethernet as a flexible, composable platform that integrates across the full breadth of modern AI infrastructure.

As AI factories continue to scale, the network must do more than move data quickly. It must be intelligent, resilient and based on open standards. NVIDIA Spectrum-X Ethernet delivers on all three, and with MRC, it continues to set the standard for advanced AI networking.

NVIDIA collaborated on MRC development with AMD, Broadcom, Intel, Microsoft and OpenAI.

*Learn more about NVIDIA Spectrum-X Ethernet on the *webpage*, *datasheet* and *technical whitepaper*. *

*See *notice *regarding software product information**.*

この記事をシェア

TLDR AI2026年6月26日 09:00

研究科学者の就職活動から得た驚くべき教訓（11 分読）

TLDR AI2026年6月26日 09:00

ツール使用型 LLM エージェントの脆弱性評価手法「RHB」を発表

TLDR AI2026年6月26日 09:00

ある言語モデルのドイツ語話能を削除した件（3 分読）

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

TLDR AI·2026年5月7日 09:00·約6分

NVIDIA Spectrum-X — AI ネイティブなオープン Ethernet ファブリックがギガスケール AI の基準を設定、MRC を追加

#NVIDIA Spectrum-X #RDMA #AI Infrastructure #GPU Clustering

TL;DR

AI深層分析2026年5月7日 23:08

重要/ 5段階

深度40%

キーポイント

MRC プロトコルの導入

RDMA 接続において単一の接続で複数のネットワーク経路へトラフィックを分散する「Multipath Reliable Connection」プロトコルが実装された。

大規模 AI 学習の最適化

運用管理の簡素化

管理者がトラフィック経路に対して微細な可視性と制御権を持つことで、大規模環境におけるトラブルシューティングと運用を加速する。

影響分析・編集コメントを表示

影響分析

編集コメント

大規模 AI 学習においてネットワークの信頼性と効率性を担保する MRC の実装は、次世代データセンター設計における重要なマイルストーンです。

世界で最も強力な AI ファクトリを構築するための競争は、AI そのものの野望に匹敵するネットワークを必要としています。

そこには OpenAI、Microsoft、Oracle も含まれます。

NVIDIA は、AMD、Broadcom、Intel、Microsoft、OpenAI と共同で MRC の開発を行いました。

*ウェブページ、*データシート、*技術白書* で NVIDIA Spectrum-X Ethernet について詳しくご覧ください。*

*ソフトウェア製品情報に関する*注意事項もご確認ください**。

原文を表示

The race to build the world’s most powerful AI factories demands networking that keeps pace with the ambitions of AI itself.

That includes OpenAI, Microsoft and Oracle.

When data loss occurs, intelligent retransmission enables rapid, precise recovery, minimizing the impact of short-lived interruptions to long-running jobs, helping avoid GPU idle time.

Administrators also gain fine-grained visibility and control over traffic paths, simplifying operations and accelerating troubleshooting at scale.

In this way, the Spectrum-X Ethernet hardware and software infrastructure that powers today’s largest AI clusters gives customers the flexibility to choose the right transport for their workload.

NVIDIA collaborated on MRC development with AMD, Broadcom, Intel, Microsoft and OpenAI.

*Learn more about NVIDIA Spectrum-X Ethernet on the *webpage*, *datasheet* and *technical whitepaper*. *

*See *notice *regarding software product information**.*

この記事をシェア

TLDR AI2026年6月26日 09:00

研究科学者の就職活動から得た驚くべき教訓（11 分読）

TLDR AI2026年6月26日 09:00

ツール使用型 LLM エージェントの脆弱性評価手法「RHB」を発表

TLDR AI2026年6月26日 09:00

ある言語モデルのドイツ語話能を削除した件（3 分読）

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み