NVIDIA Developer Blog·2026年6月2日 07:00·約13分で読める

NVIDIA DGX Spark で高速モデルとマルチノードクラスタリングを活用し、ローカル AI エージェントを実行可能に

#ローカル AI #NVIDIA DGX Spark #NemoClaw #Qwen3.6 #マルチノードクラスター

TL;DR

NVIDIA は DGX Spark の新ソフトウェアリリースにより、ローカル AI エージェントの構築と実行を劇的に簡素化し、セキュリティとプライバシーを重視する開発者向けの多ノードクラスター対応も強化した。

AI深層分析2026年6月11日 23:07

重要/ 5段階

深度40%

キーポイント

ローカル AI エージェントの実行環境の革新

NVIDIA DGX Spark と NemoClaw の連携により、モデル選定やバックエンド設定に数日かかっていたプロセスを、箱から出して数分で実行可能な状態にする「OOBE（Out-of-Box Experience）」を実現した。

セキュリティとコストの最適化

クラウド依存を排除し、機密情報をデバイス内に保持することでプライバシーを確保し、トークン単価のコストも不要にするローカル実行が可能となった。

スケーラビリティの向上とモデル性能

Qwen3.6 などの高性能モデル対応に加え、チーム規模での利用を想定したガイド付き多ノードクラスター設定機能により、単一デバイスを超えた拡張性が提供される。

DGX Spark の初期セットアップ最適化

2026 年 6 月リリース版では、初期設定時にデフォルトで OTA アップデートが実行されなくなったため、Ubuntu デスクトップへの到達時間が短縮されています。

NemoClaw の統合と OpenShell のセキュリティ

NemoClaw はオープンモデル、エージェントハッチ、および安全な実行環境である NVIDIA OpenShell をパッケージ化したもので、自律型エージェントの運用におけるアクセス制御やプライバシー保護を強化します。

NemoClaw のシームレスな導入フロー

DGX Spark 上で NemoClaw をインストールする際、OOBE（初期セットアップ）完了からモデルのダウンロード、そして最初のエージェント起動までを自動化したデスクトップインストーラーが提供されています。

ローカル AI エージェントの実行環境

NVIDIA DGX Spark を活用することで、高速なモデルとマルチノードクラスタリングをサポートするローカル AI エージェントの構築が可能になります。

影響分析・編集コメントを表示

影響分析

この発表は、AI エージェントの実装における「オンプレミス/ローカル」シフトを加速させる重要な転換点となります。特にセキュリティ要件の高い企業や、ランニングコストを抑制したい開発者にとって、複雑なインフラ構築の障壁を取り除くことで、自律型 AI システムの普及が現実的なものになります。

編集コメント

「数分で実行可能」という表現は、これまで AI エージェント開発のボトルネックだったインフラ構築の難易度を劇的に下げるものであり、実用化への道筋を明確にした画期的なアップデートです。

自律的で長期間稼働する AI エージェントの台頭は、大規模なコンテキストウィンドウを維持し、並列サブエージェントを起動し、クラウド依存なしで継続的に反復処理を行うという、新たな計算需要をもたらしました。セキュリティとプライバシーへの懸念もまた、ローカルエージェントへの移行を加速させています。

開発者は、NVIDIA NemoClaw を実行オーケストレーションに用いて所有するハードウェア上で自律型エージェントを実行することで、機密性の高いコンテキストをデバイス内に保持し、エージェントがアクセスできる範囲を直接制御し、トークンごとのコストを排除できます。

NVIDIA DGX Spark は、ローカルで自律型エージェントを構築・実行するために設計されています。Computex 2026 では、NVIDIA がその実現を大幅に容易にし、箱から開けて数分以内に AI エージェントを実行できるまでの streamlined なパス（初期モデルのダウンロードはネットワーク速度に依存するため除く）を導入します。また、Qwen3.6 におけるモデル性能の向上と、単一デバイスを超えてスケーリングが必要なチーム向けのガイド付きマルチノードクラスタセットアップも用意されています。

本記事では、エージェント型 AI システムを構築する開発者にとってこれらのアップデートが何を意味するか、NVIDIA NemoClaw のインストール方法、そのセットアップ内容、および DGX Spark 上で OpenClaw を用いて最初のエージェントを構築・実行する方法について解説します。

前提条件

初期モデルのダウンロードに必要なアクティブなインターネット接続

オプションの設定ステップのためのターミナル操作への慣れ

ローカル AI エージェントのアンボックスから実行まで

ローカル AI エージェントを稼働させるには、従来より適切なモデルの選定、推論バックエンドの設定、ランタイムのインストール、そしてそれらを接続する作業が必要でした。このプロセスは熟練した開発者であっても、1 日近くかかることがありました。新しい簡素化された NemoClaw のインストールパスがこれを変えます。

新規システムでは、DGX Spark のアンボックスと初回セットアップから体験が始まります。DGX Spark システムソフトウェアの最新バージョンである 2026 年 6 月リリースは、これまでで最も簡素化された箱開け直後の体験 (OOBE: Out-of-Box Experience) を提供し、ユーザーがより早くローカルエージェントに到達できるようにしています。今回のリリースでは、初期セットアップ時にデフォルトでオーバー・ザ・エア (OTA: Over-The-Air) アップロードがインストールされなくなり、セットアップ時間の短縮と、Ubuntu デスクトップへの早期アクセスが可能になりました。

NemoClaw は、オープンモデル、Hermes Agent や OpenClaw といったエージェント・ハネス、そして NVIDIA OpenShell runtime の 3 つを単一のインストールにパッケージ化したオープンソースのブループリントです。OpenShell は、自律型エージェントをより安全に実行するために設計された、セキュアでサンドボックス化された実行環境です。これにより、エージェント・ループにアクセス制御、プライバシー保護、運用上のガードレールが追加されます。オンデバイス推論と組み合わせることで、開発者はエージェントワークロードに対して、より堅牢なデフォルトのセキュリティおよびプライバシー姿勢を実現できます。

ステップ 1: NemoClaw のインストール

以下の図 1 は、OOBE（Out-of-Box Experience）完了から DGX Spark 上で動作する NemoClaw エージェントに至るまでの完全なフローを示しています。**

image*図 1. OOBE 完了からモデルのダウンロード、最初のエージェント起動に至る DGX Spark 上の NemoClaw デスクトップインストールフロー*

OOBE を完了すると、DGX Spark は再起動し、NemoClaw プレイブックが強調表示された build.nvidia.com/spark が開かれ、ガイド付きのウォークスルーが行われます。Node.js（必要な場合）のインストール、OpenShell のインストール、最新の安定版 NemoClaw リリースのクローン作成、CLI のビルド、およびサンドボックス作成のためのオンボードウィザードの実行を行うには、以下の単一コマンドを実行してください。

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

インストール・ウィザードはセットアップの手順を案内します：

NemoClaw および OpenClaw のライセンスを受け入れる — yes と入力して確認
簡易インストールを実行する — Y と入力して確認
ローカル Ollama がセットアップされ、Qwen3.6-35B が自動的にダウンロードされる

DGX Spark/GB10 システムで NemoClaw をインストールする方法の詳細はこちら：NemoClaw の DGX Spark での開始 →

ステップ 2: エージェントへのアクセス

インストールが完了したら、エージェントのカスタマイズが可能になります。

まず、WebUI を使用して対話します：

nemoclaw <サンドボックス名> gateway-token --quiet

次に、トークン化された URL をブラウザで開きます：http://127.0.0.1:18789/#token=<WEBUI_TOKEN>。127.0.0.1 は正確に入力してください — ゲートウェイのオリジンチェックにはこれが必要であり（localhost ではありません）。

「hello」または「what can you do?」といった簡単なテストメッセージを送信して、フルスタックが正常に起動していることを確認します。ローカル Ollama モデルはすでに選択されており、オンボーディング時に NemoClaw がこれを自動的に設定します。

ステップ 3: 最初のエージェントの構築

サンドボックスが稼働中であるため、NemoClaw アプリケーションプレイブックでは、すぐに実行可能な 4 つのエージェントを提供しています。それぞれにポリシー設定、スタート用プロンプト、およびパーソナライゼーションのガイダンスが含まれています：

毎日の個人ニュースダイジェスト — 定期的に朝に行われるブリーフィングで、ユーザーの関心分野をスキャンし、構造化された要約を Telegram に投稿します。
ソフトウェア開発エージェント — ローカルプロジェクトディレクトリを読み込み、計画を立て、コードを書き、レビューを行います。外部ネットワークへの通信はローカル推論のみです。

デッキおよびドキュメントレビューアー — ファイルが公開される前にレッドチーム（敵対的テスト）を実行し、矛盾点、根拠のない主張、アクセシビリティの問題について深刻度順にランク付けされた課題リストを返す
カレンダー交渉役 — 「いつ会えるか？」というスレッドを確認済みのカレンダーイベントに変換するスケジュール担当チーフオブスタッフ

ステップ 4: さらに詳細なカスタマイズ

サンドボックスが実行されている状態で、エージェントの動作を形作る主なレバーは以下の通りです。

システムプロンプト — ダッシュボードからエージェントの指示を編集し、応答方法や行動前に何を尋ねるべきかを定義します。より具体的なプロンプトを使用することで、より信頼性の高いエージェントが作成されます。
ツール権限 — OpenShell のネットワークポリシーにより、エージェントが呼び出せる外部先を制御します。権限を狭めることで、予期せぬ動作を減らすことができます。
統合 — オンボーディング時にメッセージングチャネルを有効化している場合、そのチャネルですでにエージェントにアクセスできます。スマートフォンからメッセージを送信すると、同じローカルモデルを使用して応答が返されます。

開発者は、異なるモデルへの差し替え、OpenShell の権限調整、およびローカルワークフローへの接続を通じてさらにカスタマイズできます。異なるモデルで新しいサンドボックスを起動するには、nemoclaw onboard --fresh --gpu を実行し、ウィザード内で別のモデルを選択してください。なお、--fresh オプションは既存のサンドボックスを破棄して再作成するため、既存のものに影響を与えずに追加のサンドボックスを作成する場合は --name <new-name> を使用してください。完全な NemoClaw インストール手順とモデルカタログは、NVIDIA NGC で利用可能です。

ヒント：範囲を狭く始めることです。 最初の実行では、エージェントに「ローカルドキュメントからファイルを要約する」や「質問に答える」といった、単一の明確なスコープを持つタスクを与えてください。権限を広げる前に、応答とツール呼び出しが適切か確認してください。

反復作業中に覚えておくと役立つコマンドをいくつか紹介します：

コマンド機能

nemoclaw <サンドボックス名> statusサンドボックスの状態と推論の健全性を表示

nemoclaw <サンドボックス名> logs --followリアルタイムでサンドボックスログをストリーム表示

nemoclaw list登録されているすべてのサンドボックスを一覧表示

*表 1. エージェント用サンドボックスの監視および管理に役立つ NemoClaw CLI コマンド*

Qwen3.6-35B を使用する DGX Spark エージェント

開発者は、MTP（Multi-Token Prediction）最適化を活用した vLLM で、Qwen 3.6 35B のようなトップクラスのエージェントモデルを使用する際、最大 2.6 倍の高速な推論を体験できます。これには NVIDIA の NVFP4 量子化チェックポイントが利用可能です。さらに、vLLM の MTP に対する CUDA Graph サポートの改善（FlashInfer を使用）、FlashInfer MoE カーネル全体での BF16 オートチューニング、TinyGEMM および cuBLAS の BF16 パスに関する追加的な改良が施されています。

image*図 2. Computex における最適化により、DGX Spark 上の Qwen3.6-35B の vLLM 全体のスループット性能が 2.6 倍向上*

スケーリング：NVIDIA Sync 内のクラスターアシスタント

単一の DGX Spark では必要十分なメモリやスループットを提供できない開発者のために、NVIDIA Sync のクラスターアシスタントは、2 台から 4 台の DGX Spark ユニットをハイバンド幅クラスターに接続するプロセスを自動化します。

クラスター化はモデルレベルで重要です：2 つの DGX Spark ノードでは 256 GB の統合メモリ（約 400B パラメータモデルに対応可能）が提供され、4 つのノードでは 512 GB が提供されます。これは、大規模な MoE モデルの実行や、複数の並列推論インスタンスを備えたマルチエージェントパイプライン、分散メモリを活用するファインチューニングジョブを実行するのに十分な容量です。

クラスターのセットアップには ConnectX-7 ネットワークの構成が必要です。各 DGX Spark には 200 Gbps RoCE をサポートする ConnectX-7 NIC が搭載されていますが、正しく使用するには netplan の設定、ノード間 SSH 信頼関係の構築、各リンク間の帯域幅の確認、およびターゲットトポロジーに適した IP アサインメントスキームの理解が必要です。クラスターアシスタントは、Sync 内のガイド付きワークフローを通じてネットワーク構成を簡素化します。

Sync が設定する内容

Sync に既に登録されているデバイスから開始し、クラスターアシスタントは以下の手順を順に実行します：システム準備状況の確認（OTA バージョン、sudo アクセス権限）、各ノードで並列に実行されるプローブによる CX-7 トポロジ検出（LLDP/BPDU の証拠とインターフェースおよび IP 確認を組み合わせたもの）、IP プランニングおよび競合回避、netplan の適用、ib_write_bw / ib_write_lat を用いた帯域幅とレイテンシの検証、そして CX-7 ファブリックを介して鍵がルーティングされるノード間 SSH 設定です。

サポートされている物理構成は、2 ノード間の直接接続（単一の QSFP ケーブル、スイッチ不要）、3 ノードリング（3 つの QSFP ケーブル、各ノードで両方の CX-7 ポートがアクティブ）、および QSFP スイッチを介した 2〜4 ノードです。最小要件は以下の通りです：

最低 4 ポートの QSFP56-DD
25/50/100/200/400 G へのブレイクアウト対応
推奨ポート速度：ポートあたり最大 200G〜400G
管理用イーサネットポート 1 つ（1/10GbE）
RoCE v2 をサポート
スイッチング容量/スループット：最低 0.8〜1.6 Tbps

NVIDIA Sync クラスターアシスタントおよびサポートされるトポロジに関するドキュメントについては、NVIDIA Sync ドキュメンテーションをご覧ください。

DGX Spark についてさらに詳しく知る

これら 3 つの機能はすべて現在利用可能です：

NemoClaw の簡素化されたインストール：DGX Spark で NemoClaw を開始 →
例題となる NemoClaw アプリケーション：NemoClaw エージェントの設定例 →
DGX Spark を NVIDIA Brev に統合：NVIDIA Brev で DGX Spark を登録 →

ビルドを開始する

2026 年 Computex で発表された DGX Spark のアップデートは、本番環境向けのローカルエージェント構築における最大の障壁である 2 つを解消します。それは、最初のエージェントが動作するまでの時間と、大規模モデルを実行するために必要な計算リソースへのアクセスです。

簡素化された NemoClaw インストールにより、開発者は箱を開けてからすぐに Qwen3.6-35B をデフォルトモデルとして採用し、組み込みの安全な実行環境を備えた OpenClaw エージェントを稼働させることができます。さらに多くの機能が必要なチーム向けには、Sync のクラスターアシスタントが、フル性能の ConnectX-7 を備えたマルチノードクラスターの構築における専門知識の壁を取り除きます。

NVIDIA DGX Spark での構築を開始する →

原文を表示

The rise of autonomous, long-running AI agents has introduced a new class of compute demand, namely tasks that maintain large context windows, spawn concurrent subagents, and iterate continuously without cloud dependency. Security and privacy concerns are also accelerating the shift toward local agents.

Developers, by running autonomous agents on hardware they own with NVIDIA NemoClaw orchestrating execution, can keep sensitive context on-device, retain direct control over what an agent can access and eliminate per-token costs.

NVIDIA DGX Spark is designed to build and run autonomous agents locally. At Computex 2026, NVIDIA is making it significantly easier to get there, introducing a streamlined path from unboxing to running AI agents in minutes (excluding initial model download, which depends on network speed). There are also model performance improvements with Qwen3.6 and a guided multi-node cluster setup for teams that need to scale beyond a single device.

This post will cover what these updates mean for developers building agentic AI systems, including how to install NVIDIA NemoClaw, what it sets up, and how to build and run your first agent with OpenClaw on DGX Spark.

Prerequisites

Active internet connection for the initial model download

Familiarity with a terminal for optional configuration steps

From unboxing to running a local agent

Getting a local AI agent running has historically involved sourcing the right model, configuring an inference backend, installing a runtime, and wiring them together. That process could take the better part of a day even for experienced developers. The new streamlined NemoClaw installation path changes that.

For new systems, the experience begins with unboxing and first-time setup of DGX Spark. The latest version of the DGX Spark system software, the June 2026 release, delivers the most streamlined out-of-box experience (OOBE) yet so users can reach local agents faster. With this release, over-the-air updates are no longer installed by default during initial setup, reducing setup time and getting users to the Ubuntu desktop sooner.

NemoClaw is an open source blueprint that packages three things into a single install: open models, an agent harness, like Hermes Agent or OpenClaw, and theNVIDIA OpenShell runtime. OpenShell is a secure, sandboxed execution environment designed for running autonomous agents more safely. It adds access controls, privacy protections, and operational guardrails to the agent loop. Combined with on-device inference, this gives developers a stronger default security and privacy posture for agentic workloads.

Step 1: Install NemoClaw

Figure 1, below, shows the full path from OOBE completion to a running NemoClaw agent on DGX Spark.**

Figure 1. The NemoClaw desktop install flow on DGX Spark, from OOBE completion through model download to first agent launch

After completing OOBE, DGX Spark reboots and opens build.nvidia.com/spark with the NemoClaw playbook prominently displayed for a guided walkthrough. Run this single command to install Node.js (if needed), install OpenShell, clone the latest stable NemoClaw release, build the CLI, and run the onboard wizard to create a sandbox.

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

The installation wizard walks you through setup:

Accept NemoClaw and OpenClaw licenses — Confirm by entering yes

Run express install — Confirm by entering Y

Local Ollama is set up along with Qwen3.6-35B automatically downloaded

Learn more about how to install NemoClaw on your DGX Spark/GB10 system: Start with NemoClaw on DGX Spark →

Step 2: Access your agent

Once the install completes, you are ready to customize your agents.

First, interact using WebUI:

nemoclaw <sandbox name> gateway-token --quiet

Then open the tokenized URL in a browser: http://127.0.0.1:18789/#token=<WEBUI_TOKEN>. Use 127.0.0.1 exactly — the gateway origin check requires it (not localhost).

Send a quick test message — "hello” or “what can you do?” — to confirm the full stack is up. The local Ollama model is already selected; NemoClaw configures this automatically during onboarding.

Step 3: Build your first agent

With your sandbox running, the NemoClaw Applications playbook offers four ready-to-run agents to get started — each with policy setup, a starter prompt, and personalization guidance:

Daily Personal News Digest — a scheduled morning briefing that sweeps your topics and posts a structured digest to Telegram

Software Development Agent — reads a local project directory, builds a plan, writes and reviews its own code, all with no outbound network beyond local inference

Deck and Document Reviewer — red-teams a file before it goes out, returning a severity-ranked punch list of inconsistencies, unsourced claims, and accessibility issues

Calendar Negotiator — a scheduling chief-of-staff that turns “when can we meet?” threads into a confirmed calendar event

Step 4: Further customizations

With the sandbox running, the main levers for shaping agent behavior are:

System prompt — Edit the agent’s instructions from the dashboard to shape how it responds and what it should ask before acting. More specific prompts produce more reliable agents.

Tool permissions — OpenShell network policies control which external destinations the agent can call. Narrower permissions reduce unexpected behavior.

Integrations — If you enabled a messaging channel during onboarding, the agent is already reachable there. Send it a message from your phone and it responds using the same local model.

Developers can further customize by swapping in different models, adjusting OpenShell permissions, and connecting the agent to local workflows. To spin up a new sandbox with a different model, run nemoclaw onboard --fresh --gpu and select a different model during the wizard. Note that —fresh destroys and recreates the existing sandbox — use --name <new-name> to create an additional sandbox without affecting existing ones. The full NemoClaw install instructions and model catalog are available on NVIDIA NGC.

Tip:** Start narrow. Give the agent a single, well-scoped task on your first run, such as “summarize a file” or “answer a question” from a local document. Verify that the response and tool calls look right before expanding its permissions.

A few commands worth keeping handy as you iterate:

DGX Spark agents using Qwen3.6-35B

Developers can experience up to 2.6x faster inference with top agentic models like Qwen 3.6 35B on vLLM with NVIDIA’s NVFP4 quantized checkpoint using MTP optimizations. Additional improvements to vLLM CUDA Graph support for MTP with FlashInfer, BF16 autotuning across FlashInfer MoE kernels, TinyGEMM and cuBLAS BF16 paths.

Figure 2. Computex optimizations deliver a 2.6× improvement in overall throughput performance for Qwen3.6-35B on DGX Spark with vLLM

Scaling up: The cluster assistant in NVIDIA Sync

For developers who need more memory or throughput than a single DGX Spark can provide, the cluster assistant in NVIDIA Sync automates the process of connecting two to four DGX Spark units into a high-bandwidth cluster.

Clustering matters at the model level: two DGX Spark nodes provide 256 GB of unified memory (sufficient for ~400B-parameter models), and four nodes provide 512 GB. That’s enough to run large MoE models, multi-agent pipelines with multiple concurrent inference instances, or fine-tuning jobs that benefit from distributed memory.

Setting up the cluster requires configuring the ConnectX-7 networking. Each DGX Spark has ConnectX-7 NICs that support 200 Gbps RoCE, but using them correctly requires configuring netplan, setting up node-to-node SSH trust, verifying bandwidth across each link, and knowing the right IP assignment scheme for the target topology. The cluster assistant simplifies the network configuration through a guided workflow inside Sync.

What Sync configures

Starting from devices already enrolled in Sync, the cluster assistant walks through: system readiness checks (OTA version, sudo access),CX-7 topology detection using a probe that runs on each node in parallel and combines LLDP/BPDU evidence with interface and IP checks, IP planning and deconfliction and netplan application, bandwidth and latency validation via ib_write_bw / ib_write_lat, and inter-node SSH setup using keys routed over the CX-7 fabric.

Supported physical configurations are two-node direct connection (single QSFP cable, no switch), three-node ring (three QSFP cables, both CX-7 ports active per node), and two-to-four nodes via a QSFP switch with the minimum requirements shown here:

Minimum 4 ports QSFP56-DD

Breakout to 25/50/100/200/400 G

Recommended max port speed of 200G-400G per port

One 1/10GbE management Ethernet port

Supports RoCE v2

Switching capacity/throughput: Minimum 0.8 -1.6 Tbps

For documentation on the NVIDIA Sync cluster assistant and supported topologies, see the NVIDIA Sync documentation.

Explore more on DGX Spark

All three capabilities are available now:

NemoClaw streamlined install: Start with NemoClaw on DGX Spark →

Example NemoClaw Applications: Setup Example NemoClaw Agents →

Bring DGX Spark to NVIDIA Brev: Register your DGX Spark on NVIDIA Brev →

Start building

The DGX Spark updates at Computex 2026 reduce the two biggest blockers to building production-quality local agents: time to first agent and access to the compute needed to run large models.

The streamlined NemoClaw install gets developers from unboxing to a running OpenClaw agent with Qwen3.6-35B as the default model and a built-in secure execution environment. For teams that need more, the cluster assistant in Sync removes the expertise barrier to spinning up a multi-node cluster with full ConnectX-7 performance.

Start building on NVIDIA DGX Spark →

この記事をシェア

Ars Technica AI★42026年6月11日 04:29

Google DeepMind、ローカルAIを4倍高速化する拡散モデル「DiffusionGemma」を公開

Google DeepMindは、従来の逐次生成ではなくテキストブロックを並列生成する新モデル「DiffusionGemma」を発表し、Nvidia DGXやゲーミングGPUなどのローカル環境で処理速度を4倍に向上させたと発表した。

MarkTechPost★42026年6月11日 03:50

Google AI、テキスト拡散を用いた26B MoEオープンモデル「DiffusionGemma」を公開

Google DeepMindチームは、標準的な自己回帰型ではなくテキスト拡散方式を採用した実験的オープンモデル「DiffusionGemma」をApache 2.0ライセンスで公開し、開発者や研究者向けに高速な生成ワークフローを提供する。

Ars Technica AI★42026年6月4日 04:10

Google の新モデル「Gemma 4 12B」は 16GB RAM のノート PC で動作可能に設計

Google は、メモリ消費を抑えた新しい生成 AI モデル「Gemma 4 12B」を発表した。このモデルは、一般的な消費者向けノートパソコン（RAM 16GB）でも実行できるように最適化されており、ローカルでの AI 利用を促進するものである。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

NVIDIA Developer Blog·2026年6月2日 07:00·約13分で読める

NVIDIA DGX Spark で高速モデルとマルチノードクラスタリングを活用し、ローカル AI エージェントを実行可能に

#ローカル AI #NVIDIA DGX Spark #NemoClaw #Qwen3.6 #マルチノードクラスター

TL;DR

AI深層分析2026年6月11日 23:07

重要/ 5段階

深度40%

キーポイント

ローカル AI エージェントの実行環境の革新

セキュリティとコストの最適化

スケーラビリティの向上とモデル性能

DGX Spark の初期セットアップ最適化

NemoClaw の統合と OpenShell のセキュリティ

NemoClaw のシームレスな導入フロー

ローカル AI エージェントの実行環境

NVIDIA DGX Spark を活用することで、高速なモデルとマルチノードクラスタリングをサポートするローカル AI エージェントの構築が可能になります。

影響分析・編集コメントを表示

影響分析

編集コメント

前提条件

初期モデルのダウンロードに必要なアクティブなインターネット接続

オプションの設定ステップのためのターミナル操作への慣れ

ローカル AI エージェントのアンボックスから実行まで

ステップ 1: NemoClaw のインストール

以下の図 1 は、OOBE（Out-of-Box Experience）完了から DGX Spark 上で動作する NemoClaw エージェントに至るまでの完全なフローを示しています。**

image*図 1. OOBE 完了からモデルのダウンロード、最初のエージェント起動に至る DGX Spark 上の NemoClaw デスクトップインストールフロー*

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

インストール・ウィザードはセットアップの手順を案内します：

NemoClaw および OpenClaw のライセンスを受け入れる — yes と入力して確認
簡易インストールを実行する — Y と入力して確認
ローカル Ollama がセットアップされ、Qwen3.6-35B が自動的にダウンロードされる

DGX Spark/GB10 システムで NemoClaw をインストールする方法の詳細はこちら：NemoClaw の DGX Spark での開始 →

ステップ 2: エージェントへのアクセス

インストールが完了したら、エージェントのカスタマイズが可能になります。

まず、WebUI を使用して対話します：

nemoclaw <サンドボックス名> gateway-token --quiet

ステップ 3: 最初のエージェントの構築

毎日の個人ニュースダイジェスト — 定期的に朝に行われるブリーフィングで、ユーザーの関心分野をスキャンし、構造化された要約を Telegram に投稿します。
ソフトウェア開発エージェント — ローカルプロジェクトディレクトリを読み込み、計画を立て、コードを書き、レビューを行います。外部ネットワークへの通信はローカル推論のみです。

デッキおよびドキュメントレビューアー — ファイルが公開される前にレッドチーム（敵対的テスト）を実行し、矛盾点、根拠のない主張、アクセシビリティの問題について深刻度順にランク付けされた課題リストを返す
カレンダー交渉役 — 「いつ会えるか？」というスレッドを確認済みのカレンダーイベントに変換するスケジュール担当チーフオブスタッフ

ステップ 4: さらに詳細なカスタマイズ

サンドボックスが実行されている状態で、エージェントの動作を形作る主なレバーは以下の通りです。

システムプロンプト — ダッシュボードからエージェントの指示を編集し、応答方法や行動前に何を尋ねるべきかを定義します。より具体的なプロンプトを使用することで、より信頼性の高いエージェントが作成されます。
ツール権限 — OpenShell のネットワークポリシーにより、エージェントが呼び出せる外部先を制御します。権限を狭めることで、予期せぬ動作を減らすことができます。
統合 — オンボーディング時にメッセージングチャネルを有効化している場合、そのチャネルですでにエージェントにアクセスできます。スマートフォンからメッセージを送信すると、同じローカルモデルを使用して応答が返されます。

反復作業中に覚えておくと役立つコマンドをいくつか紹介します：

コマンド機能

nemoclaw <サンドボックス名> statusサンドボックスの状態と推論の健全性を表示

nemoclaw <サンドボックス名> logs --followリアルタイムでサンドボックスログをストリーム表示

nemoclaw list登録されているすべてのサンドボックスを一覧表示

*表 1. エージェント用サンドボックスの監視および管理に役立つ NemoClaw CLI コマンド*

Qwen3.6-35B を使用する DGX Spark エージェント

image*図 2. Computex における最適化により、DGX Spark 上の Qwen3.6-35B の vLLM 全体のスループット性能が 2.6 倍向上*

スケーリング：NVIDIA Sync 内のクラスターアシスタント

Sync が設定する内容

最低 4 ポートの QSFP56-DD
25/50/100/200/400 G へのブレイクアウト対応
推奨ポート速度：ポートあたり最大 200G〜400G
管理用イーサネットポート 1 つ（1/10GbE）
RoCE v2 をサポート
スイッチング容量/スループット：最低 0.8〜1.6 Tbps

DGX Spark についてさらに詳しく知る

これら 3 つの機能はすべて現在利用可能です：

NemoClaw の簡素化されたインストール：DGX Spark で NemoClaw を開始 →
例題となる NemoClaw アプリケーション：NemoClaw エージェントの設定例 →
DGX Spark を NVIDIA Brev に統合：NVIDIA Brev で DGX Spark を登録 →

ビルドを開始する

NVIDIA DGX Spark での構築を開始する →

原文を表示

Prerequisites

Active internet connection for the initial model download

Familiarity with a terminal for optional configuration steps

From unboxing to running a local agent

Step 1: Install NemoClaw

Figure 1, below, shows the full path from OOBE completion to a running NemoClaw agent on DGX Spark.**

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

The installation wizard walks you through setup:

Accept NemoClaw and OpenClaw licenses — Confirm by entering yes

Run express install — Confirm by entering Y

Local Ollama is set up along with Qwen3.6-35B automatically downloaded

Learn more about how to install NemoClaw on your DGX Spark/GB10 system: Start with NemoClaw on DGX Spark →

Step 2: Access your agent

Once the install completes, you are ready to customize your agents.

First, interact using WebUI:

nemoclaw <sandbox name> gateway-token --quiet

Then open the tokenized URL in a browser: http://127.0.0.1:18789/#token=<WEBUI_TOKEN>. Use 127.0.0.1 exactly — the gateway origin check requires it (not localhost).

Step 3: Build your first agent

With your sandbox running, the NemoClaw Applications playbook offers four ready-to-run agents to get started — each with policy setup, a starter prompt, and personalization guidance:

Daily Personal News Digest — a scheduled morning briefing that sweeps your topics and posts a structured digest to Telegram

Software Development Agent — reads a local project directory, builds a plan, writes and reviews its own code, all with no outbound network beyond local inference

Deck and Document Reviewer — red-teams a file before it goes out, returning a severity-ranked punch list of inconsistencies, unsourced claims, and accessibility issues

Calendar Negotiator — a scheduling chief-of-staff that turns “when can we meet?” threads into a confirmed calendar event

Step 4: Further customizations

With the sandbox running, the main levers for shaping agent behavior are:

System prompt — Edit the agent’s instructions from the dashboard to shape how it responds and what it should ask before acting. More specific prompts produce more reliable agents.

Tool permissions — OpenShell network policies control which external destinations the agent can call. Narrower permissions reduce unexpected behavior.

Integrations — If you enabled a messaging channel during onboarding, the agent is already reachable there. Send it a message from your phone and it responds using the same local model.

A few commands worth keeping handy as you iterate:

DGX Spark agents using Qwen3.6-35B

Scaling up: The cluster assistant in NVIDIA Sync

What Sync configures

Minimum 4 ports QSFP56-DD

Breakout to 25/50/100/200/400 G

Recommended max port speed of 200G-400G per port

One 1/10GbE management Ethernet port

Supports RoCE v2

Switching capacity/throughput: Minimum 0.8 -1.6 Tbps

For documentation on the NVIDIA Sync cluster assistant and supported topologies, see the NVIDIA Sync documentation.

Explore more on DGX Spark

All three capabilities are available now:

NemoClaw streamlined install: Start with NemoClaw on DGX Spark →

Example NemoClaw Applications: Setup Example NemoClaw Agents →

Bring DGX Spark to NVIDIA Brev: Register your DGX Spark on NVIDIA Brev →

Start building

The DGX Spark updates at Computex 2026 reduce the two biggest blockers to building production-quality local agents: time to first agent and access to the compute needed to run large models.

Start building on NVIDIA DGX Spark →

この記事をシェア

Ars Technica AI★42026年6月11日 04:29

Google DeepMind、ローカルAIを4倍高速化する拡散モデル「DiffusionGemma」を公開

MarkTechPost★42026年6月11日 03:50

Google AI、テキスト拡散を用いた26B MoEオープンモデル「DiffusionGemma」を公開

Ars Technica AI★42026年6月4日 04:10

Google の新モデル「Gemma 4 12B」は 16GB RAM のノート PC で動作可能に設計

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

キーポイント

影響分析

編集コメント

前提条件

ローカル AI エージェントのアンボックスから実行まで

ステップ 1: NemoClaw のインストール

ステップ 2: エージェントへのアクセス

ステップ 3: 最初のエージェントの構築

ステップ 4: さらに詳細なカスタマイズ

Qwen3.6-35B を使用する DGX Spark エージェント

スケーリング：NVIDIA Sync 内のクラスターアシスタント

Sync が設定する内容

DGX Spark についてさらに詳しく知る

ビルドを開始する

Prerequisites

From unboxing to running a local agent

Step 1: Install NemoClaw

Step 2: Access your agent

Step 3: Build your first agent

Step 4: Further customizations

DGX Spark agents using Qwen3.6-35B

Scaling up: The cluster assistant in NVIDIA Sync

What Sync configures

Explore more on DGX Spark

Start building

関連記事

キーポイント

影響分析

編集コメント

前提条件

ローカル AI エージェントのアンボックスから実行まで

ステップ 1: NemoClaw のインストール

ステップ 2: エージェントへのアクセス

ステップ 3: 最初のエージェントの構築

ステップ 4: さらに詳細なカスタマイズ

Qwen3.6-35B を使用する DGX Spark エージェント

スケーリング：NVIDIA Sync 内のクラスターアシスタント

Sync が設定する内容

DGX Spark についてさらに詳しく知る

ビルドを開始する

Prerequisites

From unboxing to running a local agent

Step 1: Install NemoClaw

Step 2: Access your agent

Step 3: Build your first agent

Step 4: Further customizations

DGX Spark agents using Qwen3.6-35B

Scaling up: The cluster assistant in NVIDIA Sync

What Sync configures

Explore more on DGX Spark

Start building

関連記事