TLDR AI·2026年6月25日 09:00·約3分で読める

動画拡散潜在変数からの三角形スプラット生成（5 分読了）

#Video Diffusion Models #3D Reconstruction #Generative AI #Computer Vision

TL;DR

研究チームが動画拡散モデルの潜在表現から直接三角形スプラットを生成する手法を発表し、3D 再構築のプロセス効率化に新たな突破口を開いた。

AI深層分析2026年6月26日 00:03

重要/ 5段階

深度40%

キーポイント

動画潜在表現からの直接生成

従来の複雑なパイプラインを経ず、動画拡散モデルの内部潜在表現（latents）を直接解析して三角形スプラットを抽出する手法を確立した。

3D 再構築の効率化

計算コストと処理時間を大幅に削減し、動画から高品質な 3D メッシュを迅速に作成することを可能にした。

既存手法との比較優位性

従来の幾何学的推定や多視点ステレオスコピーに基づくアプローチよりも、動画の時間的整合性をより効果的に活用できる点を強調している。

影響分析・編集コメントを表示

影響分析

この技術は、動画から 3D モデルを生成する際の計算リソースと時間を劇的に削減し、ゲーム開発やバーチャルリアリティコンテンツ制作のワークフローを変革する可能性があります。また、拡散モデルの潜在空間が持つ幾何学的情報を活用できることは、今後の 3D AI 分野における新たな研究パラダイムを示唆しています。

編集コメント

動画生成 AI の応用範囲が 2D から 3D 空間へと急速に拡大しており、実用的な 3D コンテンツ作成のハードルを下げると期待される画期的なアプローチです。

幾何学的に正確なシーン生成のための順伝播潜在三角形スプッティング。

動画拡散潜在から明示的な表面アライメントされた三角形スプッティングを、単一の順伝播パスで復号する。

Orest Kupyn1,2, Goutam Bhat1, Philipp Henzler1, Fabian Manhardt1, Christian Rupprecht1,2, Federico Tombari1,3

1 Google Research

2 University of Oxford, Visual Geometry Group

3 Technical University of Munich

FLAT は、圧縮された動画拡散潜在を明示的な非体積的シーンパラメータに直接マッピングできることを示している。3D ガウス（Gaussian）を復号するのではなく、三角形スプッティングをワンパスで予測することで、幾何学的精度を向上させつつ競争力のある視覚品質を維持し、軽量なリファインメント後に単純な三角形レンダラーによるラスタライゼーションや物理ベースのインタラクションを可能にする。

直接三角形復号

FLAT は、多くの順伝播シーンパイプラインで一般的に用いられる「生成後最適化」という経路を避け、圧縮された動画拡散潜在を明示的な三角形スプッティングへ直接変換する。

幾何学特化型トレーニング

レイ中心の三角形パラメータ化とプロダクトウィンドウレンダリング関数が、三角形回帰を安定化させる。これにより、小さな方向誤差が勾配フローを破綻させることを防ぐ。

不透明アセットへのリファインメント

軽量なテスト時リファインメントステップにより、予測された三角形の集合体が、標準的なレンダリングやゲームエンジン風のインタラクションに適した完全な不透明表現に変換されます。

生成されたシーンを明示的な三角形ジオメトリとして検査する。

FLAT は、シンプルな三角形レンダラーですぐに探索できるシーンを出力します。これにより、重厚なレンダリングエンジンへの依存なく、ビューアーは高速かつあらゆるデバイスでポータブルになります。タッチ対応デバイスでは、シーン内でドラッグして周囲を見渡したり、画面上の移動ボタンを使用してナビゲートしたりできます。

White Room

Loading White Room...

ナビゲーション

W A S D で移動、ドラッグで視点移動、R でリセット。

ヒント

ビューポート内のどこでもダブルクリックすると、デフォルトの視点にスナップします。

タッチによる移動

外観と表面構造は整合性を保ちます。

私たちが目指すのは、画像のリアリズムだけでなく幾何学的な精度です。これらの対になったレンダリングは、FLAT の新規視点と表面法線が視点間を通じて一貫性を保ち、幾何学信号を外観だけで隠すのではなく明瞭にしていることを示しています。

新規視点

表面法線

image

01 / 07

BibTeX

@misc{kupyn2026flat,

title = {FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation},

author = {Orest Kupyn and Goutam Bhat and Philipp Henzler and Fabian Manhardt and Christian Rupprecht and Federico Tombari},

year = {2026},

note = {Preprint}

}

原文を表示

Feedforward Latent Triangle Splatting for geometrically accurate scene generation.

Decode explicit surface-aligned triangle splats from video diffusion latents in a single forward pass.

Orest Kupyn1,2, Goutam Bhat1, Philipp Henzler1, Fabian Manhardt1, Christian Rupprecht1,2, Federico Tombari1,3

1 Google Research

2 University of Oxford, Visual Geometry Group

3 Technical University of Munich

FLAT shows that compressed video diffusion latents can be mapped directly to explicit non-volumetric scene parameters. Instead of decoding 3D Gaussians, it predicts triangle splats in one pass, improving geometric accuracy while preserving competitive visual quality and enabling rasterization with simple triangle renderers and physics-based interaction after lightweight refinement.

Direct Triangle Decoding

FLAT turns compressed video diffusion latents into explicit triangle splats directly, avoiding the usual generate-then-optimize path used by many feedforward scene pipelines.

Geometry-Specific Training

Ray-centered triangle parameterization and a product window rendering function stabilize triangle regression, where small orientation errors would otherwise break gradient flow.

Refinement to Opaque Assets

A lightweight test-time refinement step converts the predicted triangle soup into a fully opaque representation that fits standard rendering and game-engine-style interaction.

Inspect generated scenes as explicit triangle geometry.

FLAT outputs scenes that can be explored immediately with a simple triangle renderer. This makes the viewer fast and portable across devices, without depending on a heavy rendering engine. On touch devices, drag inside the scene to look around and use the on-screen movement buttons to navigate.

White Room

Loading White Room...

Navigation

W A S D move, drag to look, R to reset.

Tip

Double-click anywhere in the viewport to snap back to the default view.

Touch Movement

Appearance and surface structure stay aligned.

We target geometric accuracy, not only image realism. These paired renders show that FLAT's novel views and surface normals stay consistent across viewpoints, making the geometry signal legible instead of hiding it behind appearance alone.

Novel View

Surface Normals

Pair 01 novel-view render produced by FLAT

01 / 07

BibTeX

code

@misc{kupyn2026flat,
  title        = {FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation},
  author       = {Orest Kupyn and Goutam Bhat and Philipp Henzler and Fabian Manhardt and Christian Rupprecht and Federico Tombari},
  year         = {2026},
  note         = {Preprint}
}

この記事をシェア

404 Media★32026年6月24日 22:34

スナップの AI スペックス：笑えるほど奇妙な体験

404 Media は、チャールズ国王の肖像画を背景に、巨大で重たい Snap Specs を装着した際、デジタル版が実像に重ねられ、ナレーターが蝶に触れるよう指示する奇妙な体験を紹介している。

AWS Machine Learning Blog★42026年6月23日 01:28

Amazon SageMaker AI の処理ジョブで ComfyUI ワークフローを実行する方法

AWS は、Amazon SageMaker AI の処理ジョブ上で ComfyUI ワークフローを実行可能であることを発表し、企業が大規模なコンテンツ生成を自動化できる仕組みを提供した。

NVIDIA Developer Blog★42026年6月26日 01:38

NVIDIA ACE を活用した KRAFTON の共演可能キャラクター「PUBG Ally」の構築方法

ゲーム開発会社 KRAFTON は、NVIDIA の AI 技術プラットフォーム「ACE」を活用し、プレイヤーと対話可能な共演可能キャラクター「PUBG Ally」を PUBG に実装した。

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

@misc{kupyn2026flat, title = {FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation}, author = {Orest Kupyn and Goutam Bhat and Philipp Henzler and Fabian Manhardt and Christian Rupprecht and Federico Tombari}, year = {2026}, note = {Preprint} }

動画拡散潜在変数からの三角形スプラット生成（5 分読了）

キーポイント

影響分析

編集コメント

生成されたシーンを明示的な三角形ジオメトリとして検査する。

外観と表面構造は整合性を保ちます。

Inspect generated scenes as explicit triangle geometry.

Appearance and surface structure stay aligned.

関連記事

動画拡散潜在変数からの三角形スプラット生成（5 分読了）

キーポイント

影響分析

編集コメント

生成されたシーンを明示的な三角形ジオメトリとして検査する。

外観と表面構造は整合性を保ちます。

Inspect generated scenes as explicit triangle geometry.

Appearance and surface structure stay aligned.

関連記事