NVIDIA ALCHEMIツールキットによる化学・材料科学のためのカスタム原子シミュレーションワークフローの構築
NVIDIAは、計算化学と材料科学の分野で精度と速度の長年の課題を解決するために、カスタム原子シミュレーションワークフローを構築できるALCHEMIツールキットを発表した。
キーポイント
精度と速度のトレードオフの解決
従来のab initio法(DFTなど)は高精度だが計算コストが高く、経験的力場は高速だが精度に限界があった。ALCHEMIはこの課題に対処する。
ALCHEMIツールキットの提供
NVIDIAが提供するこのツールキットにより、研究者はカスタムの原子シミュレーションワークフローを構築できる。
化学・材料科学への応用
新薬開発、新素材設計、触媒研究など、計算化学と材料科学の幅広い研究分野での実用化が期待される。
NVIDIAプラットフォームの統合
同ツールキットはNVIDIAのGPUコンピューティング・AIプラットフォームと統合されており、高性能計算を活用できる。
影響分析・編集コメントを表示
影響分析
この発表は、計算化学と材料科学の研究パラダイムを変える可能性がある。研究者がより柔軟かつ効率的にシミュレーションを設計・実行できるようになることで、新薬や新素材の開発サイクルが大幅に短縮されることが期待される。また、NVIDIAのハードウェア・ソフトウェアエコシステムの深化を示す重要な一歩でもある。
編集コメント
NVIDIAが科学計算の重要な分野に本格参入し、研究者向けの実用的なツールを提供する動きは注目に値する。開発者ブログという形式だが、内容は業界に与える影響が大きい技術発表と言える。
image数十年にわたり、計算化学は精度と速度の間で綱引きを続けてきました。密度汎関数理論(DFT)のような第一原理計算手法は高い精度を提供しますが、計算コストが高く、大規模なシステムや長時間スケールのシミュレーションには実用的ではありません。一方、分子動力学(MD)や経験的ポテンシャルなどの古典的手法は高速ですが、電子構造の詳細な記述が欠けており、化学反応や材料特性の正確な予測が制限されます。
このギャップを埋めるため、NVIDIAはALCHEMI(Atomistic Learning and Chemistry Engine for Materials Innovation)Toolkitを発表しました。このオープンソースフレームワークは、機械学習ポテンシャル(MLP)、高速量子力学計算、スケーラブルなワークフロー管理を組み合わせ、研究者が従来の手法では不可能だったスケールと精度でカスタムシミュレーションを構築できるようにします。ALCHEMIは、PyTorch、JAX、NVIDIA CUDA Quantumなどの最先端ツール上に構築され、GPUアクセラレーションによる分子動力学、反応経路探索、材料スクリーニングなどのワークフローをシームレスに統合します。
ALCHEMIの中核は、そのモジュラーアーキテクチャにあります。研究者は、データ生成、機械学習モデルトレーニング、シミュレーション実行、結果分析の各コンポーネントを、Pythonベースのインターフェースを通じて柔軟に組み合わせることができます。例えば、ユーザーは少量のDFT計算から開始し、ALCHEMIのアクティブラーニングツールを用いてトレーニングデータを効率的に拡張し、その後、高速なMLP駆動の分子動力学シミュレーションを数千の原子に対して数ナノ秒の時間スケールで実行できます。このアプローチにより、バッテリー電解質の劣化や触媒表面反応など、複雑な化学プロセスの研究が可能になります。
材料科学への応用例として、ALCHEMIはペロブスカイト太陽電池の欠陥ダイナミクスの調査に使用されています。従来のDFTでは数個の原子を含むシステムに限定されていましたが、ALCHEMIを利用した研究者は、数万原子を含むモデルでイオン移動と再結合プロセスをシミュレーションし、効率低下の新たなメカニズムを明らかにしました。同様に、製薬研究では、タンパク質-リガンド複合体の自由エネルギー計算が、従来の手法に比べて100倍高速化され、創薬パイプラインの加速に貢献しています。
ALCHEMIはコミュニティ主導の開発を重視しており、GitHubで公開されたコードベースには、有機電子材料、合金設計、触媒スクリーニングなどの事前構築済みワークフローが含まれています。NVIDIAはまた、定期的なワークショップとドキュメンテーションを通じて、計算化学者や材料科学者がこのツールキットを効果的に利用できるよう支援しています。今後は、量子機械学習とハイブリッド量子古典アルゴリズムの統合により、さらに高度なシミュレーション能力が期待されています。
要約すると、NVIDIA ALCHEMI Toolkitは、計算化学と材料科学のパラダイムシフトを体現するものです。精度と速度の長年のトレードオフを解消し、研究者が複雑な原子スケール現象を前例のない詳細さとスケールで探索できるようにします。オープンで拡張可能な設計により、ALCHEMIは学術研究から産業応用まで、次世代の材料発見と化学的洞察を推進する基盤となるでしょう。
原文を表示
For decades, computational chemistry has faced a tug-of-war between accuracy and speed. Ab initio methods like density functional theory (DFT) provide high fidelity but are computationally expensive, limiting researchers to systems of a few hundred atoms. Conversely, classical force fields are fast but often lack the chemical accuracy required for complex bond-breaking or transition-state analysis.
Machine learning interatomic potentials (MLIPs) have emerged as the bridge, offering quantum accuracy at classical speeds. However, the software ecosystem is a new bottleneck. While the MLIP models themselves run on GPUs, the surrounding simulation infrastructure often relies on legacy CPU-centric code.
NVIDIA ALCHEMI (AI Lab for Chemistry and Materials Innovation) helps to address these challenges by accelerating chemicals and materials discovery with AI. We have previously announced two components of the ALCHEMI portfolio:
ALCHEMI NIM microservices: Scalable, cloud‑ready microservices for AI-accelerated batched atomistic simulations in chemistry and materials science
ALCHEMI Toolkit-Ops: A set of foundational GPU kernels designed to accelerate the calculations behind simulations, such as neighbor lists, dispersion corrections, and electrostatics
Today, we are introducing the NVIDIA ALCHEMI Toolkit, a collection of GPU-accelerated simulation building blocks that incorporates and expands on ALCHEMI Toolkit-Ops. ALCHEMI Toolkit is designed to manage the data flow between accelerated chemistry and materials domain-specific kernels and deep learning models. ALCHEMI Toolkit extends beyond individual models and kernels to provide a modular, PyTorch-native structure for researchers and developers to compose custom simulation workflows.
Figure 1 shows the ALCHEMI architectural stack and product features supported in this initial release of ALCHEMI Toolkit, including expanded functionality in Toolkit-Ops. This release includes capabilities for geometry relaxation and molecular dynamics, and the supporting pipeline infrastructure for combining multiple simulation workflows.
Figure 1. NVIDIA ALCHEMI Toolkit is a collection of GPU-accelerated simulation building blocks to enable large-scale, batched simulations with AI
How does ALCHEMI Toolkit advance digital chemistry?
ALCHEMI Toolkit is not just a collection of scripts. It’s designed to enable researchers and developers to build custom, performant atomistic simulation workflows with ease.
Expanding ALCHEMI Toolkit-Ops
ALCHEMI Toolkit leverages the capabilities of Toolkit-Ops to handle the underlying calculations of the simulations. The previous release included several key operations:
Neighbor list constructions
DFT-D3 dispersion corrections
Long-range electrostatic interactions
This release broadens the scope of common operations addressed to include:
Batched dynamics kernels
JAX support (for v0.2.0 release features)
Integration with the atomistic simulation ecosystem
ALCHEMI Toolkit is designed to integrate seamlessly with the broader atomistic simulation ecosystem. We’re excited to announce the following integrations with leading platforms in the chemistry and materials science community.
Orbital
Orbital develops advanced AI foundation models used to accelerate the discovery of novel cooling systems for data centers and sustainable materials. Orbital has integrated ALCHEMI Toolkit into their new OrbMolv2 model to drastically reduce the time required for inference. The new model will leverage ALCHEMI Toolkit components such as PME electrostatics for periodic Coulomb interactions and the MTK integrator for batched constant-pressure molecular dynamics. The existing Orb models already leverage Toolkit-Ops for GPU-accelerated graph construction, providing a ~1.7x acceleration for large systems and ~33x for batched smaller systems with TorchSim support.
Materials Graph Library (MatGL)
MatGL is an open source framework for state-of-the-art graph-based MLIPs. ALCHEMI Toolkit is integrating with the MatGL TensorNet model to significantly accelerate materials simulations and property predictions workflows. By leveraging ALCHEMI Toolkit GPU-native kernels and batching infrastructure, MatGL users can achieve higher computational efficiency and lower memory consumption for simulations at scale.
Matlantis
Matlantis enables rapid materials discovery by combining universal MLIPs with high-performance cloud computing. Matlantis is actively exploring the ALCHEMI Toolkit and identifying where its composable dynamics can deliver the greatest value for industrial materials simulation customers. This builds on its proven integration of ALCHEMI Toolkit-Ops—including Warp-optimized neighbor list construction and DFT-D3 dispersion corrections—which significantly reduces computational overhead of atomistic interactions with speedups of up to 10x.
Furthermore, by evaluating specific components within ALCHEMI Toolkit, this collaboration has the potential to enable Matlantis to move beyond single-structure optimization to high-throughput, parallel relaxation of millions of molecular configurations. Ultimately, this integration aims to further power small-scale research and industrial-scale materials design, accelerating chemical evaluation with unparalleled GPU efficiency.
How to get started with ALCHEMI Toolkit
This section walks you through how to get started with ALCHEMI Toolkit, which is straightforward and designed for ease of use.
System and package requirements
Python ≥3.11, <3.14
PyTorch ≥2.8
CUDA Toolkit 12+, NVIDIA driver 470.57.02+
Operating System: Linux (primary), macOS
NVIDIA GPU (RTX 20xx or newer), CUDA Compute Capability ≥ 7.0
Minimum 4 GB RAM (16GB recommended for large systems)
Installation
Use the following code to install ALCHEMI Toolkit:
Install Atomic Simulation Environment (ASE, used in the examples below)uv pip install ase# Using pippip install nvalchemi-toolkit# Using uvuv venv --seed --python 3.12uv pip install nvalchemi-toolkit# Install from sourcecd nvalchemi-toolkituv sync --all-extras# Add nvalchemi as a project dependencyuv add nvalchemi-toolkit
For more information, reference the NVIDIA/nvalchemi-toolkit GitHub repo and the ALCHEMI Toolkit documentation.
Key features of ALCHEMI Toolkit for building end-to-end workflows
This section dives into four core ALCHEMI Toolkit features: customizable batched simulation workflows, build-your-own dynamics classes, model wrappers, and advanced data management. These features provide researchers and developers with the tools and flexibility needed to create bespoke end-to-end workflows that maximize efficiency and performance on NVIDIA GPUs.
Customizable batched simulation workflows
The distinctive feature of the NVIDIA ALCHEMI Toolkit is the GPU-native batched dynamics engine. No single MLIP model is perfect for every chemical environment, especially when dealing with nonlocal, long-range interactions.
ALCHEMI Toolkit enables researchers to combine modular chemistry and materials science domain-specific kernels and models into customized simulation workflows. This architecture supports the development of specialized compute workflows and running virtual laboratories with millions of concurrent atomic interactions without the latency of traditional software stacks.
Capabilities
Composable calculators combining MLIPs with physics-based corrections
High-performance wrappers (MACE, TensorNet, AIMNet2)
API example
The following example constructs the data, sets up the MLIP, and configures a FIRE2 geometry optimization that is then used as a starting point for velocity Verlet (microcanonical) dynamics:
from ase import Atomsfrom nvalchemi.data import AtomicData, AtomicBatchfrom nvalchemi.dynamics import ConvergenceHookfrom nvalchemi.dynamics.optimizers import FIRE2from nvalchemi.dynamics.integrator import VelocityVerletatomic_data = [AtomicData.from_atoms(Atoms(...), device="cuda") for _ in range(16)]batch = Batch.from_data_list(atomic_data)mlip = ...conv_criteria = ConvergenceHook( criteria=[ {"key": "forces", "threshold": 0.05, "reduce_op": "norm"}, {"key": "forces", "threshold": 0.1, "reduce_op": "max"} ])optimizer = FIRE2( mlip, convergence_hook=conv_criteria, n_steps=200)velverlet = VelocityVerlet(mlip, n_steps=1000)
You can run and scale the simulation pipelines in one of two ways: on a single GPU or on across multiple CPUs and GPUs.
Run and scale the pipeline on a single GPU: The FusedStage class is formed by “adding” two or more dynamics objects together. This enables wrapping the end-to-end workflow in torch.compile and sharing CUDA stream contexts.
fused = optimizer + velverletwith fused: fused.run(batch)
With this approach, you can easily build simulation workflows that run sequential steps as samples within the batch converge immediately, and make optimal use of your GPU.
Run and scale the pipeline across multiple CPUs and GPUs: The second approach is to distribute the pipeline across multiple CPUs/GPUs. Using the pipe operator on two dynamics classes will then distribute the FIRE2 optimization onto one GPU, and the velocity Verlet integration on another.
pipeline = optimizer | velverletwith pipeline: pipeline.run(batch)
While this example is deliberately simplified for illustrative purposes, such abstraction allows users to scale their pipeline up to multiple GPUs on a node, and out to multiple nodes to arbitrarily large datasets and number of ranks.
The following example configures eight GPUs to run geometry optimization, which pipelines the results to run Langevin dynamics on another eight GPUs:
from torch import distributed as distfrom torch.utils.data.distributed import DistributedSamplerfrom nvalchemi.data.datapipes import Dataset, DataLoaderdist.initialize_process_group()dataset = Dataset(...)data_sampler = DistributedSampler( dataset, num_replicas=dist.get_world_size(), rank=dist.get_rank())loader = DataLoader( dataset, batch_size=128, sampler=sampler, use_stream=True)optimizers = [FIRE2(mlip, ..., next_rank=index + 8) for index in range(8)]dynamics = [Langevin(mlip, ..., prior_rank=index) for index in range(8)]pipeline = DistributedPipeline( {index: stage for index, stage in enumerate(optimizers + dynamics)})with pipeline: for batch in loader: pipeline.run(batch)
Build-your-own dynamics classes
ALCHEMI Toolkit offers a modular architecture to build and customize dynamics classes from the ground up. This approach enables the community to integrate new sampling methods or thermodynamic ensembles into the ALCHEMI environment while maintaining direct access to underlying kernels. This transforms dynamics into a fully customizable environment where users can construct specialized dynamics classes from scratch.
Capabilities
Specialized GPU-first trajectory analysis tools
Integrated and customizable dynamics kernels (Velocity Verlet, NPT, Langevin thermostats)
FIRE and FIRE2 optimizers
API example
from enum import Enumimport torchfrom nvalchemi.data import Batchfrom nvalchemi.dynamics.base import BaseDynamics, DynamicsStagefrom nvalchemi.hooks import Hook, HookContextclass MySimulatedAnnealer(Hook): def __init__( self, t_start: float, t_end: float, cooldown_steps: int, frequency: int, stage: DynamicsStage ) -> None: self.frequency = frequency self.t_start = t_start self.t_end = t_end self.cooldown_steps = cooldown_steps self.stage = DynamicsStage.BEFORE_STEP self.decay = (t_end / t_start) (1.0 / cooldown_steps) self._current_temp = t_start def __call__(self, ctx: HookContext, stage: Enum) -> None: dynamics = ctx.workflow dynamics.target_temperature = max( dynamics.target_temperature * self.decay, self.t_end )class VelocityVerlet(BaseDynamics) __needs_keys__: {"energies", "forces", "masses", "velocities"} __provides_keys__: {"positions"} def __init__( self, model: BaseModelMixin, n_steps: int, dt: float = 1.0, target_temperature: float = 300.0, tau: float = 10.0, hooks: list[Hook] | None = None, convergence_hook: ConvergenceHook | dict | None = None, kwargs, ): super().__init__(model=model, n_steps=n_steps, hooks=hooks, convergence_hook=convergence_hook) self.dt = dt self.target_temperature = target_temperature self.tau = tau self._prev_accelerations = None def pre_update(self, batch: Batch) -> None: with torch.no_grad(): accelerations = batch.forces / batch.masses self._prev_accelerations = accelerations.clone() batch.positions.add_( batch.velocities * dt + 0.5 * accelerations * dt2.0 ) def post_update(self, batch: Batch) -> None: with torch.no_grad(): new_accelerations = batch.forces / batch.masses batch.velocities.add_(0.5 * (self._prev_accelerations + new_accelerations) * self.dt) ke_per_atom = 0.5 * batch.masses * (batch.velocities2).sum(dim=-1, keepdim=True) total_ke = scatter_add_(...) current_temp = 2.0 * total_ke / (batch.num_atoms * 3.0) ratio = self.target_temperature / current_temp lam = torch.sqrt( torch.tensor(1.0 + (self.dt / self.tau) * (ratio - 1.0)) ).clamp(min=0.8, max=1.2) batch.velocities.mul_(lam)my_velverlet = VelocityVerlet( ..., hooks=[ MySimulatedAnnealer(t_start=900.0, t_end=300.0, cooldown_steps=10, frequency=100, stage=DynamicsStage.BEFORE_STEP) ],)
Model wrappers
With ALCHEMI Toolkit, you can use your own pretrained models with accelerated physics components. It provides the essential infrastructure for importing your own models into the pipeline, ensuring that proprietary or domain-specific architectures can leverage GPU-native orchestration. This abstracts the complexity of different model types, providing a standardized path to move from a standalone model to a production-ready, high-throughput simulation.
Capabilities
MLIP support (MACE, TensorNet, AIMNet2)
Composable calculators
Standardized model configuration
API example
from beartype import beartypefrom super_mlip import BestMLIPModelfrom nvalchemi._typing import ModelOutputsfrom nvalchemi.models.base import BaseModelMixin, ModelConfig, NeighborConfigclass BestMLIPWrapper(nn.Module, BaseModelMixin): def __init__(self, model: BestMLIPModel, kwargs): super().__init__(kwargs) self.model_config = ModelConfig( outputs=frozenset({"energy", "forces", "hessians"}), required_inputs=frozenset({"positions", "atomic_numbers"}) autograd_outputs=frozenset({"forces"}), neighbor_config=NeighborConfig(cutoff=5.0, format="coo")) def adapt_input(self, data: Batch, kwargs) -> dict[str, Any]: model_inputs = super().adapt_input(data, kwargs) model_inputs["atom_numbers"] = data.atomic_numbers model_inputs["coords"] = data.positions return model_inputs def adapt_output(self, model_output: any, data: Batch) -> ModelOutputs: output = super().adapt_output(model_output, data) energies = model_output["energies"] output["energies"] = energies if "forces" in self.model_config.active_outputs: output["forces"] = model_output["forces"] return output @beartype def forward(self, data: Batch, kwargs) -> ModelOutputs: model_inputs = self.adapt_input(data, kwargs) model_outputs = super().forward(**model_inputs) return self.adapt_output(model_outputs, data)
Advanced data management
Traditionally, the “memory tax” of moving data between the CPU and GPU is a significant bottleneck in AI-driven discovery. ALCHEMI Toolkit acts as the specialized orchestrator for scientific data, providing the infrastructure required to build custom ingestion pipelines to move information from standard research files into optimized GPU tensors.
This supports discovery to scale, making industrial-scale simulations accessible through familiar interfaces. By standardizing how atomic information is represented and loaded, ALCHEMI Toolkit ensures that data remains resident on the device, meaning the entire simulation stays on the GPU, enabling batched simulations for optimization of GPU utilization and eliminating communication overhead.
Capabilities
High-performance data loaders
ASE and Pymatgen interface
AtomicData and batch objects
API example
from nvalchemi import AtomicData, Batchfrom nvalchemi import datafrom ase.build import slabatoms = slab(...)data = AtomicData.from_atoms(atoms, device="cuda")>>> data ...data.node_propertiesdata.system_propertiesbatch = Batch.from_data_list([data, data, data])batch.num_graphsbatch.get_data(0)batch[:2]batch[mask]batch["energies"] -> ...batch.from_atoms([ase.Atoms,...])writer = data.AtomicDataZarrWriter("atom_dataset.zarr")writer.write(batch)reader = data.AtomicDataZarrReader("atom_dataset.zarr")dataset = data.Dataset(reader, device = "cuda", num_workers=4)dataloader = data.DataLoader(dataset, batch_size=16)for batch in dataloader:
Get started building molecular workflows with ALCHEMI Toolkit
ALCHEMI Toolkit provides researchers and developers with the low-level primitives and high-level abstractions needed to build end-to-end, GPU-native molecular workflows. Moving critical bottlenecks—such as neighbor list construction, structural relaxation, and integration steps—into the PyTorch ecosystem eliminates the host-to-device memory transfer overhead that has traditionally throttled MLIP-driven simulations.
Whether you’re composing hybrid ML or physics potentials or scaling batched molecular dynamics, ALCHEMI Toolkit exposes the necessary API hooks to manage complex tensorized states without sacrificing performance.
To accelerate your chemistry and materials science simulations and explore building your own custom workflows, visit the NVIDIA/nvalchemi-toolkit GitHub repo and ALCHEMI Toolkit documentation. As we continue to expand the library of supported operations and architectures, we encourage you to clone the repository, explore the provided Jupyter notebooks, and begin integrating these GPU-accelerated workflows into your own discovery pipelines.
Acknowledgments
We’d like to thank James Gin, Tim Duignan, Vaidas Šimkus of Orbital; Professor Shyue Ping Ong of MatGL; Susumu Ohno, Ryuhei Okuno, Jethro Tan of Matlantis for working with us to adopt NVIDIA ALCHEMI Toolkit into their platforms. We would also like to thank Nikita Fedik, Roman Zubatyuk, Atul Thakur, and Logan Ward for their contributions to this post.
関連記事
プロテオーム規模でのタンパク質構造予測を加速する方法
NVIDIAが、タンパク質が単体ではなく相互作用によって生物学的プロセスを制御する点に着目し、プロテオーム規模でのタンパク質構造予測を高速化する手法を提案している。
Kubernetes上でSlurmを使用した大規模GPUワークロードの実行
NVIDIAが、オープンソースのクラスタ管理システムSlurmをKubernetesと統合し、大規模GPUワークロードを効率的に管理・スケジューリングする方法を紹介している。SlurmはTOP500システムの65%以上で採用されている実績を持つ。
約30行のPythonとNVIDIA nvCOMPでチェックポイントコストを削減
NVIDIAが、LLM学習時のチェックポイント保存コストを削減するPythonスクリプトを公開した。約30行のコードでモデル重み・オプティマイザ状態・勾配の圧縮保存を実現し、ストレージコストとI/O負荷を低減できる。