読み込み中…

AWS Machine Learning Blog·2026年4月24日 01:17·約14分

治療と患者ケアにおけるマルチモーダル生物基盤モデルの応用

#マルチモーダルAI #生物基礎モデル(BioFMs)#創薬支援 #AWS Machine Learning #個別化医療

TL;DR

AWSはマルチモーダル生物基礎モデル（BioFMs）を活用し、タンパク質構造解析から臨床記録まで複数の生体データを統合分析することで、創薬開発と患者ケアの意思決定を支援するプラットフォームを提供している。

AI深層分析2026年4月24日 01:56

注目/ 5段階

深度40%

キーポイント

BioFMsの領域別適用比率

創薬（分子設計）20%、オミクス30%、医療画像15%、臨床記録35%の領域でモデル能力が分散しており、単一データ型分析の限界を超える統合が必要とされている。

単一モーダルからマルチモーダルへ

2024年ノーベル化学賞の対象となった単一データ型モデルの知見を基盤とし、テキスト・画像・生体データを同時に処理するマルチモーダルBioFMsへの移行が加速している。

AWSの統合開発環境

生体データ、モデル開発、スケーラブルな計算リソース、パートナーツールを一体化した基盤を提供し、創薬ライフサイクル全体における意思決定の迅速化と高精度化を支援する。

重要な引用

Healthcare and life sciences decision making increasingly relies on multimodal data to diagnose diseases, prescribe medicine and predict treatment outcomes, develop and optimize innovative therapies accurately.

Biological foundation models (BioFMs) are AI models pre-trained on large biological datasets.

AWS provides a unified environment for multimodal biological foundation models (BioFMs), enabling you to make more confident, timely decision-making in personalized medicine.

影響分析・編集コメントを表示

影響分析

マルチモーダルBioFMsの普及は、創薬開発の期間短縮とコスト削減に直結し、個別化医療の実現を加速させる。AWSなどのクラウド基盤が参入障壁を下げることで、中小バイオ企業や医療機関も高度なAI創薬に参画しやすくなる。ただし、生体データのプライバシー保護とモデルのバイアス検証が実用化の鍵となる。

編集コメント

AWSのプラットフォーム紹介記事ではあるが、マルチモーダルBioFMsが創薬と臨床の分断されたデータを統合する方向性は、業界標準として確実に移行しつつある。技術の実装よりも、生体データの標準化と品質管理が今後の課題となるだろう。

ヘルスケアおよびライフサイエンスの意思決定は、疾患の診断、医薬品の処方、治療結果の予測、革新的な療法の開発と最適化を正確に行うために、マルチモーダルデータ（multimodal data）への依存が高まっています。従来のアプローチは、創薬のためのオミクス（‘omics）、診断用の医療画像、検証用の臨床試験報告書、患者治療用の電子カルテ（electronic health records, EHR）など、断片化されたデータを分析します。その結果、意思決定者（CxOs、VPs、Director）は、データタイプ間の関係に隠された重要な洞察を見逃しがちです。最近のAIの進歩により、これらの断片化されたデータストリームを効率的に統合・分析し、療法および患者ケアに関するより包括的な理解をサポートすることが可能になりました。

AWSは、マルチモーダル生物基盤モデル（multimodal biological foundation models, BioFMs）のための統一された環境を提供し、パーソナライズドメディシン（personalized medicine）におけるより確実かつ迅速な意思決定を可能にします。このAIシステムは、生物データ、モデル開発、スケーラブルなコンピューティング（scalable compute）、パートナーツールを組み合わせ、創薬ライフサイクル（drug development life cycle）をサポートします。本稿では、マルチモーダルBioFMsの動作原理を探り、創薬および臨床開発における実世界のアプリケーションを紹介し、AWSが組織にマルチモーダルBioFMsの構築と展開をどのように可能にするかの文脈を説明します。

マルチモーダル生物基盤モデル

生物基盤モデル（Biological foundation models, BioFMs）は、大規模な生物データセットで事前学習されたAIモデルです。BioFMsは、特定のヘルスケアおよびライフサイエンスのタスクにおいて高度な能力を発揮します。一般的に使用されるBioFMsは、創薬および臨床開発の領域をカバーしており、特にタンパク質構造と分子設計（protein structure and molecule design）（約20%）、DNA、エピジェネティクス（epigenetic）、RNAを含むオミクスデータ分析（約30%）、医療画像（medical imaging）（15%）、臨床文書（clinical documentation）（約35%）に焦点を当てています（Delile et al. 2025）。

ユニモーダルBioFMs（Unimodal BioFMs）は、タンパク質構造の予測などの関連する下流アプリケーション（downstream applications）のために、単一のデータモダリティ（data modality）のみで学習されます。この画期的な成果は、2024年ノーベル化学賞の受賞につながりました。マルチモーダルBioFMsは複数のデータタイプ（テキスト、オーディオ、画像、ビデオ、以降「モダリティ（modalities）」と呼ぶ）で学習し、単一のモデル内で異なるストリームに対して同時に推論を行うことができます（例：テキストプロンプトから新しい画像を生成したり、画像とキャプションをマッチングさせたりする）。

代表的なマルチモーダルBioFMsの例には以下が含まれます：

Latent LabsのLatent-X1およびLatent-X2は、タンパク質の3次元構造（3D structures）を予測するだけでなく、抗体、マクロシクリックペプチド（macrocyclic peptides）、ミニタンパク質（miniproteins）のような新規バインダーを生成し、標的との相互作用も予測します。

Arc InstituteのEvo 2は、生物学の中核原理（central dogma）をマッピングし、DNA、RNA、タンパク質の構造と機能を解釈・予測します。

Insilco MedicineのNach01は、自然言語、化学インテリジェンス（chemical intelligence）、3次元分子構造データを統合し、創薬を加速します。

BioptimusのM-Optimusは、組織病理学（histology）データと臨床データを解読して豊富な生物学的知見を得られ、研究から患者ケアまでの複数の段階をサポートします。

ハーバードとAstraZenecaのMADRIGALは、構造データ、パスウェイ（経路）データ、細胞生存率データ、トランスクリプトームデータ（transcriptomic data）を統合し、薬物併用療法の臨床結果を予測し、有害相互作用を特定し、多剤併用管理（polypharmacy management）を最適化します。

John Snow Labのビジョン言語モデル（vision language model）Medical VLM-24Bは、臨床メモ、検査報告書、画像（X線、MRI、CT）を処理し、統一された文脈認識型診断を実現します。

GEHCの3次元磁気共鳴画像法（MRI）ファウンデーションモデル（foundation model）は、画像検索、分類、画像セグメンテーション、レポート生成などのタスク向けアプリケーションを構築できるように設計されています。

多モーダル（multimodal）の利点

現在のモデルの最前線は、多モーダル（multimodal）な理解と生成能力の境界を押し広げています。Amazon Nova 2 Omniのような汎用モデルは、テキスト、画像、動画、音声の入力を処理しながら、テキストと画像の両方を生成できます。このマルチモーダリティの傾向はBioFMs（バイオロジー・ファウンデーションモデル）にも及び、医療画像と臨床文書（clinical documentation）などの複数のデータタイプを組み合わせることで、予測精度の向上と多様な臨床結果への幅広い適用性が実現されます（Siam et al. 2025）。

多様な生物学的データタイプを統合することで、測定可能な性能向上が得られます：

診断精度の向上：ゲノム学、画像解析、臨床データを統合したモデルは、単一モーダルベースライン（unimodal baselines）と比較して、診断（アルツハイマー病、脳がんなど）や表現型において平均4〜7%の曲線下面積（Area Under the Curve, AUC）向上をもたらす（Sun et al. 2024）。さらに、患者スクリーニング時に検査データ、患者の運動指標、臨床メモを統合したモデルは、心血管疾患リスク予測（cardiovascular risk prediction）において92.74%の精度と93.21 AUCを達成している（Guo and Wu, 2025）。

ターゲット治療戦略（Targeted therapeutic strategies）：ゲノムプロファイル（genomic profiles）、医療画像、臨床病歴を統合したモデルを用いて、個別の患者に対する効果的な介入法の選択をガイドすることができる（Parvin et al. 2025）。これは特にがん患者において顕著な影響を及ぼし、腫瘍ゲノム学と放射線画像解析（radiological imaging）が化学療法のレジメン（chemotherapy regimens）などの治療決定を支援する（Restrepo et al. 2023）。

新しい疾患メカニズム（disease mechanisms）：単一細胞マルチオミクスモデル（Single-cell multi-omics models）は、白血病などの血液疾患においてがん細胞がどのように増殖し治療に抵抗するかを示しており、隠れたがん細胞の発見、変異が疾患進行をどのように駆動するかの追跡、患者への個別化治療（personalized treatments）の選択を通じて、医師が生存率を向上させるのを支援する（Kim and Takahashi, 2025）。

正確なリスク予測（risk prediction）：検査結果、投薬情報、臨床メモ、退院サマリーその他の臨床データを統合したモデルを用いて、30日以内の再入院リスク（hospital readmission risk）を76%の精度で予測できる。これにより、個別化介入を通じて高リスク心不全患者の全体的な臨床アウトカム（clinical outcomes）を改善すると同時に、病院あたり年間約340万ドルの純節約を実現する（Golas et al. 2018）。

P4医療（Predictive, Preventative, Personalized, Participatory medicine）：ウェアラブルヘルステック（wearable health technologies）と患者の健康データを組み合わせたモデルは、糖尿病および心疾患の診断において標的シグナルを96〜97%の精度で抽出できる（Mansour et al. 2021）。

AWS顧客におけるBioFMs（生物基盤モデル）の実践

これらの性能向上が、なぜ主要な製薬・バイオ企業（biopharma organizations）がマルチモーダル生物基盤モデル（multimodal BioFMs）を急速に採用しているのかを説明しています。主要な製薬・バイオ企業は、生物学的データ（biologic data）MerckおよびNovo Nordisk、ゲノムデータ（genomic data）AstraZeneca)、病理学データ（pathology data）Bayer、および臨床データ（clinical data）Rocheの分析にBioFMs（生物基盤モデル）への投資を行っています。これらの専門的な人工知能モデル（AI models）を使用することで、創薬開発において最大50%のコストと時間の節約、医療画像診断では最大90%の時間節約を実現できます（State of the Art-ificial Intelligence 2025, Jeong et al. 2025）。マルチモーダル生物基盤モデル（Multimodal BioFMs）は、医療・ライフサイエンスバリューチェーン（healthcare and life sciences value chain）の複数の段階で有望な結果を示しています（Figure 1）。

image

図1. マルチモーダル生物基盤モデル（Multimodal BioFMs）は、タンパク質、低分子化合物（small molecule）、オミクスデータ（omics）、画像、センサー、臨床文書などのさまざまな生物学的データタイプを統合し、創薬ライフサイクル（drug development lifecycle：研究、臨床開発、製造、商業化）全体でアプリケーションを推進します。

さらに詳しく掘り下げるため、創薬（drug discovery）と臨床開発の2つのユースケースを選択しました。

治療不可能な疾患標的に対する治療用タンパク質の設計。計算予測、構造生物学、生体物理学的検証を統合したマルチモーダルBioFMs（Biological Foundation Models）は、以前アクセス不可能だったタンパク質標的に対する新たなアプローチを可能にします（図2）。初期の応用では3次元構造の予測が可能でしたが、不連続エピトープを特徴とするマルチドメイン標的では課題が残りました。高度な創薬プロセスでは現在、構造、計算、生体物理学的データを横断する反復的な設計・合成・評価・分析（Design-Make-Test-Analyze: DMTA）ループが統合されています。低温電子顕微鏡（Cryo-EM：Cryo-electron microscopy）で取得されたタンパク質の3次元構造データは、インターフェース予測テンプレートモデリングスコア（iPTM：interface predicted template modeling score）、インターフェース予測整列誤差（iPAE：interface predicted aligned error）、二乗平均平方根偏差（RMSD：root mean square deviation）などの計算指標と共に評価され、その後、用量反応曲線、バイオレイヤー干渉法（BLI：biolayer interferometry）、酵素結合免疫吸着アッセイ（ELISA：enzyme-linked immunosorbent assay）などの生体物理学的測定値に対して検証され、創薬の加速とリスク低減が図られています。例えば、Onava社の統合された「AI-人間-湿式実験室」ループは、de novoタンパク質設計のための生成AIと、「エピトープ展開（epitope expansion）」戦略による迅速な実験的検証を組み合わせることで、設計から検証までの期間を数ヶ月から数週間に圧縮し（Calman et al. bioRxiv 2025）、この分野における一歩を前進させたものです。Latent LabsのLatent-X2やChai DiscoveryのChai-2のようなマルチモーダルBioFMsを用いて次世代のバイオロジクスを開発する際、生成モデルの学習にはAmazon SageMaker AI、モデル推論にはAmazon Elastic Compute Cloud（EC2）、構造および実験データの保存にはAmazon Simple Storage Service（Amazon S3）、共有設計ライブラリにはAmazon Elastic File System（EFS）、安全なインフラストラクチャにはAmazon Virtual Private Cloud（VPC）を提供するAWSサービス、ならびにAmazon Bio Discoveryを通じて開発を進めることができます。

image

Figure 2. マルチモーダルBioFMsは、反復的な設計・検証ループを通じて3次元タンパク質構造、計算指標、生体物理学的測定値を統合し、治療不可能なマルチドメイン疾患標的に対する治療用タンパク質の発見を加速します。

臨床開発におけるがん患者の免疫療法の耐性予測。マルチモーダル生物基盤モデル（Multimodal BioFM）の開発者は、腫瘍分野の90％という臨床試験の失敗率に対処することを目指しています。今日のマルチモーダル生物基盤モデルは、配列解析（sequencing）、単一細胞データ（single-cell data）、空間生物学（spatial biology）、患者記録を統合して腫瘍微小環境（tumor microenvironments）をシミュレートし、無効な治療による患者の離脱を減らす耐性メカニズムを発見するとともに、それまで治療対象外だった患者サブグループに対する新たな治療標的を探索しています（図3）。例えば、NoetikのOncology Counterfactual Therapeutics Oracle（OCTO）は1,399症例のがん腫瘍にわたって87万3,000個の仮想免疫細胞をシミュレートし、KRASおよびSTK11遺伝子変異（KRAS and STK11 gene mutations）を持つ肺がん患者が免疫療法の効果を阻害する「免疫寒冷（immune cold）」環境を発症する理由を解明しました（Xieら、SITC 2025でポスター発表）。特筆すべきは、NoetikがAWS上のNVIDIA H100 GPU（NVIDIA H100 GPUs）を用いたAmazon SageMaker HyperPodの耐障害性インフラストラクチャにより、トレーニング時間を40％短縮し処理速度を2倍に向上させたことです。Amazon SageMaker HyperPodによるGPU分散AIトレーニング、計算容量のためのAmazon Elastic Compute Cloud（EC2）、データストレージ用のAmazon Simple Storage Service（Amazon S3）、そしてペタバイト単位の患者データ分析のためのAmazon Athenaを活用することで、独自のマルチモーダル生物基盤モデルを構築し、同様のアプローチを取ることができます。

image

図3. マルチモーダル生物基盤モデルアプローチは、配列解析（sequencing）、空間トランスクリプトミクス（spatial transcriptomics）、病理学、患者記録を組み合わせ、腫瘍微小環境をシミュレートして患者サブ集団を優先順位付けし、早期段階の臨床試験失敗（early-phase trial failures）を削減する可能性があります。

ソリューション：マルチモーダル生物基盤モデル用のAWS環境

AWSは、医療およびライフサイエンスデータを実行可能な洞察に変換するマルチモーダル生物基盤モデルの構築、トレーニング、デプロイメントのための統一された環境を提供します。この環境は4つのレイヤーで構成されています：モデル開発用のAIソリューション、生物データ管理のための統一されたデータ基盤、計算およびストレージのためのスケーラブルなインフラストラクチャ、そして創薬ライフサイクル全体にわたって機能を拡張するパートナー統合です。

AIシステム

Amazon Bio Discoveryは、科学者が適切なバイオファウンデーションモデル（BioFMs）を選択し、入力を最適化し、候補を評価し、ラボパートナーにテストのために送信し、結果を自動的に返して改良する「ラボ・イン・ザ・ループ（lab-in-the-loop）」サイクルを通じて組織的知識を構築し、AIエージェントに直接アクセスできるように提供します。

Amazon SageMaker HyperPodは、大規模モデル向けの分散型トレーニングインフラストラクチャ（distributed training infrastructure）を提供します。Amazon SageMaker AIは、これに組み込みの説明可能性ツール（explainability tools）、バイアス検出（bias detection）機能、包括的な監査証跡（audit trails）を追加し、モデル開発から本番環境へのデプロイに至るまで必要な規制上の信頼性を支えます。

AWS re:Invent 2025で発表されたAmazon Nova Forgeは、Amazon Novaモデルファミリーを開始点として使用し、トレーニングと継続的プレトレーニング（continued pretraining）を最小限に抑えながら独自データセットの学習を最大化する最適なポイントでトレーニングを行います。

Amazon Bedrock AgentCoreには、長時間実行されるディープリサーチエージェントをホストするためのRuntimeサービス（Runtime service）と、エージェントをBioFMモデルやその他のドメイン固有ツールに安全に接続するためのGatewayサービス（Gateway service）が含まれています。

Unified Data Foundation

AWS HealthOmicsは、マルチステップのAIワークフローをオーケストレーションし、ペタバイト規模（petabyte scale）でオミクスデータ（omics data）（DNA、RNA、プロテオミクス）を処理でき、マルチモーダルBioFMワークフロー（multimodal BioFM workflows）を駆動する生物データバックボーン（biological data backbone）として機能します。

AWS HealthLakeとAWS HealthImagingは、管理されたデータ湖家（lakehouses）に異種データを統合し、臨床記録と医療画像 across の調和を自動化します。

原文を表示

Healthcare and life sciences decision making increasingly relies on multimodal data to diagnose diseases, prescribe medicine and predict treatment outcomes, develop and optimize innovative therapies accurately. Traditional approaches analyze fragmented data, such as ‘omics for drug discovery, medical images for diagnostics, clinical trial reports for validation, and electronic health records (EHR) for patient treatment. As a result, decision makers (CxOs, VPs, Directors) often miss critical insights hidden in the relationships between data types. Recent advancements in AI enable you to integrate and analyze these fragmented data streams efficiently to support a more complete understanding of therapeutics and patient care.

AWS provides a unified environment for multimodal biological foundation models (BioFMs), enabling you to make more confident, timely decision-making in personalized medicine. This AI system combines biological data, model development, scalable compute, and partner tools to support the drug development life cycle. In this post, we’ll explore how multimodal BioFMs work, showcase real-world applications in drug discovery and clinical development, and contextualize how AWS enables organizations to build and deploy multimodal BioFMs.

Multimodal biological foundation models

Biological foundation models (BioFMs) are AI models pre-trained on large biological datasets. BioFMs demonstrate advanced capabilities on specific healthcare and life sciences tasks. The commonly used BioFMs span drug discovery and clinical development domains, particularly in protein structure and molecule design (~20%), omics data analysis including DNA, epigenetic, and RNA (~30%), medical imaging (15%), and clinical documentation (~35%) (Delile et al. 2025).

Unimodal BioFMs are trained exclusively on a single data modality (for example, amino acid sequences) for relevant downstream applications like predicting protein structures; this breakthrough earned the 2024 Nobel Prize in Chemistry. Multimodal BioFMs train across multiple data types (text, audio, image, and video, hereafter “modalities”) and can simultaneously infer across different streams in a single model (for example, text prompts to generate new images or match images to captions).

Notable multimodal BioFM examples include:

Latent Labs’ Latent-X1 and Latent-X2 not only predict 3D structures of proteins, but also generate novel binders like antibodies, macrocyclic peptides, and miniproteins and predict how they interact with targets.

Arc Institute’s Evo 2 maps the central dogma of biology to interpret and predict the structure and function of DNA, RNA, and proteins.

Insilco Medicine’s Nach01 integrates natural language, chemical intelligence, and 3D molecular structure data to accelerate drug discovery.

Bioptimus’ M-Optimus decodes histology and clinical data for rich biological insights, supporting multiple stages from research to patient care.

Harvard and AstraZeneca’s MADRIGAL integrates structural, pathway, cell viability, and transcriptomic data to predict drug combination clinical outcome, identify adverse interactions, and optimize polypharmacy management.

John Snow Lab’s vision language model Medical VLM-24B processes clinical notes, lab reports, and imaging (X‑ray, MRI, CT) for unified, context‑aware diagnostics.

GEHC’s 3D magnetic resonance imaging (MRI) foundation model, designed to enable developers to build applications for tasks such as image retrieval, classification, image segmentation, and report generation.

The multimodal advantage

The current frontier of models pushes the boundary of multimodal understanding and generation capabilities. General-purpose models like Amazon Nova 2 Omni can process text, images, video, and speech inputs while generating both text and images. This multimodality trend extends to BioFMs, where combining multiple data types like medical images and clinical documentation achieves higher predictive accuracy and broader applicability across diverse clinical outcomes (Siam et al. 2025).

Integrating diverse biological data types yields measurable performance gains:

Enhanced diagnostic accuracy: Models integrating genomics, imaging, and clinical data yield 4-7% average gains in area under the curve (AUC) over unimodal baselines for diagnoses (e.g., Alzheimer’s, brain cancer) and phenotypes (Sun et al. 2024). Moreover, models integrating lab data, patient exercise metrics, and clinical notes during patient screening achieve 92.74% accuracy with 93.21 AUC in cardiovascular risk prediction (Guo and Wu, 2025).

Targeted therapeutic strategies: You can use models integrating genomic profiles, medical images, and clinical histories to guide selection of effective interventions for individual patients (Parvin et al. 2025). This proves especially impactful for cancer patients where tumor genomics and radiological imaging can facilitate therapeutic decisions like chemotherapy regimens (Restrepo et al. 2023).

New disease mechanisms: Single-cell multi-omics models show how cancer cells grow and resist treatments inside blood diseases like leukemia, helping physicians improve survival rates by spotting hidden cancer cells, tracking how mutations drive disease progression, and selecting personalized treatments for patients (Kim and Takahashi, 2025).

Accurate risk prediction: You can use models integrating lab results, medications, clinical notes, and discharge summaries and other clinical data to predict 30-day hospital readmission risk with 76% accuracy—delivering ~$3.4 million in net savings per hospital annually while improving overall clinical outcomes for high-risk heart failure patients through targeted interventions (Golas et al. 2018).

Predictive, Preventative, Personalized, Participatory (P4) medicine: Models combining wearable health technologies with patient health data can extract target signals with 96-97% accuracy for diabetes and heart disease diagnosis (Mansour et al. 2021).

BioFMs in action at AWS customers

These performance gains explain why leading biopharma organizations are increasingly adopting multimodal BioFMs. Leading biopharma organizations invest in BioFMs for analyzing biologic (Merck and Novo Nordisk), genomic (AstraZeneca), pathology (Bayer), and clinical (Roche) data. You can realize up to 50% in cost and time savings for drug development and up to 90% in time savings for medical image diagnosis when using these specialized AI models (State of the Art-ificial Intelligence 2025, Jeong et al. 2025). Multimodal BioFMs show promise in multiple stages of the healthcare and life sciences value chain (Figure 1).

Figure 1. Multimodal BioFMs integrate various biological data types (for example, protein, small molecule, omics, imaging, sensors, clinical documentation) to power applications across the drug development lifecycle (research, clinical development, manufacturing, commercial).

For a deeper dive, we’ve selected two use cases: drug discovery and clinical development.

Designing therapeutic proteins for undruggable disease targets. Multimodal BioFMs integrating computational predictions, structural biology, and biophysical validation enable new approaches to previously inaccessible protein targets (Figure 2). Early applications predicted 3D structures but struggled with multidomain targets featuring discontinuous epitopes. Advanced drug discovery now integrates iterative design-make-test-analyze (DMTA) loops that span structural, computational, and biophysical data. The 3D protein structural data captured through cryo-electron microscopy (Cryo-EM) is evaluated alongside computational metrics like interface predicted template modeling score (iPTM), interface predicted aligned error (iPAE), and root mean square deviation (RMSD) then validated against biophysical measurements such as dose-response curves, biolayer interferometry (BLI), and enzyme-linked immunosorbent assay (ELISA) to accelerate and de-risk drug discovery. For example, Onava’s integrated “AI-human-wet lab” loop represents a step forward in this space by combining generative AI for de novo protein design with rapid experimental validation through an “epitope expansion” strategy, compressing design-to-validation timelines from months to weeks (Calman et al. bioRxiv 2025). You may develop next-generation biologics using multimodal BioFMs like Latent Labs Latent-X2 and Chai Discovery Chai-2 through AWS services including Amazon Bio Discovery, Amazon SageMaker AI for training generative models, Amazon Elastic Compute Cloud (EC2) for model inference, Amazon Simple Storage Service (Amazon S3) for storing structural and experimental data, Amazon Elastic File System (EFS) for shared design libraries, and Amazon Virtual Private Cloud (VPC) for secure infrastructure.

Figure 2. Multimodal BioFMs integrate 3D protein structure, computational metrics, and biophysical measurements through iterative design-validation loops to accelerate therapeutic protein discovery for undruggable multidomain disease targets.

Predicting immunotherapy resistance in cancer patients during clinical development. Multimodal BioFM developers work towards addressing oncology’s 90% clinical trial failure rate. Today’s multimodal BioFMs simulate tumor microenvironments by integrating sequencing, single-cell data, spatial biology, and patient records to discover resistance mechanisms that reduce patient drop-offs from ineffective treatments and discover new therapeutic targets for previously untreatable patient subgroups (Figure 3). For example, Noetik’s Oncology Counterfactual Therapeutics Oracle (OCTO) simulated 873,000 virtual immune cells across 1,399 patient tumors and revealed why lung cancer patients with KRAS and STK11 gene mutations develop “immune cold” environments blocking immunotherapy effectiveness (Xie et al. Poster presented at SITC 2025). Notably, Noetik achieved 40% faster training time and doubled processing speed through Amazon SageMaker HyperPod’s fault-tolerant infrastructure on AWS with NVIDIA H100 GPUs. You can build your own multimodal BioFMs can take a similar approach using Amazon SageMaker HyperPod for distributed AI training across GPUs, Amazon Elastic Compute Cloud (EC2) for compute capacity, Amazon Simple Storage Service (Amazon S3) for data storage, and Amazon Athena for analyzing petabytes of patient data.

Figure 3. Multimodal BioFM approach combines sequencing, spatial transcriptomics, pathology, and patient records to simulate tumor microenvironments and prioritize patient subpopulations, potentially reducing early-phase trial failures

Solution: AWS environment for multimodal BioFMs

AWS provides a unified environment for building, training, and deploying multimodal BioFMs that help you convert healthcare and life science data into actionable insights. This environment comprises four layers: an AI solution for model development, a unified data foundation for biological data management, scalable infrastructure for compute and storage, and partner integrations that extend capabilities across the drug development lifecycle.

AI System

Amazon Bio Discovery provides scientists direct access AI agents selecting the right BioFMs, optimizing inputs, evaluating candidates, sending to lab partners for testing, and automatically returning results for refinement in a lab-in-the-loop cycle that builds institutional knowledge.

Amazon SageMaker HyperPod delivers distributed training infrastructure for large-scale models. Amazon SageMaker AI compliments this with built-in explainability tools, bias detection, and comprehensive audit trails to support regulatory confidence needed from model development through production deployment.

Amazon Nova Forge, released at AWS re:Invent 2025, uses the Amazon Nova model family as a starting point to train at optimal points to maximize proprietary data set learning while minimizing training and continued pretraining.

Amazon Bedrock AgentCore includes the Runtime service to host long-running deep research agents and the Gateway service to securely connect agents to BioFM models and other domain-specific tools.

Unified Data Foundation

AWS HealthOmics can orchestrate multi-step AI workflows and handle omics data (DNA, RNA, proteomics) at the petabyte scale, serving as a biological data backbone that powers multimodal BioFM workflows.

AWS HealthLake and AWS HealthImaging aggregate heterogeneous data into governed lakehouses, automating harmonization across clinical records and medical imag

この記事をシェア

AWS Machine Learning Blog重要度42026年7月23日 00:54

monday.com、Bedrock で AI エージェントを運用

AWS Machine Learning Blog2026年7月22日 01:23

Amazon Nova、自己蒸留型推論をSFTに活用

AWS Machine Learning Blog2026年7月21日 02:25

AWS DeepRacer、カスタムOSインストール可能に

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

AWS Machine Learning Blog·2026年4月24日 01:17·約14分

治療と患者ケアにおけるマルチモーダル生物基盤モデルの応用

#マルチモーダルAI #生物基礎モデル(BioFMs)#創薬支援 #AWS Machine Learning #個別化医療

TL;DR

AI深層分析2026年4月24日 01:56

注目/ 5段階

深度40%

キーポイント

BioFMsの領域別適用比率

単一モーダルからマルチモーダルへ

AWSの統合開発環境

重要な引用

Healthcare and life sciences decision making increasingly relies on multimodal data to diagnose diseases, prescribe medicine and predict treatment outcomes, develop and optimize innovative therapies accurately.

Biological foundation models (BioFMs) are AI models pre-trained on large biological datasets.

AWS provides a unified environment for multimodal biological foundation models (BioFMs), enabling you to make more confident, timely decision-making in personalized medicine.

影響分析・編集コメントを表示

影響分析

編集コメント

マルチモーダル生物基盤モデル

代表的なマルチモーダルBioFMsの例には以下が含まれます：

Latent LabsのLatent-X1およびLatent-X2は、タンパク質の3次元構造（3D structures）を予測するだけでなく、抗体、マクロシクリックペプチド（macrocyclic peptides）、ミニタンパク質（miniproteins）のような新規バインダーを生成し、標的との相互作用も予測します。

Arc InstituteのEvo 2は、生物学の中核原理（central dogma）をマッピングし、DNA、RNA、タンパク質の構造と機能を解釈・予測します。

Insilco MedicineのNach01は、自然言語、化学インテリジェンス（chemical intelligence）、3次元分子構造データを統合し、創薬を加速します。

BioptimusのM-Optimusは、組織病理学（histology）データと臨床データを解読して豊富な生物学的知見を得られ、研究から患者ケアまでの複数の段階をサポートします。

ハーバードとAstraZenecaのMADRIGALは、構造データ、パスウェイ（経路）データ、細胞生存率データ、トランスクリプトームデータ（transcriptomic data）を統合し、薬物併用療法の臨床結果を予測し、有害相互作用を特定し、多剤併用管理（polypharmacy management）を最適化します。

John Snow Labのビジョン言語モデル（vision language model）Medical VLM-24Bは、臨床メモ、検査報告書、画像（X線、MRI、CT）を処理し、統一された文脈認識型診断を実現します。

GEHCの3次元磁気共鳴画像法（MRI）ファウンデーションモデル（foundation model）は、画像検索、分類、画像セグメンテーション、レポート生成などのタスク向けアプリケーションを構築できるように設計されています。

多モーダル（multimodal）の利点

多様な生物学的データタイプを統合することで、測定可能な性能向上が得られます：

診断精度の向上：ゲノム学、画像解析、臨床データを統合したモデルは、単一モーダルベースライン（unimodal baselines）と比較して、診断（アルツハイマー病、脳がんなど）や表現型において平均4〜7%の曲線下面積（Area Under the Curve, AUC）向上をもたらす（Sun et al. 2024）。さらに、患者スクリーニング時に検査データ、患者の運動指標、臨床メモを統合したモデルは、心血管疾患リスク予測（cardiovascular risk prediction）において92.74%の精度と93.21 AUCを達成している（Guo and Wu, 2025）。

ターゲット治療戦略（Targeted therapeutic strategies）：ゲノムプロファイル（genomic profiles）、医療画像、臨床病歴を統合したモデルを用いて、個別の患者に対する効果的な介入法の選択をガイドすることができる（Parvin et al. 2025）。これは特にがん患者において顕著な影響を及ぼし、腫瘍ゲノム学と放射線画像解析（radiological imaging）が化学療法のレジメン（chemotherapy regimens）などの治療決定を支援する（Restrepo et al. 2023）。

新しい疾患メカニズム（disease mechanisms）：単一細胞マルチオミクスモデル（Single-cell multi-omics models）は、白血病などの血液疾患においてがん細胞がどのように増殖し治療に抵抗するかを示しており、隠れたがん細胞の発見、変異が疾患進行をどのように駆動するかの追跡、患者への個別化治療（personalized treatments）の選択を通じて、医師が生存率を向上させるのを支援する（Kim and Takahashi, 2025）。

正確なリスク予測（risk prediction）：検査結果、投薬情報、臨床メモ、退院サマリーその他の臨床データを統合したモデルを用いて、30日以内の再入院リスク（hospital readmission risk）を76%の精度で予測できる。これにより、個別化介入を通じて高リスク心不全患者の全体的な臨床アウトカム（clinical outcomes）を改善すると同時に、病院あたり年間約340万ドルの純節約を実現する（Golas et al. 2018）。

P4医療（Predictive, Preventative, Personalized, Participatory medicine）：ウェアラブルヘルステック（wearable health technologies）と患者の健康データを組み合わせたモデルは、糖尿病および心疾患の診断において標的シグナルを96〜97%の精度で抽出できる（Mansour et al. 2021）。

AWS顧客におけるBioFMs（生物基盤モデル）の実践

image

さらに詳しく掘り下げるため、創薬（drug discovery）と臨床開発の2つのユースケースを選択しました。

治療不可能な疾患標的に対する治療用タンパク質の設計。計算予測、構造生物学、生体物理学的検証を統合したマルチモーダルBioFMs（Biological Foundation Models）は、以前アクセス不可能だったタンパク質標的に対する新たなアプローチを可能にします（図2）。初期の応用では3次元構造の予測が可能でしたが、不連続エピトープを特徴とするマルチドメイン標的では課題が残りました。高度な創薬プロセスでは現在、構造、計算、生体物理学的データを横断する反復的な設計・合成・評価・分析（Design-Make-Test-Analyze: DMTA）ループが統合されています。低温電子顕微鏡（Cryo-EM：Cryo-electron microscopy）で取得されたタンパク質の3次元構造データは、インターフェース予測テンプレートモデリングスコア（iPTM：interface predicted template modeling score）、インターフェース予測整列誤差（iPAE：interface predicted aligned error）、二乗平均平方根偏差（RMSD：root mean square deviation）などの計算指標と共に評価され、その後、用量反応曲線、バイオレイヤー干渉法（BLI：biolayer interferometry）、酵素結合免疫吸着アッセイ（ELISA：enzyme-linked immunosorbent assay）などの生体物理学的測定値に対して検証され、創薬の加速とリスク低減が図られています。例えば、Onava社の統合された「AI-人間-湿式実験室」ループは、de novoタンパク質設計のための生成AIと、「エピトープ展開（epitope expansion）」戦略による迅速な実験的検証を組み合わせることで、設計から検証までの期間を数ヶ月から数週間に圧縮し（Calman et al. bioRxiv 2025）、この分野における一歩を前進させたものです。Latent LabsのLatent-X2やChai DiscoveryのChai-2のようなマルチモーダルBioFMsを用いて次世代のバイオロジクスを開発する際、生成モデルの学習にはAmazon SageMaker AI、モデル推論にはAmazon Elastic Compute Cloud（EC2）、構造および実験データの保存にはAmazon Simple Storage Service（Amazon S3）、共有設計ライブラリにはAmazon Elastic File System（EFS）、安全なインフラストラクチャにはAmazon Virtual Private Cloud（VPC）を提供するAWSサービス、ならびにAmazon Bio Discoveryを通じて開発を進めることができます。

image

臨床開発におけるがん患者の免疫療法の耐性予測。マルチモーダル生物基盤モデル（Multimodal BioFM）の開発者は、腫瘍分野の90％という臨床試験の失敗率に対処することを目指しています。今日のマルチモーダル生物基盤モデルは、配列解析（sequencing）、単一細胞データ（single-cell data）、空間生物学（spatial biology）、患者記録を統合して腫瘍微小環境（tumor microenvironments）をシミュレートし、無効な治療による患者の離脱を減らす耐性メカニズムを発見するとともに、それまで治療対象外だった患者サブグループに対する新たな治療標的を探索しています（図3）。例えば、NoetikのOncology Counterfactual Therapeutics Oracle（OCTO）は1,399症例のがん腫瘍にわたって87万3,000個の仮想免疫細胞をシミュレートし、KRASおよびSTK11遺伝子変異（KRAS and STK11 gene mutations）を持つ肺がん患者が免疫療法の効果を阻害する「免疫寒冷（immune cold）」環境を発症する理由を解明しました（Xieら、SITC 2025でポスター発表）。特筆すべきは、NoetikがAWS上のNVIDIA H100 GPU（NVIDIA H100 GPUs）を用いたAmazon SageMaker HyperPodの耐障害性インフラストラクチャにより、トレーニング時間を40％短縮し処理速度を2倍に向上させたことです。Amazon SageMaker HyperPodによるGPU分散AIトレーニング、計算容量のためのAmazon Elastic Compute Cloud（EC2）、データストレージ用のAmazon Simple Storage Service（Amazon S3）、そしてペタバイト単位の患者データ分析のためのAmazon Athenaを活用することで、独自のマルチモーダル生物基盤モデルを構築し、同様のアプローチを取ることができます。

image

ソリューション：マルチモーダル生物基盤モデル用のAWS環境

AIシステム

Amazon SageMaker HyperPodは、大規模モデル向けの分散型トレーニングインフラストラクチャ（distributed training infrastructure）を提供します。Amazon SageMaker AIは、これに組み込みの説明可能性ツール（explainability tools）、バイアス検出（bias detection）機能、包括的な監査証跡（audit trails）を追加し、モデル開発から本番環境へのデプロイに至るまで必要な規制上の信頼性を支えます。

AWS re:Invent 2025で発表されたAmazon Nova Forgeは、Amazon Novaモデルファミリーを開始点として使用し、トレーニングと継続的プレトレーニング（continued pretraining）を最小限に抑えながら独自データセットの学習を最大化する最適なポイントでトレーニングを行います。

Amazon Bedrock AgentCoreには、長時間実行されるディープリサーチエージェントをホストするためのRuntimeサービス（Runtime service）と、エージェントをBioFMモデルやその他のドメイン固有ツールに安全に接続するためのGatewayサービス（Gateway service）が含まれています。

Unified Data Foundation

AWS HealthLakeとAWS HealthImagingは、管理されたデータ湖家（lakehouses）に異種データを統合し、臨床記録と医療画像 across の調和を自動化します。

原文を表示

Multimodal biological foundation models

Notable multimodal BioFM examples include:

Latent Labs’ Latent-X1 and Latent-X2 not only predict 3D structures of proteins, but also generate novel binders like antibodies, macrocyclic peptides, and miniproteins and predict how they interact with targets.

Arc Institute’s Evo 2 maps the central dogma of biology to interpret and predict the structure and function of DNA, RNA, and proteins.

Insilco Medicine’s Nach01 integrates natural language, chemical intelligence, and 3D molecular structure data to accelerate drug discovery.

Bioptimus’ M-Optimus decodes histology and clinical data for rich biological insights, supporting multiple stages from research to patient care.

Harvard and AstraZeneca’s MADRIGAL integrates structural, pathway, cell viability, and transcriptomic data to predict drug combination clinical outcome, identify adverse interactions, and optimize polypharmacy management.

John Snow Lab’s vision language model Medical VLM-24B processes clinical notes, lab reports, and imaging (X‑ray, MRI, CT) for unified, context‑aware diagnostics.

GEHC’s 3D magnetic resonance imaging (MRI) foundation model, designed to enable developers to build applications for tasks such as image retrieval, classification, image segmentation, and report generation.

The multimodal advantage

Integrating diverse biological data types yields measurable performance gains:

Enhanced diagnostic accuracy: Models integrating genomics, imaging, and clinical data yield 4-7% average gains in area under the curve (AUC) over unimodal baselines for diagnoses (e.g., Alzheimer’s, brain cancer) and phenotypes (Sun et al. 2024). Moreover, models integrating lab data, patient exercise metrics, and clinical notes during patient screening achieve 92.74% accuracy with 93.21 AUC in cardiovascular risk prediction (Guo and Wu, 2025).

Targeted therapeutic strategies: You can use models integrating genomic profiles, medical images, and clinical histories to guide selection of effective interventions for individual patients (Parvin et al. 2025). This proves especially impactful for cancer patients where tumor genomics and radiological imaging can facilitate therapeutic decisions like chemotherapy regimens (Restrepo et al. 2023).

New disease mechanisms: Single-cell multi-omics models show how cancer cells grow and resist treatments inside blood diseases like leukemia, helping physicians improve survival rates by spotting hidden cancer cells, tracking how mutations drive disease progression, and selecting personalized treatments for patients (Kim and Takahashi, 2025).

Accurate risk prediction: You can use models integrating lab results, medications, clinical notes, and discharge summaries and other clinical data to predict 30-day hospital readmission risk with 76% accuracy—delivering ~$3.4 million in net savings per hospital annually while improving overall clinical outcomes for high-risk heart failure patients through targeted interventions (Golas et al. 2018).

Predictive, Preventative, Personalized, Participatory (P4) medicine: Models combining wearable health technologies with patient health data can extract target signals with 96-97% accuracy for diabetes and heart disease diagnosis (Mansour et al. 2021).

BioFMs in action at AWS customers

For a deeper dive, we’ve selected two use cases: drug discovery and clinical development.

Designing therapeutic proteins for undruggable disease targets. Multimodal BioFMs integrating computational predictions, structural biology, and biophysical validation enable new approaches to previously inaccessible protein targets (Figure 2). Early applications predicted 3D structures but struggled with multidomain targets featuring discontinuous epitopes. Advanced drug discovery now integrates iterative design-make-test-analyze (DMTA) loops that span structural, computational, and biophysical data. The 3D protein structural data captured through cryo-electron microscopy (Cryo-EM) is evaluated alongside computational metrics like interface predicted template modeling score (iPTM), interface predicted aligned error (iPAE), and root mean square deviation (RMSD) then validated against biophysical measurements such as dose-response curves, biolayer interferometry (BLI), and enzyme-linked immunosorbent assay (ELISA) to accelerate and de-risk drug discovery. For example, Onava’s integrated “AI-human-wet lab” loop represents a step forward in this space by combining generative AI for de novo protein design with rapid experimental validation through an “epitope expansion” strategy, compressing design-to-validation timelines from months to weeks (Calman et al. bioRxiv 2025). You may develop next-generation biologics using multimodal BioFMs like Latent Labs Latent-X2 and Chai Discovery Chai-2 through AWS services including Amazon Bio Discovery, Amazon SageMaker AI for training generative models, Amazon Elastic Compute Cloud (EC2) for model inference, Amazon Simple Storage Service (Amazon S3) for storing structural and experimental data, Amazon Elastic File System (EFS) for shared design libraries, and Amazon Virtual Private Cloud (VPC) for secure infrastructure.

Predicting immunotherapy resistance in cancer patients during clinical development. Multimodal BioFM developers work towards addressing oncology’s 90% clinical trial failure rate. Today’s multimodal BioFMs simulate tumor microenvironments by integrating sequencing, single-cell data, spatial biology, and patient records to discover resistance mechanisms that reduce patient drop-offs from ineffective treatments and discover new therapeutic targets for previously untreatable patient subgroups (Figure 3). For example, Noetik’s Oncology Counterfactual Therapeutics Oracle (OCTO) simulated 873,000 virtual immune cells across 1,399 patient tumors and revealed why lung cancer patients with KRAS and STK11 gene mutations develop “immune cold” environments blocking immunotherapy effectiveness (Xie et al. Poster presented at SITC 2025). Notably, Noetik achieved 40% faster training time and doubled processing speed through Amazon SageMaker HyperPod’s fault-tolerant infrastructure on AWS with NVIDIA H100 GPUs. You can build your own multimodal BioFMs can take a similar approach using Amazon SageMaker HyperPod for distributed AI training across GPUs, Amazon Elastic Compute Cloud (EC2) for compute capacity, Amazon Simple Storage Service (Amazon S3) for data storage, and Amazon Athena for analyzing petabytes of patient data.

Solution: AWS environment for multimodal BioFMs

AI System

Amazon SageMaker HyperPod delivers distributed training infrastructure for large-scale models. Amazon SageMaker AI compliments this with built-in explainability tools, bias detection, and comprehensive audit trails to support regulatory confidence needed from model development through production deployment.

Amazon Nova Forge, released at AWS re:Invent 2025, uses the Amazon Nova model family as a starting point to train at optimal points to maximize proprietary data set learning while minimizing training and continued pretraining.

Amazon Bedrock AgentCore includes the Runtime service to host long-running deep research agents and the Gateway service to securely connect agents to BioFM models and other domain-specific tools.

Unified Data Foundation

AWS HealthLake and AWS HealthImaging aggregate heterogeneous data into governed lakehouses, automating harmonization across clinical records and medical imag

この記事をシェア

AWS Machine Learning Blog重要度42026年7月23日 00:54

monday.com、Bedrock で AI エージェントを運用

AWS Machine Learning Blog2026年7月22日 01:23

Amazon Nova、自己蒸留型推論をSFTに活用

AWS Machine Learning Blog2026年7月21日 02:25

AWS DeepRacer、カスタムOSインストール可能に

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

治療と患者ケアにおけるマルチモーダル生物基盤モデルの応用

キーポイント

重要な引用

影響分析

編集コメント

マルチモーダル生物基盤モデル

多モーダル（multimodal）の利点

AWS顧客におけるBioFMs（生物基盤モデル）の実践

ソリューション：マルチモーダル生物基盤モデル用のAWS環境

Multimodal biological foundation models

The multimodal advantage

BioFMs in action at AWS customers

Solution: AWS environment for multimodal BioFMs

関連記事

治療と患者ケアにおけるマルチモーダル生物基盤モデルの応用

キーポイント

重要な引用

影響分析

編集コメント

マルチモーダル生物基盤モデル

多モーダル（multimodal）の利点

AWS顧客におけるBioFMs（生物基盤モデル）の実践

ソリューション：マルチモーダル生物基盤モデル用のAWS環境

Multimodal biological foundation models

The multimodal advantage

BioFMs in action at AWS customers

Solution: AWS environment for multimodal BioFMs

関連記事