OpenAI News·2026年6月30日 09:00·約8分

Inside Genebench-Pro

#GeneBench-Pro #OpenAI #医療 AI #推論能力 #遺伝子解析

TL;DR

OpenAI は、遺伝子解析に基づく腫瘍治療の利益・リスク判断を評価するための詳細なベンチマーク「GeneBench-Pro」のケーススタディとデータセットを発表し、医療 AI の専門性向上に向けた重要な一歩を示した。

AI深層分析2026年7月4日 20:04

重要/ 5段階

深度40%

キーポイント

GeneBench-Pro の公開と目的

OpenAI は、遺伝子解析に基づく腫瘍治療の判断能力を評価するベンチマーク「GeneBench-Pro」を発表し、その詳細なケーススタディ 10 例とデータセットを公開した。

複雑な医療データの統合分析

単一の指標ではなく、ロングリード配列、発現データ、腫瘍品質、薬理遺伝学的証拠など多角的な情報を統合し、構造変異（structural variant）を特定した上で治療の利益と毒性を判断する課題が設定されている。

具体的な臨床シナリオの評価

例として「TXR1 指向性阻害剤」が特定の構造変異を持つ腫瘍に対して有効かどうかを判定するケースが提示され、AI が臨床的有用性を推論できるかを検証している。

合成ラベルと実データの使用

ベンチマークの精度評価のために「TXR1」「DLR1」などの合成ラベルや患者 ID、年齢、ECOG スコアなどの臨床パラメータを含む構造化データをモデルに提供し、推論能力を厳密にテストする。

影響分析・編集コメントを表示

影響分析

この発表は、医療分野における大規模言語モデルの適用可能性を「理論」から「実証」へと移行させる重要な転換点です。特に、多様な臨床データソースを統合して複雑な治療判断を下す能力を評価する基準が確立されたことで、AI 開発者が医療現場で信頼されるシステムを構築するための指針となります。

編集コメント

医療 AI の評価基準が、単なる知識の問答から複雑な臨床データ統合と推論へと進化していることを示す重要な発表です。特に構造変異のような専門的な遺伝子情報を処理する能力を評価するベンチマークは、実用化への近道となるでしょう。

ケーススタディ

これら 10 のケーススタディは、GeneBench-Pro から選ばれた代表的な質問を紹介するものです。各ケーススタディには、元のプロンプト、データセット、および補足資料が含まれています。ベンチマークの概要と主要な知見については、発表ブログをご覧ください。

注：ファイルプレビューでは、完全なデータセットからの抜粋が表示されます。

ケーススタディ 1

体細胞腫瘍学：構造変異に基づく腫瘍治療のベネフィット・リスク判断

合成 TXR1 指向阻害剤が、ターゲット活性化が構造変異によって駆動される腫瘍において臨床的有用性を有するかを推定すること。TXR1、TXR1i、DLR1、およびスターアレルラベルは、合成ベンチマークラベルです。

*ベネフィットと毒性を治療判断として解釈するためには、ターゲットサブグループをロングリードデータ、発現データ、腫瘍品質データ、および薬理ゲノミクスエビデンスから復元する必要があります。*

モデルに提供されたファイル

patient_idanalysis_setagesexsitecalendar_periodecogtumor_burdenprior_linesprior_resistancelineage_classtherapy_classassessed16benefit16tox_stop_8wktime_zero_day

MTB0001173.8MS1P220.78731ATXR1i010

MTB0002155.2MS3P112.63701ATXR1i1000

MTB0003168.8FS4P200.89121ATXR1i1110

MTB0004182.8FS2P224.10100BTXR1i1000

MTB0005165.5FS1P317.011ATXR1i1000

レジストリ共変量、治療、週 16 の評価、ベネフィット、および早期毒性に関する情報。

ケーススタディ 2

機能ゲノミクス：CRISPR ターゲット検証：lncRNA トランスクリプトかゲノム座か？

必ず JSON 形式で返してください。translation フィールドのみ。他のフィールド (technical_terms 等) は一切追加しないこと — 余計なフィールドを書こうとして本文翻訳がトークン上限で打ち切られる事故を防ぐため:

{"translation": "翻訳全文"}

明白な lncRNA の依存性が転写体特異的なものか、それとも近傍の遺伝子座や隣接遺伝子の効果によって駆動されているかを判断する。

*転写体指向のエビデンスは、局所的な DNA 遺伝子座への擾乱、隣接遺伝子の抑制、ガイドの入れ替え、GC 毒性、およびプレート効果に対する統制に耐えなければならない。*

モデルに提供されるファイル

guide_idnominal_targetchrcoordstranddist_lnc_tss_bpdist_neighbor_tss_bpguide_gc_frac

g001LINC473chr7100014+14300.624

g002LINC473chr7100035-43670.584

g003LINC473chr7100051+116560.622

g004LINC473chr7100066-59660.617

g005LINC473chr7100088+74770.715

ガイド座標、ターゲット、距離、および GC 特徴量。

ケーススタディ 3

統計遺伝学：連鎖した遺伝子座におけるタンパク質薬物標的の優先順位付け

近接する 2 つのタンパク質について、アッセイ規模、対立遺伝子の向き、勝者の呪い（winner's curse）、LD（連鎖不平衡）、および残存する局所的な多効性を処理しつつ、cis 多変量メンデルランダム化（cis-MVMR）を用いて直接的な疾患効果を推定する。

*この 2 つのタンパク質は相関した遺伝子座を共有している。分析は、周辺関連性から、共通のタンパク質スケールにおける条件付きかつ LD を考慮した疾患効果へと移行しなければならない。*

モデルに提供されるファイル

snppos_bpeffect_alleleother_allelemafbetasepval

rs20000050000000AC0.422150.0064386683107068080.0032673300912034120.04876727714241972

rs20000150010126AC0.057090.0110089933375813010.0069552392087504070.11345916603941006

rs20000250020253GT0.090210.0099220147571163190.0056330230270155180.07817048492026045

rs20000350030379GT0.483990.0105692156141645730.00322914197402374450.0010638520681901973

rs20000450040506AG0.377030.0070365513782386540.00332975923212698020.034580976884336506

PROTA に関するスクリーニング段階のタンパク質関連サマリー。

事例研究 4

臨床ゲノム学 / キャリアスクリーニング：CNV（コピー数多型）および擬遺伝子キャリブレーションを適用した DRX1 キャリアスクリーニング後の残存リスク

祖先固有のキャリア頻度、スクリーニング陰性後の残存リスク、パートナーのキャリア頻度、および影響を受ける受精卵のリスクを、キャリアスクリーニングアッセイデータから推定する。

*残存リスクの推定は、擬遺伝子対応型のキャリア判定、創始者ハプロタイプの集約、祖先固有のアッセイキャリブレーション、ならびにテストされたパートナーからフルなパートナー名簿への標準化に依存します。*

モデルに提供されたファイル

sample_idcollectionancestryfamily_history_tier

S_EUR_0001スクリーニングEUR0

S_EUR_0002スクリーニングEUR0

S_EUR_0003スクリーニングEUR0

S_EUR_0004スクリーニングEUR0

S_EUR_0005スクリーニングEUR1

祖先情報およびスクリーニング文脈を有するスクリーニング対象の成人。

事例研究 5

単細胞ゲノム学：環境 RNA 補正後の活性化モノサイト eQTL（遺伝子発現量形質関連座標）

単細胞 RNA-seq データから環境 RNA および技術的汚染を除去した後の、活性化モノサイトにおける遺伝子型の影響を推定する。

*Ambient RNA は標的発現と活性化状態を判定するために使用されるマーカーパネルの両方に影響を与えるため、eQTL モデルの前段階で補正を行う必要があります。*

提供されたファイル（モデル用）

cell_iddonortotal_umiHBBIFI6ISG15LST1CXCL10

D01_C001 D01 1113734835

D01_C002 D01 110363311210

D01_C003 D01 11419812639

D01_C004 D01 125076043217

D01_C005 D01 10459125115

マーカー遺伝子、汚染マーカー、および標的遺伝子の細胞ごとの UMI カウントです。

ケーススタディ 6

構造遺伝学：ネスト型構造変異体 — 発現エビデンスと臨床関連性の評価

匿名の逆位様座領域内に存在するネスト型の構造サブハプロタイプが、較正された臨床関連性と信頼性のある発現エビデンスを有しているかどうかを推定します。

*ネスト型のコピー数シグナルは、より広範な逆位配向によって交絡される可能性があるため、コピー数の較正、発現エビデンス、および臨床モデリングはそれぞれ独立して行われる必要があります。*

提供されたファイル（モデル用）

sample_idcaseageage_bandsexpc1pc2pc3ancestry_groupclinic_stratumrecruitment_stream

Q0001 2150.45 50_640 -1.01514 -0.21032 -0.08849 EUR tertiary clinic

Q0002 8057.39 50_640 -1.25987 -0.12498 0.2344 EUR regional registry

Q00029 168.46 5_plus0 0.91598 0.62177 0.01891 AFR tertiary clinic

Q00030 174.07 65_plus1 0.21125 -0.59634 -0.08197 EAS community registry

Q00032 182.82 65_plus0 -1.12034 -0.24372 0.14665 EUR community clinic

コホート全体の臨床データおよび共変量データです。

ケーススタディ 7

レギュラトリー・ゲノミクス：構造的変異およびマッピングアーティファクトのマスク後のクロマチンループ強度の測定

期待される接触背景から低マッパビリティおよび構造的変異のアーティファクトを除去した後、焦点的なケース対照間の Hi-C ループ強度の違いを定量化する。

*ターゲットとなるループは 20 kb の解像度で定義されているが、低マッパビリティ接触とケース固有の SV ストライプ（ストライプ状のアーティファクト）をまずマスクしない限り、期待される接触モデルは歪んでしまう。*

モデルに提供されたファイル

bin_idchromstartendgc_contentmappabilityre_sites

0chr84000004200000.461990338215725940.97875742147042735

1chr84200004400000.50441242085346770.89010849434983975

2chr84400004600000.432184515849381940.90568792893267123

3chr84600004800000.47331972826812180.93765298406647893

4chr84800005000000.44449560621507480.86825655179818774

ターゲット解像度のビン注釈。

ケーススタディ 8

統計遺伝学：創始者再構築による多親 QTL マッピング

8 つの創始者を持つ組換え集団において、表現型関連性を検定する前に創始者の祖先を再構築することで、染色体 1 の数量形質遺伝子座（QTL）をマッピングする。

*観測されるマーカーデータは二対立遺伝性であるが、生物学的な信号は創始者祖先である。したがって、正当化可能な分析では、創始者状態の再構築、マーカーの方向性の確認、そして QTL をバッチ整列されたノイズピークから分離する必要がある。*

モデルに提供されたファイル

marker_idchrpos_cM

m2_065259.762431265596575

m2_103294.52656615104739

m2_107298.18761427503033

m2_079272.20130244108847

m1_054149.907510212292195

マーカー識別子、染色体、および遺伝地図上の位置。

事例研究 9

集団遺伝学：親特異的祖先構成と最近の混合時期

相互的なアーティファクトを修復し、染色体特異的なラベル反転を修正した後の位相指定された局所祖先トラクトから、親特異的な祖先構成比率および最近の混合時期を推論する。

*相互的なトラクトアーティファクト、染色体局所的なラベル反転、または地図分母が正しく処理されない場合、祖先割合とパルス（混合）時期は両方とも変化する。*

モデルに提供されるファイル

chromhapstart_morganend_morganancposteriorlow_complexity_frac

chr1h10.030.505A0.9850.08

chr1h10.5050.535B0.620.92

chr1h10.5351.478849A0.9850.08

chr1h11.5037271.852681B0.9850.08

chr1h11.8526812.422373A0.9850.08

座標、祖先ラベル、事後確率値、および品質管理（QC）注釈を備えた位相指定された局所祖先トラクト。

事例研究 10

集団遺伝学：ノイズの多い古代 DNA 時系列からの選択推定

対立遺伝子の向き、方向誤差、ドリフト、および変化する個体群サイズを考慮しながら、古代対立遺伝子頻度の時系列データから、2 つの単倍型座のうちどちらがより強い正の選択を受けているかを推論する。

*ノイズの多い古代の軌道は、両方の座が同じ派生対立遺伝子の尺度上に配置され、提供されたサンプルレベルのシーケンシング誤差値が直接モデル化されるまで、直接的に比較することはできない。*

モデルに提供されるファイル

generationalt_readstotal_readsseq_errorsample_year

636400.16-4500

1234450.16-4278

1841550.16-4056

2438700.16-3833

3036900.16-3611

ロカス A におけるリードカウントの時系列。

原文を表示

Case studies

These 10 case studies showcase representative questions from GeneBench-Pro. Each case study includes the original prompt, datasets, and supporting materials. For an overview of the benchmark and key findings, see the announcement blog.

Note: File previews show excerpts from the full datasets.

Case study 1

Somatic oncology: Structural variant-guided tumor therapy benefit-risk decision

Estimate whether a synthetic TXR1-directed inhibitor has positive clinical utility in tumors whose target activation is driven by a structural variant. TXR1, TXR1i, DLR1, and star-allele labels are synthetic benchmark labels.

*The target subgroup has to be recovered from long-read, expression, tumor-quality, and pharmacogenomic evidence before benefit and toxicity can be interpreted as a treatment decision.*

Files provided to the model

patient_idanalysis_setagesexsitecalendar_periodecogtumor_burdenprior_linesprior_resistancelineage_classtherapy_classassessed16benefit16tox_stop_8wktime_zero_day

MTB0001173.8MS1P220.78731ATXR1i010

MTB0002155.2MS3P112.63701ATXR1i1000

MTB0003168.8FS4P200.89121ATXR1i1110

MTB0004182.8FS2P224.10100BTXR1i1000

MTB0005165.5FS1P317.011ATXR1i1000

Registry covariates, therapy, week-16 assessment, benefit, and early toxicity.

Case study 2

Functional genomics: CRISPR target validation: lncRNA transcript or genomic locus?

Decide whether an apparent lncRNA dependency is transcript-specific or driven by nearby-locus and neighbor-gene effects.

*Transcript-directed evidence has to survive controls for local DNA-locus perturbation, neighbor-gene repression, guide swaps, GC toxicity, and plate effects.*

Files provided to the model

guide_idnominal_targetchrcoordstranddist_lnc_tss_bpdist_neighbor_tss_bpguide_gc_frac

g001LINC473chr7100014+14300.624

g002LINC473chr7100035-43670.584

g003LINC473chr7100051+116560.622

g004LINC473chr7100066-59660.617

g005LINC473chr7100088+74770.715

Guide coordinates, targets, distances, and GC features.

Case study 3

Statistical genetics: Prioritizing protein drug targets in a linked genetic locus

Estimate direct disease effects for two nearby proteins using cis multivariable Mendelian randomization (cis-MVMR) while handling assay scale, allele orientation, winner's curse, LD, and residual local pleiotropy.

*The two proteins share a correlated locus. The analysis has to move from marginal associations to conditional, LD-aware disease effects on a common protein scale.*

Files provided to the model

snppos_bpeffect_alleleother_allelemafbetasepval

rs20000050000000AC0.422150.0064386683107068080.0032673300912034120.04876727714241972

rs20000150010126AC0.057090.0110089933375813010.0069552392087504070.11345916603941006

rs20000250020253GT0.090210.0099220147571163190.0056330230270155180.07817048492026045

rs20000350030379GT0.483990.0105692156141645730.00322914197402374450.0010638520681901973

rs20000450040506AG0.377030.0070365513782386540.00332975923212698020.034580976884336506

Screening-stage protein association summaries for PROTA.

Case study 4

Clinical genomics / carrier screening: DRX1 carrier-screening residual risk under CNV and pseudogene calibration

Estimate ancestry-specific carrier frequencies, residual risk after a negative screen, partner carrier frequency, and affected-conceptus risk from carrier-screening assay data.

*The residual-risk estimate depends on pseudogene-aware carrier calls, founder-haplotype collapse, ancestry-specific assay calibration, and standardization from tested partners back to the full partner roster.*

Files provided to the model

sample_idcollectionancestryfamily_history_tier

S_EUR_0001screeningEUR0

S_EUR_0002screeningEUR0

S_EUR_0003screeningEUR0

S_EUR_0004screeningEUR0

S_EUR_0005screeningEUR1

Screening-roster adults with ancestry and screening context.

Case study 5

Single-cell genomics: Activated-monocyte eQTL after ambient RNA correction

Estimate a genotype effect on activated-monocyte expression after removing ambient RNA and technical contamination from single-cell RNA-seq data.

*Ambient RNA affects both target expression and the marker panel used to call activation state, so correction has to occur before the eQTL model.*

Files provided to the model

cell_iddonortotal_umiHBBIFI6ISG15LST1CXCL10

D01_C001D011113734835

D01_C002D01110363311210

D01_C003D0111419812639

D01_C004D01125076043217

D01_C005D0110459125115

Per-cell UMI counts for marker genes, contamination markers, and the target gene.

Case study 6

Structural genetics: Nested structural variant: expression support and clinical association

Estimate whether a nested structural subhaplotype inside an anonymous inversion-like locus has a calibrated clinical association and credible expression support.

*A nested copy-dosage signal can be confounded by the broader inversion orientation, so dosage calibration, expression support, and clinical modeling have to remain distinct.*

Files provided to the model

sample_idcaseageage_bandsexpc1pc2pc3ancestry_groupclinic_stratumrecruitment_stream

Q00012150.4550_640-1.01514-0.21032-0.08849EURtertiaryclinic

Q00028057.3950_640-1.25987-0.124980.2344EURregionalregistry

Q00029168.465_plus00.915980.621770.01891AFRtertiaryclinic

Q00030174.0765_plus10.21125-0.59634-0.08197EAScommunityregistry

Q00032182.8265_plus0-1.12034-0.243720.14665EURcommunityclinic

Clinical and covariate data for the full cohort.

Case study 7

Regulatory genomics: Measuring chromatin loop strength after structural-variant and mapping artifact masking

Quantify a focal case-control Hi-C loop-strength difference after removing low-mappability and structural-variant artifacts from the expected-contact background.

*The target loop is defined at 20 kb resolution, but the expected-contact model is distorted unless low-mappability contacts and a case-only SV stripe are masked first.*

Files provided to the model

bin_idchromstartendgc_contentmappabilityre_sites

0chr84000004200000.461990338215725940.97875742147042735

1chr84200004400000.50441242085346770.89010849434983975

2chr84400004600000.432184515849381940.90568792893267123

3chr84600004800000.47331972826812180.93765298406647893

4chr84800005000000.44449560621507480.86825655179818774

Target-resolution bin annotations.

Case study 8

Statistical genetics: Multi-parent QTL mapping with founder reconstruction

Map a chromosome-1 quantitative-trait locus in an eight-founder recombinant population by reconstructing founder ancestry before testing the phenotype association.

*The visible marker data are biallelic, but the biological signal is founder ancestry. A defensible analysis therefore has to reconstruct founder state, check marker orientation, and separate the QTL from a batch-aligned nuisance peak.*

Files provided to the model

marker_idchrpos_cM

m2_065259.762431265596575

m2_103294.52656615104739

m2_107298.18761427503033

m2_079272.20130244108847

m1_054149.907510212292195

Marker identifiers, chromosomes, and genetic-map positions.

Case study 9

Population genetics: Parent-specific ancestry and recent admixture timing

Infer parent-specific ancestry proportions and recent admixture timing from phased local-ancestry tracts after repairing reciprocal artifacts and a chromosome-specific label inversion.

*Ancestry fractions and pulse times both change if reciprocal tract artifacts, chromosome-local label inversion, or map denominators are handled incorrectly.*

Files provided to the model

chromhapstart_morganend_morganancposteriorlow_complexity_frac

chr1h10.030.505A0.9850.08

chr1h10.5050.535B0.620.92

chr1h10.5351.478849A0.9850.08

chr1h11.5037271.852681B0.9850.08

chr1h11.8526812.422373A0.9850.08

Phased local-ancestry tracts with coordinates, ancestry labels, posterior values, and QC annotations.

Case study 10

Population genetics: Estimating selection from noisy ancient-DNA time series

Infer which of two haploid loci is under stronger positive selection from ancient allele-frequency time series while accounting for allele orientation, directional error, drift, and changing population size.

*Noisy ancient trajectories are not directly comparable until both loci are placed on the same derived-allele scale and the provided sample-level sequencing-error values are modeled directly.*

Files provided to the model

generationalt_readstotal_readsseq_errorsample_year

636400.16-4500

1234450.16-4278

1841550.16-4056

2438700.16-3833

3036900.16-3611

Read-count time series for locus A.

この記事をシェア

TechCrunch AI2026年7月5日 00:51

ミストラル AI とは？OpenAI の競合企業に関する全知識

The Verge AI2026年7月3日 20:49

ミッドジャーニーの医療用スキャナー、多くの疑問を残した裏側レポート

TLDR AI2026年7月3日 09:00

メタの「Watermelon」が GPT-5.5 ベンチマークに匹敵

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

OpenAI News·2026年6月30日 09:00·約8分

Inside Genebench-Pro

#GeneBench-Pro #OpenAI #医療 AI #推論能力 #遺伝子解析

TL;DR

AI深層分析2026年7月4日 20:04

重要/ 5段階

深度40%

キーポイント

GeneBench-Pro の公開と目的

複雑な医療データの統合分析

具体的な臨床シナリオの評価

合成ラベルと実データの使用

影響分析・編集コメントを表示

影響分析

編集コメント

ケーススタディ

注：ファイルプレビューでは、完全なデータセットからの抜粋が表示されます。

ケーススタディ 1

体細胞腫瘍学：構造変異に基づく腫瘍治療のベネフィット・リスク判断

モデルに提供されたファイル

patient_idanalysis_setagesexsitecalendar_periodecogtumor_burdenprior_linesprior_resistancelineage_classtherapy_classassessed16benefit16tox_stop_8wktime_zero_day

MTB0001173.8MS1P220.78731ATXR1i010

MTB0002155.2MS3P112.63701ATXR1i1000

MTB0003168.8FS4P200.89121ATXR1i1110

MTB0004182.8FS2P224.10100BTXR1i1000

MTB0005165.5FS1P317.011ATXR1i1000

レジストリ共変量、治療、週 16 の評価、ベネフィット、および早期毒性に関する情報。

ケーススタディ 2

機能ゲノミクス：CRISPR ターゲット検証：lncRNA トランスクリプトかゲノム座か？

{"translation": "翻訳全文"}

明白な lncRNA の依存性が転写体特異的なものか、それとも近傍の遺伝子座や隣接遺伝子の効果によって駆動されているかを判断する。

モデルに提供されるファイル

guide_idnominal_targetchrcoordstranddist_lnc_tss_bpdist_neighbor_tss_bpguide_gc_frac

g001LINC473chr7100014+14300.624

g002LINC473chr7100035-43670.584

g003LINC473chr7100051+116560.622

g004LINC473chr7100066-59660.617

g005LINC473chr7100088+74770.715

ガイド座標、ターゲット、距離、および GC 特徴量。

ケーススタディ 3

統計遺伝学：連鎖した遺伝子座におけるタンパク質薬物標的の優先順位付け

モデルに提供されるファイル

snppos_bpeffect_alleleother_allelemafbetasepval

rs20000050000000AC0.422150.0064386683107068080.0032673300912034120.04876727714241972

rs20000150010126AC0.057090.0110089933375813010.0069552392087504070.11345916603941006

rs20000250020253GT0.090210.0099220147571163190.0056330230270155180.07817048492026045

rs20000350030379GT0.483990.0105692156141645730.00322914197402374450.0010638520681901973

rs20000450040506AG0.377030.0070365513782386540.00332975923212698020.034580976884336506

PROTA に関するスクリーニング段階のタンパク質関連サマリー。

事例研究 4

臨床ゲノム学 / キャリアスクリーニング：CNV（コピー数多型）および擬遺伝子キャリブレーションを適用した DRX1 キャリアスクリーニング後の残存リスク

モデルに提供されたファイル

sample_idcollectionancestryfamily_history_tier

S_EUR_0001スクリーニングEUR0

S_EUR_0002スクリーニングEUR0

S_EUR_0003スクリーニングEUR0

S_EUR_0004スクリーニングEUR0

S_EUR_0005スクリーニングEUR1

祖先情報およびスクリーニング文脈を有するスクリーニング対象の成人。

事例研究 5

単細胞ゲノム学：環境 RNA 補正後の活性化モノサイト eQTL（遺伝子発現量形質関連座標）

単細胞 RNA-seq データから環境 RNA および技術的汚染を除去した後の、活性化モノサイトにおける遺伝子型の影響を推定する。

提供されたファイル（モデル用）

cell_iddonortotal_umiHBBIFI6ISG15LST1CXCL10

D01_C001 D01 1113734835

D01_C002 D01 110363311210

D01_C003 D01 11419812639

D01_C004 D01 125076043217

D01_C005 D01 10459125115

マーカー遺伝子、汚染マーカー、および標的遺伝子の細胞ごとの UMI カウントです。

ケーススタディ 6

構造遺伝学：ネスト型構造変異体 — 発現エビデンスと臨床関連性の評価

提供されたファイル（モデル用）

sample_idcaseageage_bandsexpc1pc2pc3ancestry_groupclinic_stratumrecruitment_stream

Q0001 2150.45 50_640 -1.01514 -0.21032 -0.08849 EUR tertiary clinic

Q0002 8057.39 50_640 -1.25987 -0.12498 0.2344 EUR regional registry

Q00029 168.46 5_plus0 0.91598 0.62177 0.01891 AFR tertiary clinic

Q00030 174.07 65_plus1 0.21125 -0.59634 -0.08197 EAS community registry

Q00032 182.82 65_plus0 -1.12034 -0.24372 0.14665 EUR community clinic

コホート全体の臨床データおよび共変量データです。

ケーススタディ 7

レギュラトリー・ゲノミクス：構造的変異およびマッピングアーティファクトのマスク後のクロマチンループ強度の測定

モデルに提供されたファイル

bin_idchromstartendgc_contentmappabilityre_sites

0chr84000004200000.461990338215725940.97875742147042735

1chr84200004400000.50441242085346770.89010849434983975

2chr84400004600000.432184515849381940.90568792893267123

3chr84600004800000.47331972826812180.93765298406647893

4chr84800005000000.44449560621507480.86825655179818774

ターゲット解像度のビン注釈。

ケーススタディ 8

統計遺伝学：創始者再構築による多親 QTL マッピング

モデルに提供されたファイル

marker_idchrpos_cM

m2_065259.762431265596575

m2_103294.52656615104739

m2_107298.18761427503033

m2_079272.20130244108847

m1_054149.907510212292195

マーカー識別子、染色体、および遺伝地図上の位置。

事例研究 9

集団遺伝学：親特異的祖先構成と最近の混合時期

モデルに提供されるファイル

chromhapstart_morganend_morganancposteriorlow_complexity_frac

chr1h10.030.505A0.9850.08

chr1h10.5050.535B0.620.92

chr1h10.5351.478849A0.9850.08

chr1h11.5037271.852681B0.9850.08

chr1h11.8526812.422373A0.9850.08

座標、祖先ラベル、事後確率値、および品質管理（QC）注釈を備えた位相指定された局所祖先トラクト。

事例研究 10

集団遺伝学：ノイズの多い古代 DNA 時系列からの選択推定

モデルに提供されるファイル

generationalt_readstotal_readsseq_errorsample_year

636400.16-4500

1234450.16-4278

1841550.16-4056

2438700.16-3833

3036900.16-3611

ロカス A におけるリードカウントの時系列。

原文を表示

Case studies

Note: File previews show excerpts from the full datasets.

Case study 1

Somatic oncology: Structural variant-guided tumor therapy benefit-risk decision

*The target subgroup has to be recovered from long-read, expression, tumor-quality, and pharmacogenomic evidence before benefit and toxicity can be interpreted as a treatment decision.*

Files provided to the model

patient_idanalysis_setagesexsitecalendar_periodecogtumor_burdenprior_linesprior_resistancelineage_classtherapy_classassessed16benefit16tox_stop_8wktime_zero_day

MTB0001173.8MS1P220.78731ATXR1i010

MTB0002155.2MS3P112.63701ATXR1i1000

MTB0003168.8FS4P200.89121ATXR1i1110

MTB0004182.8FS2P224.10100BTXR1i1000

MTB0005165.5FS1P317.011ATXR1i1000

Registry covariates, therapy, week-16 assessment, benefit, and early toxicity.

Case study 2

Functional genomics: CRISPR target validation: lncRNA transcript or genomic locus?

Decide whether an apparent lncRNA dependency is transcript-specific or driven by nearby-locus and neighbor-gene effects.

*Transcript-directed evidence has to survive controls for local DNA-locus perturbation, neighbor-gene repression, guide swaps, GC toxicity, and plate effects.*

Files provided to the model

guide_idnominal_targetchrcoordstranddist_lnc_tss_bpdist_neighbor_tss_bpguide_gc_frac

g001LINC473chr7100014+14300.624

g002LINC473chr7100035-43670.584

g003LINC473chr7100051+116560.622

g004LINC473chr7100066-59660.617

g005LINC473chr7100088+74770.715

Guide coordinates, targets, distances, and GC features.

Case study 3

Statistical genetics: Prioritizing protein drug targets in a linked genetic locus

*The two proteins share a correlated locus. The analysis has to move from marginal associations to conditional, LD-aware disease effects on a common protein scale.*

Files provided to the model

snppos_bpeffect_alleleother_allelemafbetasepval

rs20000050000000AC0.422150.0064386683107068080.0032673300912034120.04876727714241972

rs20000150010126AC0.057090.0110089933375813010.0069552392087504070.11345916603941006

rs20000250020253GT0.090210.0099220147571163190.0056330230270155180.07817048492026045

rs20000350030379GT0.483990.0105692156141645730.00322914197402374450.0010638520681901973

rs20000450040506AG0.377030.0070365513782386540.00332975923212698020.034580976884336506

Screening-stage protein association summaries for PROTA.

Case study 4

Clinical genomics / carrier screening: DRX1 carrier-screening residual risk under CNV and pseudogene calibration

Estimate ancestry-specific carrier frequencies, residual risk after a negative screen, partner carrier frequency, and affected-conceptus risk from carrier-screening assay data.

Files provided to the model

sample_idcollectionancestryfamily_history_tier

S_EUR_0001screeningEUR0

S_EUR_0002screeningEUR0

S_EUR_0003screeningEUR0

S_EUR_0004screeningEUR0

S_EUR_0005screeningEUR1

Screening-roster adults with ancestry and screening context.

Case study 5

Single-cell genomics: Activated-monocyte eQTL after ambient RNA correction

Estimate a genotype effect on activated-monocyte expression after removing ambient RNA and technical contamination from single-cell RNA-seq data.

*Ambient RNA affects both target expression and the marker panel used to call activation state, so correction has to occur before the eQTL model.*

Files provided to the model

cell_iddonortotal_umiHBBIFI6ISG15LST1CXCL10

D01_C001D011113734835

D01_C002D01110363311210

D01_C003D0111419812639

D01_C004D01125076043217

D01_C005D0110459125115

Per-cell UMI counts for marker genes, contamination markers, and the target gene.

Case study 6

Structural genetics: Nested structural variant: expression support and clinical association

Estimate whether a nested structural subhaplotype inside an anonymous inversion-like locus has a calibrated clinical association and credible expression support.

*A nested copy-dosage signal can be confounded by the broader inversion orientation, so dosage calibration, expression support, and clinical modeling have to remain distinct.*

Files provided to the model

sample_idcaseageage_bandsexpc1pc2pc3ancestry_groupclinic_stratumrecruitment_stream

Q00012150.4550_640-1.01514-0.21032-0.08849EURtertiaryclinic

Q00028057.3950_640-1.25987-0.124980.2344EURregionalregistry

Q00029168.465_plus00.915980.621770.01891AFRtertiaryclinic

Q00030174.0765_plus10.21125-0.59634-0.08197EAScommunityregistry

Q00032182.8265_plus0-1.12034-0.243720.14665EURcommunityclinic

Clinical and covariate data for the full cohort.

Case study 7

Regulatory genomics: Measuring chromatin loop strength after structural-variant and mapping artifact masking

Quantify a focal case-control Hi-C loop-strength difference after removing low-mappability and structural-variant artifacts from the expected-contact background.

*The target loop is defined at 20 kb resolution, but the expected-contact model is distorted unless low-mappability contacts and a case-only SV stripe are masked first.*

Files provided to the model

bin_idchromstartendgc_contentmappabilityre_sites

0chr84000004200000.461990338215725940.97875742147042735

1chr84200004400000.50441242085346770.89010849434983975

2chr84400004600000.432184515849381940.90568792893267123

3chr84600004800000.47331972826812180.93765298406647893

4chr84800005000000.44449560621507480.86825655179818774

Target-resolution bin annotations.

Case study 8

Statistical genetics: Multi-parent QTL mapping with founder reconstruction

Map a chromosome-1 quantitative-trait locus in an eight-founder recombinant population by reconstructing founder ancestry before testing the phenotype association.

Files provided to the model

marker_idchrpos_cM

m2_065259.762431265596575

m2_103294.52656615104739

m2_107298.18761427503033

m2_079272.20130244108847

m1_054149.907510212292195

Marker identifiers, chromosomes, and genetic-map positions.

Case study 9

Population genetics: Parent-specific ancestry and recent admixture timing

Infer parent-specific ancestry proportions and recent admixture timing from phased local-ancestry tracts after repairing reciprocal artifacts and a chromosome-specific label inversion.

*Ancestry fractions and pulse times both change if reciprocal tract artifacts, chromosome-local label inversion, or map denominators are handled incorrectly.*

Files provided to the model

chromhapstart_morganend_morganancposteriorlow_complexity_frac

chr1h10.030.505A0.9850.08

chr1h10.5050.535B0.620.92

chr1h10.5351.478849A0.9850.08

chr1h11.5037271.852681B0.9850.08

chr1h11.8526812.422373A0.9850.08

Phased local-ancestry tracts with coordinates, ancestry labels, posterior values, and QC annotations.

Case study 10

Population genetics: Estimating selection from noisy ancient-DNA time series

*Noisy ancient trajectories are not directly comparable until both loci are placed on the same derived-allele scale and the provided sample-level sequencing-error values are modeled directly.*

Files provided to the model

generationalt_readstotal_readsseq_errorsample_year

636400.16-4500

1234450.16-4278

1841550.16-4056

2438700.16-3833

3036900.16-3611

Read-count time series for locus A.

この記事をシェア

TechCrunch AI2026年7月5日 00:51

ミストラル AI とは？OpenAI の競合企業に関する全知識

The Verge AI2026年7月3日 20:49

ミッドジャーニーの医療用スキャナー、多くの疑問を残した裏側レポート

TLDR AI2026年7月3日 09:00

メタの「Watermelon」が GPT-5.5 ベンチマークに匹敵

今日のまとめ

AI日報で今日の重要ニュースをまとめ読み

ニュース一覧に戻る元記事を読む

キーポイント

影響分析

編集コメント

ケーススタディ

体細胞腫瘍学：構造変異に基づく腫瘍治療のベネフィット・リスク判断

モデルに提供されたファイル

機能ゲノミクス：CRISPR ターゲット検証：lncRNA トランスクリプトかゲノム座か？

モデルに提供されるファイル

統計遺伝学：連鎖した遺伝子座におけるタンパク質薬物標的の優先順位付け

モデルに提供されるファイル

臨床ゲノム学 / キャリアスクリーニング：CNV（コピー数多型）および擬遺伝子キャリブレーションを適用した DRX1 キャリアスクリーニング後の残存リスク

モデルに提供されたファイル

単細胞ゲノム学：環境 RNA 補正後の活性化モノサイト eQTL（遺伝子発現量形質関連座標）

提供されたファイル（モデル用）

構造遺伝学：ネスト型構造変異体 — 発現エビデンスと臨床関連性の評価

提供されたファイル（モデル用）

レギュラトリー・ゲノミクス：構造的変異およびマッピングアーティファクトのマスク後のクロマチンループ強度の測定

モデルに提供されたファイル

統計遺伝学：創始者再構築による多親 QTL マッピング

モデルに提供されたファイル

集団遺伝学：親特異的祖先構成と最近の混合時期

モデルに提供されるファイル

集団遺伝学：ノイズの多い古代 DNA 時系列からの選択推定

モデルに提供されるファイル

Case studies

Somatic oncology: Structural variant-guided tumor therapy benefit-risk decision

Files provided to the model

Functional genomics: CRISPR target validation: lncRNA transcript or genomic locus?

Files provided to the model

Statistical genetics: Prioritizing protein drug targets in a linked genetic locus

Files provided to the model

Clinical genomics / carrier screening: DRX1 carrier-screening residual risk under CNV and pseudogene calibration

Files provided to the model

Single-cell genomics: Activated-monocyte eQTL after ambient RNA correction

Files provided to the model

Structural genetics: Nested structural variant: expression support and clinical association

Files provided to the model

Regulatory genomics: Measuring chromatin loop strength after structural-variant and mapping artifact masking

Files provided to the model

Statistical genetics: Multi-parent QTL mapping with founder reconstruction

Files provided to the model

Population genetics: Parent-specific ancestry and recent admixture timing

Files provided to the model

Population genetics: Estimating selection from noisy ancient-DNA time series

Files provided to the model

関連記事

キーポイント

影響分析

編集コメント

ケーススタディ

体細胞腫瘍学：構造変異に基づく腫瘍治療のベネフィット・リスク判断

モデルに提供されたファイル

機能ゲノミクス：CRISPR ターゲット検証：lncRNA トランスクリプトかゲノム座か？

モデルに提供されるファイル

統計遺伝学：連鎖した遺伝子座におけるタンパク質薬物標的の優先順位付け

モデルに提供されるファイル

臨床ゲノム学 / キャリアスクリーニング：CNV（コピー数多型）および擬遺伝子キャリブレーションを適用した DRX1 キャリアスクリーニング後の残存リスク

モデルに提供されたファイル

単細胞ゲノム学：環境 RNA 補正後の活性化モノサイト eQTL（遺伝子発現量形質関連座標）

提供されたファイル（モデル用）

構造遺伝学：ネスト型構造変異体 — 発現エビデンスと臨床関連性の評価

提供されたファイル（モデル用）

レギュラトリー・ゲノミクス：構造的変異およびマッピングアーティファクトのマスク後のクロマチンループ強度の測定

モデルに提供されたファイル

統計遺伝学：創始者再構築による多親 QTL マッピング

モデルに提供されたファイル

集団遺伝学：親特異的祖先構成と最近の混合時期

モデルに提供されるファイル

集団遺伝学：ノイズの多い古代 DNA 時系列からの選択推定

モデルに提供されるファイル

Case studies

Somatic oncology: Structural variant-guided tumor therapy benefit-risk decision

Files provided to the model

Functional genomics: CRISPR target validation: lncRNA transcript or genomic locus?

Files provided to the model

Statistical genetics: Prioritizing protein drug targets in a linked genetic locus

Files provided to the model

Clinical genomics / carrier screening: DRX1 carrier-screening residual risk under CNV and pseudogene calibration

Files provided to the model

Single-cell genomics: Activated-monocyte eQTL after ambient RNA correction