Back

A modality gap in personal-genome prediction by sequence-to-function models

Mostafavi, S.; Tu, X.; Spiro, A.; Chikina, M.

2026-02-03 bioinformatics
10.64898/2026.02.01.702969 bioRxiv
Show abstract

Sequence-to-function (S2F) models trained on reference genomes have achieved strong performance on regulatory prediction and variant-effect benchmarks, yet they still struggle to predict inter-individual variation in gene expression from personal genomes. We evaluated AlphaGenome on personal genome prediction in two molecular modalities--gene expression and chromatin accessibility--and observed a striking dichotomy: AlphaGenome approaches the heritability ceiling for chromatin accessibility variation, but remains far below baseline for gene-expression variation, despite improving over Borzoi. Context truncation and fine-mapped QTL analyses indicate that accessibility is governed by local regulatory grammar captured by current architectures, whereas gene-expression variation requires long-range regulatory integration that remains challenging.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.