Back

Integrating Genomics into Multimodal EHR Foundation Models

Amar, J.; Liu, E.; Breschi, A.; Zhang, L.; Kheradpour, P.; Li, S.; Soleymani Lehmann, L.; Giulianelli, A.; Edwards, M.; Nola, D.; Mani, R.; Vats, P.; Tetreault, J.; Chen, T. J.; McLean, C. Y.

2025-10-27 bioinformatics
10.1101/2025.10.26.684668 bioRxiv
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWThis paper introduces an innovative Electronic Health Record (EHR) foundation model that integrates Polygenic Risk Scores (PRS) as a foundational data modality, moving beyond traditional EHR-only approaches to build more holistic health profiles. Leveraging the extensive and diverse data from the All of Us (AoU) Research Program, this multimodal framework aims to learn complex relationships between clinical data and genetic predispositions. The methodology extends advancements in generative AI to the EHR foundation model space, enhancing predictive capabilities and interpretability. Evaluation on AoU data demonstrates the models predictive value for the onset of various conditions, particularly Type 2 Diabetes (T2D), and illustrates the interplay between PRS and EHR data. The work also explores transfer learning for custom classification tasks, showcasing the architectures versatility and efficiency. This approach is pivotal for unlocking new insights into disease prediction, proactive health management, risk stratification, and personalized treatment strategies, laying the groundwork for more personalized, equitable, and actionable real-world evidence generation in healthcare.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.1%
25.9%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
18.3%
3
npj Digital Medicine
97 papers in training set
Top 0.4%
12.4%
50% of probability mass above
4
Advanced Science
249 papers in training set
Top 4%
4.3%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.6%
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
2.4%
7
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
2.1%
8
Database
51 papers in training set
Top 0.3%
1.8%
9
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.2%
1.7%
10
Genome Medicine
154 papers in training set
Top 4%
1.7%
11
Bioinformatics
1061 papers in training set
Top 7%
1.7%
12
GigaScience
172 papers in training set
Top 2%
1.3%
13
Nature Machine Intelligence
61 papers in training set
Top 2%
1.2%
14
Scientific Reports
3102 papers in training set
Top 66%
1.2%
15
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
16
IEEE Access
31 papers in training set
Top 0.6%
1.1%
17
PLOS ONE
4510 papers in training set
Top 61%
1.1%
18
Nature Communications
4913 papers in training set
Top 58%
1.0%
19
Journal of Personalized Medicine
28 papers in training set
Top 0.9%
0.9%
20
European Journal of Human Genetics
49 papers in training set
Top 1%
0.8%
21
Expert Systems with Applications
11 papers in training set
Top 0.4%
0.7%
22
BioData Mining
15 papers in training set
Top 0.9%
0.7%
23
iScience
1063 papers in training set
Top 32%
0.7%
24
Frontiers in Genetics
197 papers in training set
Top 10%
0.7%
25
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
26
JMIR Medical Informatics
17 papers in training set
Top 2%
0.6%