Back

Data-driven prioritization of mouse strains for improved preclinical modeling of rare and common disease

Ball, R. L.; Klein, A.; Gerring, M. W.; Berger-Liedtka, A. K.; Kim, M. J.; Berry, M. A.; Gargano, M. A.; Mukherjee, G.; Fisher, H. S.; Nichols-Meade, T.; Castellanos, F.; Smith, C. L.; Karlebach, G.; Murray, S. A.; Bult, C. J.; Robinson, P. N.; Chesler, E. J.

2026-04-30 bioinformatics
10.64898/2026.04.27.721175 bioRxiv
Show abstract

Choosing an appropriate mouse genetic background is a persistent challenge for successful translation of preclinical disease modeling. We present Strain Recommender, a genomic framework that prioritizes inbred mouse strains as relatively vulnerable or resilient to a disease state using disease-associated gene signatures and strain-specific transcriptome predictions. The method represents disease states as weighted gene scores, ranks 657 strains based on resemblance to the disease state, and estimates uncertainty via a permutation-derived false positive rate (FPR). In a prospective validation of connective tissue disorder predictions, vulnerable and resilient Collaborative Cross strains showed significantly different cardiovascular abnormalities. In a global retrospective validation predicting previously reported strain background effects, Strain Recommender achieved [≥] 90% sensitivity for 86.6% of diseases with 94.4% mean sensitivity (95% CI: 94.0-94.8%) across 5,890 diseases, including 92.3% (95% CI: 91.6-93.0%) for 2,598 rare diseases, demonstrating its potential to improve the validity of mouse models of human disease.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.1%
23.3%
2
Nature Communications
4913 papers in training set
Top 17%
10.4%
3
Nature Genetics
240 papers in training set
Top 2%
3.7%
4
The American Journal of Human Genetics
206 papers in training set
Top 1%
3.7%
5
Cell Systems
167 papers in training set
Top 4%
3.2%
6
Scientific Reports
3102 papers in training set
Top 44%
2.7%
7
PLOS Computational Biology
1633 papers in training set
Top 13%
2.4%
8
Bioinformatics
1061 papers in training set
Top 6%
2.1%
50% of probability mass above
9
Nature Machine Intelligence
61 papers in training set
Top 1%
2.1%
10
Frontiers in Genetics
197 papers in training set
Top 4%
1.9%
11
Nature Methods
336 papers in training set
Top 4%
1.9%
12
Cell Genomics
162 papers in training set
Top 3%
1.7%
13
Advanced Science
249 papers in training set
Top 10%
1.7%
14
Communications Biology
886 papers in training set
Top 8%
1.7%
15
PLOS ONE
4510 papers in training set
Top 52%
1.7%
16
Nucleic Acids Research
1128 papers in training set
Top 12%
1.5%
17
Bioinformatics Advances
184 papers in training set
Top 3%
1.5%
18
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.5%
19
npj Digital Medicine
97 papers in training set
Top 2%
1.5%
20
Disease Models & Mechanisms
119 papers in training set
Top 1%
1.4%
21
Cell Reports Medicine
140 papers in training set
Top 5%
1.4%
22
Science Translational Medicine
111 papers in training set
Top 4%
1.3%
23
iScience
1063 papers in training set
Top 20%
1.3%
24
BMC Medical Genomics
36 papers in training set
Top 0.8%
1.1%
25
European Journal of Human Genetics
49 papers in training set
Top 0.9%
1.1%
26
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.8%
27
Molecular Systems Biology
142 papers in training set
Top 2%
0.8%
28
Science Advances
1098 papers in training set
Top 29%
0.8%
29
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
30
Genome Research
409 papers in training set
Top 4%
0.8%