Back

Disease-guided functional gene mapping across species reveals translational correspondences beyond sequence orthology

Yan, J.; Cao, Z.

2026-05-13 bioinformatics
10.64898/2026.05.10.720506 bioRxiv
Show abstract

Selecting the correct mouse gene to model a human disease phenotype is critical for translational research, yet sequence-based orthology fails when genes have been lost, duplicated, or functionally rewired between species. Here we present BRIDGE (Biological Rank Integration for Disease Gene Equivalence), a framework that identifies functional mouse equivalents of human disease genes without sequence input. BRIDGE integrates 3.37 million disease-gene associations, biological pathways, and Gene Ontology annotations into a unified heterogeneous graph (94,897 nodes, [~]8.3 million edges), encoded by a heterogeneous graph transformer with fused Gromov-Wasserstein alignment and multi-strategy reciprocal rank fusion. On two sequence-independent benchmarks, BRIDGE achieves Recall@5 of 61.8-66.7%, compared with 0.0-20.1% for Ensembl Compara. We validate BRIDGE through case studies including neutrophil pathway rewiring (CXCL8[->]Cxcl1/2/5), acute-phase divergence (CRP[->]Apcs), and immune checkpoint substitution (LILRB2[->]Pirb), and demonstrate complementarity with sequence methods in drug-translation analysis. Prospective validation of 30 novel predictions against three independent data modalities (tissue expression, cell-type expression, and phenotype concordance) shows that BRIDGE picks are favoured in 64 of 65 orthogonal tests (sign test P = 3.6 x 10-{superscript 1}) and significantly outperform all tested baselines including Ensembl Compara, BLAST RBH, and ESM-2. BRIDGE provides a benchmarked framework for functional cross-species gene mapping in disease-model design.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 15%
12.0%
2
Nature Methods
336 papers in training set
Top 1%
9.9%
3
Bioinformatics Advances
184 papers in training set
Top 0.3%
8.2%
4
Bioinformatics
1061 papers in training set
Top 3%
8.2%
5
Cell Systems
167 papers in training set
Top 2%
6.7%
6
Nucleic Acids Research
1128 papers in training set
Top 3%
6.7%
50% of probability mass above
7
Genome Medicine
154 papers in training set
Top 2%
4.2%
8
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.2%
9
Nature Biotechnology
147 papers in training set
Top 3%
2.7%
10
Advanced Science
249 papers in training set
Top 8%
2.5%
11
Cell Reports Methods
141 papers in training set
Top 2%
2.0%
12
Scientific Reports
3102 papers in training set
Top 56%
1.7%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
14
Nature Machine Intelligence
61 papers in training set
Top 2%
1.7%
15
iScience
1063 papers in training set
Top 17%
1.6%
16
Patterns
70 papers in training set
Top 1%
1.4%
17
PLOS ONE
4510 papers in training set
Top 59%
1.3%
18
Genome Biology
555 papers in training set
Top 6%
1.2%
19
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.2%
20
BMC Bioinformatics
383 papers in training set
Top 6%
1.2%
21
Nature Genetics
240 papers in training set
Top 6%
1.2%
22
Genome Research
409 papers in training set
Top 4%
0.9%
23
PLOS Computational Biology
1633 papers in training set
Top 22%
0.9%
24
Cell Genomics
162 papers in training set
Top 7%
0.7%
25
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
26
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
27
eLife
5422 papers in training set
Top 59%
0.7%
28
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
29
Cell Reports Medicine
140 papers in training set
Top 9%
0.7%
30
npj Systems Biology and Applications
99 papers in training set
Top 3%
0.6%