Back

CAPRINI-M: An AI-curated Cardiac-Specific Atlas of Protein Interactions in Mice

Gjerga, E.; Wiesenbach, P.; Goerner, C.-A.; Zhang, Y.; Pelz, K.; List, M.; Dieterich, C.

2026-03-09 bioinformatics
10.64898/2026.03.06.710104 bioRxiv
Show abstract

MotivationProtein-protein interactions are fundamental to cardiovascular disease biology, but the corresponding knowledge is dispersed across the literature and heterogeneous databases, making systematic curation time-consuming. Moreover, many existing PPI resources may be biased and lack detailed information on structural interaction interfaces or associated thermodynamic parameters. ResultsWe present CAPRINI-M (CArdiac PRotein INteractions In Mice), a web-based tool hosting an AI-curated atlas of cardiac protein interactions. We mined 9,105 cardiobiology manuscripts and used open-source LLMs (LLaMA-3.3 70B) to extract 11,189 protein-protein interactions. We then used AlphaFold3 to infer interaction interfaces, estimate thermodynamic properties related to complex stability, and predict the likelihood that each protein pair forms a complex. In our benchmarking analysis, CAPRINI-M showed stronger performance than the comparator PPI resources tested here. Predicted interaction favourability also agreed with published experimental evidence, with lower predicted Gibbs free energy associated with experimentally preferred binding partners. Overall, CAPRINI-M provides a more comprehensive, mechanistically annotated view of cardiovascular disease-relevant protein-protein interactions by integrating literature evidence with structural, interface-level, and stability-related information. AvailabilityThe CAPRINI-M web application is available at https://shiny.dieterichlab.org/app/caprinim. The source code used in this study is linked in the manuscripts Availability section.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.7%
28.8%
2
Nucleic Acids Research
1128 papers in training set
Top 2%
8.8%
3
Database
51 papers in training set
Top 0.1%
7.5%
4
Nature Communications
4913 papers in training set
Top 31%
5.1%
50% of probability mass above
5
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.1%
6
Bioinformatics Advances
184 papers in training set
Top 2%
2.7%
7
Patterns
70 papers in training set
Top 0.4%
2.6%
8
Communications Biology
886 papers in training set
Top 4%
2.5%
9
Genome Medicine
154 papers in training set
Top 3%
2.2%
10
PLOS Computational Biology
1633 papers in training set
Top 13%
2.2%
11
npj Digital Medicine
97 papers in training set
Top 2%
2.0%
12
Scientific Reports
3102 papers in training set
Top 57%
1.7%
13
iScience
1063 papers in training set
Top 17%
1.5%
14
BMC Bioinformatics
383 papers in training set
Top 5%
1.5%
15
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.5%
16
Nature Cardiovascular Research
28 papers in training set
Top 0.3%
1.4%
17
GigaScience
172 papers in training set
Top 2%
1.3%
18
Genome Biology
555 papers in training set
Top 6%
0.9%
19
Advanced Science
249 papers in training set
Top 18%
0.8%
20
Scientific Data
174 papers in training set
Top 2%
0.8%
21
Nature Methods
336 papers in training set
Top 6%
0.8%
22
European Heart Journal - Digital Health
15 papers in training set
Top 0.6%
0.7%
23
Genetics in Medicine
69 papers in training set
Top 1.0%
0.7%
24
Nature Genetics
240 papers in training set
Top 7%
0.7%
25
European Heart Journal
16 papers in training set
Top 0.9%
0.7%
26
Journal of the American Heart Association
119 papers in training set
Top 4%
0.7%
27
Nature Computational Science
50 papers in training set
Top 2%
0.7%
28
EBioMedicine
39 papers in training set
Top 1%
0.7%
29
PLOS Digital Health
91 papers in training set
Top 3%
0.7%
30
Journal of Translational Medicine
46 papers in training set
Top 4%
0.5%