Back

Unsupervised co-optimization of a graph neural network and a knowledge graph embedding model to prioritize causal genes for Alzheimers Disease

Liu, K.; Prabhakar, V.

2022-10-06 health informatics
10.1101/2022.10.03.22280657 medRxiv
Show abstract

1.Data obtained from clinical trials for a given disease often capture reliable empirical features of the highest quality which are limited to few studies/experiments. In contrast, knowledge data extracted from biomedical literature captures a wide range of clinical information relevant to a given disease that may not be as reliable as the experimental data. Therefore, we propose a novel method of training that co-optimizes two AI algorithms on experimental data and knowledge-based information from literature respectively to supplement the learning of one algorithm with that of the other and apply this method to prioritize/rank causal genes for Alzheimers Disease (AD). One algorithm generates unsupervised embeddings for gene nodes in a protein-protein interaction network associated with experimental data. The other algorithm generates embeddings for the nodes/entities in a knowledge graph constructed from biomedical literature. Both these algorithms are co-optimized to leverage information from each others domain. Therefore; a downstream inferencing task to rank causal genes for AD ensures the consideration of experimental and literature data available to implicate any given gene in the geneset. Rank-based evaluation metrics computed to validate the gene rankings prioritized by our algorithm showed that the top ranked positions were highly enriched with genes from a ground truth set that were experimentally verified to be causal for the progression of AD.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
12.8%
2
Frontiers in Aging Neuroscience
67 papers in training set
Top 0.4%
8.3%
3
Computers in Biology and Medicine
120 papers in training set
Top 0.3%
6.4%
4
Scientific Reports
3102 papers in training set
Top 23%
4.9%
5
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.3%
4.9%
6
PLOS ONE
4510 papers in training set
Top 34%
4.4%
7
Artificial Intelligence in the Life Sciences
11 papers in training set
Top 0.1%
4.0%
8
PLOS Computational Biology
1633 papers in training set
Top 9%
3.7%
9
Journal of Biomedical Informatics
45 papers in training set
Top 0.4%
3.6%
50% of probability mass above
10
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
3.6%
11
Journal of Personalized Medicine
28 papers in training set
Top 0.2%
2.4%
12
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
1.9%
13
BioMed Research International
25 papers in training set
Top 2%
1.7%
14
Alzheimer's Research & Therapy
52 papers in training set
Top 1%
1.7%
15
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
16
NeuroImage: Clinical
132 papers in training set
Top 2%
1.5%
17
Expert Systems with Applications
11 papers in training set
Top 0.1%
1.5%
18
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.4%
1.3%
19
Biomedicines
66 papers in training set
Top 2%
1.2%
20
Patterns
70 papers in training set
Top 1%
1.2%
21
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.0%
22
Medical Image Analysis
33 papers in training set
Top 0.9%
0.9%
23
JMIR Medical Informatics
17 papers in training set
Top 1%
0.9%
24
Experimental Neurology
57 papers in training set
Top 1%
0.8%
25
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.8%
26
Advanced Science
249 papers in training set
Top 19%
0.8%
27
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
28
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
29
Communications Biology
886 papers in training set
Top 23%
0.8%
30
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%