Back

E2EGraph: An End-to-end Graph Learning Model for Interpretable Prediction of Pathlogical Stages in Prostate Cancer

Zhan, W.; Song, C.; Das, S.; Rebbeck, T. R.; Shi, X.

2023-03-12 bioinformatics
10.1101/2023.03.09.531924 bioRxiv
Show abstract

Prostate cancer is one of the deadliest cancers worldwide. An accurate prediction of pathological stages using the expressions and interactions of genes is effective for clinical assessment and treatment. However, identification of interactions using biological procedure is time consuming and prohibitively expensive. A graph is a powerful representation for the complex interactome of genes, their transcripts, and proteins. Recently, Graph Neural Networks (GNNs) have gained great attention in machine learning due to their capability to capture the graphical interactions among data entities. To leverage GNNs for predicting pathological stage stages, we developed an end-to-end graph representation and learning model, namely E2EGraph, which can automatically generate a graph representation using gene expression data and a multi-head graph attention network to learn the strength of interactions among genes and make the prediction. To ensure the reliability of model prediction, we identify critical components of graph representation and GNN model to interpret prediction results from multiple perspectives at gene and patient levels. We evaluated E2EGraph to predict pathological stages of prostate cancer using The Cancer Genome Atlas (TCGA) data. Our experimental results demonstrate that E2EGraph reaches the state-of-art prediction performance while being effective in identifying marker genes indicated by interpretability. Our results point to a direction where adaptive graph construction and attention based GNNs can be leveraged for various prediction tasks and interpretation of model prediction in a variety of data domains including disease prediction.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 3%
10.1%
2
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.1%
8.4%
3
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
8.4%
4
Journal of Computational Biology
37 papers in training set
Top 0.1%
6.8%
5
PLOS ONE
4510 papers in training set
Top 25%
6.8%
6
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
6.4%
7
Scientific Reports
3102 papers in training set
Top 28%
4.3%
50% of probability mass above
8
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.1%
9
Frontiers in Genetics
197 papers in training set
Top 3%
3.1%
10
Neurocomputing
13 papers in training set
Top 0.1%
2.6%
11
PLOS Computational Biology
1633 papers in training set
Top 12%
2.6%
12
BMC Bioinformatics
383 papers in training set
Top 3%
2.4%
13
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
1.7%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
15
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.5%
16
Expert Systems with Applications
11 papers in training set
Top 0.2%
1.3%
17
Bioinformatics Advances
184 papers in training set
Top 4%
1.2%
18
BioData Mining
15 papers in training set
Top 0.5%
1.2%
19
IEEE Access
31 papers in training set
Top 0.6%
1.2%
20
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.1%
21
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.9%
22
iScience
1063 papers in training set
Top 29%
0.8%
23
Quantitative Biology
11 papers in training set
Top 0.7%
0.7%
24
Life
27 papers in training set
Top 0.4%
0.7%
25
Journal of Molecular Biology
217 papers in training set
Top 4%
0.6%
26
Neural Networks
32 papers in training set
Top 0.9%
0.6%
27
Frontiers in Computational Neuroscience
53 papers in training set
Top 2%
0.6%
28
Artificial Intelligence in Medicine
15 papers in training set
Top 0.8%
0.6%
29
Nature Communications
4913 papers in training set
Top 65%
0.6%
30
Bioengineering
24 papers in training set
Top 2%
0.6%