Back

Joint Clinical And Molecular Subtyping Of COPD With Variational Autoencoders

Maiorino, E.; De Marzio, M.; Weiss, S.; Silverman, E.; Castaldi, P.; Glass, K.

2023-08-20 respiratory medicine
10.1101/2023.08.19.23294298 medRxiv
Show abstract

Chronic Obstructive Pulmonary Disease (COPD) is a complex, heterogeneous disease. Traditional subtyping methods generally focus on either the clinical manifestations or the molecular endotypes of the disease, resulting in classifications that do not fully capture the diseases complexity. Here, we bridge this gap by introducing a subtyping pipeline that integrates clinical and gene expression data with variational autoencoders. We apply this methodology to the COPDGene study, a large study of current and former smoking individuals with and without COPD. Our approach generates a set of vector embeddings, called Personalized Integrated Profiles (PIPs), that recapitulate the joint clinical and molecular state of the subjects in the study. Prediction experiments show that the PIPs have a predictive accuracy comparable to or better than other embedding approaches. Using trajectory learning approaches, we analyze the main trajectories of variation in the PIP space and identify five well-separated subtypes with distinct clinical phenotypes, expression signatures, and disease outcomes. Notably, these subtypes are more robust to data resampling compared to those identified using traditional clustering approaches. Overall, our findings provide new avenues to establish fine-grained associations between the clinical characteristics, molecular processes, and disease outcomes of COPD.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Communications Medicine
85 papers in training set
Top 0.1%
10.2%
2
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 5%
10.2%
3
Scientific Reports
3102 papers in training set
Top 14%
6.9%
4
npj Digital Medicine
97 papers in training set
Top 0.7%
6.4%
5
Nature Communications
4913 papers in training set
Top 35%
4.4%
6
International Journal of Epidemiology
74 papers in training set
Top 0.5%
4.0%
7
European Respiratory Journal
54 papers in training set
Top 0.5%
3.6%
8
eBioMedicine
130 papers in training set
Top 0.4%
3.1%
9
Nature Machine Intelligence
61 papers in training set
Top 1%
2.8%
50% of probability mass above
10
Journal of Translational Medicine
46 papers in training set
Top 0.4%
2.6%
11
Computers in Biology and Medicine
120 papers in training set
Top 1%
2.4%
12
iScience
1063 papers in training set
Top 11%
1.9%
13
Communications Biology
886 papers in training set
Top 7%
1.8%
14
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
1.8%
15
Medical Image Analysis
33 papers in training set
Top 0.6%
1.7%
16
American Journal of Respiratory and Critical Care Medicine
39 papers in training set
Top 0.5%
1.7%
17
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 5%
1.5%
18
eLife
5422 papers in training set
Top 45%
1.5%
19
PLOS Computational Biology
1633 papers in training set
Top 18%
1.5%
20
Patterns
70 papers in training set
Top 1%
1.2%
21
Bioinformatics
1061 papers in training set
Top 8%
1.2%
22
Science Advances
1098 papers in training set
Top 23%
1.2%
23
IEEE Access
31 papers in training set
Top 0.6%
1.2%
24
Advanced Science
249 papers in training set
Top 15%
1.1%
25
Journal of Biomedical Informatics
45 papers in training set
Top 1%
0.9%
26
Human Molecular Genetics
130 papers in training set
Top 3%
0.9%
27
Journal of Allergy and Clinical Immunology
25 papers in training set
Top 0.7%
0.8%
28
Nucleic Acids Research
1128 papers in training set
Top 16%
0.8%
29
Genomics
60 papers in training set
Top 3%
0.8%
30
Respiratory Research
19 papers in training set
Top 0.5%
0.8%