Back

Unsupervised machine-learning identifies clinically distinct subtypes of ALS that reflect different genetic architectures and biological mechanisms

Spargo, T. P.; Marriott, H.; Hunt, G.; Pain, O.; Kabiljo, R.; Bowles, H.; Sproviero, W.; Gillett, A. C.; Fogh, I.; Project MinE ALS Sequencing Consortium, ; Andersen, P. M.; Basak, N. A.; Shaw, P.; Corcia, P.; Couratier, P.; de Carvalho, M.; Drory, V.; Glass, J. D.; Gotkine, M.; hardiman, O.; Landers, J. E.; McLaughlin, R.; Mora Pardina, J. S.; Morrison, K. E.; Pinto, S.; Povedano, M.; Shaw, C. E.; Silani, V.; Ticozzi, N.; van Damme, P.; van den Berg, L. h.; Vourch, P.; Weber, M.; Veldink, J.; Dobson, R.; Al Khleifat, A.; Cummis, N.; Stahl, D.; Al-Chalabi, A.; Iacoangeli, A.

2023-06-13 neurology
10.1101/2023.06.12.23291304 medRxiv
Show abstract

BackgroundAmyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterised by a highly variable clinical presentation and multifaceted genetic and biological bases that translate into great patient heterogeneity. The identification of homogeneous subgroups of patients in terms of both clinical presentation and biological causes, could favour the development of effective treatments, healthcare, and clinical trials. We aimed to identify and characterise homogenous clinical subgroups of ALS, examining whether they represent underlying biological trends. MethodsLatent class clustering analysis, an unsupervised machine-learning method, was used to identify homogenous subpopulations in 6,523 people with ALS from Project MinE, using widely collected ALS-related clinical variables. The clusters were validated using 7,829 independent patients from STRENGTH. We tested whether the identified subgroups were associated with biological trends in genetic variation across genes previously linked to ALS, polygenic risk scores of ALS and related neuropsychiatric traits, and in gene expression data from post-mortem motor cortex samples. ResultsWe identified five ALS subgroups based on patterns in clinical data which were general across international datasets. Distinct genetic trends were observed for rare variants in the SOD1 and C9orf72 genes, and across genes implicated in biological processes relevant to ALS. Polygenic risk scores of ALS, schizophrenia and Parkinsons disease were also higher in distinct clusters with respect to controls. Gene expression analysis identified different altered biological processes across clusters reflecting the genetic differences. We developed a machine learning classifier based on our model to assign subgroup membership using clinical data available at first visit, and made it available on a public webserver at http://latentclusterals.er.kcl.ac.uk. ConclusionALS subgroups characterised by highly distinct clinical presentations were discovered and validated in two large independent international datasets. Such groups were also characterised by different underlying genetic architectures and biology. Our results showed that data-driven patient stratification into more clinically and biologically homogeneous subtypes of ALS is possible and could help develop more effective and targeted approaches to the biomedical and clinical study of ALS.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Journal of Neurology
26 papers in training set
Top 0.1%
12.4%
2
Brain Communications
147 papers in training set
Top 0.1%
10.0%
3
Journal of Neurology, Neurosurgery & Psychiatry
29 papers in training set
Top 0.1%
8.3%
4
Neuropathology and Applied Neurobiology
14 papers in training set
Top 0.1%
7.1%
5
Annals of Neurology
57 papers in training set
Top 0.2%
6.3%
6
European Journal of Neurology
20 papers in training set
Top 0.1%
6.3%
50% of probability mass above
7
Brain
154 papers in training set
Top 1%
4.3%
8
Scientific Reports
3102 papers in training set
Top 38%
3.6%
9
Neurobiology of Disease
134 papers in training set
Top 2%
2.7%
10
PLOS ONE
4510 papers in training set
Top 45%
2.6%
11
NeuroImage: Clinical
132 papers in training set
Top 2%
2.1%
12
Frontiers in Neurology
91 papers in training set
Top 3%
1.8%
13
npj Parkinson's Disease
89 papers in training set
Top 0.7%
1.7%
14
Journal of the Neurological Sciences
17 papers in training set
Top 0.3%
1.6%
15
BMC Neurology
12 papers in training set
Top 0.4%
1.6%
16
BMC Medicine
163 papers in training set
Top 4%
1.5%
17
Movement Disorders
62 papers in training set
Top 0.8%
1.3%
18
Annals of Clinical and Translational Neurology
29 papers in training set
Top 0.8%
1.2%
19
Muscle & Nerve
10 papers in training set
Top 0.2%
1.2%
20
Frontiers in Aging Neuroscience
67 papers in training set
Top 3%
0.9%
21
Neurology
44 papers in training set
Top 1%
0.9%
22
EBioMedicine
39 papers in training set
Top 1.0%
0.8%
23
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.7%
24
Heliyon
146 papers in training set
Top 7%
0.7%
25
Journal of Parkinson's Disease
13 papers in training set
Top 0.4%
0.7%
26
Frontiers in Cellular Neuroscience
79 papers in training set
Top 1%
0.7%
27
The Journal of Pathology
22 papers in training set
Top 0.6%
0.7%
28
npj Genomic Medicine
33 papers in training set
Top 1%
0.7%
29
Biomedicines
66 papers in training set
Top 3%
0.7%
30
European Journal of Neuroscience
168 papers in training set
Top 2%
0.6%