Back

Unsupervised machine learning identifies distinct molecular and phenotypic ALS subtypes in post-mortem motor cortex and blood expression data

Marriott, H.; kabiljo, R.; Hunt, G. P.; Al Khleifat, A.; Jones, A. R.; Troakes, C.; Pfaff, A.; Quinn, J.; Koks, S.; Dobson, R.; Schwab, P.; Al-Chalabi, A.; iacoangeli, a.

2023-04-25 neurology
10.1101/2023.04.21.23288942 medRxiv
Show abstract

BackgroundAmyotrophic lateral sclerosis (ALS) displays considerable clinical, genetic and molecular heterogeneity. Machine learning approaches have shown potential to disentangle complex disease landscapes and they have been utilised for patient stratification in ALS. However, lack of independent validation in different populations and in pre-mortem tissue samples have greatly limited their use in clinical and research settings. We overcame such issues by performing a large-scale study of over 600 post-mortem brain and blood samples of people with ALS from four independent datasets from the UK, Italy, the Netherlands and the US. MethodsHierarchical clustering was performed on the 5000 most variably expressed autosomal genes identified from post-mortem motor cortex expression data of people with sporadic ALS from the KCL BrainBank (N=112). The molecular architectures of each cluster were investigated with gene enrichment, network and cell composition analysis. Methylation and genetic data were also used to assess if other omics measures differed between individuals. Validation of these clusters was achieved by applying linear discriminant analysis models based on the KCL BrainBank to the TargetALS US motor cortex (N=93), as well as Italian (N=15) and Dutch (N=397) blood expression datasets. Phenotype analysis was also performed to assess cluster-specific differences in clinical outcomes. ResultsWe identified three molecular phenotypes, which reflect the proposed major mechanisms of ALS pathogenesis: synaptic and neuropeptide signalling, excitotoxicity and oxidative stress, and neuroinflammation. Known ALS risk genes were identified among the informative genes of each cluster, suggesting potential for genetic profiling of the molecular phenotypes. Cell types which are known to be associated with specific molecular phenotypes were found in higher proportions in those clusters. These molecular phenotypes were validated in independent motor cortex and blood datasets. Phenotype analysis identified distinct cluster-related outcomes associated with progression, survival and age of death. We developed a public webserver (https://alsgeclustering.er.kcl.ac.uk) that allows users to stratify samples with our model by uploading their expression data. ConclusionsWe have identified three molecular phenotypes, driven by different cell types, which reflect the proposed major mechanisms of ALS pathogenesis. Our results support the hypothesis of biological heterogeneity in ALS where different mechanisms underly ALS pathogenesis in a subgroup of patients that can be identified by a specific expression signature. These molecular phenotypes show potential for stratification of clinical trials, the development of biomarkers and personalised treatment approaches.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Neuropathology and Applied Neurobiology
14 papers in training set
Top 0.1%
14.8%
2
Journal of Neurology, Neurosurgery & Psychiatry
29 papers in training set
Top 0.1%
14.8%
3
Journal of Neurology
26 papers in training set
Top 0.1%
8.4%
4
Annals of Neurology
57 papers in training set
Top 0.1%
8.4%
5
European Journal of Neurology
20 papers in training set
Top 0.1%
6.8%
50% of probability mass above
6
Brain Communications
147 papers in training set
Top 0.2%
6.8%
7
Brain
154 papers in training set
Top 1%
4.9%
8
Neurobiology of Disease
134 papers in training set
Top 1%
3.6%
9
Scientific Reports
3102 papers in training set
Top 40%
3.3%
10
Muscle & Nerve
10 papers in training set
Top 0.1%
2.4%
11
Annals of Clinical and Translational Neurology
29 papers in training set
Top 0.6%
1.5%
12
Acta Neuropathologica
51 papers in training set
Top 0.8%
1.3%
13
Frontiers in Neurology
91 papers in training set
Top 4%
1.2%
14
Journal of the Neurological Sciences
17 papers in training set
Top 0.4%
1.2%
15
Acta Neuropathologica Communications
81 papers in training set
Top 0.8%
1.1%
16
Neurology
44 papers in training set
Top 1%
1.0%
17
EBioMedicine
39 papers in training set
Top 0.8%
0.9%
18
Heliyon
146 papers in training set
Top 5%
0.9%
19
NeuroImage: Clinical
132 papers in training set
Top 3%
0.9%
20
eBioMedicine
130 papers in training set
Top 4%
0.7%
21
Frontiers in Cellular Neuroscience
79 papers in training set
Top 1%
0.7%
22
PLOS ONE
4510 papers in training set
Top 68%
0.7%
23
npj Genomic Medicine
33 papers in training set
Top 1.0%
0.7%
24
The Journal of Pathology
22 papers in training set
Top 0.6%
0.7%
25
Nature Communications
4913 papers in training set
Top 65%
0.6%
26
Movement Disorders
62 papers in training set
Top 1%
0.6%