Back

Ensemble Approaches to Screening, Diagnosis, and Subtyping of Multiple Sclerosis

Yang, I. Y.; Patil, A.; Jin, O.; Loud, S.; Buxhoeveden, S.; Zhang, D. Y.

2026-04-21 genetic and genomic medicine
10.64898/2026.04.19.26351230 medRxiv
Show abstract

Multiple sclerosis (MS) is a debilitating disease affecting more than 1 million Americans, and today is assessed primarily through magnetic resonance imaging (MRI) and observational clinical symptoms. Given the autoimmune nature of MS, we hypothesized that high-dimensional gene expression data from peripheral blood mononuclear cells (PBMCs), when analyzed with the assistance of AI, may collectively serve as valuable biomarkers for the real-time risk and progression of MS. Here, we present PBMC RNA sequencing (RNAseq) results from N=997 samples, including 540 MS, 221 neuromyelitis optica (NMO), and 149 healthy controls. We constructed and optimized ensemble models for three clinical outcomes: (1) discrimination of early MS (EDSS [≤] 2.0) from healthy individuals with 74% AUC at 100% coverage, (2) differential diagnosis of MS from NMO with 91% AUC at 80% coverage, and (3) subtyping RRMS from progressive MS with 79% AUC at 80% coverage. To our knowledge, no prior molecular test has been reported for any of these three MS clinical tasks, and these results may have immediate impact on clinical management of MS patients. Two innovations that improved the stratification accuracy of our models: selection of gene sets based on expression variance in disease states, and use of non-linear rank sort and conviction weighting in the ensemble score calculation.

Matching journals

The top 12 journals account for 50% of the predicted probability mass.

1
Brain
154 papers in training set
Top 0.5%
10.2%
2
Nature Communications
4913 papers in training set
Top 28%
6.5%
3
Nucleic Acids Research
1128 papers in training set
Top 4%
4.9%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 16%
4.4%
5
Scientific Reports
3102 papers in training set
Top 27%
4.4%
6
Nature Neuroscience
216 papers in training set
Top 2%
4.0%
7
Communications Biology
886 papers in training set
Top 2%
3.6%
8
Cell Genomics
162 papers in training set
Top 2%
3.1%
9
eLife
5422 papers in training set
Top 31%
2.8%
10
Cell
370 papers in training set
Top 9%
2.4%
11
Frontiers in Genetics
197 papers in training set
Top 3%
2.1%
12
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.9%
50% of probability mass above
13
Nature Medicine
117 papers in training set
Top 2%
1.9%
14
Neurobiology of Disease
134 papers in training set
Top 2%
1.9%
15
Med
38 papers in training set
Top 0.2%
1.8%
16
Cell Systems
167 papers in training set
Top 7%
1.7%
17
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
18
Genome Medicine
154 papers in training set
Top 5%
1.5%
19
Annals of Clinical and Translational Neurology
29 papers in training set
Top 0.7%
1.3%
20
EBioMedicine
39 papers in training set
Top 0.5%
1.2%
21
iScience
1063 papers in training set
Top 23%
1.1%
22
NAR Molecular Medicine
18 papers in training set
Top 0.1%
1.1%
23
Neuron
282 papers in training set
Top 7%
1.0%
24
EMBO Molecular Medicine
85 papers in training set
Top 3%
1.0%
25
Frontiers in Neurology
91 papers in training set
Top 4%
1.0%
26
Genome Biology
555 papers in training set
Top 6%
0.9%
27
Cell Reports Medicine
140 papers in training set
Top 6%
0.9%
28
npj Digital Medicine
97 papers in training set
Top 3%
0.8%
29
Bioinformatics
1061 papers in training set
Top 9%
0.8%
30
Clinical Immunology
21 papers in training set
Top 0.5%
0.8%