Ensemble Approaches to Screening, Diagnosis, and Subtyping of Multiple Sclerosis
Yang, I. Y.; Patil, A.; Jin, O.; Loud, S.; Buxhoeveden, S.; Zhang, D. Y.
Show abstract
Multiple sclerosis (MS) is a debilitating disease affecting more than 1 million Americans, and today is assessed primarily through magnetic resonance imaging (MRI) and observational clinical symptoms. Given the autoimmune nature of MS, we hypothesized that high-dimensional gene expression data from peripheral blood mononuclear cells (PBMCs), when analyzed with the assistance of AI, may collectively serve as valuable biomarkers for the real-time risk and progression of MS. Here, we present PBMC RNA sequencing (RNAseq) results from N=997 samples, including 540 MS, 221 neuromyelitis optica (NMO), and 149 healthy controls. We constructed and optimized ensemble models for three clinical outcomes: (1) discrimination of early MS (EDSS [≤] 2.0) from healthy individuals with 74% AUC at 100% coverage, (2) differential diagnosis of MS from NMO with 91% AUC at 80% coverage, and (3) subtyping RRMS from progressive MS with 79% AUC at 80% coverage. To our knowledge, no prior molecular test has been reported for any of these three MS clinical tasks, and these results may have immediate impact on clinical management of MS patients. Two innovations that improved the stratification accuracy of our models: selection of gene sets based on expression variance in disease states, and use of non-linear rank sort and conviction weighting in the ensemble score calculation.
Matching journals
The top 12 journals account for 50% of the predicted probability mass.