Back

A Novel Swarm Intelligence-Driven Feature Selection for Interpretable Machine Learning in GBM Overall Survival Analysis

duman, a.; Sun, X.; Powell, J. R.; Spezi, E.

2025-04-17 oncology
10.1101/2025.04.16.25325927 medRxiv
Show abstract

PurposeIn this study, we develop and validate an interpretable machine learning (ML) model that integrates a hybrid Swarm Intelligence (SI)-based feature selection method with Magnetic Resonance Imaging (MRI)-derived radiomic features (RFs) to estimate overall survival (OS) in Glioblastoma Multiforme (GBM) patients. This study seeks to enhance the generalizability of the developed prognostic model and its potential for clinical integration by emphasizing feature reproducibility and leveraging multi-institutional retrospective datasets. MethodsA cohort of 276 GBM patients with open-access pre-treatment MRI data (including T1, T1ce, T2, and FLAIR sequences) was used to perform comprehensive radiomic analysis. The extraction protocol yielded 1980 RFs per patient, extracted from three tumor regions (enhancing tumor: ET, tumor core: TC, and whole tumor: WT). The prognostic framework was built step-by-step, starting with a model of up to 10 RFs and then improving prediction by adding a single clinical feature (Age). In the training (discovery) dataset, we employed five-fold cross-validation combined with bootstrapping to ensure robust methodological validation. Model evaluation covered the C-index with 95% confidence intervals (CI) and survival stratification using Kaplan-Meier curves and the log-rank test to separate patients into low- and high-risk groups for OS. ResultsThe final survival model integrates patient age and ten independent RFs; the model itself was optimized using features derived from three tumor contours and two MRI sequences (T1, FLAIR). The models performance in the holdout test dataset was evaluated by a concordance index (C-index) of 0.71 (95% CI: 0.61-0.79), exhibiting statistically significant risk stratification (p = 2 x 10-). Upon external validation, the model achieved a C-index of 0.64, maintaining statistical significance (p = 1 x 10-{superscript 2}). The research combined the regularized Cox regression (Cox-LASSO), a traditional ML model, with a new SI-based LASSO-PSO method, yielding significant stratification. To our knowledge, the present study offers the first documented use of an interpretable ML model with an SI-based approach (LASSO-PSO) for successful risk stratification based on OS. ConclusionThis study provides the development and validation of a clinical-radiomic model capable of conducting time-to-event analysis in GBM patients. By leveraging multicenter retrospective datasets, the model enables effective risk stratification based on OS. A key direction for future work involves exploring the combination of deep learning(DL)-based features and engineered features extracted via standardized convolutional filters, with the objective of improving OS prediction.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Computers in Biology and Medicine
120 papers in training set
Top 0.1%
19.6%
2
Biology Methods and Protocols
53 papers in training set
Top 0.1%
18.5%
3
PLOS ONE
4510 papers in training set
Top 13%
14.5%
50% of probability mass above
4
Scientific Reports
3102 papers in training set
Top 12%
7.3%
5
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
4.9%
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1.0%
2.8%
7
Frontiers in Oncology
95 papers in training set
Top 2%
2.1%
8
Neuro-Oncology Advances
24 papers in training set
Top 0.2%
2.1%
9
Cancers
200 papers in training set
Top 3%
1.7%
10
Journal of Magnetic Resonance Imaging
14 papers in training set
Top 0.4%
1.7%
11
Journal of Translational Medicine
46 papers in training set
Top 1%
1.3%
12
Brain and Behavior
37 papers in training set
Top 0.8%
1.2%
13
Frontiers in Genetics
197 papers in training set
Top 7%
1.2%
14
FEBS Open Bio
29 papers in training set
Top 0.3%
1.0%
15
PeerJ
261 papers in training set
Top 13%
0.8%
16
Biomedicines
66 papers in training set
Top 3%
0.8%
17
JMIR Medical Informatics
17 papers in training set
Top 1%
0.8%
18
Diagnostics
48 papers in training set
Top 2%
0.8%
19
European Radiology
14 papers in training set
Top 0.7%
0.8%
20
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%
21
International Journal of Molecular Sciences
453 papers in training set
Top 15%
0.8%
22
BMC Cancer
52 papers in training set
Top 2%
0.8%
23
Annals of Biomedical Engineering
34 papers in training set
Top 1%
0.7%
24
Heliyon
146 papers in training set
Top 8%
0.7%
25
BMC Bioinformatics
383 papers in training set
Top 8%
0.5%
26
Journal of the Neurological Sciences
17 papers in training set
Top 1.0%
0.5%
27
PLOS Computational Biology
1633 papers in training set
Top 29%
0.5%