A Novel Swarm Intelligence-Driven Feature Selection for Interpretable Machine Learning in GBM Overall Survival Analysis

duman, a.; Sun, X.; Powell, J. R.; Spezi, E.

2025-04-17 oncology

10.1101/2025.04.16.25325927 medRxiv

Show abstract

PurposeIn this study, we develop and validate an interpretable machine learning (ML) model that integrates a hybrid Swarm Intelligence (SI)-based feature selection method with Magnetic Resonance Imaging (MRI)-derived radiomic features (RFs) to estimate overall survival (OS) in Glioblastoma Multiforme (GBM) patients. This study seeks to enhance the generalizability of the developed prognostic model and its potential for clinical integration by emphasizing feature reproducibility and leveraging multi-institutional retrospective datasets. MethodsA cohort of 276 GBM patients with open-access pre-treatment MRI data (including T1, T1ce, T2, and FLAIR sequences) was used to perform comprehensive radiomic analysis. The extraction protocol yielded 1980 RFs per patient, extracted from three tumor regions (enhancing tumor: ET, tumor core: TC, and whole tumor: WT). The prognostic framework was built step-by-step, starting with a model of up to 10 RFs and then improving prediction by adding a single clinical feature (Age). In the training (discovery) dataset, we employed five-fold cross-validation combined with bootstrapping to ensure robust methodological validation. Model evaluation covered the C-index with 95% confidence intervals (CI) and survival stratification using Kaplan-Meier curves and the log-rank test to separate patients into low- and high-risk groups for OS. ResultsThe final survival model integrates patient age and ten independent RFs; the model itself was optimized using features derived from three tumor contours and two MRI sequences (T1, FLAIR). The models performance in the holdout test dataset was evaluated by a concordance index (C-index) of 0.71 (95% CI: 0.61-0.79), exhibiting statistically significant risk stratification (p = 2 x 10-). Upon external validation, the model achieved a C-index of 0.64, maintaining statistical significance (p = 1 x 10-{superscript 2}). The research combined the regularized Cox regression (Cox-LASSO), a traditional ML model, with a new SI-based LASSO-PSO method, yielding significant stratification. To our knowledge, the present study offers the first documented use of an interpretable ML model with an SI-based approach (LASSO-PSO) for successful risk stratification based on OS. ConclusionThis study provides the development and validation of a clinical-radiomic model capable of conducting time-to-event analysis in GBM patients. By leveraging multicenter retrospective datasets, the model enables effective risk stratification based on OS. A key direction for future work involves exploring the combination of deep learning(DL)-based features and engineered features extracted via standardized convolutional filters, with the objective of improving OS prediction.

A Novel Swarm Intelligence-Driven Feature Selection for Interpretable Machine Learning in GBM Overall Survival Analysis

Matching journals