Hybrid Stacking-Bagging Ensembles for Robust Multi-Omics Breast Cancer Prognosis
Bozorgpour, R.
Show abstract
Accurate breast cancer risk prediction remains a central challenge in precision oncology due to the complexity and heterogeneity of underlying biological processes. While single-modality models based on clinical, gene expression, or copy number variation (CNV) data provide valuable prognostic insights, they often fail to capture complementary information across data sources. Conventional stacking ensembles improve predictive performance through multimodal integration but remain susceptible to variance and overfitting. In this study, we propose a heterogeneous hybrid ensemble framework that combines stacking and bagging to enhance robustness and accuracy in multi-omics breast cancer classification. The framework integrates clinical features, gene expression profiles, and CNV data through stacked multimodal representations, followed by parallel stacking and bagging meta-learning and weighted fusion. Experiments conducted on the METABRIC cohort demonstrate that the proposed hybrid model achieves a ROC AUC of 0.9355, outperforming unimodal models (AUC range: 0.80-0.88) and a conventional stacking ensemble (AUC = 0.919). At the Youdens J optimal operating point, the hybrid approach yields balanced sensitivity (0.8571) and specificity (0.8792), with an overall accuracy of 87.4% and an F1-score of 0.7706. These results highlight the effectiveness of hybrid ensemble learning for robust multimodal integration and demonstrate its potential as a scalable and reliable approach for breast cancer risk prediction. The proposed framework offers a practical pathway toward improved predictive stability and supports the broader application of ensemble-based strategies in precision medicine.
Matching journals
The top 13 journals account for 50% of the predicted probability mass.