Back

RNAseq-Based Machine Learning Models for Prognostication of Multiple Myeloma

Shah, K. U.; Millan, K. A.; Pula, A. E.; Kubicki, T. F.; Cannova, J.; Wu, S.; Bhagwat, M.; Guenther, Q. C.; Cooperrider, J.; Roloff, G.; Venkat, A.; Derman, B. A.; Jakubowiak, A. J.; Drazer, M. W.

2025-02-06 oncology
10.1101/2025.01.31.25321495 medRxiv
Show abstract

BackgroundMultiple myeloma (MM) is characterized by abnormal plasma cell proliferation in the bone marrow, leading to symptoms like osteolytic lesions, anemia, hypercalcemia, and elevated serum creatinine. RNA-sequencing-based prognostic indicators for MM have shown promise in stratifying risk and assessing first-line treatment options. This study uses machine learning techniques and leverages RNA-sequencing, clinical, and biochemical data from the Multiple Myeloma Research Foundation (MMRF) CoMMpass cohort to predict patient prognosis. MethodsRNAseq data of 60,623 genes from bone marrow samples of 708 MM patients were pre-processed for batch effect correction and split into training (70%) and testing (30%) sets. Feature selection involved MAD, mRMR, and iterative permutation importance filtering for predicting PFS and OS. Machine learning survival models like Random Survival Forest (RSF), Gradient Boosted (GB), and Component-wise Gradient Boosted (CGB) were developed and optimized. Performance was evaluated using C-index and integrated Brier score (IBS). ResultsThe RSF and GB models showed the highest performance for predicting progression-free survival (PFS) and overall survival (OS) on the testing dataset. Significant features for PFS included stem cell transplant status, serum {beta}2-microglobulin levels, germline mutational status, and expression of C12orf75 and ENSG00000256006. For OS, stem cell transplant status, age, serum {beta}2-microglobulin levels, germline mutational status, and expression of NUTM2B-AS1 and ENSG00000287022 were prominent. Gene ontology analyses confirmed the biological relevance of enriched pathways related to cell division, protein localization, and cancer. ConclusionIntegrating RNAseq and clinical data with advanced machine learning models presents a robust approach for predicting MM prognosis, highlighting gene expression programs, germline mutational status, and clinical markers as significant features. Future research should focus on independent validation to confirm findings and explore additional genomic data for enhanced prognostication.

Matching journals

The top 12 journals account for 50% of the predicted probability mass.

1
Blood Cancer Journal
11 papers in training set
Top 0.1%
7.3%
2
Blood Advances
54 papers in training set
Top 0.2%
6.4%
3
Frontiers in Oncology
95 papers in training set
Top 0.5%
6.4%
4
Scientific Reports
3102 papers in training set
Top 23%
4.9%
5
Computers in Biology and Medicine
120 papers in training set
Top 0.5%
4.4%
6
PLOS ONE
4510 papers in training set
Top 36%
4.0%
7
Frontiers in Immunology
586 papers in training set
Top 2%
3.6%
8
British Journal of Haematology
15 papers in training set
Top 0.1%
3.3%
9
Journal of Hematology & Oncology
10 papers in training set
Top 0.1%
2.8%
10
Cancers
200 papers in training set
Top 2%
2.6%
11
JCO Precision Oncology
14 papers in training set
Top 0.1%
2.4%
12
PeerJ
261 papers in training set
Top 4%
2.4%
50% of probability mass above
13
Leukemia
39 papers in training set
Top 0.4%
2.1%
14
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.3%
2.1%
15
Biology Methods and Protocols
53 papers in training set
Top 0.7%
1.9%
16
Oncotarget
15 papers in training set
Top 0.1%
1.7%
17
Frontiers in Genetics
197 papers in training set
Top 5%
1.7%
18
Transplantation
13 papers in training set
Top 0.3%
1.3%
19
npj Precision Oncology
48 papers in training set
Top 0.7%
1.3%
20
FEBS Open Bio
29 papers in training set
Top 0.2%
1.3%
21
Journal of the Neurological Sciences
17 papers in training set
Top 0.4%
1.3%
22
Neuropathology and Applied Neurobiology
14 papers in training set
Top 0.3%
1.3%
23
Biomolecules
95 papers in training set
Top 0.9%
1.2%
24
British Journal of Cancer
42 papers in training set
Top 1%
1.2%
25
Heliyon
146 papers in training set
Top 3%
1.2%
26
British Journal of Clinical Pharmacology
21 papers in training set
Top 0.5%
1.1%
27
F1000Research
79 papers in training set
Top 3%
1.0%
28
Modern Pathology
21 papers in training set
Top 0.3%
1.0%
29
International Journal of Molecular Sciences
453 papers in training set
Top 13%
0.9%
30
European Journal of Cancer
10 papers in training set
Top 0.5%
0.8%