Integrative Bioinformatics Approach to Identify Prognostic Gene Signatures for Risk Stratification in Thyroid Carcinoma
Malik, S.; Raghava, G. P. S.
Show abstract
Thyroid cancer is a heterogeneous malignancy with variable outcomes, highlighting the need for reliable biomarkers and effective risk stratification. In this study, we implemented a multi-step integrative framework to identify distinct prognostic biomarker sets using transcriptomic data from 572 thyroid cancer patients. Correlation analysis followed by false discovery rate (FDR) correction revealed significant associations of genes. Notably, MAFF (r = 0.25, p = 1.34x10-, FDR = 2.46x10-), NR4A3 (r = 0.24, p = 1.26x10-, FDR = 9.25x10-), and SRF showed strong positive correlations, whereas LOC728264 (r = -0.21, p = 7.39x10-, FDR = 6.36x10-) and VAMP1 (r = -0.20, p = 1.20x10-, FDR = 1.3x10-) exhibited negative correlations with OS. Univariate Cox regression identified several survival-associated genes, including TMEM90B (HR = 10.66, p = 2.88x10-) and PTH1R (HR = 9.88, p = 5.55x10-). LASSO regression further identified 31 key prognostic genes, including 13 potential drug targets predominantly functioning as inhibitors. Machine learning models based on seven independent 20-gene biomarker sets effectively predicted Class 0 (0-1 years), Class 1 (1-3 years), Class 2 (3-5 years), and Class 3 (>5 years), achieving AUC values of 0.91-0.94 and Kappa up to 0.76. An ensemble model further improved prediction (AUC = 0.95, Kappa = 0.72). Incorporating clinical variables (age, gender, stage) enhanced model performance (AUC = 0.96, Kappa = 0.80). Reduced 10- and 5-gene subsets demonstrated consistent yet slightly lower performance (AUC = 0.90 and 0.86, respectively). Collectively, the 20-gene set exhibited the strongest predictive and prognostic potential, highlighting the importance of integrating molecular and clinical features for risk stratification in thyroid cancer.All data and code are openly available (https://github.com/raghavagps/THCA_prognostic_biomarkers), supporting future research in thyroid cancer prediction.
Matching journals
The top 13 journals account for 50% of the predicted probability mass.