Back

Integrated Multi-Omics Analysis for the Identification of Disease-Associated Variations and Prognostic Biomarkers in Triple-Negative Breast Cancer (TNBC)

MANNEKUNTA, N.; NATRAJAN, E.

2026-05-06 bioinformatics
10.64898/2026.05.03.722461 bioRxiv
Show abstract

BackgroundTriple-negative breast cancer (TNBC) exhibits substantial molecular heterogeneity and lacks targeted receptor therapies. Single-omic approaches inadequately capture its regulatory complexity, necessitating integrated multi-omic frameworks to identify stable prognostic signatures. MethodsMatched transcriptomic and DNA methylation data from the TCGA-BRCA cohort were normalised and mathematically integrated to isolate disease-associated variations. A calibrated machine learning voting ensemble (comprising LightGBM, Random Forest, and Logistic Regression) was trained to predict clinical survival. Model generalisability was tested on an independent microarray cohort (GSE58812) using independent quantile normalisation. SHAP (SHapley Additive exPlanations) values provided biological interpretability. ResultsDifferential and integrative analyses identified a 47-gene master prognostic signature. The ensemble classifier achieved an external validation accuracy of 74.77% (AUC 0.590) on unseen clinical patients. SHAP analysis confirmed the biological directionality of these specific biomarkers in driving mortality. Hypergeometric pathway enrichment highlighted targetable metabolic and signalling networks. ConclusionsThis multi-omic machine learning pipeline identifies a highly prognostic 47-gene signature for TNBC. The model demonstrates strong cross-platform generalisability and offers interpretable clinical utility for stratifying patient risk and guiding future therapeutic target development.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 10%
8.3%
2
Bioinformatics
1061 papers in training set
Top 3%
8.3%
3
PLOS Computational Biology
1633 papers in training set
Top 6%
6.3%
4
JNCI Cancer Spectrum
10 papers in training set
Top 0.1%
6.3%
5
BMC Bioinformatics
383 papers in training set
Top 2%
6.3%
6
Breast Cancer Research
32 papers in training set
Top 0.2%
4.8%
7
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.9%
4.8%
8
Genome Medicine
154 papers in training set
Top 2%
3.6%
9
npj Breast Cancer
18 papers in training set
Top 0.1%
3.6%
50% of probability mass above
10
PLOS ONE
4510 papers in training set
Top 40%
3.6%
11
International Journal of Cancer
42 papers in training set
Top 0.3%
3.6%
12
Cancers
200 papers in training set
Top 2%
2.3%
13
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
2.1%
14
Nature Communications
4913 papers in training set
Top 48%
2.1%
15
Bioinformatics Advances
184 papers in training set
Top 2%
1.9%
16
iScience
1063 papers in training set
Top 13%
1.8%
17
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.7%
18
eBioMedicine
130 papers in training set
Top 1%
1.7%
19
Frontiers in Genetics
197 papers in training set
Top 6%
1.5%
20
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.6%
1.5%
21
npj Precision Oncology
48 papers in training set
Top 0.8%
1.3%
22
Cancer Research Communications
46 papers in training set
Top 0.7%
1.2%
23
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.7%
0.8%
24
Communications Biology
886 papers in training set
Top 27%
0.7%
25
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 1%
0.6%
26
BMC Research Notes
29 papers in training set
Top 0.8%
0.6%
27
Frontiers in Immunology
586 papers in training set
Top 9%
0.6%
28
PeerJ
261 papers in training set
Top 17%
0.6%
29
International Journal of Epidemiology
74 papers in training set
Top 3%
0.6%