Back

Combined inference of known and novel mutational signatures with ReDeNovo

Kesimoglu, Z. N.; Hodzic, E.; Hoinka, J.; Amgalan, B.; Hirsch, M. G.; Przytycka, T. M.

2026-02-06 genomics
10.64898/2026.02.05.703798 bioRxiv
Show abstract

Mutational signatures represent characteristic mutational patterns imprinted on the genome by mutagenic processes. They can provide information about the impact of the environmental and endogenous cellular processes on tumor mutations and can suggest treatment. Analysis of presence and strength of mutational signatures in cancer genomes has become a cornerstone in analysis of new and legacy cancer data. However, a precise inference of novel (de novo) signatures requires a large set of genomes, and methods focusing on estimating the presence of previously defined signatures are unable to uncover potential novel signatures that might emerge in new data. Thus, reliable methods to address these challenges are needed. We formally define the Combined Mutational Signature Inference Problem (CMSI) for the identification of known signatures and the inference of novel signatures in cancer data. CMSI represents non-convex optimization, and we provide a heuristic algorithm, ReDeNovo, to solve it efficiently. We extensively validated ReDeNovo on simulated data, evaluating its ability to precisely estimate presence and exposure to known signatures and to discover of novel signatures. On both tasks ReDeNovo outperformed existing approaches. In real biological data, ReDeNovo identified signatures missed by previous analyses and defined a new signature related to UV light exposure. ReDeNovo method provides a new and powerful tool to uncover mutational signatures. ReDeNovo is available from https://github.com/ncbi/redenovo.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.8%
26.7%
2
Cell Systems
167 papers in training set
Top 1%
8.7%
3
PLOS Computational Biology
1633 papers in training set
Top 5%
6.6%
4
Nature Communications
4913 papers in training set
Top 31%
5.0%
5
Genome Research
409 papers in training set
Top 0.9%
3.7%
50% of probability mass above
6
Genome Biology
555 papers in training set
Top 3%
3.4%
7
Scientific Reports
3102 papers in training set
Top 49%
2.1%
8
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 27%
2.1%
9
iScience
1063 papers in training set
Top 11%
1.9%
10
Nature Biotechnology
147 papers in training set
Top 4%
1.9%
11
BMC Bioinformatics
383 papers in training set
Top 4%
1.9%
12
Nature Methods
336 papers in training set
Top 4%
1.9%
13
Nature Computational Science
50 papers in training set
Top 0.4%
1.9%
14
Communications Biology
886 papers in training set
Top 7%
1.8%
15
Nature Genetics
240 papers in training set
Top 4%
1.7%
16
eLife
5422 papers in training set
Top 44%
1.5%
17
Frontiers in Genetics
197 papers in training set
Top 5%
1.5%
18
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.3%
1.5%
19
Nucleic Acids Research
1128 papers in training set
Top 12%
1.4%
20
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.3%
21
Genome Medicine
154 papers in training set
Top 6%
1.1%
22
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
23
PLOS Genetics
756 papers in training set
Top 12%
1.0%
24
PLOS ONE
4510 papers in training set
Top 62%
1.0%
25
The American Journal of Human Genetics
206 papers in training set
Top 3%
0.9%
26
GENETICS
189 papers in training set
Top 1%
0.9%
27
Cancer Research
116 papers in training set
Top 3%
0.8%
28
The Annals of Applied Statistics
15 papers in training set
Top 0.1%
0.8%
29
Biometrics
22 papers in training set
Top 0.1%
0.8%
30
Journal of Computational Biology
37 papers in training set
Top 0.5%
0.8%