Back

Topographical archetypes of somatic mutagenesis in cancer

Lynch, A. W.; Lee, S. S.; Hummel, J. P.; Geiger, B.; Lawrence, M. S.; Jin, H.; Gulhan, D. C.; Park, P. J.

2026-04-21 bioinformatics
10.64898/2026.04.18.719374 bioRxiv
Show abstract

The genome of every cancer cell carries a record of the mutational processes that have acted throughout its history. Mutational signature analysis, which infers the activity of mutagenic processes from their characteristic base-change patterns, has become an indispensable tool for interpreting somatic mutations. However, this framework captures only which types of mutations a process generates and not where in the genome they occur -- a distribution influenced by replication timing, chromatin organization, transcription, DNA secondary structure, and other genomic features. Here, we present a generative probabilistic framework (MuTopia) that jointly infers mutational spectra and their genome-wide topography as nonlinear functions of genomic and epigenomic state. Applied to whole-genome sequencing data from 15 tumor types, MuTopia reveals that mutational processes fall into eight conserved topographic archetypes, or topotypes, shaped primarily by replication timing and chromatin state. Diverse mutational processes converge upon this limited repertoire, indicating that the genomic distribution of mutagenesis is constrained less by the source of damage than by how that damage is processed. Individual mutational processes exhibit state-dependent variation in their genomic distributions: the same signature can adopt distinct topotypes depending on repair proficiency and replication stress. For instance, SBS8 shifts from a canonical late-replicating profile in homologous recombination-proficient tumors to an early-replicating, stress-associated topotype in HR-deficient tumors, and replication stress similarly reshapes the genomic distribution of APOBEC editing. Topotypes, therefore, provide a classification of mutagenesis distinct from spectral signatures, capturing aspects of tumor biology that spectra alone cannot resolve.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Cell Systems
167 papers in training set
Top 0.3%
22.1%
2
Nature Communications
4913 papers in training set
Top 6%
18.3%
3
Nucleic Acids Research
1128 papers in training set
Top 3%
6.2%
4
Genome Biology
555 papers in training set
Top 2%
4.8%
50% of probability mass above
5
Nature Biotechnology
147 papers in training set
Top 2%
4.2%
6
Nature Genetics
240 papers in training set
Top 2%
4.1%
7
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 18%
3.9%
8
Science Advances
1098 papers in training set
Top 8%
3.2%
9
Nature
575 papers in training set
Top 8%
3.0%
10
PLOS Computational Biology
1633 papers in training set
Top 12%
2.7%
11
Cancer Research
116 papers in training set
Top 2%
2.0%
12
eLife
5422 papers in training set
Top 39%
1.9%
13
Science
429 papers in training set
Top 13%
1.9%
14
Advanced Science
249 papers in training set
Top 10%
1.9%
15
Genome Medicine
154 papers in training set
Top 4%
1.7%
16
Cell Genomics
162 papers in training set
Top 4%
1.3%
17
The American Journal of Human Genetics
206 papers in training set
Top 3%
0.9%
18
Nature Methods
336 papers in training set
Top 6%
0.7%
19
Nature Microbiology
133 papers in training set
Top 5%
0.7%
20
Molecular Cell
308 papers in training set
Top 10%
0.7%
21
Scientific Reports
3102 papers in training set
Top 76%
0.7%
22
Nature Cell Biology
99 papers in training set
Top 5%
0.7%
23
Cell Reports
1338 papers in training set
Top 35%
0.7%
24
PLOS ONE
4510 papers in training set
Top 72%
0.6%