Back

Coupling time-aware SNP thresholds with genetic markers to define bacterial transmission chains in hospital surveillance

Burgaya, J.; Erdmann, J.; Knegendorf, L.; Baier, C.; Strunin, D.; Marvig, R. L.; Nielsen, K. L.; Hertz, F. B.; Schlueter, D.; Haeussler, S.; Galardini, M.

2026-05-01 infectious diseases
10.64898/2026.04.27.26351816 medRxiv
Show abstract

Hospital settings can act as reservoirs for pathogens, with many nosocomial bacteria surviving for extended periods of time and spreading via contaminated environments, healthcare workers, or medical equipment. Infection prevention and control in hospital settings relies on accurately identifying transmission events to prevent further transmission. Whole-genome sequencing (WGS) offers the possibility to accurately identify bacterial strains and define transmission routes, yet traditional analyses often rely on fixed single nucleotide polymorphisms (SNP) thresholds that fail to account for the accumulation of variations over extended periods of time. We analyzed WGS surveillance data from bacterial isolates collected over five years in two hospitals in Germany and Denmark. By leveraging longitudinal sampling and repeated isolates from persistent infections, we estimated in vivo evolutionary rates from different strains belonging to three species: Escherichia coli, Klebsiella pneumoniae and Pseudomonas aeruginosa. Using species-level mean molecular clock rates, we developed a time-aware framework that defines transmission threshold based on expected SNP accumulation over time. Using this approach, transmission clusters were detected in 0.3% of E. coli, 9.3% of K. pneumoniae and 3.5% of P. aeruginosa isolates. To understand the genetic factors underlying epidemic strain potential, we compared epidemic lineages part of transmission chains with sporadic lineages. We found that epidemic lineages of K. pneumoniae and E. coli had higher virulence scores than sporadic strains, with enrichment of siderophores and adhesion genes, while resistance scores were similar. Genome-wide association analyses revealed hundreds of variants associated with epidemic status, particularly in replication, recombination and repair mechanisms, as well as metabolism related functions. With the predicted virulence and resistance scores we could easily observe predicted phenotypic changes within transmission clusters, including the loss of virulence factors in an outbreak, and the emergence of resistance after antibiotic treatment.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 0.1%
45.5%
2
Nature Genetics
240 papers in training set
Top 1%
6.5%
50% of probability mass above
3
Science Advances
1098 papers in training set
Top 3%
4.4%
4
Science
429 papers in training set
Top 7%
4.4%
5
Nature Microbiology
133 papers in training set
Top 0.9%
4.0%
6
Nature Biotechnology
147 papers in training set
Top 3%
2.8%
7
eLife
5422 papers in training set
Top 35%
2.1%
8
Genome Medicine
154 papers in training set
Top 3%
2.1%
9
Cell
370 papers in training set
Top 10%
1.9%
10
Nature Computational Science
50 papers in training set
Top 0.6%
1.7%
11
Scientific Reports
3102 papers in training set
Top 56%
1.7%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 32%
1.7%
13
Nature
575 papers in training set
Top 12%
1.5%
14
Genome Biology
555 papers in training set
Top 6%
1.3%
15
Cell Host & Microbe
113 papers in training set
Top 4%
1.3%
16
Cell Reports
1338 papers in training set
Top 28%
1.3%
17
Nucleic Acids Research
1128 papers in training set
Top 14%
1.1%
18
Cell Systems
167 papers in training set
Top 10%
1.1%
19
PLOS Biology
408 papers in training set
Top 18%
0.8%
20
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.8%
21
Nature Medicine
117 papers in training set
Top 5%
0.7%
22
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
23
PLOS Computational Biology
1633 papers in training set
Top 27%
0.7%
24
iScience
1063 papers in training set
Top 36%
0.7%
25
Cell Reports Medicine
140 papers in training set
Top 10%
0.5%
26
PLOS Pathogens
721 papers in training set
Top 10%
0.5%
27
Cell Reports Methods
141 papers in training set
Top 7%
0.5%