Back

Paired wastewater and clinical genomics across metropolitan and hospital catchments reveals SARS-CoV-2 relevant mutations

Ruiz-Rodriguez, P.; Sanz-Carbonell, A.; Perez-Cataluna, A.; Cano-Jimenez, P.; Ruiz-Roldan, L.; Alandes, R.; Valiente-Mullor, C.; Gimeno, C.; Comas, I.; Sanchez, G.; Gonzalez-Candelas, F.; Coscolla, M.

2026-04-06 epidemiology
10.64898/2026.03.31.26346553 medRxiv
Show abstract

Wastewater (WW) genomics can track SARS-CoV-2 circulation beyond clinical testing, but its ability to reflect clinical diversity and capture severity-linked mutations remains unclear. Here, we integrated 845 clinical genomes and 22 wastewater genomes from Valencia, Spain, across matched metropolitan and hospital catchments. We compared matched WW and clinical sequencing for lineage and mutation surveillance at two levels: metropolitan and hospital. Then, we tested WW sensitivity to detect mutations statistically associated with hospitalization status in regional (n = 4,843), national (n = 10,052) and supranational (n = 39,099) clinical datasets. WW surveillance captured the dominant Omicron background when collapsing lineages into parental lineages constellation but had limited sensitivity for fine-scale sublineage diversity. Performance was strongly catchment dependent: metropolitan wastewater best represented broader community circulation, whereas hospital wastewater was noisier but detected KP.3 months before its appearance in routine metropolitan clinical surveillance. Across clinical datasets, hospitalisation-associated substitutions showed limited reproducibility, although the national and supranational analyses converged on receptor-binding-domain substitutions D405N, K417N and R408S. Networks showed coupling between G252V in NTD with those RBD substitutions involved in immune escape and receptor engagement. Finally, integrating regional to supranational GWAS with interaction networks and wastewater detection prioritised mutations supported by at least two independent association layers, that includes mutations in the Spike, especially in RBD, and the wastewater-exclusive candidate S:V445P, which was missed by contemporaneous clinical sequencing. Overall, WW genomics preferentially recovers the common mutational backbone of SARS-CoV-2 circulation and can highlight important changes missed by clinical sampling, making it a complementary tool for real-time prioritisation of viral evolutionary change.We found partial overlap in lineage composition between WW and clinical samples, with higher overlap at the metropolitan (50%), than at the hospital level (30%). Conversely, we found a slightly higher overlap of individual mutations between WW and clinical samples at the hospital level (20%) than at the metropolitan area (16%), but shared mutations in both datasets were enriched in the Spike gene. Clade composition did not differ between 216 hospitalised and 528 non-hospitalised cases at regional level. Using GWAS and Hierarchical Lasso analysis, we detected mutations associated with hospitalization status in three different datasets: regional, national and worldwide, with little overlap between them. Although few variants replicated across cohorts, the overlap between the Spain and global analyses was statistically enriched and centred on RBD substitutions (D405N, K417N, R408S). Multiple integration of genomic association results prioritised 34/191 wastewater mutations (16 in Spike), including one mutation only detected in wastewater missed by routine clinical surveillance. Wastewater sequencing tracked dominant Omicron waves but performance varied by catchment; integrating clinical association results with interaction network modelling helped prioritise and interpret wastewater-detected mutations.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 2%
26.4%
2
Genome Medicine
154 papers in training set
Top 0.1%
22.9%
3
Nature Genetics
240 papers in training set
Top 1%
6.4%
50% of probability mass above
4
Molecular Systems Biology
142 papers in training set
Top 0.1%
4.9%
5
Nature Biotechnology
147 papers in training set
Top 2%
3.7%
6
Genome Biology
555 papers in training set
Top 2%
3.7%
7
Cell Reports Medicine
140 papers in training set
Top 2%
3.3%
8
Nature
575 papers in training set
Top 9%
2.1%
9
Nature Medicine
117 papers in training set
Top 2%
1.9%
10
mSystems
361 papers in training set
Top 5%
1.4%
11
Cell
370 papers in training set
Top 13%
1.4%
12
Microbial Genomics
204 papers in training set
Top 1%
1.4%
13
eBioMedicine
130 papers in training set
Top 2%
1.1%
14
PLOS Computational Biology
1633 papers in training set
Top 21%
1.0%
15
Communications Medicine
85 papers in training set
Top 0.6%
1.0%
16
Science Translational Medicine
111 papers in training set
Top 5%
0.9%
17
eLife
5422 papers in training set
Top 53%
0.9%
18
Scientific Reports
3102 papers in training set
Top 72%
0.8%
19
Journal of Infection
71 papers in training set
Top 3%
0.8%
20
Cell Genomics
162 papers in training set
Top 6%
0.8%
21
Science Advances
1098 papers in training set
Top 30%
0.7%
22
Med
38 papers in training set
Top 1%
0.7%
23
Cell Reports Methods
141 papers in training set
Top 6%
0.7%
24
Science
429 papers in training set
Top 22%
0.5%
25
BMC Medicine
163 papers in training set
Top 9%
0.5%
26
Patterns
70 papers in training set
Top 3%
0.5%