Paired wastewater and clinical genomics across metropolitan and hospital catchments reveals SARS-CoV-2 relevant mutations
Ruiz-Rodriguez, P.; Sanz-Carbonell, A.; Perez-Cataluna, A.; Cano-Jimenez, P.; Ruiz-Roldan, L.; Alandes, R.; Valiente-Mullor, C.; Gimeno, C.; Comas, I.; Sanchez, G.; Gonzalez-Candelas, F.; Coscolla, M.
Show abstract
Wastewater (WW) genomics can track SARS-CoV-2 circulation beyond clinical testing, but its ability to reflect clinical diversity and capture severity-linked mutations remains unclear. Here, we integrated 845 clinical genomes and 22 wastewater genomes from Valencia, Spain, across matched metropolitan and hospital catchments. We compared matched WW and clinical sequencing for lineage and mutation surveillance at two levels: metropolitan and hospital. Then, we tested WW sensitivity to detect mutations statistically associated with hospitalization status in regional (n = 4,843), national (n = 10,052) and supranational (n = 39,099) clinical datasets. WW surveillance captured the dominant Omicron background when collapsing lineages into parental lineages constellation but had limited sensitivity for fine-scale sublineage diversity. Performance was strongly catchment dependent: metropolitan wastewater best represented broader community circulation, whereas hospital wastewater was noisier but detected KP.3 months before its appearance in routine metropolitan clinical surveillance. Across clinical datasets, hospitalisation-associated substitutions showed limited reproducibility, although the national and supranational analyses converged on receptor-binding-domain substitutions D405N, K417N and R408S. Networks showed coupling between G252V in NTD with those RBD substitutions involved in immune escape and receptor engagement. Finally, integrating regional to supranational GWAS with interaction networks and wastewater detection prioritised mutations supported by at least two independent association layers, that includes mutations in the Spike, especially in RBD, and the wastewater-exclusive candidate S:V445P, which was missed by contemporaneous clinical sequencing. Overall, WW genomics preferentially recovers the common mutational backbone of SARS-CoV-2 circulation and can highlight important changes missed by clinical sampling, making it a complementary tool for real-time prioritisation of viral evolutionary change.We found partial overlap in lineage composition between WW and clinical samples, with higher overlap at the metropolitan (50%), than at the hospital level (30%). Conversely, we found a slightly higher overlap of individual mutations between WW and clinical samples at the hospital level (20%) than at the metropolitan area (16%), but shared mutations in both datasets were enriched in the Spike gene. Clade composition did not differ between 216 hospitalised and 528 non-hospitalised cases at regional level. Using GWAS and Hierarchical Lasso analysis, we detected mutations associated with hospitalization status in three different datasets: regional, national and worldwide, with little overlap between them. Although few variants replicated across cohorts, the overlap between the Spain and global analyses was statistically enriched and centred on RBD substitutions (D405N, K417N, R408S). Multiple integration of genomic association results prioritised 34/191 wastewater mutations (16 in Spike), including one mutation only detected in wastewater missed by routine clinical surveillance. Wastewater sequencing tracked dominant Omicron waves but performance varied by catchment; integrating clinical association results with interaction network modelling helped prioritise and interpret wastewater-detected mutations.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.