Decomposing Participatory Surveillance Symptom Time Series to Track Respiratory Infections: A Cross-Country Evaluation Using Non-Negative Matrix Factorization
Carstens, G.; Mazzoli, M.; Gozzi, N.; van Hoek, A. J.; Paolotti, D.
Show abstract
Background: The annual respiratory season in Europe is marked by the co-circulation of multiple respiratory pathogens, such as influenza viruses, rhinoviruses, and coronaviruses. Effective surveillance is necessary but hampered by heterogeneity of case definitions and limited pathogen specificity in existing systems. This study aims to detect pathogen-specific signals in the participatory surveillance of the Netherlands using a sub-set of samples with virological detection. Additionally, we explore a method to use the findings in the Netherlands to enhance the virological interpretation of participatory surveillance data in Italy. Methods: We analyzed symptom data collected through a participatory surveillance platform in the Netherlands and Italy over five years (2020-2025). Symptom-by-week matrices from the Dutch cohort were aggregated into syndromes and their associated time series using Non-negative Matrix Factorization (NMF). We compared the respective time series of the syndromes with influenza virus, SARS-CoV-2, seasonal coronaviruses, RSV, and rhinovirus incidence estimated from nose- and throat swabs of a subsample of symptomatic participants of the participatory surveillance platform in the Netherlands. We tested the transferability of these components by applying the Dutch-derived components to describe Italian symptom data and extract respective incidences. Results: NMF identified eight symptom clusters in the Dutch cohort, one aligning with SARS-CoV-2, one aligning with rhinovirus and a third component aligning with influenza virus, RSV and seasonal incidences estimated from collected nose- and throat swabs. Transferring Dutch-derived symptom clusters to Italian data showed consistency in key components between Dutch and Italian cohorts, particularly those associated with SARS-CoV-2. Conclusion: This study demonstrates that unsupervised symptom decomposition can identify co-circulating respiratory pathogens trends from syndromic surveillance data, providing timely pathogen circulation insights.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.