Back

Decomposing Participatory Surveillance Symptom Time Series to Track Respiratory Infections: A Cross-Country Evaluation Using Non-Negative Matrix Factorization

Carstens, G.; Mazzoli, M.; Gozzi, N.; van Hoek, A. J.; Paolotti, D.

2026-03-31 infectious diseases
10.64898/2026.03.30.26349719 medRxiv
Show abstract

Background: The annual respiratory season in Europe is marked by the co-circulation of multiple respiratory pathogens, such as influenza viruses, rhinoviruses, and coronaviruses. Effective surveillance is necessary but hampered by heterogeneity of case definitions and limited pathogen specificity in existing systems. This study aims to detect pathogen-specific signals in the participatory surveillance of the Netherlands using a sub-set of samples with virological detection. Additionally, we explore a method to use the findings in the Netherlands to enhance the virological interpretation of participatory surveillance data in Italy. Methods: We analyzed symptom data collected through a participatory surveillance platform in the Netherlands and Italy over five years (2020-2025). Symptom-by-week matrices from the Dutch cohort were aggregated into syndromes and their associated time series using Non-negative Matrix Factorization (NMF). We compared the respective time series of the syndromes with influenza virus, SARS-CoV-2, seasonal coronaviruses, RSV, and rhinovirus incidence estimated from nose- and throat swabs of a subsample of symptomatic participants of the participatory surveillance platform in the Netherlands. We tested the transferability of these components by applying the Dutch-derived components to describe Italian symptom data and extract respective incidences. Results: NMF identified eight symptom clusters in the Dutch cohort, one aligning with SARS-CoV-2, one aligning with rhinovirus and a third component aligning with influenza virus, RSV and seasonal incidences estimated from collected nose- and throat swabs. Transferring Dutch-derived symptom clusters to Italian data showed consistency in key components between Dutch and Italian cohorts, particularly those associated with SARS-CoV-2. Conclusion: This study demonstrates that unsupervised symptom decomposition can identify co-circulating respiratory pathogens trends from syndromic surveillance data, providing timely pathogen circulation insights.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 2%
14.4%
2
PLOS ONE
4510 papers in training set
Top 19%
10.1%
3
Epidemics
104 papers in training set
Top 0.1%
8.4%
4
Journal of Medical Internet Research
85 papers in training set
Top 0.9%
4.9%
5
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
6
European Respiratory Journal
54 papers in training set
Top 0.5%
3.1%
7
BMC Infectious Diseases
118 papers in training set
Top 1%
3.1%
8
BMC Medicine
163 papers in training set
Top 2%
3.1%
50% of probability mass above
9
Influenza and Other Respiratory Viruses
44 papers in training set
Top 0.1%
2.5%
10
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.7%
11
Wellcome Open Research
57 papers in training set
Top 0.9%
1.7%
12
Nature Communications
4913 papers in training set
Top 51%
1.7%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.5%
14
International Journal of Environmental Research and Public Health
124 papers in training set
Top 5%
1.3%
15
Clinical Infectious Diseases
231 papers in training set
Top 3%
1.3%
16
Epidemiology and Infection
84 papers in training set
Top 2%
1.3%
17
iScience
1063 papers in training set
Top 21%
1.2%
18
Viruses
318 papers in training set
Top 4%
1.0%
19
BMC Medical Research Methodology
43 papers in training set
Top 1.0%
1.0%
20
Frontiers in Microbiology
375 papers in training set
Top 8%
0.9%
21
Journal of Infection
71 papers in training set
Top 2%
0.9%
22
Frontiers in Public Health
140 papers in training set
Top 8%
0.8%
23
Journal of Translational Medicine
46 papers in training set
Top 2%
0.8%
24
Heliyon
146 papers in training set
Top 6%
0.8%
25
Journal of Clinical Medicine
91 papers in training set
Top 6%
0.8%
26
Journal of Biomedical Informatics
45 papers in training set
Top 1%
0.8%
27
GigaScience
172 papers in training set
Top 3%
0.7%
28
International Journal of Infectious Diseases
126 papers in training set
Top 3%
0.7%
29
Eurosurveillance
80 papers in training set
Top 2%
0.7%
30
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.7%