Back

Development of an original algorithm to characterize serological antibody response that improve infectious diseases surveillance

RAZAFIMAHATRATRA, S. L.; RASOLOHARIMANANA, L. T.; ANDRIAMARO, T. M.; RANAIVOMANANA, P.; SCHOENHALS, M.

2026-04-24 epidemiology
10.64898/2026.04.16.26350925 medRxiv
Show abstract

Interpreting serological data remains challenging, particularly in low prevalence or cross reactive contexts, where antibody responses often show substantial overlap between exposed and unexposed individuals and may depart from normal distributional assumptions. Conventional cutoff based approaches often yield inconsistent or biased estimates of seroprevalence. Here, we present a decisional framework based on finite mixture models (FMMs) that enhances the robustness and interpretability of serological analyses. Beyond simply applying mixture models, our framework integrates multiple methodological innovations : (i) systematic comparison of Gaussian and skew normal mixture models to accommodate asymmetric antibody distributions; (ii) rigorous model selection using the Cramer von Mises test (p > 0.01) combined with a parsimonious score (APS) to prioritize models with well separated clusters; and (iii) hierarchical clustering of posterior probabilities to collapse latent components into biologically meaningful seronegative and seropositive groups. Applied to chikungunya virus (CHIKV) data from Bangladesh, the framework produced prevalence estimates consistent with ROC based methods while probabilistically identifying borderline cases. Validation on SARS CoV 2 and dengue datasets further demonstrated its generalizability: for SARS CoV 2, the approach identified up to five latent clusters with high sensitivity (up to 100%) and specificity (up to 100%), enabling discrimination by disease severity. For dengue, it revealed interpretable subgrouping consistent with background exposure and subclinical infection, despite limited confirmed cases. By integrating distributional flexibility, robust goodness of fit testing, and biologically guided cluster consolidation, this decisional FMM framework provides a reproducible and scalable method for serological interpretation across pathogens and epidemiological settings, addressing key limitations of threshold based classification.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.9%
19.5%
2
Nature Communications
4913 papers in training set
Top 16%
10.5%
3
Epidemics
104 papers in training set
Top 0.1%
10.1%
4
Cell Reports Methods
141 papers in training set
Top 0.3%
6.8%
5
Scientific Reports
3102 papers in training set
Top 18%
6.3%
50% of probability mass above
6
American Journal of Epidemiology
57 papers in training set
Top 0.2%
4.9%
7
PLOS ONE
4510 papers in training set
Top 39%
3.6%
8
Communications Biology
886 papers in training set
Top 2%
3.6%
9
eLife
5422 papers in training set
Top 30%
2.9%
10
Science Advances
1098 papers in training set
Top 12%
2.1%
11
Viruses
318 papers in training set
Top 2%
2.1%
12
International Journal of Epidemiology
74 papers in training set
Top 1%
1.9%
13
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 29%
1.9%
14
PLOS Biology
408 papers in training set
Top 11%
1.5%
15
Patterns
70 papers in training set
Top 1%
1.3%
16
Bioinformatics
1061 papers in training set
Top 8%
1.2%
17
eBioMedicine
130 papers in training set
Top 3%
0.9%
18
Genome Medicine
154 papers in training set
Top 7%
0.8%
19
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
20
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
21
Microbial Genomics
204 papers in training set
Top 2%
0.6%
22
Advanced Science
249 papers in training set
Top 22%
0.6%
23
BMC Bioinformatics
383 papers in training set
Top 8%
0.6%