Epidemics
○ Elsevier BV
Preprints posted in the last 90 days, ranked by how well they match Epidemics's content profile, based on 104 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.
Song, P.; de Vlas, S. J.; Emery, T.; Coffeng, L. E.
Show abstract
A concern in infectious disease modelling is how accurately population mixing is incorporated, as it shapes the type and frequency of contacts through which infection spreads, and consequently, estimated intervention effectiveness. Although synthesizing mixing patterns from diary-based surveys is an established framework, geographical information is poorly or sparsely captured. Here we propose a generalizable workflow to quantify geographical connectivity from job registry data covering over 8 million Dutch working population. The derived colleague connectedness shows heterogeneous spatial patterns, quantified from the number of connections per municipality triplet, two residential municipalities and one shared workplace municipality. We demonstrate the utility of this spatial connectivity in signalling regions with elevated outbreak risks. Using SARS-CoV-2 Omicron as an example: a ten-fold increase in within-province connections is associated with a 12-day earlier (95% CI: 2 to 22 days) Omicron onset, and between-province connections associated with an 8-day earlier (95% CI: - 4 to 21 days) onset. These results suggest that the impact of regional interventions shifting spatial connectivity patterns should be expected to vary by region and type of intervention. Together, our findings draw attention of using this highly fine-grained spatial connectivity to enable more regionally tailored and network-targeted policy measures.
Schmid, N.; Zacharias, N.; Höser, C.; Bracher, J.; Arruda, J.; Papan, C.; Mutters, N. T.; Hasenauer, J.
Show abstract
Wastewater-based epidemiology provides a low-cost, scalable view of community infection dynamics, but converting these signals into actionable epidemiological insights remains difficult. Mechanistic models offer interpretability, yet, assumptions such as a constant transmission rate limit realism over long simulation horizons and heterogeneous settings. We present a susceptible-exposed-infectious-recovered (SEIR) universal differential equation (UDE) that links wastewater viral loads to case counts and embeds neural networks to represent time-varying parameters. Parameter and prediction uncertainties are quantified using an ensemble method. We assessed the method using newly collected data for Bonn, Germany, as well as published data for five cities in Rhineland-Palatinate, Germany. The proposed approach produces realistic out-of-sample projections of case counts over an up to 50-week test horizon, and it learns city-specific mappings to prevalence that generalise within each location. Compared to SEIR models with fixed transmission rates, the UDE captures non-stationary drivers (policy, behaviour, seasonality) without sacrificing epidemiological structure, while propagating observation and model uncertainty into the projections. Accordingly, the approach facilitates a scalable interpretation and exploitation of wastewater data for the monitoring of infectious diseases.
Colman, E.; Chatzilena, A.; Prasse, B.; Danon, L.; Brooks Pollock, E.
Show abstract
The basic reproduction number of an infectious disease is known to depend on the structure of contacts between individuals in a population. This relationship has been explored mathematically through two well-known models: one which depends on a matrix of contact rates between different demographic groups, and another which depends on the variability of contact rates over the population. Here we introduce a model that combines and generalises these two approaches. We derive a formula for the basic reproduction number and validate it through comparisons to simulated outbreaks. Applying this method to contact survey data collected in Belgium between 2020 and 2022, we find that our model produces higher estimates of the basic reproduction number and larger relative changes over periods when social contact behaviour was changing during the COVID-19 pandemic. Our analysis suggests some practical considerations when using contact data in models of infectious disease transmission.
Murray Kearney, L.; Davis, E. L.; Keeling, M. J.
Show abstract
Capturing the structured mixing within a population is key to the reliable projection of infectious disease dynamics and hence informed control. Both heterogeneity in the number of epidemiologically-relevant contacts and age-structured mixing have been repeatedly demonstrated as fundamental, yet are rarely combined. Networks provide a powerful and intuitive method to realise these two elements of population structure, and simulate infection dynamics. While there are a few key examples of contact networks being measured explicitly, this is not scalable to larger populations, where representative networks must be constructed from more ubiquitous individual-level data. Here, using data from social contact surveys, we develop a generalisable and robust algorithm utilizing machine learning to generate a surrogate population-scale network that preserves both age-structured mixing and heterogeneity of contacts. For different datasets and network construction assumptions we simulate the spread of infection, considering how the epidemic size varies over basic reproduction number (R0) scenarios - mirroring the process of determining public health impact from early epidemic growth. Our approach shows that both age structure and degree heterogeneity substantially reduce the epidemic size (for a given R0) compared to simpler models. We also demonstrate that these simulations more accurately re-capture the heterogeneity in secondary cases that has been observed, when transmission is scaled by contact duration to dampen the effect of highly connected nodes ("super-spreaders"). By using survey data collected during 2020-2022, these network models also inform about the impacts of control and targeting of public health interventions: quantifying the non-linear reduction in transmission opportunities that occurred during lockdowns, and the ages and contact types most responsible for onward transmission. Our robust methodology therefore allows for the inclusion of the full wealth of data commonly collected by surveys but frequently overlooked to be incorporated into more realistic transmission models of infectious diseases.
Asplin, P.; Mancy, R.; Keeling, M. J.; Hill, E. M.
Show abstract
Symptom propagation occurs when the symptoms of secondary cases are related to those of the primary case as a result of epidemiological mechanisms. Determining whether - and to what extent - symptom propagation occurs requires data-driven methods. Here we quantify the strength of symptom propagation as the increase in risk of a secondary case developing severe symptoms if the primary case has severe symptoms. We first used synthetic results to determine the data requirements to robustly estimate the strength of symptom propagation and to investigate the effect of severity-dependent reporting bias. Categorising symptom severity into two group (mild or severe; asymptomatic or symptomatic), our estimation requires only four summary statistics - the number of primary-secondary case pairs of each combination of symptom presentations. Our analysis showed that a relatively small number (100) of synthetic primary-secondary case pairs was sufficient to obtain a reasonable estimate of the strength of symptom propagation and 1,000 pairs meant errors were consistently small across replicates. Our estimates were robust to severity-dependent reporting bias. We also explored how symptom propagation can be separated from other individual-level factors affecting severity, using age dependence as an example. Although synthetic data generated from an age-structured model led to overestimations of the strength of symptom propagation, allowing disease severity to be age-dependent restored the accuracy of parameter estimation. Finally, we applied our methodology to estimate the strength of symptom propagation from three publicly available data collected during the COVID-19 pandemic with data on presence or absence of symptoms: England households, Israel households, and Norway contact tracing. Our age-free methodology indicated a 12-17% increase in the risk of being symptomatic if infected by someone symptomatic. Our positive estimates for the strength of symptom propagation persisted when applying our age-dependent methodology to the two household data sets with age-structured information (England and Israel). These findings demonstrate evidence for symptom propagation of SARS-CoV-2 and provide consistent estimates for its strength. Our synthetic data analysis supports the conclusion that these correlations are not a result of reporting bias or age-dependent effects. This work provides a practical tool for estimating the strength of symptom propagation that has minimal data requirements, enabling application across a wide range of pathogens and epidemiological settings.
Xiao, W. F.; Wang, Y.; Goel, N.; Wolfe, M.; Koelle, K.
Show abstract
Wastewater is increasingly being recognized as an important data stream that can contribute to infectious disease surveillance and forecasting. With this recognition, a growing number of statistical inference approaches are being developed to use wastewater data to provide quantitative insights into epidemiological dynamics. However, few existing approaches have allowed for systematic integration of data streams for inference, for example by combining case incidence data and/or serological data with wastewater data. Furthermore, only a subset of existing approaches have been able to handle missing data without imputation and to handle datasets with different sampling times or intervals. Here, we develop a statistically rigorous, yet lightweight, approach to infer and forecast time-varying effective reproduction numbers (Rt values) using longitudinal wastewater virus concentrations either alone or jointly with additional data streams including case incidence data and serological data. Our approach relies on a state-space modeling approach for inference and forecasting, within the context of a simple bootstrap particle filter. We first describe the structure of our underlying disease transmission process model as well as our observation models. Using a mock dataset, we then show that Rt can be accurately estimated by interfacing this model with case incidence data, wastewater data, or a combination of these two data streams using the bootstrap particle filter. Of note, we show that these data streams alone do not allow for reconstruction of underlying infection dynamics due to structural parameter unidentifiability. We then apply our particle filter to a previously analyzed SARS-CoV-2 dataset from Zurich that includes case data and wastewater data. Our analyses of these real-world datasets indicate that incorporation of process noise (in the form of environmental stochasticity) into the state space model greatly improves our ability to reconstruct the latent variables of the model. We further show that underlying infection dynamics can be made identifiable through the incorporation of serological data and that the bootstrap particle filter can be used to make forecasts of Rt, case incidence, and wastewater virus concentrations. We hope that the inference approach presented here will lead to greater reliance on wastewater data for disease surveillance and forecasting that will aid public health practitioners in responding to infectious disease threats.
Autoriello, A.; Averga, S.; Buonomo, B.; Della Marca, R.; Guarino, A.; Moracas, C.; Penitente, E.; Poeta, M.
Show abstract
We introduce PerTexP (Pertussis Time Exploration), an interactive modelling tool designed to investigate pertussis transmission dynamics and to support the evaluation of vaccination strategies and short-term projections. PerTexP allows users to explore and compare maternal, infant, and non-infant booster vaccination scenarios and to assess their potential impact on disease transmission, with a particular focus on the Italian epidemiological context. The tool is based on a discrete-time, stage-structured compartmental model with two age classes. By enabling rapid scenario-based analyses, PerTexP supports evidence-informed decision-making and provides transparent insights into how alternative vaccination strategies may shape pertussis dynamics. Combining accessibility, flexibility, and methodological rigour, PerTexP offers a practical resource for researchers and public health practitioners interested in exploring and comparing pertussis control strategies.
Domenech de Celles, M.; Kramer, S. C.
Show abstract
1Parameter estimation is often necessary to inform transmission models of infectious diseases. This estimation requires choosing an observation model that links the model outputs to the observed data. Although potentially consequential, this choice has received little attention in the literature. Here, we aimed to compare eight observation models, including common distributions such as the Poisson, binomial, negative binomial, and normal (equivalent to least-squares estimation). Using Bayesian inference methods, we fit an SIR-like model to daily case reports during the first wave of COVID-19 in Belgium, Finland, Germany, and the UK. We found considerable differences in the log-likelihoods of the observation models, spanning three orders of magnitude between the best and the worst. Compared with the best models, the binomial, Poisson, and normal models received no support due to their rigid variance structures. Additionally, the binomial and Poisson models produced overly narrow prediction and confidence intervals, especially for key parameters such as the basic reproduction number. The other five models--each with a free dispersion parameter scaling the variance to the mean--performed significantly better, with the negative binomial model ranking first in three countries. We conclude that flexible observation models are essential for transmission models to accurately capture all sources of uncertainty.
Bahig, S.; Oughton, M.; Vandesompele, J.; Brukner, I.
Show abstract
In dense urban settings, delays between diagnostic sampling and effective isolation can sustain transmission during peak infectiousness. We define a waiting-window transmission externality arising when infectious individuals remain mobile while awaiting results, formalized as E = N{middle dot}P{middle dot}TR{middle dot}D, where N is daily testing volume, P test positivity, TR transmission during the waiting period, and D turnaround time. Using Monte Carlo simulation and a susceptible-infectious-recovered (SIR) framework, we quantify excess infections per 1,000 tests/day under multiple diagnostic workflows. A surge scenario incorporates positive coupling between TR and D ({rho} = 0.45), reflecting co-occurrence of laboratory saturation and elevated contacts during system stress. Under centralized 48-hour workflows, excess infections reach [~]80 at P = 10% and [~]401 at P = 50%, increasing to [~]628 under surge conditions. In contrast, near-patient rapid testing and home sampling reduce this to [~]5 and [~]25-26, respectively. Workflows that eliminate the waiting window--either through immediate isolation at sampling or through home-based PCR that returns results at the point of collection--effectively collapse the transmission term. These findings identify diagnostic delay as a modifiable driver of epidemic dynamics. Operational redesign of testing workflows, including decentralized sampling and home-based molecular diagnostics, offers a scalable pathway to improve epidemic controllability and reduce inequities in dense urban environments.
Romeijnders, M. C.; van Boven, M.; Panja, D.
Show abstract
BackgroundHuman-to-human transmission of pathogens fundamentally depends on interactions among infectious and susceptible individuals, yet traditional population-scale models often overlook the stochastic, behaviour-driven, and highly heterogeneous nature of these interactions. MethodsHere, we develop a large-scale actor-based model capturing early epidemic dynamics of a novel respiratory pathogen on dynamic contact networks. We build these networks upon explicitly integrating detailed demographic and residential registry data from the Netherlands. The model simulates the Dutch population characterised by age, residency and mobility patterns, with actors interacting stochastically across households, workplaces and schools. ResultsWe show how the geographic and demographic profiles of initial cases impact transmission trajectories, with densely populated municipalities in the countrys western core acting as key hubs driving epidemic spread. The framework enables rigorous assessment of intervention strategies incorporating behavioural adaptations. As case studies, we quantify the effects of symptomatic self-isolation and travel restrictions to and from major urban centres, highlighting their potential to modulate epidemic outcomes. ConclusionsOur findings underscore the necessity of integrating fine-scale human-to-human contact realism and population scale in epidemic forecasting and control. Plain-language summaryMathematical modelling of infectious diseases is a cornerstone for understanding and predicting how pathogens spread in populations. Current models of disease spread, despite their widespread use, rely on one-size-fits-all assumptions that fail to capture the dynamic, and adaptive nature of real-world human interactions. Network models have the fine detail needed to represent these complexities, but face challenges in scalability and generalisability. Here, we introduce a novel hybrid model that combines the realism of network models with the adaptability of population-level models, enabling a more accurate overall analysis. Our framework advances epidemic modelling by bridging detailed interpersonal behaviour and large-scale generalisability.
Chen, J.; Lambe, T.; Kamau, E.; Donnelly, C.; Lambert, B.; Bajaj, S.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWSerological surveys measure the presence of antibodies in a population to infer past exposure to an infectious pathogen. If study participants ages are known, serocatalytic models can be used to retrace the historical transmission strength of a pathogen within that population, quantified by the force of infection (FOI). These models rely on age information as a key variable since infection risks are interpreted in relation to how long individuals have been at risk. However, due to data constraints, participants ages may be provided only within "age bins". A common approach is then to assign individuals ages to be midpoints of their respective age bins, ignoring uncertainty in this quantity. In this study, we quantify the bias introduced by this midpoint approach and develop a Bayesian framework that explicitly accounts for uncertainty in age. By comparing inference under constant, age-dependent, and time-dependent FOI scenarios, we show that incorporating uncertainty in age in serocatalytic models yields more reliable FOI estimates without sacrificing computational complexity. These improvements support the interpretation of serological data and inform public health decisions, such as estimating disease burden and identifying targeted vaccination groups.
Pefura-Yone, E. W.; Pefura-Yone, E. H.; Pefura-Yone, H. L. N.; Djenabou, A.; Balkissou, A. D.
Show abstract
Tuberculosis (TB) remains a leading cause of death globally, with early mortality often driven by severe malnutrition and human immuno-deficiency virus (HIV) co-infection. Traditional survival analyses identify risk factors but remain associative, failing to capture the dynamic physiological collapse preceding death. In a novel interdisciplinary adaptation, we applied the Merton jump-diffusion structural framework from quantitative finance to model survival as a state of biological solvency, in which mortality occurs when a stochastic health trajectory crosses a critical failure threshold. We analysed a retrospective cohort of 15,182 TB patients in Cameroon over two decades. Adjusted body mass index (BMI) was conceptualized as a proxy for health capital and modeled using a stochastic process accounting for individual recovery trends, physiological instability, and acute clinical shocks. The study included predominantly young adult males (median age: 33 years) with a median BMI of 20.7 kg/m2. HIV co-infection was present in 35% of patients. The overall mortality rate during the 240 days follow-up period was 7.0%, with 55.1% of deaths occurring within the first 30 days. The model identified a critical failure threshold at BMI 17.329 kg/m2. HIV co-infection emerged as a key driver of metabolic instability, significantly increasing physiological volatility. Statistical validation confirmed that sudden clinical shocks were necessary to explain observed mortality patterns. The resulting Distance-to-Death (DtD) metric slightly outperformed standard associative extended Cox models in predicting survival, achieving a higher discriminative ability in testing set (Harrells C-index: 0.781 vs. 0.772; p = 0.038). Patients stratified into the highest-risk category showed a mortality rate of 16.7%, compared with 1.6% in the most stable group.This study bridges financial engineering and clinical epidemiology, offering a mechanistic understanding of how physiological reserves and metabolic instability determine survival. To support clinical application, we developed an interactive digital triage tool enabling identification of high-risk patients in resource-limited settings. Author summaryTuberculosis remains a major cause of death worldwide, particularly in people with poor nutrition or co-infection with HIV. In this study, we explored a new way to understand why some patients survive while others do not. We adapted a method originally used in finance to track the "health reserves" of patients over time, using body weight and related measures to estimate how close someone is to a critical health threshold. Our approach captures both gradual health decline and sudden medical complications, such as severe infections or rapid deterioration. By applying this method to a large group of patients in Cameroon, we found that a very low body weight is a strong warning sign for impending death and that HIV infection makes health outcomes less predictable. We also created a simple scoring tool that can help doctors identify patients at greatest risk, so that life-saving interventions and closer monitoring can be prioritized. This work bridges mathematical modeling and clinical care, offering a new way to assess patient vulnerability and improve outcomes in resource-limited settings.
Danon, L.; Brooks-Pollock, E.
Show abstract
Background Social contact surveys, which measure who-contacts-whom, are widely used to inform infectious disease transmission models and estimate the reproduction number (R), a key metric for assessing epidemic risk. Despite their widespread use, sample size calculations are not routinely performed. Aims To assess the impact of sample size on estimates of R and determine a practical target sample size for social contact surveys used in epidemic modelling. Methods We conducted a review of social contact surveys (2008-2025) to characterise current practice. We characterised the impact of survey size on epidemic metrics using two social contact surveys, the UK Social Contact Survey and POLYMOD (Europe) and two methods. For each dataset and approach, we generated repeated subsamples and calculated the resulting reproduction numbers, characterised their distributions and measured uncertainty. Results We identified 107 unique social contact surveys from 57 studies. Sample sizes ranged from 30 to more than 10,000 participants, with a median of 1,438. One quarter of surveys contained fewer than 1,000 participants. From our simulations, we find that sample sizes below 200 individuals can result in highly variability reproduction numbers. Increasing sample size increases precision, and the most meaningful gains are up to 1,300 individuals. Increasing sample sizes over 3,000 individuals leads to smaller gains. Conclusions A minimum sample size of approximately 1,200-1,300 participants appears sufficient for general-purpose use. These findings support the inclusion of sample size considerations in the design, reporting and interpretation of social contact surveys used for epidemic intelligence and public health decision-making.
Hounsell, R. A.; Norman, J.; Muloiwa, R.; Silal, S. P.
Show abstract
Pertussis remains an endemic and periodically resurgent vaccine-preventable disease despite long-standing immunisation programmes, reflecting complex interactions between transmission, waning immunity, vaccination history, and heterogeneous clinical presentation. We present a comprehensive age-structured mathematical model of pertussis transmission that explicitly represents infection heterogeneity, immunity dynamics, and detailed vaccination schedules across the life course. The model stratifies the population into 56 age groups and 29 epidemiological states, capturing four distinct infection types that differ by severity, symptoms, and infectiousness, including asymptomatic infection. Both naturally acquired and vaccine-derived immunity are modelled as non-lifelong, incorporating waning, partial protection, reinfection, and immune boosting following exposure without transmissible infection. Vaccination is represented at high resolution, including dose-specific primary series vaccination, booster doses in early childhood, childhood, and adolescence, and maternal immunisation during pregnancy, with differentiation between whole-cell and acellular pertussis vaccine formulations and historical changes in vaccine use and coverage. Periodicity and stochasticity are incorporated to reproduce observed multi-year epidemic cycles. A global sensitivity analysis using Latin hypercube sampling and partial rank correlation coefficients identifies immunity waning rates, immune boosting, and recovery from severe infection as key drivers of modelled incidence, mortality, and population protection. By integrating detailed immune processes with realistic vaccination histories, this model provides a flexible framework for evaluating pertussis epidemiology and assessing the population-level impact of alternative vaccination strategies, including booster and maternal immunisation policies.
Bardsley, K.; de Pablo, L. X.; Keppler Canada, E.; Ormaza Zulueta, N.; Mehrabi, Z.; Kissler, S. M.
Show abstract
Emerging respiratory disease outbreaks pose a major threat to food production systems. Agricultural workers live in larger, more crowded households than the general population, amplifying their potential exposure to respiratory pathogens, yet the consequences for worker health and food production remain poorly understood. We developed a household-structured susceptible-infectious-recovered (SIR) transmission model to compare disease dynamics between agricultural workers and the general U.S. population across six regions. We simulated outbreaks across a range of epidemiological scenarios and assessed productivity losses in California for three labor-intensive crops (oranges, iceberg lettuce, strawberries) with different harvest seasonalities. For a baseline reproduction number of R0 = 1.5, peak disease prevalence among agricultural workers was 1.23-1.45 times higher than that of the general population across regions, and final outbreak sizes were 1.15-1.28 times higher. Peak productivity losses ranged from 0.50%-0.62% across crops, translating to millions in lost revenue. At higher transmissibility and severity (R0 = 3 and assuming all infections are symptomatic), losses were over 2.5 times higher. Household crowding may lead to disproportionate respiratory disease burden among agricultural workers, highlighting the need for targeted outbreak preparedness and mitigation strategies in the agricultural sector to maintain food system resilience and support public health in these communities.
Suez, E.; Fox, S. J.
Show abstract
Over the past decade, outbreak forecasting has become an increasingly used tool to assist public health decision-making during epidemics. Collaborative forecast hubs, where multiple teams submit predictions in real-time, are the gold standard for such efforts. For each hub, a Baseline model is used as a performance benchmark for other models. Although the Baseline is understood as a naive forecast, its design is subjective, and the impact of model design decisions remains understudied. We evaluated how three Baseline specification decisions influence forecast performance on trend models that forecast based on historically observed dynamics: (1) the amount of historical data used for training, (2) whether the data are transformed, and (3) whether forecasts follow a flatline variant (constant predictions) or a drift variant (allowing a slope). Retrospective forecasts were generated for multiple years across four surveillance targets: COVID-19, influenza and RSV hospital admissions, and weighted influenza-like illness percentage. For wILI, we additionally compared trend baselines with a seasonal baseline model leveraging long-term historical patterns. Model specification significantly altered performance. The optimal performing model across targets was a flatline model that used the most recent 6-12 transformed observations. The optimal model outperforms the current standard Baseline used in many forecast hubs by an average of 9.6% (range: 3.7-12.9%) across forecast targets, and it outperformed the seasonal baseline model by 32.3% across nine influenza seasons. Our results demonstrate that subjective Baseline design decisions can materially influence forecast accuracy and, consequently, the perceived rankings of models within collaborative forecast hubs. Based on the varying approaches and their performance differences, these findings highlight the need for increased transparency in Baseline model specifications and support the routine inclusion of multiple benchmark models within collaborative forecast hubs.
Anderegg, N.; Egger, M.; Buthlezi, K.; Sinqu, Y.; Slabbert, M.; Johnson, L. F.
Show abstract
Female sex workers (FSW) in sub-Saharan Africa experience disproportionately high risks of HIV infection. Mathematical models are widely used to assess the contribution of sex workers and other key populations to HIV transmission dynamics and to inform targeted programmes. However, many rely on simplifying assumptions, such as stable sex worker characteristics and constant HIV transmission risk over time. These assumptions may be unrealistic and could bias modelled estimates. We used the South African Thembisa model to assess how alternative assumptions about FSW age, duration of sex work, and client-to-FSW transmission risk affect modelled HIV outcomes. We compared six scenarios that combined constant and increasing FSW age and sex work duration with constant and early-epidemic declining (exponentially or exposure-dependent) transmission risk. Each scenario was calibrated to HIV prevalence data from population-based and sex worker-specific surveys. Scenarios that allowed both FSW characteristics and transmission risk to vary over time showed the best agreement with external data, most closely reproducing HIV incidence, prevalence, and viral suppression estimates from a 2019 national sex worker survey (incidence [~]5 per 100 person-years, prevalence 61-62%, viral suppression [~]60%), and producing incidence rate ratios more consistent with estimates from the broader eastern and southern Africa region. By contrast, the scenario assuming constant FSW characteristics and transmission risk overestimated HIV incidence and underestimated prevalence and viral suppression. At the same time, this time-invariant specification attributed a much larger share of new HIV infections to sex work, with commercial sex work accounting for more than 20% of new infections in 2025, compared with 9-13% under time-varying assumptions. Overall, our findings show that HIV model estimates for sex workers are highly sensitive to modelling assumptions. Incorporating time-varying FSW parameters yields estimates that are more consistent with empirical data and support more reliable programme planning and evaluation. Author SummaryFemale sex workers in sub-Saharan Africa face much higher risks of HIV infection than other women. Mathematical models are often used to understand why and to guide prevention programmes. Yet many of these models make simple assumptions about sex workers - for example, that their average age stays the same over time, that they spend a fixed number of years in sex work, or that the chance of HIV passing from a client to a sex worker never changes. In reality, these factors changed over time. In this study, we used South Africas national HIV model to test how changing these assumptions affects the results. We compared different versions of the model and checked which ones best matched national sex worker survey data. We found that the model worked better when we allowed sex workers to become older over time, to spend longer in sex work, and the risk of passing on HIV to decline. Our findings show that mathematical models can give very different answers depending on how they represent the lives and experiences of sex workers. More realistic assumptions lead to more accurate estimates and can help ensure that programmes focus support where it is most needed.
RAZAFIMAHATRATRA, S. L.; RASOLOHARIMANANA, L. T.; ANDRIAMARO, T. M.; RANAIVOMANANA, P.; SCHOENHALS, M.
Show abstract
Interpreting serological data remains challenging, particularly in low prevalence or cross reactive contexts, where antibody responses often show substantial overlap between exposed and unexposed individuals and may depart from normal distributional assumptions. Conventional cutoff based approaches often yield inconsistent or biased estimates of seroprevalence. Here, we present a decisional framework based on finite mixture models (FMMs) that enhances the robustness and interpretability of serological analyses. Beyond simply applying mixture models, our framework integrates multiple methodological innovations : (i) systematic comparison of Gaussian and skew normal mixture models to accommodate asymmetric antibody distributions; (ii) rigorous model selection using the Cramer von Mises test (p > 0.01) combined with a parsimonious score (APS) to prioritize models with well separated clusters; and (iii) hierarchical clustering of posterior probabilities to collapse latent components into biologically meaningful seronegative and seropositive groups. Applied to chikungunya virus (CHIKV) data from Bangladesh, the framework produced prevalence estimates consistent with ROC based methods while probabilistically identifying borderline cases. Validation on SARS CoV 2 and dengue datasets further demonstrated its generalizability: for SARS CoV 2, the approach identified up to five latent clusters with high sensitivity (up to 100%) and specificity (up to 100%), enabling discrimination by disease severity. For dengue, it revealed interpretable subgrouping consistent with background exposure and subclinical infection, despite limited confirmed cases. By integrating distributional flexibility, robust goodness of fit testing, and biologically guided cluster consolidation, this decisional FMM framework provides a reproducible and scalable method for serological interpretation across pathogens and epidemiological settings, addressing key limitations of threshold based classification.
Smah, M. L.; Seale, A. C.; Rock, K. S.
Show abstract
Network-based epidemic models have been instrumental in understanding how contact structure shapes infectious disease dynamics, yet widely used frameworks such as Erd[o]s-Renyi, configuration-model, and stochastic block networks do not explicitly capture the combination of fully accessible (saturated) within-group interactions and constrained between-group connectivity characteristic of many real-world settings. Here, we introduce the Multi-Clique (MC) network model, a generative framework in which individuals are organised into fully connected cliques representing stable contact groups (e.g., households, classrooms, or workplaces), with a limited number of external connections governing inter-group transmission. Using stochastic susceptible-infectious-recovered (SIR) simulations on degree-matched networks, we compare epidemic dynamics on MC networks with those on classical random graph models. Despite having an identical mean degree, MC networks exhibit systematically distinct behaviour, including slower epidemic growth, reduced peak prevalence, increased fade-out probability, and delayed time to peak. These effects arise from rapid within but constrained between clique transmission, creating structural bottlenecks that standard models do not capture. The MC framework provides an interpretable, data-driven representation of recurrent contact structure, with parameters that map directly to observable quantities such as household and classroom sizes. By isolating the role of intergroup connectivity, the model offers a basis for evaluating targeted intervention strategies that reduce between-group mixing while preserving within-group interactions. Our results highlight the importance of explicitly representing the real-life clique-based network structure in epidemic models and suggest that classical degree-matched networks may systematically overestimate epidemic speed and intensity in structured populations.
O'Reilly, K.; Hay, J. A.; Lindesmith, L.; Allen, D.; Hue, S.; Debbink, K.; Kucharski, A.; Baric, R.; Breuer, J.; Edmunds, W. J.
Show abstract
Norovirus in humans is highly contagious, causing diarrhoea and vomiting, and is especially common in young children. Winter incidence varies annually, and previous research indicates that the change of dominant norovirus variant is followed by high incidence, but having a clear mechanism to explain this observation could support better prediction of epidemics. Here we analyse unique norovirus serology blockade data from 656 children in England collected via opportunistic sampling between 2007-2012 using a mathematical model of multi-variant antibody kinetics to infer metrics such as annual attack rates and age-specific infection rates. Analysis reveals that overall infection rates were 204 infections per 1000 person-years (posterior median; 95% credible intervals: 188-221). Infection rates were lowest in children aged under 1 year at 164 infections per 1000 person-years (95% CrI: 121-209) and highest in children aged 5 years and older, at 252 infections per 1000 person-years (95% CrI: 212-288). The annual attack rate was highest in 2002, coincident with transition of the dominant variant to Farmington Hills, and high attack rates are frequently observed with emergence of new variants, but not always. Parameter estimates indicate moderate evidence for the immune imprinting hypothesis: a stronger antibody response to variants encountered earliest in life. Estimates of infection rates estimated here from serology are higher than incidence reported within similar settings based on disease only and is consistent with considerable asymptomatic infection. The combined use of multi-variant antibody data and a mathematical model provide key insights on the natural history of norovirus variants which can inform epidemic planning.