Back

Epidemiology

26 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
Novel Representations of Vaccine Protection Against Progression to Severe Disease Over Time
2026-02-14 epidemiology 10.64898/2026.02.12.26346197
#1 (8.4%)
Show abstract

BackgroundVaccines can prevent severe disease by preventing infection or by reducing progression among those who become infected. Vaccine effectiveness against progression given infection is often used to quantify this second mechanism, but it conditions on infection, which is itself affected by vaccination. As a result, this estimand lacks a clear causal interpretation and may behave non-intuitively over time. MethodsWe introduce a conceptual framework that models protection against infection ...

2
An E-value-Informed Sensitivity Analysis Framework for Hybrid Controlled Trials
2026-03-06 epidemiology 10.64898/2026.03.05.26347653
#1 (5.1%)
Show abstract

Hybrid controlled trials (HCTs) incorporate real-world data into randomized controlled trials (RCTs) by augmenting the internal control arm with patients receiving the same treatment in routine care. Beyond increasing power, HCTs may improve recruitment by supporting unequal randomization ratios that increase patient access to experimental treatments. However, HCT validity is threatened by bias from unmeasured confounding due to lack of randomization of external controls, leading to outcome non-...

3
Constructing and analyzing a synthetic life course cohort based on pooling two data sources: A case study of early adulthood depression symptomatology and late-life cognition
2026-02-27 epidemiology 10.64898/2026.02.25.26347113
Top 0.2% (1.8%)
Show abstract

BackgroundSynthetic cohorts created by combining two cohorts can be useful when no single data set includes both the exposure and outcome data of interest. We estimate the effects of depression in early adulthood on later-life memory outcome using two nationally representative cohorts separately and in a synthetic sample. MethodsWe used the National Longitudinal Study of Youth 1979 (NLSY; N=5,747) and the Health and Retirement Study (HRS; N=6,846) and a synthetic cohort combining exposure data ...

4
A bootstrap particle filter for viral Rt inference and forecasting using wastewater data
2026-03-06 epidemiology 10.64898/2026.03.06.26347747
Top 0.3% (1.5%)
Show abstract

Wastewater is increasingly being recognized as an important data stream that can contribute to infectious disease surveillance and forecasting. With this recognition, a growing number of statistical inference approaches are being developed to use wastewater data to provide quantitative insights into epidemiological dynamics. However, few existing approaches have allowed for systematic integration of data streams for inference, for example by combining case incidence data and/or serological data ...

5
Integrating stakeholder perspectives in modeling routine data for therapeutic decision-making
2026-02-18 epidemiology 10.64898/2026.02.18.26346074
Top 0.3% (1.5%)
Show abstract

BackgroundRoutinely collected health data are increasingly used to generate real-world evidence for therapeutic decision-making. Yet, stakeholders, including clinicians, pharmaceutical industry representatives, patient advocacy groups, and statisticians, prioritize different aspects of data quality, analysis, and interpretation. Without explicit consideration of these perspectives, analyses risk being fragmented, misaligned with end-user needs, or lacking transparency. MethodsWe developed a sta...

6
Accelerating vaccine trials during an outbreak of Disease-X: the effect of pathogen super-spreading on ring-trial design
2026-02-18 epidemiology 10.64898/2026.02.17.26346480
Top 0.3% (1.5%)
Show abstract

The prospective design of vaccine efficacy trials for deployment in outbreaks requires advance consideration of plausible outbreak scenarios, anticipated vaccine characteristics, and logistical and ethical constraints. As part of CEPIs 100 Days Mission to accelerate vaccine development against a novel Disease X, we evaluated trial designs for a hypothetical Nipah-X outbreak. We assumed Nipah-X would share key features with Nipah, including high case fatality rates and substantial super-spreading...

7
Spatial Clustering of School Susceptibles Drives Divergent US Measles Outbreaks
2026-02-27 epidemiology 10.64898/2026.02.25.26347103
Top 0.4% (1.4%)
Show abstract

The two largest US measles outbreaks in over two decades (2025 Gaines County, Texas: 414 cases, contained; 2025-2026 Spartanburg County, South Carolina: 923+ cases, ongoing) occurred in counties with similar sub-threshold K-12 MMR coverage (85.1% vs 88.8%), yet their trajectories diverged dramatically. Using kernel density estimation with a common bandwidth and bootstrap uncertainty quantification, we compared sub-county vaccination data at the district level for Texas (3 districts, 3,560 studen...

8
Methodological Guidance for Predictor Variable Selection for Adolescent Smoking Outcomes in Global Youth Tobacco Survey Using R and Python
2026-02-17 epidemiology 10.64898/2026.02.14.26346305
Top 0.4% (1.4%)
Show abstract

BackgroundThe Global Youth Tobacco Survey (GYTS) is widely used to monitor tobacco use among adolescents worldwide. However, inconsistent analytical approaches particularly in handling complex survey designs and predictor selection limit comparability across countries, survey waves, and software platforms. Although much of the GYTS literature relies on proprietary tools such as SAS and SPSS, practical and transparent guidance on implementing reproducible, theory-informed analyses remains limited...

9
Characterizing the impact of the COVID-19 pandemic on HIV testing among Medicaid beneficiaries
2026-02-14 epidemiology 10.64898/2026.02.12.26346199
Top 0.5% (1.3%)
Show abstract

ObjectivesEstimate the HIV testing, diagnoses, and test positivity rates among Medicaid beneficiaries in 2016-2021 and assess the impact of the COVID-19 pandemic on these outcomes. DesignProspective observational study of Medicaid enrollment, inpatient, and outpatient claims data from 27 states, 2016-2021. MethodsWe assessed Medicaid claims from adult beneficiaries with full benefits whose first continuous enrollment was [≥]6 months without dual enrollment in other insurance, and without pr...

10
Revised estimates of the types and durations of long Covid symptoms based on claims records from 245 Million US patients
2026-02-18 epidemiology 10.64898/2026.02.17.26346448
Top 0.8% (1.1%)
Show abstract

COVID-19 has been shown to cause a range of harmful long-term effects on nearly every organ system1-3. These findings are based on retrospective studies comparing COVID-19 patients to patients with similar medical histories and demographics but no COVID-19 diagnosis4-16. However, concerns have emerged that these comparisons may be biased if COVID-19 patients had unrelated health conditions or other factors not recorded in their medical records17-21. Here, using a massive dataset of 14.4 billion ...

11
Infrequent Cannabis Use and Increased Overdose Risk Among People Who Use Unregulated Drugs: Revealing Frequency-Dependent Effects Through Secondary Analysis
2026-02-14 epidemiology 10.64898/2026.02.11.26346111
Top 0.8% (1.1%)
Show abstract

BackgroundCannabis use is highly prevalent among people who use unregulated drugs. While daily cannabis use has been hypothesized to provide protective effects through substitution or tolerance mechanisms, the relationship between cannabis use frequency and overdose risk remains poorly understood, particularly for infrequent users. MethodsWe conducted a secondary analysis of cross-sectional interview data from people who use unregulated drugs in Vancouver, British Columbia, collected during the...

12
An intuitive sampling framework for setting-specific decision-making in soil-transmitted helminthiasis control programs
2026-02-14 epidemiology 10.64898/2026.02.11.26346062
Top 0.9% (1.0%)
Show abstract

BackgroundWe recently developed a general egg count framework to support cost-efficient survey design choices to inform soil-transmitted helminthiasis (STH) control programs. Yet, the interpretation and the application was not always intuitive for program managers. MethodsWe first adapted the existing framework to make the interpretation of risks of incorrect decision making more intuitive and to allow for prior information. Then, we assessed the impact of the allowable risk of incorrect decisi...

13
Comparison of methods for assessing effects of risk factors on disease progression in Mendelian randomization under index event bias
2026-03-02 epidemiology 10.64898/2026.02.26.26347193
Top 0.9% (1.0%)
Show abstract

Mendelian randomization has emerged as a transformative approach for inferring causal relationships between risk factors and disease outcomes. However, applying Mendelian randomization to disease progression - a critical step in validating pharmacological targets - is hampered by index event bias. This form of selection bias occurs because analyses of disease progression are necessarily restricted to individuals who have already experienced the disease event. Here, we present a comprehensive eva...

14
Standardisation of terminology, calculation and reporting for assigning exposure duration to drug utilisation records from healthcare data sources: the CreateDoT framework
2026-02-19 epidemiology 10.64898/2026.02.18.26346576
Top 1.0% (1.0%)
Show abstract

BackgroundIn pharmacoepidemiological studies, days of treatment (DoT) duration associated with individual electronic drug utilization records (DUR) are usually missing. Researcher-defined duration (RDD) calculation approaches, as opposed to data-driven approaches, can be used to estimate DoT based on the specific choices and assumptions made by investigators. These are usually underreported or even undocumented. We aimed to develop a framework for the standardization of terminology, formulas, im...

15
The Effect Of Smokers Transitioning To E-Cigarettes On Physical And Mental Health: An Emulated Trial Using Longitudinal Data.
2026-02-22 epidemiology 10.64898/2026.02.12.26345898
Top 1% (0.9%)
Show abstract

IntroductionTobacco smoking remains a leading cause of preventable death in the UK. Although e-cigarettes are promoted as a harm-reduction option, longitudinal evidence on short-term health outcomes across different smoking transition pathways is limited. This study examined short-term associations between transitions to exclusive e-cigarette use, dual use, or cessation and physical health, mental health, and health-related quality of life, compared with continued smoking. MethodsA target trial...

16
Joint modelling of PSA dynamics and prostate cancer risks: A population-based study
2026-02-22 epidemiology 10.64898/2026.02.15.26346131
Top 1% (0.9%)
Show abstract

While the prostate-specific antigen (PSA) test is a widely used prostate cancer screening tool, its application remains controversial. Opportunistic PSA testing generates complex data in which testing intensities, PSA levels, and prostate cancer diagnosis are interdependent. Conventional analyses rarely model these processes jointly. The objective of this study was to develop a population-based joint model to analyse PSA dynamics, retesting patterns, and prostate cancer risk. We used the Stockho...

17
Secondary Prevention of Cardiovascular Events in Patients with Overweight/Obesity in Routine Clinical Practice
2026-02-20 epidemiology 10.64898/2026.02.18.26346594
Top 1% (0.9%)
Show abstract

Background and AimsThe glucagon-like peptide-1 receptor agonist (GLP-1 RA) semaglutide has demonstrated efficacy for the secondary prevention of cardiovascular disease among patients with overweight/obesity without diabetes mellitus. However, the comparative effectiveness of GLP-1 RA versus other antiobesity medications (e.g. phentermine-topiramate) not been evaluated. MethodsThis was a retrospective, observational, cohort study using target trial emulation methodology using the Truveta electro...

18
Evaluating Spatially Targeted HIV Interventions and Harm Reduction Services Among People Who Inject Drugs in a High-Burden Setting
2026-02-09 epidemiology 10.64898/2026.02.07.26345824
Top 1% (0.9%)
Show abstract

People who inject drugs (PWID) in India continue to experience high HIV incidence while coverage of HIV and harm reduction services within this population remains suboptimal in many settings, highlighting the need to identify novel service delivery points. To evaluate the effectiveness of spatially focused upscaling of interventions at observed venues where PWID injected drugs together, we developed an individual-based dynamic transmission model of HIV informed by detailed injection network, ser...

19
Life-course comorbidity patterns and integrated prediction of postpartum depression, multimorbidity, and symptom progression
2026-02-18 epidemiology 10.64898/2026.02.18.26346535
Top 2% (0.9%)
Show abstract

Perinatal depression (PD) is common and disabling, yet its longitudinal comorbidity patterns and predictability remain poorly understood. This study leveraged 8,804 women with delivery records in the All of Us cohort, including 438 with clinically diagnosed postpartum depression (PPD), to characterize multimorbidity trajectories and develop integrated prediction models. Comorbidities were grouped into 38 conditions across psychiatric, autoimmune, metabolic, neurological/pain, and reproductive/gy...

20
Regularity in occurrence of respiratory-related events in sleep predicts cardiovascular disease and mortality
2026-03-03 epidemiology 10.64898/2026.02.25.26347037
Top 2% (0.9%)
Show abstract

BackgroundObstructive sleep apnea (OSA), as measured by the Apnea Hypopnea Index (AHI), is associated with adverse outcomes. Measures that characterize the temporal variability in events may provide information over and beyond a simple summary of event frequency as measured by the AHI. Research QuestionTo assess whether temporal variability in the occurrence of obstructive apnea/hypopneas during the night is associated with all-cause mortality or incident cardiovascular disease (CVD). Study De...