Epidemiology
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
BackgroundVaccines can prevent severe disease by preventing infection or by reducing progression among those who become infected. Vaccine effectiveness against progression given infection is often used to quantify this second mechanism, but it conditions on infection, which is itself affected by vaccination. As a result, this estimand lacks a clear causal interpretation and may behave non-intuitively over time. MethodsWe introduce a conceptual framework that models protection against infection ...
Show abstract
Hybrid controlled trials (HCTs) incorporate real-world data into randomized controlled trials (RCTs) by augmenting the internal control arm with patients receiving the same treatment in routine care. Beyond increasing power, HCTs may improve recruitment by supporting unequal randomization ratios that increase patient access to experimental treatments. However, HCT validity is threatened by bias from unmeasured confounding due to lack of randomization of external controls, leading to outcome non-...
Show abstract
BackgroundSynthetic cohorts created by combining two cohorts can be useful when no single data set includes both the exposure and outcome data of interest. We estimate the effects of depression in early adulthood on later-life memory outcome using two nationally representative cohorts separately and in a synthetic sample. MethodsWe used the National Longitudinal Study of Youth 1979 (NLSY; N=5,747) and the Health and Retirement Study (HRS; N=6,846) and a synthetic cohort combining exposure data ...
Show abstract
Wastewater is increasingly being recognized as an important data stream that can contribute to infectious disease surveillance and forecasting. With this recognition, a growing number of statistical inference approaches are being developed to use wastewater data to provide quantitative insights into epidemiological dynamics. However, few existing approaches have allowed for systematic integration of data streams for inference, for example by combining case incidence data and/or serological data ...
Show abstract
BackgroundRoutinely collected health data are increasingly used to generate real-world evidence for therapeutic decision-making. Yet, stakeholders, including clinicians, pharmaceutical industry representatives, patient advocacy groups, and statisticians, prioritize different aspects of data quality, analysis, and interpretation. Without explicit consideration of these perspectives, analyses risk being fragmented, misaligned with end-user needs, or lacking transparency. MethodsWe developed a sta...
Show abstract
The prospective design of vaccine efficacy trials for deployment in outbreaks requires advance consideration of plausible outbreak scenarios, anticipated vaccine characteristics, and logistical and ethical constraints. As part of CEPIs 100 Days Mission to accelerate vaccine development against a novel Disease X, we evaluated trial designs for a hypothetical Nipah-X outbreak. We assumed Nipah-X would share key features with Nipah, including high case fatality rates and substantial super-spreading...
Show abstract
The two largest US measles outbreaks in over two decades (2025 Gaines County, Texas: 414 cases, contained; 2025-2026 Spartanburg County, South Carolina: 923+ cases, ongoing) occurred in counties with similar sub-threshold K-12 MMR coverage (85.1% vs 88.8%), yet their trajectories diverged dramatically. Using kernel density estimation with a common bandwidth and bootstrap uncertainty quantification, we compared sub-county vaccination data at the district level for Texas (3 districts, 3,560 studen...
Show abstract
BackgroundThe Global Youth Tobacco Survey (GYTS) is widely used to monitor tobacco use among adolescents worldwide. However, inconsistent analytical approaches particularly in handling complex survey designs and predictor selection limit comparability across countries, survey waves, and software platforms. Although much of the GYTS literature relies on proprietary tools such as SAS and SPSS, practical and transparent guidance on implementing reproducible, theory-informed analyses remains limited...
Show abstract
ObjectivesEstimate the HIV testing, diagnoses, and test positivity rates among Medicaid beneficiaries in 2016-2021 and assess the impact of the COVID-19 pandemic on these outcomes. DesignProspective observational study of Medicaid enrollment, inpatient, and outpatient claims data from 27 states, 2016-2021. MethodsWe assessed Medicaid claims from adult beneficiaries with full benefits whose first continuous enrollment was [≥]6 months without dual enrollment in other insurance, and without pr...
Show abstract
COVID-19 has been shown to cause a range of harmful long-term effects on nearly every organ system1-3. These findings are based on retrospective studies comparing COVID-19 patients to patients with similar medical histories and demographics but no COVID-19 diagnosis4-16. However, concerns have emerged that these comparisons may be biased if COVID-19 patients had unrelated health conditions or other factors not recorded in their medical records17-21. Here, using a massive dataset of 14.4 billion ...
Show abstract
BackgroundCannabis use is highly prevalent among people who use unregulated drugs. While daily cannabis use has been hypothesized to provide protective effects through substitution or tolerance mechanisms, the relationship between cannabis use frequency and overdose risk remains poorly understood, particularly for infrequent users. MethodsWe conducted a secondary analysis of cross-sectional interview data from people who use unregulated drugs in Vancouver, British Columbia, collected during the...
Show abstract
BackgroundWe recently developed a general egg count framework to support cost-efficient survey design choices to inform soil-transmitted helminthiasis (STH) control programs. Yet, the interpretation and the application was not always intuitive for program managers. MethodsWe first adapted the existing framework to make the interpretation of risks of incorrect decision making more intuitive and to allow for prior information. Then, we assessed the impact of the allowable risk of incorrect decisi...
Show abstract
Mendelian randomization has emerged as a transformative approach for inferring causal relationships between risk factors and disease outcomes. However, applying Mendelian randomization to disease progression - a critical step in validating pharmacological targets - is hampered by index event bias. This form of selection bias occurs because analyses of disease progression are necessarily restricted to individuals who have already experienced the disease event. Here, we present a comprehensive eva...
Show abstract
BackgroundIn pharmacoepidemiological studies, days of treatment (DoT) duration associated with individual electronic drug utilization records (DUR) are usually missing. Researcher-defined duration (RDD) calculation approaches, as opposed to data-driven approaches, can be used to estimate DoT based on the specific choices and assumptions made by investigators. These are usually underreported or even undocumented. We aimed to develop a framework for the standardization of terminology, formulas, im...
Show abstract
IntroductionTobacco smoking remains a leading cause of preventable death in the UK. Although e-cigarettes are promoted as a harm-reduction option, longitudinal evidence on short-term health outcomes across different smoking transition pathways is limited. This study examined short-term associations between transitions to exclusive e-cigarette use, dual use, or cessation and physical health, mental health, and health-related quality of life, compared with continued smoking. MethodsA target trial...
Show abstract
While the prostate-specific antigen (PSA) test is a widely used prostate cancer screening tool, its application remains controversial. Opportunistic PSA testing generates complex data in which testing intensities, PSA levels, and prostate cancer diagnosis are interdependent. Conventional analyses rarely model these processes jointly. The objective of this study was to develop a population-based joint model to analyse PSA dynamics, retesting patterns, and prostate cancer risk. We used the Stockho...
Show abstract
Background and AimsThe glucagon-like peptide-1 receptor agonist (GLP-1 RA) semaglutide has demonstrated efficacy for the secondary prevention of cardiovascular disease among patients with overweight/obesity without diabetes mellitus. However, the comparative effectiveness of GLP-1 RA versus other antiobesity medications (e.g. phentermine-topiramate) not been evaluated. MethodsThis was a retrospective, observational, cohort study using target trial emulation methodology using the Truveta electro...
Show abstract
People who inject drugs (PWID) in India continue to experience high HIV incidence while coverage of HIV and harm reduction services within this population remains suboptimal in many settings, highlighting the need to identify novel service delivery points. To evaluate the effectiveness of spatially focused upscaling of interventions at observed venues where PWID injected drugs together, we developed an individual-based dynamic transmission model of HIV informed by detailed injection network, ser...
Show abstract
Perinatal depression (PD) is common and disabling, yet its longitudinal comorbidity patterns and predictability remain poorly understood. This study leveraged 8,804 women with delivery records in the All of Us cohort, including 438 with clinically diagnosed postpartum depression (PPD), to characterize multimorbidity trajectories and develop integrated prediction models. Comorbidities were grouped into 38 conditions across psychiatric, autoimmune, metabolic, neurological/pain, and reproductive/gy...
Show abstract
BackgroundObstructive sleep apnea (OSA), as measured by the Apnea Hypopnea Index (AHI), is associated with adverse outcomes. Measures that characterize the temporal variability in events may provide information over and beyond a simple summary of event frequency as measured by the AHI. Research QuestionTo assess whether temporal variability in the occurrence of obstructive apnea/hypopneas during the night is associated with all-cause mortality or incident cardiovascular disease (CVD). Study De...