Hidden in Plain Sight: Epidemiological Signals in Routine Laboratory Data
Hoffmann, T.; Mugahid, D.; Olejarz, J.; Neale, A.; Zapf, A.; Molinaro, R.; Lipsitch, M.; Atun, R.; Grad, Y.; Fortune, S.; Sampath, R.; Onnela, J.-P.
Show abstract
Public health monitoring traditionally relies on active reporting from diverse data sources, including clinical and administrative data, disease registries, and population-based surveys. Yet these surveillance methods often face challenges such as incomplete reporting, time lags, and variable population coverage. Meanwhile, diagnostic laboratories routinely generate vast volumes of operational data that are currently untapped for public health monitoring. As these data are not collected for scientific inquiry or population-level surveillance, they often lack formal validation and may contain sensitive information. We developed a Bayesian hierarchical model to decompose aggregated laboratory assay volume data for 1.1 billion clinician-ordered assays across the U.S. from October 2019 to March 2023 into interpretable epidemiological and health system signals. The signals generated by these models were compared with known perturbances to health systems, such as the COVID-19 pandemic. The method does not rely on assay outcomes or individual-level data, providing quantitative signals of epidemiological trends and health system responses while protecting both the privacy of patients and commercially sensitive information. Temporal analysis reveals qualitatively different responses of assay volumes to major public health events, identifying assays whose use paralleled surges in hospitalization rates during the COVID-19 pandemic documented through traditional public health reporting structures. This framework suggests that routine operational data can be used to augment traditional surveillance by identifying anomalous patterns for expert epidemiological investigation. To be truly effective, data from multiple vendors must be integrated to create a comprehensive real-time national or supranational public health surveillance platform.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.