Back

Characterizing spatiotemporal patterns of case reporting backfill: a case study of COVID-19 reporting in Michigan, 2020-24

Niu, Y.; Brouwer, A. F.; Martin, E. T.; Coyle, J. R.; Eisenberg, M. C.

2025-11-27 epidemiology
10.1101/2025.11.25.25340977 medRxiv
Show abstract

Backfill is the process of revising case data, often by retrospectively assigning or reassigning newly reported cases to earlier symptom onset dates. Time- and spatial-varying delays in the backfill process may compromise real-time surveillance and forecasting efforts by obscuring the true underlying transmission patterns. Using Michigan COVID-19 case data, we developed a statistical mixture model to describe backfill and geographical and temporal variations. The model combined an exponential process (case reporting delay) and a gamma-distributed process (date reassignment). Parameters were estimated by maximum likelihood with lasso regularization, and the Akaike Information Criterion was used to determine the necessity of the reassignment component for each date. We estimated the exponential reporting speed over time and space and, if appropriate, the transient peak and time of case reassignment. We found that case reporting improved over the pandemic: reporting speed increased over time (with substantial day-to-day variation), and case reassignments were processed faster. We also identified potential regional disparities: rural regions with population densities below 50 people/km2 had slower backfill speeds. These findings provide critical insights about the evolution of case reporting and backfill dynamics that can be leveraged for "nowcasting" models to complete real-time surveillance data, ultimately improving outbreak preparedness and response.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Epidemics
104 papers in training set
Top 0.1%
29.4%
2
PLOS Computational Biology
1633 papers in training set
Top 3%
11.1%
3
Scientific Reports
3102 papers in training set
Top 4%
10.7%
50% of probability mass above
4
PLOS ONE
4510 papers in training set
Top 30%
5.2%
5
Journal of The Royal Society Interface
189 papers in training set
Top 1%
3.3%
6
Infectious Disease Modelling
50 papers in training set
Top 0.5%
2.8%
7
Nature Communications
4913 papers in training set
Top 46%
2.2%
8
mSystems
361 papers in training set
Top 4%
2.0%
9
American Journal of Epidemiology
57 papers in training set
Top 0.6%
2.0%
10
GeoHealth
10 papers in training set
Top 0.3%
1.6%
11
Clinical Infectious Diseases
231 papers in training set
Top 3%
1.4%
12
Epidemiology and Infection
84 papers in training set
Top 2%
1.3%
13
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 37%
1.3%
14
Royal Society Open Science
193 papers in training set
Top 3%
1.3%
15
Scientific Data
174 papers in training set
Top 2%
1.0%
16
BMC Infectious Diseases
118 papers in training set
Top 4%
1.0%
17
eLife
5422 papers in training set
Top 51%
1.0%
18
PeerJ
261 papers in training set
Top 11%
0.9%
19
The American Journal of Tropical Medicine and Hygiene
60 papers in training set
Top 4%
0.8%
20
npj Digital Medicine
97 papers in training set
Top 3%
0.8%
21
Frontiers in Physics
20 papers in training set
Top 0.9%
0.8%
22
mSphere
281 papers in training set
Top 7%
0.5%
23
Epidemiology
26 papers in training set
Top 0.7%
0.5%
24
Viruses
318 papers in training set
Top 6%
0.5%
25
Science Advances
1098 papers in training set
Top 34%
0.5%
26
PLOS Global Public Health
293 papers in training set
Top 6%
0.5%
27
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.5%