Back

A Reproducible Pipeline for Processing Commercial Wearable Step-Count Data in Aging Cohorts: Application and Evaluation in the STRRIDE-PD Reunion Study

Bo, N.; Sudnick, A. M.; Counts, J. D.; Kennedy, K. G.; Saldana, A. A.; Collins-Bennett, K. A.; Bennett, W. C.; Johnson, J. L.; Huffman, K. M.; Paluch, A. E.; Ashner, M. C.; Kraus, W. E.; Peskoe, S. B.; Ross, L. M.

2026-05-19 epidemiology
10.64898/2026.05.14.26353213 medRxiv
Show abstract

Wearable devices offer the ability to objectively characterize free-living physical activity; however, raw step-count data generated by commercial devices require systematic processing before they can support rigorous inference. We describe a transparent, reproducible standard operating procedure (SOP) for transforming epoch-level step-count data from commercial Garmin devices into participant-level analytic variables and demonstrate its application in the STRRIDE-PD Reunion study: a long-term follow-up of older adults originally enrolled in a supervised exercise intervention trial. This data pipeline standardizes timestamps, reconstructs daily epoch grids, infers wear time from observed step patterns, and applies a prespecified valid-day threshold ([≥]10 hours inferred wear time) to generate participant-level summaries. Among 67 participants (mean age 71.4 years, 65.7% women), the median valid-day count was 10 days, median average daily steps were 5,794, and participant-level estimates were identical across [≥]10-hour and [≥]6-hour valid-day thresholds. Wearable-derived step counts were significantly associated with 9 of 16 cardiometabolic and fitness outcomes, including cardiorespiratory fitness, body composition, and lipid profiles. By contrast, self-reported exercise - assessed via a frequency-by-duration composite ranked into deciles - was not significantly associated with any outcome. A regression calibration framework applied to the full sample quantified the attenuation underlying this discrepancy: the naive self-report model systematically underestimated associations relative to both the observed Garmin model and calibration-corrected estimates. These findings demonstrate that measurement approach is a determinant of scientific conclusions in physical activity research, and that reproducible wearable data pipelines are essential infrastructure for aging epidemiology.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 6%
18.6%
2
npj Digital Medicine
97 papers in training set
Top 0.5%
10.1%
3
Scientific Reports
3102 papers in training set
Top 10%
8.4%
4
JMIR mHealth and uHealth
10 papers in training set
Top 0.1%
7.2%
5
eLife
5422 papers in training set
Top 13%
6.3%
50% of probability mass above
6
Aging Cell
144 papers in training set
Top 1%
2.9%
7
European Journal of Epidemiology
40 papers in training set
Top 0.2%
2.9%
8
GeroScience
97 papers in training set
Top 0.7%
2.7%
9
PLOS ONE
4510 papers in training set
Top 44%
2.6%
10
Nature Medicine
117 papers in training set
Top 1%
2.4%
11
Nature Aging
51 papers in training set
Top 0.8%
2.1%
12
PLOS Computational Biology
1633 papers in training set
Top 14%
1.9%
13
International Journal of Epidemiology
74 papers in training set
Top 1%
1.9%
14
Science Advances
1098 papers in training set
Top 17%
1.7%
15
The Journals of Gerontology, Series A: Biological Sciences and Medical Sciences
22 papers in training set
Top 0.2%
1.7%
16
Nature Human Behaviour
85 papers in training set
Top 2%
1.7%
17
American Journal of Epidemiology
57 papers in training set
Top 0.8%
1.5%
18
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.9%
19
International Journal of Behavioral Nutrition and Physical Activity
15 papers in training set
Top 0.4%
0.9%
20
PLOS Digital Health
91 papers in training set
Top 3%
0.8%
21
Aging
69 papers in training set
Top 3%
0.8%
22
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
24
Scientific Data
174 papers in training set
Top 2%
0.7%
25
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
26
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
27
Methods in Ecology and Evolution
160 papers in training set
Top 2%
0.7%
28
Sensors
39 papers in training set
Top 2%
0.7%
29
npj Aging
15 papers in training set
Top 1.0%
0.7%
30
eBioMedicine
130 papers in training set
Top 5%
0.7%