A Reproducible Pipeline for Processing Commercial Wearable Step-Count Data in Aging Cohorts: Application and Evaluation in the STRRIDE-PD Reunion Study
Bo, N.; Sudnick, A. M.; Counts, J. D.; Kennedy, K. G.; Saldana, A. A.; Collins-Bennett, K. A.; Bennett, W. C.; Johnson, J. L.; Huffman, K. M.; Paluch, A. E.; Ashner, M. C.; Kraus, W. E.; Peskoe, S. B.; Ross, L. M.
Show abstract
Wearable devices offer the ability to objectively characterize free-living physical activity; however, raw step-count data generated by commercial devices require systematic processing before they can support rigorous inference. We describe a transparent, reproducible standard operating procedure (SOP) for transforming epoch-level step-count data from commercial Garmin devices into participant-level analytic variables and demonstrate its application in the STRRIDE-PD Reunion study: a long-term follow-up of older adults originally enrolled in a supervised exercise intervention trial. This data pipeline standardizes timestamps, reconstructs daily epoch grids, infers wear time from observed step patterns, and applies a prespecified valid-day threshold ([≥]10 hours inferred wear time) to generate participant-level summaries. Among 67 participants (mean age 71.4 years, 65.7% women), the median valid-day count was 10 days, median average daily steps were 5,794, and participant-level estimates were identical across [≥]10-hour and [≥]6-hour valid-day thresholds. Wearable-derived step counts were significantly associated with 9 of 16 cardiometabolic and fitness outcomes, including cardiorespiratory fitness, body composition, and lipid profiles. By contrast, self-reported exercise - assessed via a frequency-by-duration composite ranked into deciles - was not significantly associated with any outcome. A regression calibration framework applied to the full sample quantified the attenuation underlying this discrepancy: the naive self-report model systematically underestimated associations relative to both the observed Garmin model and calibration-corrected estimates. These findings demonstrate that measurement approach is a determinant of scientific conclusions in physical activity research, and that reproducible wearable data pipelines are essential infrastructure for aging epidemiology.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.