Transportability of missing data models across study sites for research synthesis

Thiesmeier, R.; Madley-Dowd, P.; Ahlqvist, V.; Orsini, N.

2026-03-10 epidemiology

10.64898/2026.03.09.26347913 medRxiv

Show abstract

IntroductionSystematically missing covariates are a common challenge in medical research synthesis of quantitative data, particularly when individual participant data cannot be shared across study sites. Imputing covariate values in studies where they are systematically unobserved using information from sites where the covariate is observed implicitly assumes similarity of associations across studies. The behaviour of this assumption, and the bias arising from violating it, remains difficult to qualitatively reason about. Here, we evaluated a two-stage imputation approach for handling systematically missing covariates using simulations across a range of statistical and causal heterogeneity scenarios. MethodsWe conducted a simulation study with varying degrees of between-study heterogeneity and systematic differences in model parameters. A binary confounder was set to systematically missing in half of the studies. Study-specific effect estimates were combined using a two-stage meta-analytic model. The performance of the imputation approach was evaluated with the primary estimand being the pooled conditional confounding-adjusted exposure effect across all studies. ResultsBias in the pooled adjusted effect estimate was small across scenarios with low to substantial between-study heterogeneity. Bias increased monotonically with increasingly pronounced differences in causal structures across study sites. Coverage remained close to the nominal level under low to substantial between-study heterogeneity, but deteriorated markedly as differences in causal structures between study sites became more severe. ConclusionThe two-stage cross-site imputation approach produced valid pooled effect estimates across a wide range of simulated scenarios but showed monotonic sensitivity to differences in causal structures across studies. The results provide insight into the conditions under which cross-site imputation may be appropriate for handling systematically missing covariates in research synthesis.

Transportability of missing data models across study sites for research synthesis

Matching journals