Back

Distributional and Centile Calibration of Diffusion Tensor Imaging Normative Models as Training Sample Sizes Increase to 40,000 Subjects

Villalon Reina, J. E.; Feng, Y.; Nabulsi, L.; Nir, T. M.; Thomopoulos, S. I.; Lawrence, K. E.; Jahanshad, N.; Kia, S. M.; Marquand, A. F.; Thompson, P. M.

2026-02-06 bioinformatics
10.64898/2026.02.04.703901 bioRxiv
Show abstract

Normative modeling (NM) is a powerful framework for quantifying individual deviations in brain structure and function relative to a population reference. However, its clinical utility depends on well-calibrated models trained on heterogeneous datasets such as those found in neuroimaging. Here, we systematically examine the effect of training sample size on the distributional and centile calibration of hierarchical Bayesian regression (HBR)-based NMs. Using multisite 3D diffusion MRI scans of the brain from 54,583 subjects, spanning almost the entire lifespan (age: 4-91 years), we trained NMs of white matter fractional anisotropy, a key microstructural metric, on subsamples ranging from 5,000 to 40,000 subjects. HBR was modeled with a Sinh-Arcsinh likelihood. Model calibration was evaluated using Kernelized Stein Discrepancy (KSD) to assess distributional agreement of Z-scores with the standard normal distribution; we also used Mean Absolute Centile Error (MACE) to quantify centile accuracy. Both metrics showed consistent and substantial improvements as the training sample size increased, indicating reduced posterior uncertainty and improved estimation of distributional parameters, particularly at the centile extremes. These results demonstrate that large training cohorts are essential for well-calibrated NMs derived from heterogeneous neuroimaging data and highlight the importance of large-scale data aggregation for reliable individual-level inference.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.