A Unified Form of Batch Harmonization Equation for Normative Modeling: A Location Scale Framework
Li, M.; Wang, Y.; Shen, Y.; Jia, G.
Show abstract
Normative modeling quantifies individual deviation from population norms by estimating the conditional mean and variance of brain-derived measures as functions of clinically relevant parameters such as age. The rapid growth of multicenter consortia has created an urgent need for normative models that incorporate batch harmonization. Several harmonization methods based on linear mixed models--ComBat, GAMLSS, HBR, and Generalized Normative Modeling (GNM)--offer explicit formulations of the mean and variance, making them natural candidates for batch-harmonized normative modeling; yet the absence of a unified theoretical framework leaves it unclear whether and how these methods support the computation of batch-harmonized z-scores. We bridge this gap by writing existing harmonization methods as special cases of a single location-scale equation, y = m(x, {Theta})+{sigma}(x, {Theta}){varepsilon} , which we term the unified form of batch harmonization equation for normative modeling. The methods differ only in the functional forms of m and{sigma} , how batch parameters enter{Theta} , and how{Theta} is estimated. This unified form yields both harmonized data y* and site-invariant z-scores from the same model, providing a common theoretical language for harmonized normative modeling. Building on this framework, we evaluate the underlying regression engines (parametric, spline, Gaussian process, kernel, deep learning), sensitivity to outliers, computational scalability, and federated decomposability for privacy-preserving multi-center computation. By clarifying what each method assumes, what it delivers, and where the boundaries of current methodology lie, the unified equation establishes a principled foundation for method selection and charts a path toward reliable, scalable, and privacy-aware normative modeling across multi-center neuroimaging.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.