Whom Does Algorithmic Risk Stratification Miss? A Fairness Audit of Machine Learning Targeting for Concurrent Maternal-Child Double Burden of Malnutrition Across 30 Low- and Middle-Income Countries
WU, X.; Zheng, B.
Show abstract
BackgroundConcurrent maternal-child double burden of malnutrition (DBM) affects a growing share of mother-child dyads in low- and middle-income countries (LMICs). Nutrition programmes often use maternal education as an eligibility proxy, but whether algorithmic alternatives would do better--and at what equity cost--has not been directly tested. We evaluated whether machine learning (ML)-based targeting for two concurrent DBM subtypes--overweight mother with stunted or wasted child (Subtype A) and underweight mother with stunted or wasted child (Subtype B)--improves recall over a proxy-based rule while preserving fairness across social strata. MethodsWe pooled Phase 7-8 Demographic and Health Surveys from 30 LMICs (181,636 mother-child dyads). We first estimated subtype-specific social gradients with multilevel logistic regression. We then trained xgboost prediction models with strict label-leakage safeguards and leave-one-country-out cross-validation, and compared ML-based targeting against random and education-based rules at 10%, 20%, and 30% budget constraints. Fairness was audited along six social strata using equalized-odds, demographic-parity, calibration, and predictive-value gaps. A full-India sensitivity analysis (354,691 dyads) assessed robustness to down-sampling. FindingsOverall weighted any-DBM prevalence was 12.52% (Subtype A: 8.21%; Subtype B: 4.31%). Subtype A showed an inverted-U gradient on wealth (adjusted odds ratio peak 1.22 at Richer versus Poorest) and maternal education (peak 1.23 at Primary versus None); Subtype B declined monotonically (wealth: 0.25 at Richest; higher education: 0.49). Mean leave-one-country-out area under the curve was 0.615 for Subtype A and 0.652 for Subtype B. At a 20% budget, ML captured 35.3% of Subtype A cases versus 18.4% for education-based targeting (+92%); for Subtype B the corresponding values were 37.5% and 32.1% (+17%). Equalized-odds gaps reached 0.57 on country income and 0.59 on maternal education; true-positive rates were lowest in the highest-wealth and highest-education strata. Results were stable under the full-India sensitivity analysis. ConclusionsML is useful principally for Subtype A, where the education proxy is no better than random. For Subtype B it mostly changes who gets reached rather than how many, which is a policy choice rather than an accuracy upgrade. The households the algorithm most often misses are not the poor but the rare positives in high-resource strata, which is what a fixed-budget rule ranking on heterogeneous base rates will do. Programmes should decide whether their priority is total capture or the distribution of capture before adopting such a rule. Author SummaryIn many low- and middle-income countries, mothers who are overweight often live in the same household as children who are too short or too thin for their age. Nutrition programmes that try to reach such families have limited resources, so they must choose which households to prioritise. Most programmes use maternal education level as a rough filter, but whether this is actually a good way to find affected families has rarely been tested. We used surveys of 181,636 mother-child pairs from 30 low- and middle-income countries to compare three ways of identifying at-risk households: random selection, selection by low maternal education, and selection by a machine-learning model. Machine learning was much better at finding families where an overweight mother lives with an undernourished child--nearly doubling the capture rate compared with the education rule. For a different combination (underweight mother with an undernourished child), machine learning did not clearly outperform education on total recall; instead it reached different households, mostly shifting attention toward the rural poor. An unexpected finding was that the households the algorithm was most likely to miss were not the poor ones, but the wealthier and better-educated ones, where this type of malnutrition is rarer. This is not bias against the poor--it is what happens when any ranking rule operates under a fixed budget. Programmes that want to reach everyone at risk, regardless of how rare risk is in a given group, may need more than one rule.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.