Back

Context-Dependent Age-Group performance hierarchies limit fairness interventions in PPG-based heart rate prediction

Panchumarthi, L. Y.; Kataria, S.; Wu, Y.; Hu, X.; Fedorov, A.; Kwak, H. G.

2026-06-05 health informatics
10.64898/2026.06.04.26352929 medRxiv
Show abstract

Background. Fairness-aware machine learning increasingly targets demographic performance disparities in clinical prediction, yet whether standard bias mitigation strategies genuinely improve equity in physiological signal analysis remains unclear. Age-based disparities in photoplethysmography (PPG)-based heart rate prediction present a particular challenge, as age-related performance differences may reflect context-dependent physiological structure rather than correctable artifacts. Methods. We evaluated three fairness interventions, inverse-frequency weighting (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV), applied via fine-tuning of a PPG foundation model across three clinical datasets spanning intensive care unit, laboratory, and consumer wearable contexts. Outcomes were assessed using a 2x2 framework classifying each intervention-dataset combination by the joint direction of change in mean absolute error (MAE) and fairness gap (FG) across age groups, yielding four outcome types: genuine improvement (G), leveling down (L), selective benefit (S), and both worse (W). Results. Across nine intra-domain conditions, no intervention simultaneously improved both MAE and FG (0/9 genuine improvement). The dominant pattern was leveling down (5/9): FG decreased but was accompanied by MAE degradation, indicating that apparent fairness gains were achieved at the cost of overall predictive performance. Age-group difficulty ordering varied across clinical contexts at baseline and was not preserved under intervention. In 18 cross-domain transfer conditions, genuine improvement was rare (4/18) and observed exclusively in non-MIMIC source configurations; models fine-tuned on MIMIC-sourced data yielded no genuine improvements (0/6). Embedding-level representation changes following fine-tuning did not reliably predict fairness outcomes. Conclusions. Age-based fairness interventions in PPG heart rate prediction indicate a leveling-down pattern rather than genuine equity improvement, suggesting that age-related performance gaps reflect context-dependent physiological structure not fully addressable through standard bias mitigation. Cross-domain transfer further amplifies this instability. These findings suggest that fairness evaluation frameworks for age-stratified physiological prediction should account for context-dependent performance structure rather than treating observed gaps as correctable bias.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.3%
14.8%
2
Scientific Reports
3102 papers in training set
Top 2%
14.8%
3
PLOS Digital Health
91 papers in training set
Top 0.5%
4.9%
4
eBioMedicine
130 papers in training set
Top 0.1%
4.9%
5
Human Brain Mapping
295 papers in training set
Top 2%
4.0%
6
NeuroImage: Clinical
132 papers in training set
Top 1%
4.0%
7
PLOS Computational Biology
1633 papers in training set
Top 9%
3.6%
50% of probability mass above
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.6%
2.9%
9
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.8%
10
European Heart Journal - Digital Health
15 papers in training set
Top 0.2%
2.8%
11
GeroScience
97 papers in training set
Top 0.7%
2.6%
12
Physiological Measurement
12 papers in training set
Top 0.2%
2.1%
13
PLOS ONE
4510 papers in training set
Top 47%
2.1%
14
Frontiers in Digital Health
20 papers in training set
Top 0.5%
1.9%
15
Neurophotonics
37 papers in training set
Top 0.3%
1.7%
16
Nature Biomedical Engineering
42 papers in training set
Top 0.8%
1.7%
17
Biology Methods and Protocols
53 papers in training set
Top 1.0%
1.7%
18
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.7%
19
Journal of Biomedical Informatics
45 papers in training set
Top 0.9%
1.5%
20
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.3%
1.5%
21
Journal of the American Heart Association
119 papers in training set
Top 4%
0.8%
22
Communications Biology
886 papers in training set
Top 23%
0.8%
23
Frontiers in Cardiovascular Medicine
49 papers in training set
Top 3%
0.7%
24
Patterns
70 papers in training set
Top 3%
0.7%
25
Neurobiology of Aging
95 papers in training set
Top 2%
0.7%
26
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.6%
27
NeuroImage
813 papers in training set
Top 6%
0.6%
28
BMC Medicine
163 papers in training set
Top 9%
0.5%
29
Circulation
66 papers in training set
Top 3%
0.5%