Back

Reliable Uncertainty Under Class Imbalance and Distribution Shift: Class-Conditional Conformal Prediction of Multiple Sclerosis

Millar, A. S.; Roman, C.; Gouripeddi, R.; Facelli, J. C.

2026-05-15 health informatics
10.64898/2026.05.12.26353057 medRxiv
Show abstract

Objectives To evaluate whether class-conditional conformal prediction (CP) can provide reliable uncertainty quantification (UQ) under severe class imbalance and distribution shift, using multiple sclerosis (MS) diagnosis from magnetic resonance imaging (MRI) as a clinical exemplar. Methods We evaluated marginal and class-conditional CP using 720 T2-weighted MRI scans (142 MS, 578 controls). A convolutional neural network trained on 3 T data was evaluated under distribution shift (1.5 T acquisitions and synthetic image degradations). Through 100 Monte Carlo experiments, we assessed coverage guarantees, class-specific performance, and relationships between calibration set size, coverage variance, and uncertainty. Results Marginal CP severely under-covered the minority MS class (16.9% mean coverage at 1.5 T vs. 95.2% for controls) despite valid population-level guarantees. Class-conditional CP dramatically improved MS coverage to 77.5% at 1.5 T and 85.8% at 3 T, significantly reducing severe undercoverage (<80%) frequency while maintaining >89% control coverage. Minority class coverage variance increased due to limited calibration samples, matching theoretical Beta-binomial predictions. CP maintained validity under distribution shift; prediction set sizes scaled monotonically with shift severity, yielding clinically interpretable UQ. Conclusions Class-conditional CP successfully mitigates systematic undercoverage of minority disease classes while maintaining validity under distribution shift. The approach offers a practical, model-agnostic solution for uncertainty quantification applicable across clinical AI systems, though increased coverage variance for less represented conditions reflects fundamental statistical constraints. By characterizing these variance trade-offs, this framework enables more reliable deployment of diagnostic AI in heterogeneous clinical environments across diverse medical domains where minority disease class detection is critical.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
NeuroImage
813 papers in training set
Top 1.0%
12.5%
2
npj Digital Medicine
97 papers in training set
Top 0.5%
10.6%
3
NeuroImage: Clinical
132 papers in training set
Top 0.3%
10.3%
4
Scientific Reports
3102 papers in training set
Top 12%
7.3%
5
Medical Image Analysis
33 papers in training set
Top 0.2%
6.4%
6
Human Brain Mapping
295 papers in training set
Top 1%
4.9%
50% of probability mass above
7
NMR in Biomedicine
24 papers in training set
Top 0.1%
4.9%
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.6%
2.8%
9
Medical Physics
14 papers in training set
Top 0.4%
1.7%
10
Nature Communications
4913 papers in training set
Top 51%
1.7%
11
Communications Biology
886 papers in training set
Top 9%
1.7%
12
PLOS ONE
4510 papers in training set
Top 56%
1.5%
13
Communications Medicine
85 papers in training set
Top 0.4%
1.4%
14
Magnetic Resonance in Medicine
72 papers in training set
Top 0.4%
1.4%
15
IEEE Transactions on Biomedical Engineering
38 papers in training set
Top 0.6%
1.4%
16
Biology Methods and Protocols
53 papers in training set
Top 1%
1.4%
17
Nature Machine Intelligence
61 papers in training set
Top 3%
1.2%
18
PLOS Digital Health
91 papers in training set
Top 2%
1.2%
19
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.0%
20
Patterns
70 papers in training set
Top 2%
1.0%
21
Science Translational Medicine
111 papers in training set
Top 5%
0.9%
22
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.9%
23
Imaging Neuroscience
242 papers in training set
Top 3%
0.8%
24
European Radiology
14 papers in training set
Top 0.6%
0.8%
25
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
26
IEEE Access
31 papers in training set
Top 0.9%
0.8%
27
Aperture Neuro
18 papers in training set
Top 0.4%
0.8%
28
Frontiers in Neuroinformatics
38 papers in training set
Top 0.8%
0.8%
29
Frontiers in Neurology
91 papers in training set
Top 5%
0.8%
30
Frontiers in Neuroscience
223 papers in training set
Top 7%
0.8%