Back

Hierarchical Barycentric Multimodal Representation Learning for Medical Image Analysis

Qiu, P.; An, Z.; Ha, S.; Kumar, S.; Yu, X.; Sotiras, A.

2026-04-06 neurology
10.64898/2026.04.05.26350202 medRxiv
Show abstract

Multimodal medical image analysis exploits complementary information from multiple data sources (e.g., multi contrast Magnetic Resonance Imaging (MRI), Diffusion Tensor Imaging (DTI), and Positron Emission Tomography (PET)) to enhance diagnostic accuracy and support clinical decision making. Central to this process is the learning of robust representations that capture both modality invariant and modality specific features, which can then be leveraged for downstream tasks such as MRI segmentation and normative modeling of population level variation and individual deviations. However, learning robust and generalizable representations becomes particularly challenging in the presence of missing modalities and heterogeneous data distributions. Most existing methods address this challenge primarily from a statistical perspective, yet they lack a theoretical understanding of the underlying geometric behavior such as how probability mass is allocated across modalities. In this paper, we introduce a generalized geometric perspective for multimodal representation learning grounded in the concept of barycenters, which unifies a broad class of existing methods under a common theoretical perspective. Building on this barycentric formulation, we propose a novel approach that leverages generalized Wasserstein barycenters with hierarchical modality specific priors to better preserve the geometry of unimodal distributions and enhance representation quality. We evaluated our framework on two key multimodal tasks brain tumor MRI segmentation and normative modeling demonstrating consistent improvements over a variety of multimodal approaches. Our results highlight the potential of scalable, theoretically grounded approaches to advance robust and generalizable representation learning in medical imaging applications.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Medical Image Analysis
33 papers in training set
Top 0.1%
33.6%
2
Human Brain Mapping
295 papers in training set
Top 0.7%
8.6%
3
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.1%
8.6%
50% of probability mass above
4
NeuroImage
813 papers in training set
Top 2%
6.5%
5
IEEE Transactions on Biomedical Engineering
38 papers in training set
Top 0.2%
4.0%
6
Imaging Neuroscience
242 papers in training set
Top 1%
3.1%
7
Nature Computational Science
50 papers in training set
Top 0.3%
2.6%
8
Nature Medicine
117 papers in training set
Top 1%
2.1%
9
IEEE Access
31 papers in training set
Top 0.3%
1.9%
10
Nature Communications
4913 papers in training set
Top 49%
1.8%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 31%
1.7%
12
Journal of Medical Imaging
11 papers in training set
Top 0.1%
1.7%
13
Scientific Reports
3102 papers in training set
Top 57%
1.7%
14
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
15
PLOS ONE
4510 papers in training set
Top 56%
1.5%
16
PLOS Digital Health
91 papers in training set
Top 2%
1.0%
17
NeuroImage: Clinical
132 papers in training set
Top 3%
1.0%
18
European Journal of Nuclear Medicine and Molecular Imaging
19 papers in training set
Top 0.2%
0.9%
19
Artificial Intelligence in Medicine
15 papers in training set
Top 0.5%
0.9%
20
Network Neuroscience
116 papers in training set
Top 1%
0.8%
21
eLife
5422 papers in training set
Top 55%
0.8%
22
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
23
Brain Communications
147 papers in training set
Top 3%
0.7%