Back

Accounting for Structured Missingness in Canonical Correlation Analysis

Radosavljevic, L.; Smith, S. M.; Nichols, T. E.

2025-10-10 epidemiology
10.1101/2025.10.09.25337581 medRxiv
Show abstract

A particularly challenging form of missing data is structured missingness, where sets of subjects and variables consistently have missing data. For tabular data from sub-studies or modalities, structured missingness can come from non-participation in followup studies, which creates large blocks of missing data. Canonical Correlation Analysis (CCA) is a multivariate modelling tool commonly used to link two different set of variables, and in neuroimaging has typically been used to find associations between imaging and non-imaging variables. Motivated by CCA, we propose a new method for covariance estimation from incomplete data that handles data with a mix of structured and unstructured missingness, assuming Missing at Random (MAR). Our proposed method is compared to existing methodology by way of evaluation on simulated data and on real data from subjects in the UK Biobank brain imaging cohort.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
NeuroImage
813 papers in training set
Top 0.4%
22.6%
2
Human Brain Mapping
295 papers in training set
Top 0.1%
22.6%
3
Aperture Neuro
18 papers in training set
Top 0.1%
12.4%
50% of probability mass above
4
PLOS ONE
4510 papers in training set
Top 31%
4.9%
5
Scientific Data
174 papers in training set
Top 0.4%
4.0%
6
Scientific Reports
3102 papers in training set
Top 36%
3.6%
7
Statistics in Medicine
34 papers in training set
Top 0.1%
3.6%
8
Medical Image Analysis
33 papers in training set
Top 0.5%
2.4%
9
NeuroImage: Clinical
132 papers in training set
Top 2%
1.9%
10
Communications Biology
886 papers in training set
Top 6%
1.9%
11
Journal of Medical Imaging
11 papers in training set
Top 0.1%
1.8%
12
PLOS Computational Biology
1633 papers in training set
Top 18%
1.3%
13
Genetic Epidemiology
46 papers in training set
Top 0.6%
1.2%
14
American Journal of Epidemiology
57 papers in training set
Top 1%
1.0%
15
Nature Communications
4913 papers in training set
Top 61%
0.8%
16
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
17
Frontiers in Physics
20 papers in training set
Top 0.9%
0.8%
18
Imaging Neuroscience
242 papers in training set
Top 3%
0.8%
19
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.8%
20
Bioinformatics
1061 papers in training set
Top 10%
0.7%
21
Frontiers in Genetics
197 papers in training set
Top 12%
0.5%
22
Developmental Cognitive Neuroscience
81 papers in training set
Top 0.7%
0.5%
23
BMC Medical Research Methodology
43 papers in training set
Top 2%
0.5%