Supervised restricted data fusion with common, local & distinct components
White, F.; van der Ploeg, G. R.; Heintz-Buschart, A.; Dong, L.; Bouwmeester, H.; Smilde, A.; Westerhuis, J.
Show abstract
In multi-block data, the dominant sources of variation are not always most relevant to a response of interest, meaning that purely exploratory decompositions may fail to recover subtle but important response-associated structure. We introduce PESCAR, a supervised extension of Penalised Exponential Simultaneous Component Analysis (PESCA) that incorporates response information directly into the estimation of common, local, and distinct (CLD) structure across multiple data blocks. This allows simultaneous multiblock decomposition and response variable influenced recovery of latent structure. Through simulation studies, we show that PESCAR can detect weak response-related components across a range of settings, including different noise levels and model-rank mis-specification. Applied to a real multi-omics dataset, PESCAR recovers biologically meaningful response-associated patterns and retains interpretable block structure. We further demonstrate that sparsity in the fitted loading matrices admits a hypergraph-based interpretability layer, summarising overlapping support patterns across components and blocks. These results show that direct incorporation of response information into multiblock decomposition can improve detection of subtle relevant signal and facilitate interpretation in complex systems.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.