Faithful Supervised Dimensionality Reduction for Biomedical Data via Decision Geometry
Wang, Z.; Zhou, Z.; Zhan, Q.; Shen, L.
Show abstract
Unsupervised dimensionality reduction methods aim to preserve intrinsic data geometry by maintaining local neighborhoods and approximate global relationships in low-dimensional embeddings, but they do not use label information and therefore may fail to reflect task-relevant class structure in biomedical and health applications. Supervised dimensionality reduction (SDR) incorporates labels to improve class organization, yet existing approaches often face a trade-off between discrimination and geometric faithfulness. Linear supervised methods are stable and interpretable but are limited in their ability to capture nonlinear structure, whereas many nonlinear methods impose supervision directly in the embedding space, which can over-separate classes and distort the underlying manifold. In biomedical applications, labels such as cell types in single-cell data or patient status in clinical cohorts provide meaningful biological signal, and supervised dimensionality reduction can use this information to produce more informative low-dimensional representations. Here we propose a new framework, DG-UMAP (Decision-Geometry UMAP), for faithful supervised dimensionality reduction via decision geometry. We first fit a classifier in the original feature space and use its boundary-local decision geometry to construct a low-rank metric deformation that emphasizes discriminative directions while limiting geometric distortion. Parametric UMAP is then applied to the transformed space, so supervision acts through the ambient geometry rather than by directly forcing class separation in the embedding. Across synthetic and multiple real-world biomedical datasets, our method yields embeddings with improved agreement with class structure and global organization while preserving local neighborhood quality.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.