Back

Shortkit-ML: A Unified Multi-Perspective Framework for Detecting Shortcut Learning in Medical Imaging Embeddings

Cajas, S.; Marzullo, A.; Kapadia, S.; Santos, F.; Ocampo Osorio, F.; Kong, Q.; Quarta, A.; Kuo, P.-C.; Patel, M.; Rojas Sillery, R. I.; Celi, L. A.

2026-04-30 health informatics

10.64898/2026.04.29.26352053 medRxiv

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWShortcut learning poses a significant challenge in clinical artificial intelligence, as models may rely on spurious signals rather than clinically relevant features, leading to biased predictions and poor generalization. Existing detection methods are fragmented and lack systematic evaluation across datasets and model architectures. To address this issue, we propose ShortKit-ML, an open-source Python framework for unified shortcut analysis in embedding spaces. The framework integrates over 20 detection methods and six mitigation strategies within a modular pipeline, encompassing embedding analysis, fairness metrics, training dynamics, causal methods, explainability, and representation analysis. We evaluate the framework on chest X-ray datasets (CheXpert and MIMIC-CXR), synthetic benchmarks, and an out-of-domain dataset (CelebA). Experimental results demonstrate that multi-method auditing provides more stable and interpretable evidence than individual methods, while detector disagreement reveals meaningful representational differences. The proposed framework offers automated reporting, interactive visualization, and is available as a pip-installable package. The source code and documentation are publicly available at https://github.com/criticaldata/ShortKit-ML and https://criticaldata.github.io/ShortKit-ML/.

Shortkit-ML: A Unified Multi-Perspective Framework for Detecting Shortcut Learning in Medical Imaging Embeddings

Matching journals