Back

Harness Behavioural Analysis for Unpacking the Bio-Interpretability of Pathology Foundation Models

Hu, Y.; Batchkala, G.; Gaitskell, K.; Domingo, E.; Li, B.; Zhang, T.; Li, Z.; Friedrich, M.; Woodcock, D.; Verrill, C.; Rittscher, J.

2026-01-02 pathology
10.64898/2025.12.31.25343151
Show abstract

Computational-pathology foundation models (PFMs) have demonstrated remarkable accuracy in a wide range of whole-slide image (WSI) analyses, yet their morphological reasoning and potential biases remain opaque. Here we introduce an attention-shift monitoring framework that tracks tissue-level attention influx and efflux before and after fine-tuning a slide-level aggregator. We apply our interpretable framework across five clinically relevant tasks (lymph-node metastasis detection, lung-cancer subtyping, ovarian-cancer drug-response prediction, colorectal-cancer molecular classification and Marsh grading of colitis). We compare two market-validated PFMs, UNI and prov-GigaPath, using dynamically pooled, compressed embeddings under identical running conditions. Although both models achieve comparable ROC-AUC and balanced-accuracy scores, their attention-shift trajectories diverge sharply: each exhibits broad attention efflux from most tissue regions and highly concentrated, yet minimally overlapping, influx into distinct phenotypic zones. The attention heterogeneity in zero-shot mode and inconsistency of post-tuning attention shifts indicate that the presentation of interpretability depends primarily on each models intrinsic feature priors rather than on accuracy or fine-tuning. Our findings uncover a systemic stability gap in PFM interpretability, masked by high performance metrics, and underscore the need for richer explanation tools, bias-monitoring protocols and diversified pre-training strategies to ensure safe clinical deployment.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.