Unsupervised identification of low-frequency antigen-specific TCRs using distance-based anomaly scoring
Kinoshita, K.; Kobayashi, T. J.
Show abstract
Identifying antigen-specific T cell receptors (TCRs) within the diverse human repertoire remains challenging due to their extremely low frequencies, often as rare as one per million cells. Here, we propose a novel unsupervised approach that detects low-frequency antigen-specific TCRs through distance-based anomaly detection in TCR sequence space. Our method is based on the observation that antigen-specific TCRs preferentially localize at the periphery of V gene clusters rather than cluster centers. Using TCRdist3 to quantify sequence distances, we identify query TCRs that are anomalous compared to reference repertoires within their V-J gene combinations. We validated this approach across three immunological contexts: COVID-19 infection, influenza vaccination, and yellow fever vaccination. For SARS-CoV-2-specific TCR detection in a COVID-19 patient, our method demonstrated 34.3% accuracy, significantly outperforming similarity-based (ALICE: 8.0%) and frequency-based methods (edgeR: 5.8%, the Pogorelyy method: 6.3%), and uniquely detected low-frequency antigen-specific TCRs at clone count one. The minimal overlap with conventional approaches ([≤]6.7%) indicates our method captures distinct TCR clones overlooked by existing analyses. This spatial distribution-based paradigm provides a complementary strategy for TCR specificity detection, particularly valuable for identifying rare antigen-specific clones essential for understanding immune responses.
Matching journals
The top 9 journals account for 50% of the predicted probability mass.