Back

Identifying anaphylaxis using weakly-supervised prediction models and natural language processing

Williamson, B. D.; Cronkite, D. J.; Yu, O.; Ramaprasan, A.; Fuller, S.; Covey, J.; Kiniry, E.; Park, D.; Winter, R.; Whitaker, J.; McLemore, M. F.; Wittayanukorn, S.; Stojanovic, D.; Zhao, Y.; Dutcher, S.; Carrell, D. S.; Jackson, L. A.; Nelson, J. C.; Smith, J. C.

2026-06-17 epidemiology

10.64898/2026.06.09.26355005 medRxiv

Show abstract

Objectives Scalable computable phenotyping algorithms are critical for conducting high-throughput disease-outcome research in large, distributed-data electronic health record (EHR) and claims data settings. We developed and evaluated a claims- and EHR-based computable phenotyping algorithm for anaphylaxis, a rare acute condition that is challenging to accurately identify using claims data alone. Materials and Methods Potential anaphylaxis events came from two healthcare systems (Kaiser Permanente Washington [KPWA] and Vanderbilt University Medical Center [VUMC]). We engineered features from clinical text using automated natural language processing (NLP) methods. We then developed a phenotyping algorithm using four NLP- and diagnosis code-based silver labels (proxies for the gold-standard labels). Gold-standard abstracted outcomes were used to evaluate algorithm performance. Results The largest area under the receiver operating characteristic curve (AUC) was 0.931 for an NLP-based silver-label model at KPWA. Depending on the model and healthcare system site, positive predictive value (PPV) and sensitivity at the threshold of predicted probability that maximized F1 score ranged from 0.52 to 0.77 (PPV) and 0.78 to 1 (sensitivity). Discussion NLP-based silver-label models had large AUC at KPWA but not at VUMC. This may be because clinical text at KPWA is only available for outpatient encounters and secure messaging. High sensitivity for identifying anaphylaxis can be obtained using our best-performing models. Conclusion The best-performing models had better PPV and sensitivity tradeoffs than prior bespoke anaphylaxis models with costly, manually curated features. The simplicity of the approach compared to traditional phenotyping methods allows it to be deployed easily at multiple health care systems.

Identifying anaphylaxis using weakly-supervised prediction models and natural language processing

Matching journals