Back

Quantifying the severity of patient safety events via statistical natural language processing

Bhadra, S.; Fong, A.; Sengupta, S.

2025-12-27 health informatics
10.64898/2025.12.22.25342876 medRxiv
Show abstract

Medical errors are one of the leading causes of death in the United States. Several public databases have been built to record patient safety events across healthcare systems to better understand and improve safety hazards. These reports typically include both structured fields (e.g., event type, device, manufacturer) and unstructured data elements (free text narrative of what happened). The structured fields are usually restricted to a limited number of categories, whereas the unstructured fields allow the reporter to freely describe the event details. Thus, analyzing the unstructured text, rather than the structured fields, can reveal rich insights that can help improve patient safety. However, manual analysis of these databases is impractical due to their large size and the inherent subjectivity of manual interpretation. Therefore, we need new statistical algorithms to automate this process. In this paper, we develop a novel statistical technique to predict the severity level of a patient safety event based on its free text description. Using NLP techniques, we first express the raw event descriptions as numeric feature vectors and then use statistical techniques to model the severity of the events based on the feature vectors. We consider and compare three statistical approaches: multiclass (one-shot), ordinal, and hierarchical (two-step) models. To illustrate the proposed method, we analyzed a large text corpus of more than 7.7 million patient safety reports from FDAs MAUDE (Manufacturer and User Facility Device Experience) database. The proposed techniques correctly predicted the reported outcome of the events with above 94% accuracy. Furthermore, our techniques helped identify critical terms/phrases and provide a continuous-scale harm score, which can be more useful than a discrete severity level. Inspecting the misclassified reports, we discovered some likely occurrences of mislabeled reports which are correctly classified by our proposed approach.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
JAMIA Open
37 papers in training set
Top 0.1%
22.9%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
14.6%
3
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.3%
9.3%
4
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.3%
9.3%
50% of probability mass above
5
Scientific Reports
3102 papers in training set
Top 16%
6.5%
6
JMIR Medical Informatics
17 papers in training set
Top 0.2%
4.0%
7
npj Digital Medicine
97 papers in training set
Top 1%
3.6%
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.6%
2.8%
9
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.8%
10
International Journal of Medical Informatics
25 papers in training set
Top 0.6%
2.1%
11
PLOS ONE
4510 papers in training set
Top 47%
2.1%
12
Artificial Intelligence in Medicine
15 papers in training set
Top 0.4%
1.4%
13
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.2%
14
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.5%
1.2%
15
Informatics in Medicine Unlocked
21 papers in training set
Top 0.9%
0.9%
16
iScience
1063 papers in training set
Top 26%
0.9%
17
Frontiers in Digital Health
20 papers in training set
Top 1%
0.8%
18
Cureus
67 papers in training set
Top 5%
0.8%
19
Patterns
70 papers in training set
Top 2%
0.8%
20
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.7%
21
Bioinformatics
1061 papers in training set
Top 10%
0.7%
22
JMIR Formative Research
32 papers in training set
Top 2%
0.5%
23
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.5%