Machine Learning in Psychiatric Health Records: A Gold Standard Approach to Trauma Annotation

Atwood, B.; Holderness, E.; Verhagen, M.; Shinn, A. K.; Cawkwell, P.; Cerruti, H.; Pustejovsky, J.; Hall, M.-H.

2025-03-11 psychiatry and clinical psychology

10.1101/2025.03.09.25323272 medRxiv

Show abstract

Psychiatric electronic health records present unique challenges for machine learning due to their unstructured, complex, and variable nature. This study aimed to create a gold standard dataset by identifying a cohort of patients with psychotic disorders and posttraumatic stress disorder, (PTSD), developing clinically-informed guidelines for annotating traumatic events in their health records and to create a gold standard publicly available dataset, and demonstrating the datasets suitability for training machine learning models to detect indicators of symptoms, substance use, and trauma in new records. We compiled a representative corpus of 200 narrative heavy health records (470,489 tokens) from a centralized database and developed a detailed annotation scheme with a team of clinical experts and computational linguistics. Clinicians annotated the corpus for trauma-related events and relevant clinical information with high inter-annotator agreement (0.715 for entity/span tags and 0.874 for attributes). Additionally, machine learning models were developed to demonstrate practical viability of the gold standard corpus for machine learning applications, achieving a micro F1 score of 0.76 and 0.82 for spans and attributes respectively, indicative of their predictive reliability. This study established the first gold-standard dataset for the complex task of labelling traumatic features in psychiatric health records. High inter-annotator agreement and model performance illustrate its utility in advancing the application of machine learning in psychiatric healthcare in order to better understand disease heterogeneity and treatment implications.

Machine Learning in Psychiatric Health Records: A Gold Standard Approach to Trauma Annotation

Matching journals