Linguistic Effects of Ambient AI on Clinical Documentation: A Matched Pre-Post Study
Li, Y.; Zhou, H.; Blackley, S.; Plasek, J. M.; Lyu, Z.; Zhang, W.; You, J.; Centi, A.; Mishuris, R.; Yang, J.; Zhou, L.
Show abstract
Ambient intelligence-based systems are increasingly used for clinical documentation. To quantify linguistic differences associated with ambient documentation, we conducted a matched pre-post analysis of 6,026 outpatient clinical notes from Mass General Brigham following implementation of two ambient AI documentation systems (Nuance Dragon Ambient eXperience [DAX] and Abridge). Within-clinician comparisons focused on the History of Present Illness (HPI) and Assessment and Plan (A&P) sections and evaluated syntactic complexity, lexical ambiguity, linguistic variability, discourse coherence, and readability. Manual review of 50 paired notes was performed to validate findings from automated linguistic analyses. Our analyses indicate that the linguistic effects of ambient documentation are both vendor-dependent and section-specific. Across both vendors, ambient notes in HPI were longer and exhibited greater syntactic complexity (longer sentences and clauses, increased dependency distance), lower lexical ambiguity, lower language-model perplexity, and higher local and global discourse coherence. These findings indicate that ambient systems systematically restructure conversational input into more syntactically elaborated and linguistically predictable narratives, reflecting increased standardization relative to both general-domain and biomedical language models. In contrast, changes in A&P were smaller and more heterogeneous, consistent with its more structured/templated nature. Readability analyses further showed increased length and lexical complexity in ambient HPI, whereas A&P readability differences were minimal. Overall, our findings demonstrate that ambient documentation changes how clinical information is linguistically expressed and organized, with effects varying by note section, vendor, and provider role/specialty. Evaluation should therefore extend beyond efficiency to consider effects on communication, cognitive load, clinical inference, and downstream analytics.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.