Back

Acoustic Analysis of Primary Care Patient-provider Conversations to Screen for Cognitive Impairment

Colonel, J. T.; Becker, J.; Chan, L.; Faherty, C.; Van Vleck, T. T.; Curtis, L.; Wisnivesky, J. P.; Federman, A.; Lin, B.

2025-12-29 primary care research
10.64898/2025.12.27.25343088
Show abstract

ImportanceCognitive impairment (CI) is often under detected in primary care due to time and resource constraints. Passive analysis of clinical dialogue may offer an accessible approach for screening. ObjectiveTo assess whether audio recordings of patient-physician dialogue during routine primary care visits can be used to identify CI using acoustic speech features and machine learning (ML). DesignThis observational study conducted among older primary care patients involved audio recording primary care visits using a microphone and portable device. An external validation cohort was recruited in a separate city to assess reproducibility of findings. SettingThe study was conducted in primary care practices in New York City, with additional participants recruited from primary care practices in Chicago, Illinois, for validation. ParticipantsThe study included 787 English-speaking patients aged 55 years and older, without documented history of dementia or mild CI. Eligible patients were recruited from primary care practices during routine visits. For validation, 179 patients meeting the same eligibility criteria were recruited from primary care practices in Chicago. ExposuresMultiple thirty-second speech segments were extracted from recordings. Acoustic features were derived using foundation models (Whisper, HuBERT, Wav2Vec 2.0) and expert-defined methods (eGeMAPS, prosody). Main Outcomes and MeasuresCI was defined as Montreal Cognitive Assessment score [≥]1.0 standard deviations below age and education-adjusted norms. ML classifiers were trained to predict CI status from audio recordings. We calculated area under the receiver operating characteristic curve (AUC-ROC) and maximum F1 score (Fmax) for identifying CI participants. ResultsThe mean age was 66.8 years and 21% had CI. Models using Whisper-derived acoustic features performed best (AUC-ROC=0.733, 95% confidence interval [95%CI]=0.714-0.752; Fmax(CI)=0.504, 95%CI=0.474-0.534). Results generalized to the external site with similar performance (AUC-ROC=0.727, 95%CI=0.714-0.740; Fmax(CI)=0.459, 95%CI=0.442-0.476). Model interpretation identified pitch, timing, and variability features as key predictors. When used for screening, the algorithm achieved positive predictive value of 30.4% (95%CI=28.7%-32.1%), sensitivity of 68.2% (95%CI=61.8%-74.6%), and specificity of 63.6% (95%CI=59.8%-67.4%) on the holdout cohort. Conclusions and RelevanceML models trained on acoustic features from brief clinical conversations identified CI with high accuracy. These findings support the feasibility of passive, speech-based screening during routine primary care. Key Points QuestionCan acoustic features extracted from audio recordings of patient-physician conversations during routine primary care visits be used to screen for cognitive impairment? FindingsIn this study including 787 older adults without diagnosis of cognitive problems, machine learning models trained on acoustic features from audio segments of recordings of primary care visits achieved area under the receiver operating characteristic curve values of 0.72 for predicting cognitive impairment. The algorithm achieved a sensitivity of 83%, specificity of 44%, and positive predictive value of 28%, identifying a subset of primary care patients at higher risk for cognitive impairment. Models performed similarly on an external validation dataset of 179 participants. Interpretability analyses highlighted patient pause duration and energy-related features as salient indicators of cognition status. MeaningThese findings suggest that short segments of naturalistic clinical dialogue may contain useful acoustic signals for passively screening patients for cognitive impairment.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.