Automated transcription in primary progressive aphasia: Accuracy and effects on classification
Clarke, N.; Morin, B.; Bedetti, C.; Bogley, R.; Pellerin, S.; Houze, B.; Ramkrishnan, S.; Ezzes, Z.; Miller, Z.; Gorno Tempini, M. L.; Vonk, J. M. J.; Brambati, S. M.
Show abstract
INTRODUCTIONConnected speech analyses can help characterize linguistic impairments in primary progressive aphasia (PPA) and classify variants, however, manual transcription of speech samples is time-consuming and expensive. Automated speech recognition (ASR) may be efficacious for transcribing PPA speech. METHODSTranscripts of picture descriptions (109 PPA, 32 healthy controls (HC)) were generated using a manual, automated (Whisper) or semi-automated approach including a quality control (QC) step. We evaluated transcript accuracy, the reliability of ASR-derived linguistic features, and classification performance. RESULTSWhisper demonstrated lowest error rates for HC, followed by semantic, logopenic and non-fluent PPA variants. Errors correlated with overall disease severity for semantic and logopenic variants. QC of Whisper outputs reduced errors and improved the reliability of linguistic features. Overall, ASR-derived features achieved better classification performance than manual transcription features. DISCUSSIONResults support the use of off-the-shelf ASR for scalable, cost-efficient transcription of PPA speech and classification.
Matching journals
The top 11 journals account for 50% of the predicted probability mass.