Automated transcription in primary progressive aphasia: Accuracy and effects on classification

Clarke, N.; Morin, B.; Bedetti, C.; Bogley, R.; Pellerin, S.; Houze, B.; Ramkrishnan, S.; Ezzes, Z.; Miller, Z.; Gorno Tempini, M. L.; Vonk, J. M. J.; Brambati, S. M.

2026-02-26 neurology

10.64898/2026.02.24.26346981 medRxiv

Show abstract

INTRODUCTIONConnected speech analyses can help characterize linguistic impairments in primary progressive aphasia (PPA) and classify variants, however, manual transcription of speech samples is time-consuming and expensive. Automated speech recognition (ASR) may be efficacious for transcribing PPA speech. METHODSTranscripts of picture descriptions (109 PPA, 32 healthy controls (HC)) were generated using a manual, automated (Whisper) or semi-automated approach including a quality control (QC) step. We evaluated transcript accuracy, the reliability of ASR-derived linguistic features, and classification performance. RESULTSWhisper demonstrated lowest error rates for HC, followed by semantic, logopenic and non-fluent PPA variants. Errors correlated with overall disease severity for semantic and logopenic variants. QC of Whisper outputs reduced errors and improved the reliability of linguistic features. Overall, ASR-derived features achieved better classification performance than manual transcription features. DISCUSSIONResults support the use of off-the-shelf ASR for scalable, cost-efficient transcription of PPA speech and classification.

Matching journals

●Non-profit ◐University press ○Commercial

The top 11 journals account for 50% of the predicted probability mass.

Only show non-profit

Journal of Speech, Language, and Hearing Research

● 13 papers in training set

● 5266 papers in training set

Journal of Alzheimer’s Disease

○ 50 papers in training set

Frontiers in Neurology

○ 102 papers in training set

Scientific Reports

○ 3612 papers in training set

Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring

○ 42 papers in training set

Frontiers in Digital Health

○ 24 papers in training set

Behavior Research Methods

○ 30 papers in training set

Journal of NeuroEngineering and Rehabilitation

○ 36 papers in training set

Frontiers in Neuroscience

○ 256 papers in training set

Neurorehabilitation and Neural Repair

○ 21 papers in training set

50% of probability mass above

Brain Communications

◐ 166 papers in training set

○ 55 papers in training set

European Journal of Neurology

○ 22 papers in training set

Scientific Data

○ 209 papers in training set

Journal of Alzheimer's Disease

○ 48 papers in training set

○ 50 papers in training set

Alzheimer's Research & Therapy

○ 57 papers in training set

BMJ Health & Care Informatics

● 15 papers in training set

Orphanet Journal of Rare Diseases

○ 21 papers in training set

Neuropsychologia

○ 85 papers in training set

Hearing Research

○ 54 papers in training set

○ 14 papers in training set

Computers in Biology and Medicine

○ 128 papers in training set

Pediatric Neurology

○ 11 papers in training set

Frontiers in Psychology

○ 56 papers in training set

Neuroscience & Biobehavioral Reviews

○ 43 papers in training set

○ 17 papers in training set

● 601 papers in training set

● 35 papers in training set