A computational approach for measuring sentence information via surprisal: theoretical implications in nonfluent primary progressive aphasia

Rezaii, N.; Michaelov, J.; Josephy-Hernandez, S.; Ren, B.; Hochberg, D.; Quimby, M.; Dickerson, B. C.

2022-11-29 neurology

10.1101/2022.11.25.22282630 medRxiv

Show abstract

Nonfluent aphasia is a language disorder characterized by simplified sentence structures as well as word-level abnormalities such as a reduced use of verbs and function words. According to the predominant account of the disorder, both structural and word-level features are caused by a core deficit in the processing of syntax. Under this account, however, it remains unclear why nonfluent patients choose semantically richer verbs and may have an intact comprehension of verbs and function words. Here, we propose and test the hypothesis that the word-level features of nonfluency reflect a process that selects lexically richer words to increase the information content of sentences. We use a computational linguistic method to measure the information content of sentences in the language of patients with nonfluent primary progressive aphasia (nfvPPA) (n = 36) and healthy controls (n = 133). We measure sentence information using surprisal, a metric calculated by the average probability of occurrence of words in a sentence given their preceding context. We found that by packaging their structurally simple sentences with lower frequency words, nfvPPA patients produce sentences with similar surprisal as that of healthy speakers. Furthermore, we found that higher sentence surprisal in nfvPPA correlates with a lower function-to-all-word ratio, a lower verb-to-noun ratio, and a higher heavy-to-all-verb ratio. Surprisal is an effective quantitative index of sentence information. Using surprisal allows for testing an account of nonfluent aphasia that regards word-level features of nonfluency as adaptive rather than defective symptoms, a finding that may entail revisions in therapeutic approaches to nonfluent speech.

A computational approach for measuring sentence information via surprisal: theoretical implications in nonfluent primary progressive aphasia

Matching journals