Back

Domain-adapted language model using reinforcement learning for various dementias

Kowshik, S. S.; Jasodanand, V. H.; Bellitti, M.; Puducheri, S.; Xu, L.; Liu, Y.; Saichandran, K. S.; Dwyer, B. C.; Gabelle, A.; Hao, H.; Kedar, S.; Murman, D. L.; O'Shea, S.; Saint-Hilaire, M.-H.; Samudra, N. P.; Sartor, E. A.; Swaminathan, A.; Taraschenko, O.; Yuan, J.; Au, R.; Kolachalama, V. B.

2026-03-23 neurology
10.64898/2026.03.17.26348154 medRxiv
Show abstract

Large language models excel at processing complex clinical data and advanced reasoning, yet domain-specific adaptation is essential to realize their full potential in fields such as Alzheimers disease and related dementias (ADRD). Here, we present a generative language model for ADRD fine-tuned via reinforcement learning with verifiable rewards using a self-certainty-aware advantage. Model development and validation leveraged data from five ADRD cohorts, totaling 54, 535 participants. Our framework integrates demographics, personal and family medical histories, medication use, neuropsychological test results, functional assessments, physical and neurological examination findings, laboratory data and multimodal neuroimaging to construct comprehensive clinical profiles. On held-out testing data involving 36, 688 participants, our model achieved robust performance on syndromic classification, primary etiological diagnosis and biomarker prediction. Model predictions were validated against postmortem-confirmed diagnoses, and clinical utility was demonstrated in a controlled within-subjects crossover study where board-certified neurologists reviewed cases with and with-out model assistance, showing that exposure to model responses improved diagnostic performance. These results demonstrate that targeted domain adaptation with reinforcement learning can enable language models to deliver accurate, reasoning-driven support in ADRD evaluation. Prospective validation will be essential to translate these advances into improved patient outcomes.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Medicine
117 papers in training set
Top 0.1%
22.1%
2
npj Digital Medicine
97 papers in training set
Top 0.3%
17.2%
3
Nature Communications
4913 papers in training set
Top 34%
4.8%
4
Brain
154 papers in training set
Top 1%
3.9%
5
Alzheimer's Research & Therapy
52 papers in training set
Top 0.7%
3.5%
50% of probability mass above
6
Advanced Science
249 papers in training set
Top 6%
3.5%
7
eBioMedicine
130 papers in training set
Top 0.4%
3.0%
8
Alzheimer's & Dementia
143 papers in training set
Top 2%
2.6%
9
Med
38 papers in training set
Top 0.2%
2.0%
10
Nature Biomedical Engineering
42 papers in training set
Top 0.6%
2.0%
11
Communications Medicine
85 papers in training set
Top 0.1%
2.0%
12
Scientific Reports
3102 papers in training set
Top 60%
1.6%
13
Genome Medicine
154 papers in training set
Top 5%
1.6%
14
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.6%
15
Nature Computational Science
50 papers in training set
Top 0.8%
1.5%
16
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.3%
17
PLOS ONE
4510 papers in training set
Top 59%
1.3%
18
Nature Machine Intelligence
61 papers in training set
Top 3%
1.2%
19
The Lancet Digital Health
25 papers in training set
Top 0.7%
1.2%
20
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
21
Human Brain Mapping
295 papers in training set
Top 4%
0.8%
22
Frontiers in Digital Health
20 papers in training set
Top 1%
0.8%
23
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
24
Frontiers in Aging Neuroscience
67 papers in training set
Top 3%
0.7%
25
Brain Communications
147 papers in training set
Top 3%
0.7%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
27
Medical Image Analysis
33 papers in training set
Top 1%
0.6%
28
The Innovation
12 papers in training set
Top 1%
0.6%
29
npj Parkinson's Disease
89 papers in training set
Top 1%
0.6%