Back

Brain-Informed Fine-Tuning for Improved Multilingual Understanding in Language Models

Negi, A.; Oota, S. R.; Gupta, M.; Deniz, F.

2025-07-10 neuroscience
10.1101/2025.07.07.662360 bioRxiv
Show abstract

Recent studies have demonstrated that fine-tuning language models with brain data can improve their semantic understanding, although these findings have so far been limited to English. Interestingly, similar to the shared multilingual embedding space of pretrained multilingual language models, human studies provide strong evidence for a shared semantic system in bilingual individuals. Here, we investigate whether fine-tuning language models with bilingual brain data changes model representations in a way that improves them across multiple languages. To test this, we fine-tune monolingual and multilingual language models using brain activity recorded while bilingual participants read stories in English and Chinese. We then evaluate how well these representations generalize to the bilingual participants first language, their second language, and several other languages that the participants are not fluent in. We assess the fine-tuned language models on brain encoding performance and downstream NLP tasks. Our results show that bilingual brain-informed fine-tuned language models outperform their vanilla (pretrained) counterparts in both brain encoding performance and most downstream NLP tasks across multiple languages. These findings suggest that brain-informed fine-tuning improves multilingual understanding in language models, offering a bridge between cognitive neuroscience and NLP research. We make our code publicly available. 2

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 5%
10.3%
2
Neurobiology of Language
28 papers in training set
Top 0.1%
9.0%
3
PLOS Computational Biology
1633 papers in training set
Top 4%
8.3%
4
Nature Human Behaviour
85 papers in training set
Top 0.2%
8.3%
5
NeuroImage
813 papers in training set
Top 2%
6.2%
6
Imaging Neuroscience
242 papers in training set
Top 0.9%
3.9%
7
Nature Communications
4913 papers in training set
Top 37%
3.9%
8
Human Brain Mapping
295 papers in training set
Top 2%
3.5%
50% of probability mass above
9
Scientific Reports
3102 papers in training set
Top 38%
3.5%
10
Communications Biology
886 papers in training set
Top 3%
3.0%
11
Neuron
282 papers in training set
Top 5%
2.0%
12
eneuro
389 papers in training set
Top 5%
1.9%
13
eLife
5422 papers in training set
Top 40%
1.8%
14
Journal of Cognitive Neuroscience
119 papers in training set
Top 0.9%
1.7%
15
Cerebral Cortex
357 papers in training set
Top 0.9%
1.7%
16
Frontiers in Computational Neuroscience
53 papers in training set
Top 1%
1.6%
17
Cell Reports
1338 papers in training set
Top 25%
1.6%
18
Communications Psychology
20 papers in training set
Top 0.1%
1.6%
19
Frontiers in Neuroscience
223 papers in training set
Top 4%
1.5%
20
Medical Image Analysis
33 papers in training set
Top 0.7%
1.3%
21
The Journal of Neuroscience
928 papers in training set
Top 7%
1.2%
22
Neural Computation
36 papers in training set
Top 0.5%
1.2%
23
Neural Networks
32 papers in training set
Top 0.6%
1.2%
24
Nature
575 papers in training set
Top 13%
1.1%
25
Cognition
44 papers in training set
Top 0.4%
0.9%
26
iScience
1063 papers in training set
Top 27%
0.9%
27
Nature Computational Science
50 papers in training set
Top 1%
0.9%
28
Nature Neuroscience
216 papers in training set
Top 6%
0.8%
29
Advanced Science
249 papers in training set
Top 19%
0.7%
30
Network Neuroscience
116 papers in training set
Top 1%
0.7%