Back

Scalable, non-invasive depression monitoring with smartphone speech: a multimodal benchmark and topic analysis

Emden, D.; Gutfleisch, L.; Herpertz, J.; Leenings, R.; Blitz, R.; Holstein, V. L.; Goltermann, J.; Richter, M.; Chevance, A.; Fleuchaus, A.; Winter, N. R.; Spanagel, J.; Meinert, S.; Borgers, T.; Flinkenflugel, K.; Stein, F.; Alexander, N.; Jamalabadi, H.; Leehr, E. J.; Redlich, R.; Ebner-Priemer, U.; Nenadic, I.; Kircher, T.; Dannlowski, U.; Hahn, T.; Opel, N.

2025-07-18 psychiatry and clinical psychology
10.1101/2025.07.17.25331744 medRxiv
Show abstract

Objective, scalable biomarkers are needed for continuous monitoring of major depressive disorder (MDD). Smartphone-collected speech is promising, yet extracting clinically useful signals remains difficult. We analysed 3 151 weekly voice diaries from 284 German-speaking adults (128 MDD, 156 controls) and regressed Beck Depression Inventory (BDI) scores. Sentence embeddings from the open-source 8-billion-parameter Qwen3-8B model predicted scores with MAE = 4.45 and R2 = 0.35, explaining 16 more points of variance than the best traditional feature set (TF-IDF). Adding lexical-prosodic or TF-IDF features provided only marginal improvement (best MAE = 4.39). To interpret the embeddings we applied BERTopic and uncovered ten coherent themes; BDI scores peaked for "Persistent Low Mood" and "Pain Distress", confirming clinical relevance. Large-language-model embeddings therefore capture the dominant signal of depression severity in everyday speech and, paired with interpretable topic analysis, offer a privacy-preserving, scalable route to digital mental-health phenotyping.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Nature Medicine
117 papers in training set
Top 0.1%
25.6%
2
npj Digital Medicine
97 papers in training set
Top 0.2%
18.6%
3
Nature Communications
4913 papers in training set
Top 18%
10.1%
50% of probability mass above
4
Nature Neuroscience
216 papers in training set
Top 1%
6.8%
5
Nature
575 papers in training set
Top 4%
6.4%
6
Science Advances
1098 papers in training set
Top 14%
1.9%
7
Genome Medicine
154 papers in training set
Top 4%
1.9%
8
Nature Genetics
240 papers in training set
Top 4%
1.8%
9
eLife
5422 papers in training set
Top 40%
1.8%
10
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.5%
11
Translational Psychiatry
219 papers in training set
Top 3%
1.3%
12
Scientific Reports
3102 papers in training set
Top 64%
1.3%
13
eBioMedicine
130 papers in training set
Top 2%
1.3%
14
Nature Human Behaviour
85 papers in training set
Top 3%
1.2%
15
Science Translational Medicine
111 papers in training set
Top 4%
1.2%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.1%
17
Communications Medicine
85 papers in training set
Top 0.6%
1.1%
18
Frontiers in Psychiatry
83 papers in training set
Top 3%
0.8%
19
Acta Psychiatrica Scandinavica
10 papers in training set
Top 0.4%
0.8%
20
Biological Psychiatry
119 papers in training set
Top 2%
0.7%
21
PLOS ONE
4510 papers in training set
Top 68%
0.7%
22
Imaging Neuroscience
242 papers in training set
Top 3%
0.7%
23
NeuroImage: Clinical
132 papers in training set
Top 4%
0.6%
24
Science
429 papers in training set
Top 21%
0.6%
25
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.6%
26
Frontiers in Digital Health
20 papers in training set
Top 2%
0.6%