Back

An independent supervisory safety agent improves reaction of large language models to suicidal ideation

Trivedi, S.; Simons, N. W.; Tyagi, A.; Ramaswamy, A.; Nadkarni, G. N.; Charney, A. W.

2026-04-15 psychiatry and clinical psychology
10.64898/2026.04.13.26350757 medRxiv
Show abstract

Background: Large language models (LLMs) are increasingly used in mental health contexts, yet their detection of suicidal ideation is inconsistent, raising patient safety concerns. Objective: To evaluate whether an independent safety monitoring system improves detection of suicide risk compared with native LLM safeguards. Methods: We conducted a cross-sectional evaluation using 224 paired suicide-related clinical vignettes presented in a single-turn format under two conditions (with and without structured clinical information). Native LLM safeguard responses were compared with an independent supervisory safety architecture with asynchronous monitoring. The primary outcome was detection of suicide risk requiring intervention. Results: The supervisory system detected suicide risk in 205 of 224 evaluations (91.5%) versus 41 of 224 (18.3%) for native LLM safeguards. Among 168 discordant evaluations, 166 favored the supervisory system and 2 favored the LLM (matched odds ratio {approx}83.0). Both systems detected risk in 39 evaluations, and neither in 17. Detection was highest in scenarios with explicit suicidal ideation and lower in more ambiguous presentations. Conclusions: Native LLM safeguards frequently failed to detect suicide risk in this structured evaluation. An independent monitoring approach substantially improved detection, supporting the role of external safety systems in high-risk mental health applications of LLMs.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.5%
10.1%
2
Acta Psychiatrica Scandinavica
10 papers in training set
Top 0.1%
8.4%
3
Frontiers in Psychiatry
83 papers in training set
Top 0.4%
8.2%
4
PLOS ONE
4510 papers in training set
Top 24%
7.2%
5
Journal of Medical Internet Research
85 papers in training set
Top 0.6%
6.8%
6
JAMA Network Open
127 papers in training set
Top 0.4%
6.3%
7
European Psychiatry
10 papers in training set
Top 0.1%
3.6%
50% of probability mass above
8
Frontiers in Digital Health
20 papers in training set
Top 0.2%
3.6%
9
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.9%
3.2%
10
Psychiatry Research
35 papers in training set
Top 0.6%
2.7%
11
JMIR Formative Research
32 papers in training set
Top 0.5%
2.4%
12
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.2%
2.1%
13
Journal of General Internal Medicine
20 papers in training set
Top 0.4%
2.1%
14
Scientific Reports
3102 papers in training set
Top 51%
2.1%
15
BJPsych Open
25 papers in training set
Top 0.4%
1.7%
16
Computational Psychiatry
12 papers in training set
Top 0.1%
1.7%
17
Nature Medicine
117 papers in training set
Top 3%
1.5%
18
BioData Mining
15 papers in training set
Top 0.4%
1.3%
19
JMIRx Med
31 papers in training set
Top 1.0%
1.3%
20
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.2%
21
The British Journal of Psychiatry
21 papers in training set
Top 0.8%
0.9%
22
JAMIA Open
37 papers in training set
Top 1%
0.9%
23
Journal of Affective Disorders Reports
10 papers in training set
Top 0.3%
0.8%
24
JAMA Pediatrics
10 papers in training set
Top 0.2%
0.7%
25
Acta Neuropsychiatrica
12 papers in training set
Top 1%
0.7%
26
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.6%
27
Healthcare
16 papers in training set
Top 2%
0.6%
28
BMJ Open
554 papers in training set
Top 13%
0.6%