Back

Towards Richer AI-Assisted Psychotherapy Note-Making and Performance Benchmarking

Adhikary, P. K.; Singh, S.; Singh, S.; Sharma, P.; Soni, P.; Choudhary, R.; Saxena, C.; Chauhan, P.; Gupta, S. K.; Deb, K. S.; Singh, S. M.; Chakraborty, T.

2025-06-25 psychiatry and clinical psychology
10.1101/2025.06.25.25330252 medRxiv
Show abstract

Psychotherapy note-making is crucial for effective patient care. However, traditional formats such as SOAP (Subjective, Objective, Assessment, and Plan) and BIRP (Behavior, Intervention, Response, and Plan) often fail to capture the nuanced complexities of therapeutic sessions, as they primarily focus on surface-level details and lack a comprehensive understanding of the patients history, mental status, and therapeutic process. While recent advances in Artificial Intelligence (AI) and Large Language Models (LLMs) show promise in clinical documentation, their application in psychotherapy note summarisation remains unexplored. We present iCARE (identifiers, Chief Concerns and Clinical History, Assessment and Analysis, Risk and Crisis, Engagement and Next Steps), a comprehensive framework for AI-assisted psychotherapy documentation that addresses these limitations. iCARE comprises of 17 clinically relevant aspects, developed collaboratively with mental health professionals, and aligned with established guidelines. We further introduce PATH (Psychotherapy Aspects and Treatment History summary), a novel dataset of annotated therapy sessions. Through extensive benchmarking with 11 LLMs, including both open and closed-source models, we evaluate their performance across different note-taking aspects using automatic and human evaluation metrics. Our results show that closed-source models like Gemini Pro and GPT4o-mini excel in various aspects, with Gemini Pro achieving superior human evaluation scores. Notably, all models struggle with temporal reasoning and complex therapeutic interpretations. The findings suggest that current LLMs can assist in basic documentation but require improvements in handling longitudinal therapeutic relationships and aspects that require deeper clinical understanding and interpretative reasoning. This work advances mental health care documentation while emphasising the need for continued clinical expertise in psychotherapy note summarisation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Frontiers in Psychiatry
83 papers in training set
Top 0.1%
22.6%
2
Acta Psychiatrica Scandinavica
10 papers in training set
Top 0.1%
10.1%
3
npj Digital Medicine
97 papers in training set
Top 0.6%
8.4%
4
Journal of Medical Internet Research
85 papers in training set
Top 0.7%
6.4%
5
Frontiers in Digital Health
20 papers in training set
Top 0.1%
6.4%
50% of probability mass above
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.4%
6.3%
7
PLOS ONE
4510 papers in training set
Top 39%
3.6%
8
Scientific Reports
3102 papers in training set
Top 41%
3.1%
9
Nature Medicine
117 papers in training set
Top 1%
2.5%
10
Bioengineering
24 papers in training set
Top 0.3%
1.9%
11
Scientific Data
174 papers in training set
Top 1.0%
1.8%
12
Psychiatry Research
35 papers in training set
Top 0.9%
1.7%
13
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1%
1.7%
14
IEEE Access
31 papers in training set
Top 0.5%
1.3%
15
JMIRx Med
31 papers in training set
Top 1%
0.9%
16
BJPsych Open
25 papers in training set
Top 0.6%
0.9%
17
Nature Communications
4913 papers in training set
Top 62%
0.7%
18
European Psychiatry
10 papers in training set
Top 0.7%
0.7%
19
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.7%
20
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
21
Cureus
67 papers in training set
Top 5%
0.7%
22
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.7%
23
JAMA Pediatrics
10 papers in training set
Top 0.2%
0.7%
24
BioData Mining
15 papers in training set
Top 1.0%
0.7%
25
Acta Neuropsychiatrica
12 papers in training set
Top 1%
0.6%
26
European Journal of Human Genetics
49 papers in training set
Top 2%
0.6%
27
JMIR Formative Research
32 papers in training set
Top 2%
0.6%
28
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 47%
0.6%
29
Journal of Affective Disorders
81 papers in training set
Top 2%
0.5%
30
Healthcare
16 papers in training set
Top 3%
0.5%