Elder-Sim: A Psychometrically Validated Platform for Personality-Stable Elderly Digital Twins
Wang, J.; Yang, Z.; Zhu, Z.; Zhu, X.; Huang, Z.; Wang, H.; Tian, L.; Cao, Y.; Qu, X.; Qi, X.; Wu, B.
Show abstract
Background: LLMs enable patient-facing conversational agents, creating a pathway toward digital twins that capture older adults' lived experiences and behavioral responses across time. A central barrier is personality drift---inconsistent trait expression across repeated interactions---which undermines reliability of generated trajectories and intervention-response simulation in geriatric care. Objective: To develop ELDER-SIM, a multi-role elderly-care conversational platform for building personality-stable digital twin agents, and to propose a psychometric validation framework for quantifying personality consistency in LLM-based agents. Methods: ELDER-SIM was implemented via n8n workflow orchestration with local LLM inference (Ollama/vLLM), integrating (1) Big Five (OCEAN) trait specifications, (2) a Cognitive Conceptualization Diagram (CCD) grounded in Beck's CBT framework, and (3) a MySQL-based long-term memory module. Ablation studies across four conditions---Baseline, +Memory, +CCD, and +LoRA (fine-tuned on 19,717 instruction pairs from CHARLS)---were evaluated via Cronbach's $\alpha$, ICC, and role discrimination accuracy. Results: Personality measurement reliability was acceptable to excellent across conditions (Cronbach's : 0.70-0.94), with consistently high test-retest stability (ICC: 0.85- 2 0.96). Role discrimination improved stepwise from 83.3% (Baseline) to 88.9% (+Memory), 94.4% (+CCD), and 97.2% (+LoRA). CCD produced the largest gain in internal consistency (mean 0.702[->]0.892), while LoRA achieved the highest overall internal consistency ( 0.940) and ICC (0.958). Conclusions: ELDER-SIM provides a psychometrically validated approach for constructing personality-consistent elderly digital twin agents. Structured cognitive modeling and domain adaptation reduce personality drift, supporting reliable longitudinal simulation for elderly mental health care and reproducible in silico evaluation before clinical deployment.
Matching journals
The top 7 journals account for 50% of the predicted probability mass.