Reproducible symptom subtypes of depression identified using unsupervised machine learning
Howard, D. M.; Rabelo-da-Ponte, F. D.; Viejo-Romero, M.; Vassos, E.; Lewis, C. M.
Show abstract
Depression is a heterogeneous disorder, often diagnosed based on symptom co-occurrence. However, individuals may present with markedly different symptom profiles, potentially reflecting distinct underlying mechanisms. Identifying common patterns of symptoms using data-driven approaches could help clarify the heterogeneity of depression. Furthermore, examining the sociodemographic and lifestyle characteristics, health status, and polygenic scores of individuals with specific symptom profiles may offer insights into underlying risk factors. Unsupervised machine learning models were applied to large-scale data from the UK Biobank. Independent groups of individuals were assessed at two time points (the Mental Health Questionnaire: Q1; and the Mental Well-being Questionnaire: Q2) and reporting on historical or current episodes of depression. Two machine learning models, multivariate Bernoulli-mixtures and agglomerative hierarchical clustering, were used to identify common sets of symptoms and cluster individuals by symptom similarity. Consistency of results was examined between Q1 and Q2 and between clustering models. Associations between cluster membership probabilities and sociodemographic and lifestyle factors (sex, age, body mass index, smoking status, ethnicity, and deprivation), eight health conditions, and polygenic scores for bipolar disorder, schizophrenia, and attention-deficit/hyperactivity disorder (ADHD) were examined using regression models. Symptom clusters were highly consistent across Q1 and Q2 (mean correlation > 0.81) and between machine learning models (Rand Index > 0.83). Clusters aligned with the existing clinical subtypes, atypical and melancholic depression, alongside other potentially novel clusters reflecting a range of different symptom profiles. Atypical clusters (hypersomnia with weight gain) appeared in both Q1 and Q2 and were associated with younger age and higher body mass index. Distinct clusters combining insomnia, weight gain, and having thoughts of death were associated with asthma, suggesting potential inflammatory dysregulation. Further clusters were characterised by psychomotor changes and showed strong associations with Parkinsons disease, both before and after the mental health questionnaire was conducted. These findings highlight robust and clinically meaningful symptom subtypes within depression and support the use of data-driven approaches to improve diagnostic refinement and inform personalised treatment strategies.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.