Statistical analysis for the development of a deep learning model for classification of images with TDP-43 pathology
Munoz, A.; Oliveira, V.; Vallejo, M.
Show abstract
Diagnosing Amyotrophic Lateral Sclerosis (ALS) remains challenging due to its inherent heterogeneity. Cytoplasmic aggregation of TDP-43, observed in approximately 95% of ALS cases, has emerged as a key pathological hallmark. In this observational study, we investigated the feasibility of training deep learning models to classify TDP-43 pro-teinopathic samples versus healthy controls, with a particular focus on understanding how dataset limitations affect model performance. The dataset comprised super-resolution immunofluorescence images in which cytoplasmic and nuclear TDP-43 deposits were quantified using red and pink pixel counts. We formulated three classification tasks: TDP-43 pathology (binary), TDP-43 pathology grades (multiclass), and ALS diagnosis (binary). Initial deep learning experiments yielded inconclusive results, prompting dataset curation and the removal of problematic samples. Subsequent statistical analyses using t-tests, ANOVA, and hierarchical clustering revealed significant differences between healthy and pathological samples in terms of pixel distributions, total protein levels, and TDP-43 compart-mentalisation. These findings suggest that classification based on TDP-43 proteinopathy provides a more reliable framework for deep learning compared to ALS diagnosis, underscoring the importance of data quality and task strati-fication in model performance.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.