Back

Comparison of Deep Learning Tools for Optic Nerve Axon Quantification Finds Limited Generalizability on Independent Validation

Chuter, B.; Emmert, N.; Kim, M. Y.; Dave, N.; Herrin, J.; Zhou, Z.; Wall, G.; Palmer, A.; Chen, H.; Hollingsworth, T. J.; Jablonski, M. M.

2026-03-13 bioengineering
10.64898/2026.03.11.710915 bioRxiv
Show abstract

PurposeMachine learning approaches for automated quantification of optic nerve histology have emerged as potential tools for objective assessment of axonal injury in experimental glaucoma models. However, the generalizability of these models to independent datasets remains unclear. Guided by a scoping review of the literature, this study performed independent validation testing of publicly available models on a novel rat optic nerve dataset to assess their generalizability. MethodsWe conducted a scoping review following PRISMA-ScR guidelines. PubMed, EMBASE, Scopus, and Cochrane CENTRAL were searched from 2000 through 2025. Two reviewers independently screened records and extracted data on model characteristics and performance metrics. Additionally, we performed independent validation of three models (AxoNet, AxonDeepSeg, AxoNet 2.0) on a novel rat optic nerve dataset comprising 57 images with 9,514 manually annotated axons. Because AxonDeep is not publicly available, we instead evaluated AxonDeepSeg, a separate publicly available deep learning-based tool that, while not previously applied to optic nerve tissue, is widely used for nerve fiber segmentation. ResultsFrom 2,036 records, four manuscripts describing three deep learning models met inclusion criteria. Published correlation coefficients between model predictions and reference counts ranged from 0.959 to 0.99. On independent validation, performance was reduced: AxoNet 2.0 achieved the highest correlation (r = 0.89), followed by AxonDeepSeg (r = 0.86) and AxoNet (r = 0.79). Segmentation quality metrics revealed high precision (>0.94) but low recall (0.18 to 0.27), with Dice coefficients of 0.29 to 0.40, substantially below published benchmarks of 0.81. ConclusionsDeep learning models for optic nerve histology demonstrate strong within-study performance but show meaningful performance decrements when applied to independent datasets. The observed generalizability gap (correlations 0.07 to 0.182 points below published values) demonstrates the need for standardized validation datasets and multi-center testing before widespread adoption of these tools.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Translational Vision Science & Technology
35 papers in training set
Top 0.1%
14.4%
2
Scientific Reports
3102 papers in training set
Top 6%
10.1%
3
Experimental Eye Research
30 papers in training set
Top 0.1%
10.1%
4
Investigative Ophthalmology & Visual Science
22 papers in training set
Top 0.1%
6.8%
5
Ophthalmology Science
20 papers in training set
Top 0.1%
6.4%
6
Annals of Clinical and Translational Neurology
29 papers in training set
Top 0.1%
4.9%
50% of probability mass above
7
PLOS ONE
4510 papers in training set
Top 34%
4.3%
8
Investigative Opthalmology & Visual Science
37 papers in training set
Top 0.2%
4.0%
9
Frontiers in Neurology
91 papers in training set
Top 2%
3.3%
10
British Journal of Ophthalmology
14 papers in training set
Top 0.1%
2.9%
11
Journal of Neural Engineering
197 papers in training set
Top 0.9%
2.5%
12
Journal of Vision
92 papers in training set
Top 0.3%
1.7%
13
Nature Communications
4913 papers in training set
Top 53%
1.5%
14
npj Digital Medicine
97 papers in training set
Top 2%
1.3%
15
Annals of Biomedical Engineering
34 papers in training set
Top 0.8%
1.3%
16
Journal of Biomedical Optics
25 papers in training set
Top 0.5%
1.1%
17
European Radiology
14 papers in training set
Top 0.6%
0.9%
18
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.9%
19
Bioengineering
24 papers in training set
Top 1%
0.7%
20
Bioinformatics
1061 papers in training set
Top 10%
0.6%
21
Biomedical Optics Express
84 papers in training set
Top 1%
0.6%
22
PLOS Computational Biology
1633 papers in training set
Top 29%
0.5%
23
F1000Research
79 papers in training set
Top 7%
0.5%
24
Journal of The Royal Society Interface
189 papers in training set
Top 6%
0.5%
25
Eye
11 papers in training set
Top 0.4%
0.5%