Back

Development and validation of a lesion-supervised deep learning system for diabetic retinopathy grading according to UK national screening criteria

Chowdhury, P. N.; Akter, Y.; Chowdhury, P.; Kaur, A.; Uddin, M.; Chowdhury, A.; Chowdhury, P. K.; Muqit, M.

2026-04-28 health informatics
10.64898/2026.04.27.26351799 medRxiv
Show abstract

BackgroundDiabetic retinopathy (DR) is the leading cause of preventable blindness among working-age adults worldwide, yet screening coverage remains inadequate, particularly in low-and middle-income countries. Automated deep learning systems offer potential to address the global shortage of expert graders, but most existing models lack lesion-level interpretability and are not aligned with established clinical referral frameworks. We developed and validated DRAGS (Diabetic Retinopathy Automated Grading System), a hybrid deep learning model that grades DR according to the UK Diabetic Eye Screening Programme (DESP) classification and provides lesion-level explainability. MethodsWe trained and validated a DenseNet-201-based convolutional neural network on 20,281 anonymised fundus images from two tertiary eye care institutions in Bangladesh. Images were graded by fellowship-trained retinal specialists using the UK DESP framework, resulting in 10 clinically interpretable classes that combine retinopathy grade (R0-R3) and maculopathy status (M0/M1). A companion dataset of 2,936 pixel-level lesion masks spanning nine pathological categories was used to train a parallel multi-label lesion-detection head. The dataset was partitioned 70:15:15 (patient-stratified). Performance was evaluated using macro-averaged AUROC (DeLong estimator), sensitivity, specificity, F1 score, quadratically weighted Cohens {kappa}, and expected calibration error (ECE), with 95% CIs from 2000 bootstrap resamples. Grad-CAM spatial alignment with ground-truth lesion masks was assessed using Dice and IoU. This study follows the TRIPOD+AI reporting guidelines. FindingsOn the held-out test set (Component I: n = 3,044; Component II: n {approx} 440), DRAGS achieved class-wise precision, recall, and F1 scores ranging from 0{middle dot}88 to 0{middle dot}99 across all ten UK DESP grades, with advanced proliferative stages (R3-M0, R3-M1) consistently exceeding 0{middle dot}95. Overall accuracy was approximately 91{middle dot}1% and quadratically weighted Cohens {kappa} was approximately 0{middle dot}90. For referable versus non-referable DR, sensitivity was 90{middle dot}7% and specificity was 91{middle dot}9%. The companion lesion-detection head achieved macro-averaged sensitivity of 93{middle dot}9%, specificity of 99{middle dot}5%, and AUC of 0{middle dot}997 across nine lesion classes; seven of nine classes achieved AUC = 1{middle dot}00. Grad-CAM activations showed progressive spatial shift from diffuse (normal) to lesion-dense peripheral patterns (proliferative DR), with maximal agreement for microaneurysms and exudates. Mean inference time was 110-160 ms per image. InterpretationDRAGS demonstrates high diagnostic accuracy for nine-class UK DESP-aligned DR grading, with clinically interpretable lesion-level explainability on a large real-world LMIC dataset. External validation and prospective clinical evaluation are warranted before deployment. FundingThe present study received no funding.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Ophthalmology Science
20 papers in training set
Top 0.1%
10.0%
2
Frontiers in Medicine
113 papers in training set
Top 0.3%
8.3%
3
BMJ Open
554 papers in training set
Top 3%
6.3%
4
npj Digital Medicine
97 papers in training set
Top 0.9%
6.3%
5
Scientific Reports
3102 papers in training set
Top 19%
6.3%
6
PLOS Digital Health
91 papers in training set
Top 0.5%
4.8%
7
BMJ Health & Care Informatics
13 papers in training set
Top 0.2%
3.6%
8
eBioMedicine
130 papers in training set
Top 0.4%
3.6%
9
PLOS ONE
4510 papers in training set
Top 42%
3.0%
50% of probability mass above
10
Nature Communications
4913 papers in training set
Top 44%
2.7%
11
eClinicalMedicine
55 papers in training set
Top 0.2%
2.4%
12
The Lancet Digital Health
25 papers in training set
Top 0.3%
2.1%
13
Frontiers in Neurology
91 papers in training set
Top 2%
2.1%
14
JAMA
17 papers in training set
Top 0.1%
1.7%
15
BMC Medicine
163 papers in training set
Top 4%
1.5%
16
Trials
25 papers in training set
Top 1.0%
1.5%
17
Journal of Infection
71 papers in training set
Top 2%
1.5%
18
Annals of Internal Medicine
27 papers in training set
Top 0.5%
1.5%
19
BMJ
49 papers in training set
Top 0.7%
1.3%
20
European Heart Journal - Digital Health
15 papers in training set
Top 0.4%
1.3%
21
Clinical and Translational Science
21 papers in training set
Top 0.6%
1.3%
22
European Respiratory Journal
54 papers in training set
Top 1%
1.1%
23
Cell Reports Medicine
140 papers in training set
Top 6%
0.9%
24
Scientific Data
174 papers in training set
Top 2%
0.8%
25
Diagnostics
48 papers in training set
Top 2%
0.7%
26
Journal of the American College of Cardiology
12 papers in training set
Top 0.7%
0.7%
27
NeuroImage: Clinical
132 papers in training set
Top 4%
0.7%
28
JAMA Network Open
127 papers in training set
Top 4%
0.7%
29
Annals of Neurology
57 papers in training set
Top 2%
0.7%
30
Communications Medicine
85 papers in training set
Top 1%
0.7%