Back

Diagnostic test accuracy of artificial intelligence in screening for referable diabetic retinopathy in real-world settings: A systematic review and meta-analysis

Uy, H.; Fielding, C.; Hohlfeld, A.; Ochodo, E.; Opare, A.; Mukonda, E.; Minnies, D.; Engel, M. E.

2023-06-22 ophthalmology
10.1101/2023.06.20.23291687 medRxiv
Show abstract

Studies on artificial intelligence (AI) in screening for diabetic retinopathy (DR) have shown promising results in addressing the mismatch between the capacity to implement DR screening and the increasing DR incidence; however, most of these studies were done retrospectively. This review sought to evaluate the diagnostic test accuracy (DTA) of AI in screening for referable diabetic retinopathy (RDR) in real-world settings. We searched CENTRAL, PubMed, CINAHL, Scopus, and Web of Science on 9 February 2023. We included prospective DTA studies assessing AI against trained human graders (HGs) in screening for RDR in patients living with diabetes. synthesis Two reviewers independently extracted data and assessed methodological quality against QUADAS-2 criteria. We used the hierarchical summary receiver operating characteristics (HSROC) model to pool estimates of sensitivity and specificity and, forest plots and SROC plots to visually examine heterogeneity in accuracy estimates. Finally, we conducted sensitivity analyses to explore the effects of studies deemed to possibly affect the quality of the studies. We included 15 studies (17 datasets: 10 patient-level analysis (N=45,785), and 7 eye-level analysis (N=15,390). Meta-analyses revealed a pooled sensitivity of 95.33%(95% CI: 90.60-100%) and specificity of 92.01%(95% CI: 87.61-96.42%) for patient-level analysis; for the eye-level analysis, pooled sensitivity was 91.24% (95% CI: 79.15-100%) and specificity, 93.90% (95% CI: 90.63-97.16%). Subgroup analyses did not provide variations in the diagnostic accuracy of country classification and DR classification criteria; however, a moderate increase was observed in diagnostic accuracy at the primary-level and, a minimal decrease in the tertiary-level healthcare settings. Sensitivity analyses did not show any variations in studies that included diabetic macular edema in the RDR definition, nor in studies with [≥]3 HGs. This review provides evidence, for the first time from prospective studies, for the effectiveness of AI in screening for RDR, in real-world settings.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Eye
11 papers in training set
Top 0.1%
14.7%
2
British Journal of Ophthalmology
14 papers in training set
Top 0.1%
14.7%
3
PLOS ONE
4510 papers in training set
Top 13%
14.3%
4
PLOS Digital Health
91 papers in training set
Top 0.2%
8.4%
50% of probability mass above
5
Scientific Reports
3102 papers in training set
Top 19%
6.3%
6
Journal of Medical Internet Research
85 papers in training set
Top 1%
4.3%
7
Journal of Clinical Medicine
91 papers in training set
Top 1%
3.6%
8
Translational Vision Science & Technology
35 papers in training set
Top 0.3%
3.6%
9
Annals of Translational Medicine
17 papers in training set
Top 0.3%
3.6%
10
Ophthalmology Science
20 papers in training set
Top 0.1%
3.1%
11
F1000Research
79 papers in training set
Top 0.6%
3.1%
12
npj Digital Medicine
97 papers in training set
Top 2%
2.4%
13
BMJ Open
554 papers in training set
Top 8%
2.1%
14
Frontiers in Endocrinology
53 papers in training set
Top 2%
1.2%
15
Diabetes, Obesity and Metabolism
17 papers in training set
Top 0.4%
1.1%
16
JMIR Formative Research
32 papers in training set
Top 1%
0.9%
17
International Journal of Environmental Research and Public Health
124 papers in training set
Top 6%
0.9%
18
Cureus
67 papers in training set
Top 5%
0.7%
19
Frontiers in Neuroscience
223 papers in training set
Top 8%
0.7%
20
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
21
European Journal of Neuroscience
168 papers in training set
Top 2%
0.6%
22
British Journal of Cancer
42 papers in training set
Top 2%
0.6%