Diagnostic test accuracy of artificial intelligence in screening for referable diabetic retinopathy in real-world settings: A systematic review and meta-analysis

Uy, H.; Fielding, C.; Hohlfeld, A.; Ochodo, E.; Opare, A.; Mukonda, E.; Minnies, D.; Engel, M. E.

2023-06-22 ophthalmology

10.1101/2023.06.20.23291687 medRxiv

Show abstract

Studies on artificial intelligence (AI) in screening for diabetic retinopathy (DR) have shown promising results in addressing the mismatch between the capacity to implement DR screening and the increasing DR incidence; however, most of these studies were done retrospectively. This review sought to evaluate the diagnostic test accuracy (DTA) of AI in screening for referable diabetic retinopathy (RDR) in real-world settings. We searched CENTRAL, PubMed, CINAHL, Scopus, and Web of Science on 9 February 2023. We included prospective DTA studies assessing AI against trained human graders (HGs) in screening for RDR in patients living with diabetes. synthesis Two reviewers independently extracted data and assessed methodological quality against QUADAS-2 criteria. We used the hierarchical summary receiver operating characteristics (HSROC) model to pool estimates of sensitivity and specificity and, forest plots and SROC plots to visually examine heterogeneity in accuracy estimates. Finally, we conducted sensitivity analyses to explore the effects of studies deemed to possibly affect the quality of the studies. We included 15 studies (17 datasets: 10 patient-level analysis (N=45,785), and 7 eye-level analysis (N=15,390). Meta-analyses revealed a pooled sensitivity of 95.33%(95% CI: 90.60-100%) and specificity of 92.01%(95% CI: 87.61-96.42%) for patient-level analysis; for the eye-level analysis, pooled sensitivity was 91.24% (95% CI: 79.15-100%) and specificity, 93.90% (95% CI: 90.63-97.16%). Subgroup analyses did not provide variations in the diagnostic accuracy of country classification and DR classification criteria; however, a moderate increase was observed in diagnostic accuracy at the primary-level and, a minimal decrease in the tertiary-level healthcare settings. Sensitivity analyses did not show any variations in studies that included diabetic macular edema in the RDR definition, nor in studies with [≥]3 HGs. This review provides evidence, for the first time from prospective studies, for the effectiveness of AI in screening for RDR, in real-world settings.

Diagnostic test accuracy of artificial intelligence in screening for referable diabetic retinopathy in real-world settings: A systematic review and meta-analysis

Matching journals