Purpose: To compare two artificial intelligence (AI)-based Automated Diabetic Retinopathy Image Assessment (ARIA) softwares in terms of concordance with specialist human graders and referable diabetic retinopathy (DR) diagnostic capacity.
Methods: Retrospective comparative study including 750 consecutive diabetes mellitus patients imaged for non-mydriatic fundus photographs. For each patient four images (45 degrees field of view) were captured, centered on the optic disc and macula. Images were manually graded for severity of DR as no DR, any DR (mild non-proliferative diabetic retinopathy [NPDR] or more), referable DR (RDR (more than mild DR)), or sight-threatening DR (severe NPDR or more severe disease and/or clinically significant diabetic macular edema [CSDME]). IDx-DR and MONA DR output was compared with manual grading and with each other.
Results: Total sample size was 750 patients, of which 55 were excluded due to ungradable images. Out of the remaining 695 patients 522 (75%) were considered as having no DR by manual consensus grading, and 106 (15%) as having RDR. Agreement between raters varied between moderate to substantial. IDx-DR showed moderate agreement with human grading (k = 0.4285) while MONA DR had substantial agreement (k = 0.6797). Out of 106 patients with a ground truth of RDR, IDx-DR identified 105 and MONA DR identified 99. The sensitivity and specificity rates for RDR detection of IDx-DR were 99.1 and 71.5% compared with MONA DR of 93.4 and 89.3% respectively. Of note, both ARIAs had 100% sensitivity for the detection of STDR.
Conclusion: Both ARIAs performed well in this study population, both with sensitivity for RDR screening over 90%, with IDx-DR showing higher sensitivity and MONA DR higher specificity. MONA DR showed superior agreement with human certified graders.
Keywords: artificial intelligence; diabetic retinopathy; machine learning; ophthalmology.
© 2024 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.