Background/objectives: Glaucoma is the leading cause of irreversible blindness, with a significant proportion of cases remaining undiagnosed globally. The interpretation of optic disc and retinal nerve fibre layer images poses challenges for optometrists and ophthalmologists, often leading to misdiagnosis. AI has the potential to improve diagnosis. This study aims to validate an AI system (a convolutional neural network based on the Inception-v3 architecture) for detecting glaucomatous optic neuropathy (GON) using colour fundus photographs from a UK population and to compare its performance against Australian optometrists.
Methods: A retrospective external validation study was conducted, comparing AI's performance with that of 11 AHPRA-registered optometrists in Australia on colour retinal photographs, evaluated against a reference (gold) standard established by a panel of glaucoma specialists. Statistical analyses were performed using sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC).
Results: For referable GON, the sensitivity of the AI (33.3% [95%CI: 32.4-34.3) was significantly lower than that of optometrists (65.1% [95%CI: 64.1-66.0]), p < 0.0001, although with significantly higher specificity (AI: 97.4% [95%CI: 97.0-97.7]; optometrists: 85.5% [95%CI: 84.8-86.2], p < 0.0001). The optometrists demonstrated significantly higher AUROC (0.753 [95%CI: 0.744-0.762]) compared to AI (0.654 [95%CI: 0.645-0.662], p < 0.0001).
Conclusion: The AI system exhibited lower performance than optometrists in detecting referable glaucoma. Our findings suggest that while AI can serve as a screening tool, both AI and optometrists have suboptimal performance for the nuanced diagnosis of glaucoma using fundus photographs alone. Enhanced training with diverse populations for AI is essential for improving GON detection and addressing the significant challenge of undiagnosed cases.
Keywords: artificial intelligence; deep learning; glaucoma detection; primary care.