Background: Medical artificial intelligence (AI) has entered the clinical implementation phase, although real-world performance of deep-learning systems (DLSs) for screening fundus disease remains unsatisfactory. Our study aimed to train a clinically applicable DLS for fundus diseases using data derived from the real world, and externally test the model using fundus photographs collected prospectively from the settings in which the model would most likely be adopted.
Methods: In this national real-world evidence study, we trained a DLS, the Comprehensive AI Retinal Expert (CARE) system, to identify the 14 most common retinal abnormalities using 207 228 colour fundus photographs derived from 16 clinical settings with different disease distributions. CARE was internally validated using 21 867 photographs and externally tested using 18 136 photographs prospectively collected from 35 real-world settings across China where CARE might be adopted, including eight tertiary hospitals, six community hospitals, and 21 physical examination centres. The performance of CARE was further compared with that of 16 ophthalmologists and tested using datasets with non-Chinese ethnicities and previously unused camera types. This study was registered with ClinicalTrials.gov, NCT04213430, and is currently closed.
Findings: The area under the receiver operating characteristic curve (AUC) in the internal validation set was 0·955 (SD 0·046). AUC values in the external test set were 0·965 (0·035) in tertiary hospitals, 0·983 (0·031) in community hospitals, and 0·953 (0·042) in physical examination centres. The performance of CARE was similar to that of ophthalmologists. Large variations in sensitivity were observed among the ophthalmologists in different regions and with varying experience. The system retained strong identification performance when tested using the non-Chinese dataset (AUC 0·960, 95% CI 0·957-0·964 in referable diabetic retinopathy).
Interpretation: Our DLS (CARE) showed satisfactory performance for screening multiple retinal abnormalities in real-world settings using prospectively collected fundus photographs, and so could allow the system to be implemented and adopted for clinical care.
Funding: This study was funded by the National Key R&D Programme of China, the Science and Technology Planning Projects of Guangdong Province, the National Natural Science Foundation of China, the Natural Science Foundation of Guangdong Province, and the Fundamental Research Funds for the Central Universities.
Translation: For the Chinese translation of the abstract see Supplementary Materials section.
Copyright © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 license. Published by Elsevier Ltd.. All rights reserved.