Purpose: To assess the diagnostic capabilities of the most recent chatbots releases, GPT-4o and Gemini Advanced, facing different retinal diseases.
Methods: Exploratory analysis on 50 cases with different surgical (n=27) and medical (n=23) retinal pathologies, whose optical coherence tomography/angiography (OCT/OCTA) scans were dragged into ChatGPT and Gemini's interfaces. Then, we asked "Please describe this image" and classified the diagnosis as: 1) Correct; 2) Partially correct; 3) Wrong; 4) Unable to assess exam type and 5) Diagnosis not given.
Results: ChatGPT indicated the correct diagnosis in 31/50 cases (62%), significantly higher than Gemini Advanced 16/50 cases (p=0.0048). In 24% of cases, Gemini Advanced was not able to produce any answer, stating "That's not something I'm able to do yet". For both, primary misdiagnosis was macular edema, given erroneously in 16% and 14% of cases, respectively. ChatGPT-4o showed higher rates of correct diagnoses either in surgical (52% vs 30%) and medical retina (78% vs 43%). Notably, when presented without the corresponding structural image, in any case Gemini was able to recognize OCTA scans, confusing images for artworks.
Conclusion: ChatGPT-4o outperformed Gemini Advanced in terms of diagnostic accuracy facing OCT/OCTA images, even if the range of diagnoses is still limited.
Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the Opthalmic Communications Society, Inc.