Retinal Imaging Analysis Performed By ChatGPT-4o And Gemini Advanced: The Turning Point Of The Revolution?

Retina. 2024 Dec 11. doi: 10.1097/IAE.0000000000004351. Online ahead of print.

Abstract

Purpose: To assess the diagnostic capabilities of the most recent chatbots releases, GPT-4o and Gemini Advanced, facing different retinal diseases.

Methods: Exploratory analysis on 50 cases with different surgical (n=27) and medical (n=23) retinal pathologies, whose optical coherence tomography/angiography (OCT/OCTA) scans were dragged into ChatGPT and Gemini's interfaces. Then, we asked "Please describe this image" and classified the diagnosis as: 1) Correct; 2) Partially correct; 3) Wrong; 4) Unable to assess exam type and 5) Diagnosis not given.

Results: ChatGPT indicated the correct diagnosis in 31/50 cases (62%), significantly higher than Gemini Advanced 16/50 cases (p=0.0048). In 24% of cases, Gemini Advanced was not able to produce any answer, stating "That's not something I'm able to do yet". For both, primary misdiagnosis was macular edema, given erroneously in 16% and 14% of cases, respectively. ChatGPT-4o showed higher rates of correct diagnoses either in surgical (52% vs 30%) and medical retina (78% vs 43%). Notably, when presented without the corresponding structural image, in any case Gemini was able to recognize OCTA scans, confusing images for artworks.

Conclusion: ChatGPT-4o outperformed Gemini Advanced in terms of diagnostic accuracy facing OCT/OCTA images, even if the range of diagnoses is still limited.