Background: With the rise of artificial intelligence (AI) in medical education, tools like OpenAI's ChatGPT-4 and DALL·E 3 have potential applications in enhancing learning materials. This study aims to evaluate ChatGPT-4o's proficiency in recognizing bariatric surgical procedures from illustrations and assess DALL·E 3's effectiveness in generating accurate surgical illustrations.
Methods: Illustrations of six bariatric surgical procedures (One Anastomosis Gastric Bypass, Roux-en-Y Gastric Bypass, Single Anastomosis Duodeno-Ileal Bypass with Sleeve Gastrectomy, Sleeve Gastrectomy, Biliopancreatic Diversion, and Adjustable Gastric Banding) were sourced from the IFSO Atlas of Metabolic and Bariatric Surgery. ChatGPT-4 was tasked with identifying each procedure based on these illustrations to evaluate its classification accuracy. Simultaneously, DALL·E 3 was prompted with the specific names of each procedure to generate corresponding medical illustrations.
Results: ChatGPT-4 correctly identified only the Adjustable Gastric Banding illustration, misclassifying the other five procedures. DALL·E 3 failed to produce accurate illustrations for all six procedures.
Conclusion: The study underscores the need for further evaluation of AI in bariatric surgery. Both ChatGPT-4 and DALL·E 3, while promising, have significant limitations in recognizing and generating accurate illustrations of bariatric surgical procedures. These findings call for continued research and development to make AI models suitable for medical education applications in bariatric surgery.
Keywords: Bariatric surgery; ChatGPT; DALLE; Generative artificial intelligence; Image.
© 2024. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.