Artificial Intelligence Language Model Performance for Rapid Intraoperative Queries in Plastic Surgery: ChatGPT and the Deep Inferior Epigastric Perforator Flap

J Clin Med. 2024 Feb 4;13(3):900. doi: 10.3390/jcm13030900.

Abstract

Background: The integration of artificial intelligence in healthcare has led to the development of large language models that can address various medical queries, including intraoperatively. This study investigates the potential of ChatGPT in addressing intraoperative questions during the deep inferior epigastric perforator flap procedure. Methods: A series of six intraoperative questions specific to the DIEP flap procedure, derived from real-world clinical scenarios, were proposed to ChatGPT. A panel of four experienced board-certified plastic surgeons evaluated ChatGPT's performance in providing accurate, relevant, and comprehensible responses. Results: The Likert scale demonstrated to be medically accurate, systematic in presentation, and logical when providing alternative solutions. The mean readability score of the Flesch Reading Ease Score was 28.7 (±0.8), the Flesch-Kincaid Grade Level was 12.4 (±0.5), and the Coleman-Liau Index was 14.5 (±0.5). Suitability-wise, the DISCERN score of ChatGPT was 48 (±2.5) indicating suitable and comprehensible language for experts. Conclusions: Generative AI tools such as ChatGPT can serve as a supplementary tool for surgeons to offer valuable insights and foster intraoperative problem-solving abilities. However, it lacks consideration of individual patient factors and surgical nuances. Nevertheless, further refinement of its training data and rigorous scrutiny under experts to ensure the accuracy and up-to-date nature of the information holds the potential for it to be utilized in the surgical field.

Keywords: ChatGPT; DIEP; artificial intelligence; intraoperative; large language model; plastic surgery.

Grants and funding

This research received no external funding.