Objective: Chat-based artificial intelligence programs like ChatGPT are reimagining how patients seek information. This study aims to evaluate the quality and accuracy of ChatGPT-generated answers to common patient questions about lung cancer surgery.
Methods: A 30-question survey of patient questions about lung cancer surgery was posed to ChatGPT in July 2023. The ChatGPT-generated responses were presented to 9 thoracic surgeons at 4 academic institutions who rated the quality of the answer on a 5-point Likert scale. They also evaluated if the response contained any inaccuracies and were prompted to submit free text comments. Responses were analyzed in aggregate.
Results: For ChatGPT-generated answers, the average quality ranged from 3.1 to 4.2 of 5.0, indicating they were generally "good" or "very good." No answer received a unanimous 1-star (poor quality) or 5-star (excellent quality) score. Minor inaccuracies were found by at least 1 surgeon in 100% of the answers, and major inaccuracies were found in 36.6%. Regarding ChatGPT, 66.7% of surgeons thought it was an accurate source of information for patients. However, only 55.6% thought they were comparable with answers given by experienced thoracic surgeons, and only 44.4% would recommend it to their patients. Common criticisms of ChatGPT-generated answers included lengthiness, lack of specificity regarding surgical care, and lack of references.
Conclusions: Chat-based artificial intelligence programs have potential to become a useful information tool for patients with lung cancer surgery. However, the quality and accuracy of ChatGPT-generated answers need improvement before thoracic surgeons would consider this method as a primary education source for patients.
Keywords: artificial intelligence; education; lung cancer; perioperative care.
Copyright © 2024 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.