Evaluating AI-Generated informed consent documents in oral surgery: A comparative study of ChatGPT-4, Bard gemini advanced, and human-written consents

J Craniomaxillofac Surg. 2024 Oct 25:S1010-5182(24)00283-X. doi: 10.1016/j.jcms.2024.10.002. Online ahead of print.

Abstract

This study evaluates the quality and readability of informed consent documents generated by AI platforms ChatGPT-4 and Bard Gemini Advanced compared to those written by a first-year oral surgery resident for common oral surgery procedures. The evaluation, conducted by 18 experienced oral and maxillofacial surgeons, assessed consents for accuracy, completeness, readability, and overall quality. ChatGPT-4 consistently outperformed both Bard and human-written consents. ChatGPT-4 consents had a median accuracy score of 4 [IQR 4-4], compared to Bard's 3 [IQR 3-4] and human's 4 [IQR 3-4]. Completeness scores were higher for ChatGPT-4 (4 [IQR 4-5]) than Bard (3 [IQR 3-4]) and human (4 [IQR 3-4]). Readability was also superior for ChatGPT-4, with a median score of 4 [IQR 4-5] compared to Bard and human consents, both at 4 [IQR 4-4] and 4 [IQR 3-4], respectively. The Gunning Fog Index for ChatGPT-4 was 17.2 [IQR 16.5-18.2], better than Bard's 23.1 [IQR 20.5-24.7] and the human consents' 20 [IQR 19.2-20.9]. Overall, ChatGPT-4's consents received the highest quality ratings, underscoring AI's potential in enhancing patient communication and the informed consent process. The study suggests AI can reduce misinformation risks and improve patient understanding, but continuous evaluation, oversight, and patient feedback integration are crucial to ensure the effectiveness and appropriateness of AI-generated content in clinical practice.

Keywords: AI in healthcare; Artificial intelligence; Consent accuracy; Document quality; Informed consent; Large language models; Maxillofacial surgery; Oral surgery; Patient education.