An Observational Study to Evaluate Readability and Reliability of AI-Generated Brochures for Emergency Medical Conditions

Cureus. 2024 Aug 31;16(8):e68307. doi: 10.7759/cureus.68307. eCollection 2024 Aug.

Abstract

Introduction The study assesses the readability of AI-generated brochures for common emergency medical conditions like heart attack, anaphylaxis, and syncope. Thus, the study aims to compare the AI-generated responses for patient information guides of common emergency medical conditions using ChatGPT and Google Gemini. Methodology Brochures for each condition were created by both AI tools. Readability was assessed using the Flesch-Kincaid Calculator, evaluating word count, sentence count and ease of understanding. Reliability was measured using the Modified DISCERN Score. The similarity between AI outputs was determined using Quillbot. Statistical analysis was performed with R (v4.3.2). Results ChatGPT and Gemini produced brochures with no statistically significant differences in word count (p= 0.2119), sentence count (p=0.1276), readability (p=0.3796), or reliability (p=0.7407). However, ChatGPT provided more detailed content with 32.4% more words (582.80 vs. 440.20) and 51.6% more sentences (67.00 vs. 44.20). In addition, Gemini's brochures were slightly easier to read with a higher ease score (50.62 vs. 41.88). Reliability varied by topic with ChatGPT scoring higher for Heart Attack (4 vs. 3) and Choking (3 vs. 2), while Google Gemini scored higher for Anaphylaxis (4 vs. 3) and Drowning (4 vs. 3), highlighting the need for topic-specific evaluation. Conclusions Although AI-generated brochures from ChatGPT and Gemini are comparable in readability and reliability for patient information on emergency medical conditions, this study highlights that there is no statistically significant difference in the responses generated by the two AI tools.

Keywords: #patient education; ai-generated brochures; chatgpt; discern score; flesch-kincaid; google gemini; heart attack; life threatening anaphylaxis; readability measures; syncope.