The Use of Large Language Models to Generate Education Materials about Uveitis

Ophthalmol Retina. 2024 Feb;8(2):195-201. doi: 10.1016/j.oret.2023.09.008. Epub 2023 Sep 15.

Abstract

Objective: To assess large language models in generating readable uveitis information and in improving the readability of online health information.

Design: Evaluation of technology.

Subjects: Not applicable.

Methods: ChatGPT and Bard were asked the following prompts: (prompt A) "considering that the average American reads at a 6th grade level, using the Flesch-Kincaid Grade Level (FKGL) formula, can you write patient-targeted health information on uveitis of around 6th grade level?" and (prompt B) "can you write patient-targeted health information on uveitis that is easy to understand by an average American?" Additionally, ChatGPT and Bard were asked the following prompt from the first-page results of Google when the term "uveitis" was searched: "Considering that the average American reads at a 6th grade level, using the FKGL formula, can you rewrite the following text to 6th grade level: [insert text]." The readability of each response was analyzed and compared using several metrics described below.

Main outcome measures: The FKGL is a highly validated readability assessment tool that assigns a grade level to a given text, the total number of words, sentences, syllables, and complex words. Complex words were defined as those with > 2 syllables.

Results: ChatGPT and Bard generated responses with lower FKGL scores (i.e., easier to understand) in response to prompt A compared with prompt B. This was only significant for ChatGPT (P < 0.0001). The mean FKGL of responses to ChatGPT (6.3 ± 1.2) was significantly lower (P < 0.0001) than Bard 10.5 ± 0.8. ChatGPT responses also contained less complex words than Bard (P < 0.0001). Online health information on uveitis had a mean grade level of 11.0 ± 1.4. ChatGPT lowered the FKGL to 8.0 ± 1.0 (P < 0.0001) when asked to rewrite the content. Bard was not able to do so (mean FKGL of 11.1 ± 1.6).

Conclusions: ChatGPT can aid clinicians in producing easier-to-understand health information on uveitis for patients compared with already-existing content. It can also help with reducing the difficulty of the language used for uveitis health information targeted for patients.

Financial disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Keywords: Artificial intelligence; Bard; ChatGPT; Consumer health information; Readability; Uveitis.

MeSH terms

  • Comprehension
  • Humans
  • Language*
  • Reading
  • United States
  • Uveitis* / diagnosis