Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension

Medicine (Baltimore). 2025 Jan 10;104(2):e41059. doi: 10.1097/MD.0000000000041059.

Abstract

This study evaluates the efficacy of GPT-4, a Large Language Model, in simplifying medical literature for enhancing patient comprehension in glaucoma care. GPT-4 was used to transform published abstracts from 3 glaucoma journals (n = 62) and patient education materials (Patient Educational Model [PEMs], n = 9) to a 5th-grade reading level. GPT-4 was also prompted to generate de novo educational outputs at 6 different education levels (5th Grade, 8th Grade, High School, Associate's, Bachelor's and Doctorate). Readability of both transformed and de novo materials was quantified using Flesch Kincaid Grade Level (FKGL) and Flesch Reading Ease (FKRE) Score. Latent semantic analysis (LSA) using cosine similarity was applied to assess content consistency in transformed materials. The transformation of abstracts resulted in FKGL decreasing by an average of 3.21 points (30%, P < .001) and FKRE increasing by 28.6 points (66%, P < .001). For PEMs, FKGL decreased by 2.38 points (28%, P = .0272) and FKRE increased by 12.14 points (19%, P = .0459). LSA revealed high semantic consistency, with an average cosine similarity of 0.861 across all abstracts and 0.937 for PEMs, signifying topical themes were quantitatively shown to be consistent. This study shows that GPT-4 effectively simplifies medical information about glaucoma, making it more accessible while maintaining textual content. The improved readability scores for both transformed materials and GPT-4 generated content demonstrate its usefulness in patient education across different educational levels.

MeSH terms

  • Comprehension*
  • Glaucoma* / therapy
  • Health Literacy
  • Humans
  • Language
  • Patient Education as Topic* / methods