Simplifying radiologic reports with natural language processing: a novel approach using ChatGPT in enhancing patient understanding of MRI results

Arch Orthop Trauma Surg. 2024 Feb;144(2):611-618. doi: 10.1007/s00402-023-05113-4. Epub 2023 Nov 11.

Abstract

Purpose: The aim of this prospective cohort study was to assess the factual accuracy, completeness of medical information, and potential harmfulness of incorrect conclusions by medical professionals in automatically generated texts of varying complexity (1) using ChatGPT, Furthermore, patients without a medical background were asked to evaluate comprehensibility, information density, and conclusion possibilities (2).

Methods: In the study, five different simplified versions of MRI findings of the knee of different complexity (A: simple, B: moderate, C: complex) were each created using ChatGPT. Subsequently, a group of four medical professionals (two orthopedic surgeons and two radiologists) and a group of 20 consecutive patients evaluated the created reports. For this purpose, all participants received a group of simplified reports (simple, moderate, and severe) at intervals of 1 week each for their respective evaluation using a specific questionnaire. Each questionnaire consisted of the original report, the simplified report, and a series of statements to assess the quality of the simplified reports. Participants were asked to rate their level of agreement with a five-point Likert scale.

Results: The evaluation of the medical specialists showed that the findings produced were consistent in quality depending on their complexity. Factual correctness, reproduction of relevant information and comprehensibility for patients were rated on average as "Agree". The question about possible harm resulted in an average of "Disagree". The evaluation of patients also revealed consistent quality of reports, depending on complexity. Simplicity of word choice and sentence structure was rated "Agree" on average, with significant differences between simple and complex findings (p = 0.0039) as well as between moderate and complex findings (p = 0.0222). Participants reported being significantly better at knowing what the text was about (p = 0.001) and drawing the correct conclusions the more simplified the report of findings was (p = 0.013829). The question of whether the text informed them as well as a healthcare professional was answered as "Neutral" across all findings.

Conclusion: By using ChatGPT, MRI reports can be simplified automatically with consistent quality so that the relevant information is understandable to patients. However, a report generated in this way does not replace a thorough discussion between specialist and patient.

Keywords: AI; ChatGPT; Knee; MRI.

MeSH terms

  • Health Personnel*
  • Humans
  • Knee Joint
  • Magnetic Resonance Imaging
  • Natural Language Processing*
  • Prospective Studies