Purpose: To evaluate the complexity of diagnostic radiology reports across major imaging modalities and the ability of ChatGPT (Early March 2023 Version, OpenAI, California, USA) to simplify these reports to the 8th grade reading level of the average U.S. adult.
Methods: We randomly sampled 100 radiographs (XR), 100 ultrasound (US), 100 CT, and 100 MRI radiology reports from our institution's database dated between 2022 and 2023 (N = 400). These were processed by ChatGPT using the prompt "Explain this radiology report to a patient in layman's terms in second person: <Report Text>". Mean report length, Flesch reading ease score (FRES), and Flesch-Kincaid reading level (FKRL) were calculated for each report and ChatGPT output. T-tests were used to determine significance.
Results: Mean report length was 164 ± 117 words, FRES was 38.0 ± 11.8, and FKRL was 10.4 ± 1.9. FKRL was significantly higher for CT and MRI than for US and XR. Only 60/400 (15%) had a FKRL <8.5. The mean simplified ChatGPT output length was 103 ± 36 words, FRES was 83.5 ± 5.6, and FKRL was 5.8 ± 1.1. This reflects a mean decrease of 61 words (p < 0.01), increase in FRES of 45.5 (p < 0.01), and decrease in FKRL of 4.6 (p < 0.01). All simplified outputs had FKRL <8.5.
Discussion: Our study demonstrates the effective use of ChatGPT when tasked with simplifying radiology reports to below the 8th grade reading level. We report significant improvements in FRES, FKRL, and word count, the last of which requires modality-specific context.
Keywords: 21st century cures act; Large language model; Natural language processing; Patient-centered reports.
Copyright © 2023 Elsevier Inc. All rights reserved.