Background: AtlasGPT represents an innovative generative pretrained transformer, trained using neurosurgery literature. Its ability to construct its response according to the training level of the user is unique; however, whether its responses can be comprehended at each user's training level remains unknown. This study aimed to analyze the readability of responses provided by AtlasGPT.
Methods: Ten queries were presented to AtlasGPT across its 4 user profiles (i.e., surgeon, resident, medical student, patient). A readability analysis was performed using multiple instruments on Readability Studio. Readability scores of user-specific responses were compared using one-way analysis of variance testing and post hoc pairwise t-tests with Bonferroni correction. P value <0.05 was considered to be significant.
Results: Across the readability instruments that were leveraged, significant differences in reading ease were observed across all user profiles on comparisons to the patient (P < 0.005). Readability scores for the medical student profile tended to show greater reading ease than the surgeon and resident profiles; these differences, however, were not significant. The mean grade levels for patient responses across multiple instruments ranged from 8.8 to 11.51. Only one output via the New Dale-Chall assessment was written at the level of fifth-sixth grade.
Conclusions: AtlasGPT-generated content demonstrates readability variations according to the user profile selected; however, the readability of patient content still exceeds recommendations set by United States departmental agencies, necessitating a call to action.
Keywords: AtlasGPT; ChatGPT; Education; Health literacy; Neurosurgery; Readability.
Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.