Purpose: To investigate the accuracy of Chat Generative Pre-Trained Transformer (ChatGPT)'s responses to frequently asked questions prior to rotator cuff repair surgery.
Methods: The 10 most common frequently asked questions related to rotator cuff repair were compiled from 4 institution websites. Questions were then input into ChatGPT-3.5 in 1 session. The provided ChatGPT-3.5 responses were analyzed by 2 orthopaedic surgeons for reliability, quality, and readability using the Journal of the American Medical Association Benchmark criteria, the DISCERN score, and the Flesch-Kincaid Grade Level.
Results: The Journal of the American Medical Association Benchmark criteria score was 0, indicating the absence of reliable source material citations. The mean Flesch-Kincaid Grade Level was 13.4 (range, 11.2-15.0). The mean DISCERN score was 43.4 (range, 36-51), indicating that the quality of the responses overall was considered fair. All responses cited making final decision-making to be made with the treating physician.
Conclusions: ChatGPT-3.5 provided substandard patient-related information in alignment with recommendations from the treating surgeon regarding common questions around rotator cuff repair surgery. Additionally, the responses lacked reliable source material citations, and the readability of the responses was relatively advanced with a complex language style.
Clinical relevance: The findings of this study suggest that ChatGPT-3.5 may not effectively supplement patient-related information in the context of recommendations provided by the treating surgeon prior to rotator cuff repair surgery.
Copyright © 2024 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.