Background: ChatGPT is rapidly becoming a source of medical knowledge for patients. This study aims to assess the completeness and accuracy of ChatGPT's answers to the most frequently asked patients' questions about shoulder pathology.
Methods: ChatGPT (version 3.5) was queried to produce the five most common shoulder pathologies: biceps tendonitis, rotator cuff tears, shoulder arthritis, shoulder dislocation and adhesive capsulitis. Subsequently, it generated the five most common patient questions regarding these pathologies and was queried to respond. Responses were evaluated by three shoulder and elbow fellowship-trained orthopedic surgeons with a mean of 9 years of independent practice, on Likert scales for accuracy (1-6) and completeness (rated 1-3).
Results: For all questions, responses were deemed acceptable, rated at least "nearly all correct," indicated by a score of 5 or greater for accuracy, and "adequately complete," indicated by a minimum of 2 for completeness. The mean scores for accuracy and completeness, respectively, were 5.5 and 2.6 for rotator cuff tears, 5.8 and 2.7 for shoulder arthritis, 5.5 and 2.3 for shoulder dislocations, 5.1 and 2.4 for adhesive capsulitis, 5.8 and 2.9 for biceps tendonitis.
Conclusion: ChatGPT provides both accurate and complete responses to the most common patients' questions about shoulder pathology. These findings suggest that Large Language Models might play a role as a patient resource; however, patients should always verify online information with their physician.
Level of evidence: Level V Expert Opinion.
Keywords: Artificial intelligence; ChatGPT; large language model; machine learning; shoulder.
© The Author(s) 2024.