Examining the role of ChatGPT in the management of distal radius fractures: insights into its accuracy and consistency

Christopha J Knee; Ryan J Campbell; David J Graham; Cameron Handford; Michael Symes; Brahman S Sivakumar

doi:10.1111/ans.19143

Examining the role of ChatGPT in the management of distal radius fractures: insights into its accuracy and consistency

ANZ J Surg. 2024 Jul-Aug;94(7-8):1391-1396. doi: 10.1111/ans.19143. Epub 2024 Jul 5.

Authors

Christopha J Knee¹, Ryan J Campbell¹, David J Graham^{2

3

4

5

6}, Cameron Handford⁷, Michael Symes^{1

7}, Brahman S Sivakumar^{2

8

9

10

11}

Affiliations

¹ Department of Orthopaedics and Trauma Surgery, Royal North Shore Hospital, St Leonards, New South Wales, Australia.
² Australian Research Collaboration on Hands [ARCH], Mudgeeraba, Queensland, Australia.
³ School of Medicine and Dentistry, Griffith University, Southport, Queensland, Australia.
⁴ School of Medicine, University of Queensland, Herston, Queensland, Australia.
⁵ Department of Musculoskeletal Services, Gold Coast University Hospital, Southport, Queensland, Australia.
⁶ Department of Orthopaedic Surgery, Queensland Children's Hospital, South Brisbane, Queensland, Australia.
⁷ Department of Orthopaedic Surgery, St George Hospital, Kogarah, New South Wales, Australia.
⁸ Department of Hand and Peripheral Nerve Surgery, Royal North Shore Hospital, St Leonards, New South Wales, Australia.
⁹ Faculty of Medicine and Health, Sydney Medical School, The University of Sydney, Camperdown, New South Wales, Australia.
¹⁰ Department of Orthopaedic Surgery, Hornsby Ku-ring-gai Hospital, Hornsby, New South Wales, Australia.
¹¹ Department of Orthopaedic Surgery, Nepean Hospital, Kingswood, New South Wales, Australia.

PMID: 38967407
DOI: 10.1111/ans.19143

Abstract

Background: The optimal management of distal radius fractures remains a challenge for orthopaedic surgeons. The emergence of Artificial Intelligence (AI) and Large Language Models (LLMs), especially ChatGPT, affords significant potential in improving healthcare and research. This study aims to assess the accuracy and consistency of ChatGPT's knowledge in managing distal radius fractures, with a focus on its capability to provide information for patients and assist in the decision-making processes of orthopaedic clinicians.

Methods: We presented ChatGPT with seven questions on distal radius fracture management over two sessions, resulting in 14 responses. These questions covered a range of topics, including patient inquiries and orthopaedic clinical decision-making. We requested references for each response and involved two orthopaedic registrars and two senior orthopaedic surgeons to evaluate response accuracy and consistency.

Results: All 14 responses contained a mix of both correct and incorrect information. Among the 47 cited references, 13% were accurate, 28% appeared to be fabricated, 57% were incorrect, and 2% were correct but deemed inappropriate. Consistency was observed in 71% of the responses.

Conclusion: ChatGPT demonstrates significant limitations in accuracy and consistency when providing information on distal radius fractures. In its current format, it offers limited utility for patient education and clinical decision-making.

Keywords: Artificial intelligence; ChatGPT; Distal radius fractures; Generative pre‐trained transformer; Large language models; Patient education.

MeSH terms

Artificial Intelligence
Clinical Decision-Making / methods
Humans
Radius Fractures* / therapy
Wrist Fractures