Diagnosis, treatment, and prevention of ankle sprains: Comparing free chatbot recommendations with clinical guidelines

Foot Ankle Surg. 2024 Dec 13:S1268-7731(24)00267-4. doi: 10.1016/j.fas.2024.12.003. Online ahead of print.

Abstract

Background: Free chatbots powered by large language models offer lateral ankle sprains (LAS) treatment recommendations but lack scientific validation.

Methods: The chatbots-Claude, Perplexity, and ChatGPT-were evaluated by comparing their responses to a questionnaire and their treatment algorithms against current clinical guidelines. Responses were graded on accuracy, conclusiveness, supplementary information, and incompleteness, and evaluated individually and collectively, with a 60 % pass threshold.

Results: The collective analysis of the questionnaire showed Perplexity scored significantly higher than Claude and ChatGPT (p < 0.001). In the individual analysis, Perplexity provided significantly more supplementary information than the other chatbots (p < 0.001). All chatbots met the pass threshold. In the algorithm evaluation, ChatGPT scored significantly higher than the others (p = 0.023), with Perplexity below the pass threshold.

Conclusions: Chatbots' recommendations generally aligned with current guidelines but sometimes missed crucial details. While they offer useful supplementary information, they cannot yet replace professional medical consultation or established guidelines.

Keywords: ChatGPT; Claude; Lateral ankle sprains; Perplexity; artificial intelligence (AI); chatbots; treatment recommendations.