Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization

Hong Zhou; Hong-Lin Wang; Yu-Yu Duan; Zi-Neng Yan; Rui Luo; Xiang-Xin Lv; Yi Xie; Jia-Yao Zhang; Jia-Ming Yang; Ming-di Xue; Ying Fang; Lin Lu; Peng-Ran Liu; Zhe-Wei Ye

doi:10.1007/s11596-024-2929-4

Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization

Curr Med Sci. 2024 Oct;44(5):1001-1005. doi: 10.1007/s11596-024-2929-4. Epub 2024 Oct 5.

Authors

Hong Zhou^#^{1

2}, Hong-Lin Wang^#^{1

2}, Yu-Yu Duan^#^{2

3}, Zi-Neng Yan^{1

2}, Rui Luo^{1

2}, Xiang-Xin Lv^{1

2}, Yi Xie^{1

2}, Jia-Yao Zhang^{1

2}, Jia-Ming Yang^{1

2}, Ming-di Xue^{1

2}, Ying Fang^{1

2}, Lin Lu^{4

5}, Peng-Ran Liu^{6

7}, Zhe-Wei Ye^{8

9}

Affiliations

¹ Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
² Laboratory of Intelligent Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
³ College of Chinese Medicine, Hubei University of Chinese Medicine, Wuhan, 433065, China.
⁴ Laboratory of Intelligent Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China. [email protected].
⁵ Department of Orthopedics, Renmin Hospital of Wuhan University, Wuhan, 433060, China. [email protected].
⁶ Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China. [email protected].
⁷ Laboratory of Intelligent Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China. [email protected].
⁸ Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China. [email protected].
⁹ Laboratory of Intelligent Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China. [email protected].

^# Contributed equally.

PMID: 39368054
DOI: 10.1007/s11596-024-2929-4

Abstract

Objective: This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.

Methods: This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.

Results: Compared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.

Conclusion: The optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.

Keywords: artificial intelligence; generative articial intelligence; large language models; orthopedics.

MeSH terms

Humans
Knowledge Bases*
Language
Orthopedic Procedures
Orthopedic Surgeons / standards
Orthopedics* / standards