Research on a traditional Chinese medicine case-based question-answering system integrating large language models and knowledge graphs

Front Med (Lausanne). 2025 Jan 7:11:1512329. doi: 10.3389/fmed.2024.1512329. eCollection 2024.

Abstract

Introduction: Traditional Chinese Medicine (TCM) case records encapsulate vast clinical experiences and theoretical insights, holding significant research and practical value. However, traditional case studies face challenges such as large data volumes, complex information, and difficulties in efficient retrieval and analysis. This study aimed to address these issues by leveraging modern data techniques to improve access and analysis of TCM case records.

Methods: A total of 679 case records from Wang Zhongqi, a renowned physician of Xin'an Medicine, a branch of TCM, covering 41 diseases, were selected. The study involved four stages: pattern layer construction, knowledge extraction, integration, and data storage and visualization. A large language model (LLM) was employed to automatically extract key entities, including symptoms, pathogenesis, treatment principles, and prescriptions. These were structured into a TCM case knowledge graph.

Results: The LLM successfully identified and extracted relevant entities, which were then organized into relational triples. A TCM case query system based on natural language input was developed. The system's performance, evaluated using the RAGAS framework, achieved high scores: 0.9375 in faithfulness, 0.9686 in answer relevancy, and 0.9500 in context recall; In human evaluations, the levels of safety and usability are significantly higher than those of LLMs without using RAG.

Discussion: The results demonstrate that integrating LLMs with a knowledge graph significantly enhances the efficiency and accuracy of retrieving TCM case information. This approach could play a crucial role in modernizing TCM research and improving access to clinical insights. Future research may explore expanding the dataset and refining the query system for broader applications.

Keywords: interdisciplinary research; knowledge graph; large language model; question answering system; traditional Chinese medicine.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was supported by Key Projects of University Scientific Research Plan in Anhui Province (Grant no. 2024AH050917), the University Synergy Innovation Program of Anhui Province (Grant no. GXXT-2023-071), and the Open Fund of High-level Key Discipline of Basic Theory of TCM of the State Administration of Traditional Chinese Medicine, Anhui University of Chinese Medicine (ZYJCLLZD-07).