Using Large Language Models for Efficient Cancer Registry Coding in the Real Hospital Setting: A Feasibility Study

Chen-Kai Wang; Cheng-Rong Ke; Ming-Siang Huang; Inn-Wen Chong; Yi-Hsin Yang; Vincent S Tseng; Hong-Jie Dai

Using Large Language Models for Efficient Cancer Registry Coding in the Real Hospital Setting: A Feasibility Study

Pac Symp Biocomput. 2025:30:121-137.

Authors

Chen-Kai Wang^{1

2}, Cheng-Rong Ke, Ming-Siang Huang, Inn-Wen Chong, Yi-Hsin Yang, Vincent S Tseng, Hong-Jie Dai

Affiliations

¹ Department of Computer Science, National Yang Ming Chiao Tung University Hsinchu, 300093, Taiwan, ROC, Taiwan.
² Advanced Technology Laboratory, Chunghwa Telecom Laboratories Taoyuan, 326402, Taiwan, ROC, Taiwan. [email protected].

PMID: 39670366

Abstract

The primary challenge in reporting cancer cases lies in the labor-intensive and time-consuming process of manually reviewing numerous reports. Current methods predominantly rely on rule-based approaches or custom-supervised learning models, which predict diagnostic codes based on a single pathology report per patient. Although these methods show promising evaluation results, their biased outcomes in controlled settings may hinder adaption to real-world reporting workflows. In this feasibility study, we focused on lung cancer as a test case and developed an agentic retrieval-augmented generation (RAG) system to evaluate the potential of publicly available large language models (LLMs) for cancer registry coding. Our findings demonstrate that: (1) directly applying publicly available LLMs without fine-tuning is feasible for cancer registry coding; and (2) prompt engineering can significantly enhance the capability of pre-trained LLMs in cancer registry coding. The off-the-shelf LLM, combined with our proposed system architecture and basic prompts, achieved a macro-averaged F-score of 0.637 when evaluated on testing data consisting of patients' medical reports spanning 1.5 years since their first visit. By employing chain of thought (CoT) reasoning and our proposed coding item grouping, the system outperformed the baseline by 0.187 in terms of the macro-averaged F-score. These findings demonstrate the great potential of leveraging LLMs with prompt engineering for cancer registry coding. Our system could offer cancer registrars a promising reference tool to enhance their daily workflow, improving efficiency and accuracy in cancer case reporting.

MeSH terms

Clinical Coding* / statistics & numerical data
Computational Biology*
Electronic Health Records / statistics & numerical data
Feasibility Studies*
Humans
Lung Neoplasms* / genetics
Natural Language Processing
Neoplasms / genetics
Registries* / statistics & numerical data
Workflow