Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports

Jinghui Liu; Bevan Koopman; Nathan J Brown; Kevin Chu; Anthony Nguyen

doi:10.1016/j.artmed.2024.103027

Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports

Artif Intell Med. 2025 Jan:159:103027. doi: 10.1016/j.artmed.2024.103027. Epub 2024 Nov 20.

Authors

Jinghui Liu¹, Bevan Koopman², Nathan J Brown³, Kevin Chu³, Anthony Nguyen²

Affiliations

¹ Australian e-Health Research Centre, CSIRO, Brisbane, Queensland, Australia. Electronic address: [email protected].
² Australian e-Health Research Centre, CSIRO, Brisbane, Queensland, Australia.
³ Emergency and Trauma Centre, Royal Brisbane and Women's Hospital, Brisbane, Queensland, Australia.

PMID: 39580897
DOI: 10.1016/j.artmed.2024.103027

Abstract

Large language models (LLMs) demonstrate impressive capabilities in generating human-like content and have much potential to improve the performance and efficiency of healthcare. An important application of LLMs is to generate synthetic clinical reports that could alleviate the burden of annotating and collecting real-world data in training AI models. Meanwhile, there could be concerns and limitations in using commercial LLMs to handle sensitive clinical data. In this study, we examined the use of open-source LLMs as an alternative to generate synthetic radiology reports to supplement real-world annotated data. We found LLMs hosted locally can achieve similar performance compared to ChatGPT and GPT-4 in augmenting training data for the downstream report classification task of identifying misdiagnosed fractures. We also examined the predictive value of using synthetic reports alone for training downstream models, where our best setting achieved more than 90 % of the performance using real-world data. Overall, our findings show that open-source, local LLMs can be a favourable option for creating synthetic clinical reports for downstream tasks.

Keywords: Emergency department; Large language models; Local LLMs; Natural language processing; Radiology report; Synthetic data.

MeSH terms

Artificial Intelligence
Diagnostic Errors
Electronic Health Records
Fractures, Bone* / diagnostic imaging
Humans
Natural Language Processing*