Thoracic Aorta Measurement Extraction from Computed Tomography Radiology Reports Using Instruction Tuned Large Language Models

medRxiv [Preprint]. 2024 Dec 26:2024.12.23.24319567. doi: 10.1101/2024.12.23.24319567.

Abstract

Chest computed tomography (CT) is essential for diagnosing and monitoring thoracic aortic dilations and aneurysms, conditions that place patients at risk of complications such as aortic dissection and rupture. However, aortic measurements in chest CT radiology reports are often embedded in free-text formats, limiting their accessibility for clinical care, quality improvement and research purposes. In this study, we developed a multi-method pipeline to extract structured aortic measurements from radiology reports, and compared the performance of fine-tuned BERT-based models with instruction-tuned Llama large language models (LLMs). Applying the best-performing method to a real-world large chest CT radiology report database, we generated a comprehensive aortic measurement dataset that facilitates big data aortic disease research.

Publication types

  • Preprint