Radiology reports are a rich resource for advancing deep learning applications for medical images, facilitating the generation of large-scale annotated image databases. Although the ambiguity and subtlety of natural language poses a significant challenge to information extraction from radiology reports. Thyroid Imaging Reporting and Data Systems (TI-RADS) has been proposed as a system to standardize ultrasound imaging reports for thyroid cancer screening and diagnosis, through the implementation of structured templates and a standardized thyroid nodule malignancy risk scoring system; however there remains significant variation in radiologist practice when it comes to diagnostic thyroid ultrasound interpretation and reporting. In this work, we propose a computerized approach using a contextual embedding and fusion strategy for the large-scale inference of TI-RADS final assessment categories from narrative ultrasound (US) reports. The proposed model has achieved high accuracy on an internal data set, and high performance scores on an external validation dataset.
©2021 AMIA - All rights reserved.