A Fusion NLP Model for the Inference of Standardized Thyroid Nodule Malignancy Scores from Radiology Report Text

AMIA Annu Symp Proc. 2022 Feb 21:2021:1079-1088. eCollection 2021.

Abstract

Radiology reports are a rich resource for advancing deep learning applications for medical images, facilitating the generation of large-scale annotated image databases. Although the ambiguity and subtlety of natural language poses a significant challenge to information extraction from radiology reports. Thyroid Imaging Reporting and Data Systems (TI-RADS) has been proposed as a system to standardize ultrasound imaging reports for thyroid cancer screening and diagnosis, through the implementation of structured templates and a standardized thyroid nodule malignancy risk scoring system; however there remains significant variation in radiologist practice when it comes to diagnostic thyroid ultrasound interpretation and reporting. In this work, we propose a computerized approach using a contextual embedding and fusion strategy for the large-scale inference of TI-RADS final assessment categories from narrative ultrasound (US) reports. The proposed model has achieved high accuracy on an internal data set, and high performance scores on an external validation dataset.

MeSH terms

  • Data Systems
  • Humans
  • Radiology*
  • Retrospective Studies
  • Thyroid Neoplasms* / diagnostic imaging
  • Thyroid Neoplasms* / pathology
  • Thyroid Nodule* / diagnostic imaging
  • Thyroid Nodule* / pathology