Purpose: To validate the performance of a recently created risk stratification system (RSS) for thyroid nodules on ultrasound, the Artificial Intelligence Thyroid Imaging Reporting and Data System (AI TI-RADS).
Materials and methods: 378 thyroid nodules from 320 patients were included in this retrospective evaluation. All nodules had ultrasound images and had undergone fine needle aspiration (FNA). 147 nodules were Bethesda V or VI (suspicious or diagnostic for malignancy), and 231 were Bethesda II (benign). Three radiologists assigned features according to the AI TI-RADS lexicon (same categories and features as the American College of Radiology TI-RADS) to each nodule based on ultrasound images. FNA recommendations using AI TI-RADS and ACR TI-RADS were then compared and sensitivity and specificity for each RSS were calculated.
Results: Across three readers, mean sensitivity of AI TI-RADS was lower than ACR TI-RADS (0.69 vs 0.72, p < 0.02), while mean specificity was higher (0.40 vs 0.37, p < 0.02). Overall total number of points assigned by all three readers decreased slightly when using AI TI-RADS (5,998 for AI TI-RADS vs 6,015 for ACR TI-RADS), including more values of 0 to several features.
Conclusion: AI TI-RADS performed similarly to ACR TI-RADS while eliminating point assignments for many features, allowing for simplification of future TI-RADS versions.
Keywords: Artificial intelligence; FNA; TI-RADS; Thyroid nodules.
Copyright © 2024 Elsevier Inc. All rights reserved.