Objectives: To evaluate the interobserver agreement and diagnostic accuracy of ovarian-adnexal reporting and data system magnetic resonance imaging (O-RADS MRI) and applicability to machine learning.
Material and methods: Dynamic contrast-enhanced pelvic MRI examinations 471 lesions were retrospectively analyzed and assessed by three radiologists according to O-RADS MRI criteria. Radiomic data were extracted from T2, and post-contrast fat-suppressed T1-weighted images. Using these data, an artificial neural network (ANN), support vector machine, random forest, and naive Bayes models were constructed.
Results: Among all readers, the lowest agreement was found for the O-RADS 4 group (kappa: 0.669 (95% confidence interval [CI] 0.634-0.733)), followed by the O-RADS 5 group (kappa: 0.709 (95% CI 0.678-0.754)). O-RADS 4 predicted a malignancy with an area under the curve (AUC) value of 74.3% (95% CI 0.701-0.782), and O-RADS 5 with an AUC of 95.5% (95% CI 0.932-0.972),(p < 0.001). Among the machine learning models, ANN achieved the highest success, distinguishing O-RADS groups with an AUC of 0.948, a precision of 0.861, and a recall of 0.824.
Conclusion: The interobserver agreement and diagnostic sensitivity of the O-RADS MRI in assigning O-RADS 4-5 were not perfect, indicating a need for structural improvement. Integrating artificial intelligence into MRI protocols may enhance their performance.
Advances in knowledge: Machine learning can achieve high accuracy in the correct classification of O-RADS MRI. Malignancy prediction rates were 74% for O-RADS 4 and 95% for O-RADS 5.
Keywords: Artificial intelligence; Interobserver agreement; Machine learning; O-RADS MRI; Radiomics.
© The Author(s) 2024. Published by Oxford University Press on behalf of the British Institute of Radiology. All rights reserved. For permissions, please email: [email protected].