Purpose: To reveal problems of magnetic resonance imaging (MRI) for diagnosing gastric-type mucin-positive (GMPLs) and gastric-type mucin-negative (GMNLs) cervical lesions.
Methods: We selected 172 patients suspected to have lobular endocervical glandular hyperplasia; their pelvic MR images were categorised into the training (n = 132) and validation (n = 40) groups. The images of the validation group were read twice by three pairs of six readers to reveal the accuracy, area under the curve (AUC), and intraclass correlation coefficient (ICC). The readers evaluated three images (sagittal T2-weighted image [T2WI], axial T2WI, and axial T1-weighted image [T1WI]) in every patient. The pre-trained convolutional neural network (pCNN) was used to differentiate between GMPLs and GMNLs and perform four-fold cross-validation using cases in the training group. The accuracy and AUC were obtained using the MR images in the validation group. For each case, three images (sagittal T2WI and axial T2WI/T1WI) were entered into the CNN. Calculations were performed twice independently. ICC (2,1) between first- and second-time CNN was evaluated, and these results were compared with those of readers.
Results: The highest accuracy of readers was 77.50%. The highest ICC (1,1) between a pair of readers was 0.750. All ICC (2,1) values were <0.7, indicating poor agreement; the highest accuracy of CNN was 82.50%. The AUC did not differ significantly between the CNN and readers. The ICC (2,1) of CNN was 0.965.
Conclusions: Variation in the inter-reader or intra-reader accuracy in MRI diagnosis limits differentiation between GMPL and GMNL. CNN is nearly as accurate as readers but improves the reproducibility of diagnosis.
Copyright: © 2024 Ohya et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.