Objectives: Transvaginal ultrasound is typically the initial diagnostic approach in patients with postmenopausal bleeding for detecting endometrial atypical hyperplasia/cancer. Although transvaginal ultrasound demonstrates notable sensitivity, its specificity remains limited. The objective of this study was to enhance the diagnostic accuracy of transvaginal ultrasound through the integration of artificial intelligence. By using transvaginal ultrasound images, we aimed to develop an artificial intelligence based automated segmentation model and an artificial intelligence based classifier model.
Methods: Patients with postmenopausal bleeding undergoing transvaginal ultrasound and endometrial sampling at Mayo Clinic between 2016 and 2021 were retrospectively included. Manual segmentation of images was performed by four physicians (readers). Patients were classified into cohort A (atypical hyperplasia/cancer) and cohort B (benign) based on the pathologic report of endometrial sampling. A fully automated segmentation model was developed, and the performance of the model in correctly identifying the endometrium was compared with physician made segmentation using similarity metrics. To develop the classifier model, radiomic features were calculated from the manually segmented regions-of-interest. These features were used to train a wide range of machine learning based classifiers. The top performing machine learning classifier was evaluated using a threefold approach, and diagnostic accuracy was assessed through the F1 score and area under the receiver operating characteristic curve (AUC-ROC).
Results: 302 patients were included. Automated segmentation-reader agreement was 0.79±0.21 using the Dice coefficient. For the classification task, 92 radiomic features related to pixel texture/shape/intensity were found to be significantly different between cohort A and B. The threefold evaluation of the top performing classifier model showed an AUC-ROC of 0.90 (range 0.88-0.92) on the validation set and 0.88 (range 0.86-0.91) on the hold-out test set. Sensitivity and specificity were 0.87 (range 0.77-0.94) and 0.86 (range 0.81-0.94), respectively.
Conclusions: We trained an artificial intelligence based algorithm to differentiate endometrial atypical hyperplasia/cancer from benign conditions on transvaginal ultrasound images in a population of patients with postmenopausal bleeding.
Keywords: Endometrial Hyperplasia; Endometrial Neoplasms; Preoperative Care; Uterine Neoplasms.
© IGCS and ESGO 2024. No commercial re-use. See rights and permissions. Published by BMJ.