Objective: To evaluate the accuracy and reproducibility of ChatGPT's free version answers about endometriosis for the first time.
Methods: Detailed internet searches to identify frequently asked questions (FAQs) about endometriosis have been performed. Scientific questions were prepared in accordance with the European Society of Human Reproduction and Embryology (ESHRE) endometriosis guidelines. An experienced gynecologist gave a score of 1-4 for each ChatGPT answer. The repeatability of ChatGPT answers about endometriosis was analyzed by asking each question twice, and the reproducibility of ChatGPT was accepted as scoring the answer to the same question in the same score category.
Results: A total of 91.4% (n = 71) of all FAQs were answered completely, accurately, and sufficiently. ChatGPT had the highest accuracy in the symptom and diagnosis category (94.1%, 16/17 questions) and the lowest accuracy in the treatment category (81.3%, 13/16 questions). Furthermore, of the 40 questions based on the ESHRE endometriosis guidelines, 27 (67.5%) were classified as grade 1, seven (17.5%) as grade 2, and six (15.0%) as grade 3. The reproducibility rate of FAQs in the prevention, symptoms, and diagnosis, and complications categories was the highest (100% for all categories). The reproducibility rate was the lowest for questions based on the ESHRE endometriosis guidelines (70.0%).
Conclusion: ChatGPT accurately and satisfactorily responded to more than 90% of the questions about endometriosis, but to only 67.5% of questions based on the ESHRE endometriosis guidelines.
Keywords: ChatGPT; artificial intelligence; endometriosis.
© 2023 The Authors. International Journal of Gynecology & Obstetrics published by John Wiley & Sons Ltd on behalf of International Federation of Gynecology and Obstetrics.