Evaluating The Role of ChatGPT as a Study Aid in Medical Education in Surgery

Tarek Araji; Ari D Brooks

doi:10.1016/j.jsurg.2024.01.014

Evaluating The Role of ChatGPT as a Study Aid in Medical Education in Surgery

J Surg Educ. 2024 May;81(5):753-757. doi: 10.1016/j.jsurg.2024.01.014. Epub 2024 Mar 30.

Authors

Tarek Araji¹, Ari D Brooks²

Affiliations

¹ Hospital of the University of Pennsylvania, Department of Surgery, Philadelphia, Pennsylvania.
² Hospital of the University of Pennsylvania, Department of Surgery, Philadelphia, Pennsylvania. Electronic address: [email protected].

PMID: 38556438
DOI: 10.1016/j.jsurg.2024.01.014

Abstract

Objective: Our aim was to assess how ChatGPT compares to Google search in assisting medical students during their surgery clerkships.

Design: We conducted a crossover study where participants were asked to complete 2 standardized assessments on different general surgery topics before and after they used either Google search or ChatGPT.

Setting: The study was conducted at the Perelman School of Medicine at the University of Pennsylvania (PSOM) in Philadelphia, Pennsylvania.

Participants: 19 third-year medical students participated in our study.

Results: The baseline (preintervention) performance of participants on both quizzes did not differ between the Google search and ChatGPT groups (p = 0.728). Students overall performed better postintervention and the difference in test scores was statistically significant for both the Google group (p < 0.001) and the ChatGPT group (p = 0.01). The mean percent increase in test scores pre- and postintervention was higher in the Google group at 11% vs. 10% in the ChatGPT group, but this difference was not statistically significant (p = 0.87). Similarly, there was no statistically significant difference in postintervention scores on both assessments between the 2 groups (p = 0.508). Postassessment surveys revealed that all students (100%) have known about ChatGPT before, and 47% have previously used it for various purposes. On a scale of 1 to 10 with 1 being the lowest and 10 being the highest, the feasibility of ChatGPT and its usefulness in finding answers were rated as 8.4 and 6.6 on average, respectively. When asked to rate the likelihood of using ChatGPT in their surgery rotation, the answers ranged between 1 and 3 ("Unlikely" 47%), 4 to 6 ("intermediate" 26%), and 7 to 10 ("likely" 26%).

Conclusion: Our results show that even though ChatGPT was comparable to Google search in finding answers pertaining to surgery questions, many students were reluctant to use ChatGPT for learning purposes during their surgery clerkship.

Keywords: ChatGPT; Google search; Surgical assessments; Surgical clinical rotation; Surgical education.

Publication types

Comparative Study

MeSH terms

Clinical Clerkship
Cross-Over Studies*
Education, Medical, Undergraduate / methods
Educational Measurement
Female
General Surgery* / education
Humans
Internet
Male
Search Engine
Students, Medical / statistics & numerical data