A comparison of drug information question responses by a drug information center and by ChatGPT

Am J Health Syst Pharm. 2024 Oct 25:zxae316. doi: 10.1093/ajhp/zxae316. Online ahead of print.

Abstract

Disclaimer: In an effort to expedite the publication of articles, AJHP is posting manuscripts online as soon as possible after acceptance. Accepted manuscripts have been peer-reviewed and copyedited, but are posted online before technical formatting and author proofing. These manuscripts are not the final version of record and will be replaced with the final article (formatted per AJHP style and proofed by the authors) at a later time.

Purpose: A study was conducted to assess the accuracy and ability of Chat Generative Pre-trained Transformer (ChatGPT) to systematically respond to drug information inquiries relative to responses of a drug information center (DIC).

Methods: Ten drug information questions answered by the DIC in 2022 or 2023 were selected for analysis. Three pharmacists created new ChatGPT accounts and submitted each question to ChatGPT at the same time. Each question was submitted twice to identify consistency in responses. Two days later, the same process was conducted by a fourth pharmacist. Phase 1 of data analysis consisted of a drug information pharmacist assessing all 84 ChatGPT responses for accuracy relative to the DIC responses. In phase 2, 10 ChatGPT responses were selected to be assessed by 3 blinded reviewers. Reviewers utilized an 8-question predetermined rubric to evaluate the ChatGPT and DIC responses.

Results: When comparing the ChatGPT responses (n = 84) to the DIC responses, ChatGPT had an overall accuracy rate of 50%. Accuracy across the different question types varied. In regards to the overall blinded score, ChatGPT responses scored higher than the responses by the DIC according to the rubric (overall scores of 67.5% and 55.0%, respectively). The DIC responses scored higher in the categories of references mentioned and references identified.

Conclusion: Responses generated by ChatGPT have been found to be better than those created by a DIC in clarity and readability; however, the accuracy of ChatGPT responses was lacking. ChatGPT responses to drug information questions would need to be carefully reviewed for accuracy and completeness.

Keywords: ChatGPT; artificial intelligence; drug information; evidence based medicine.