Generative artificial intelligence responses to patient messages in the electronic health record: early lessons learned

Sally L Baxter; Christopher A Longhurst; Marlene Millen; Amy M Sitapati; Ming Tai-Seale

doi:10.1093/jamiaopen/ooae028

Generative artificial intelligence responses to patient messages in the electronic health record: early lessons learned

JAMIA Open. 2024 Apr 10;7(2):ooae028. doi: 10.1093/jamiaopen/ooae028. eCollection 2024 Jul.

Authors

Sally L Baxter^{1

2}, Christopher A Longhurst², Marlene Millen^{2

3}, Amy M Sitapati^{2

3}, Ming Tai-Seale^{2

4}

Affiliations

¹ Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, CA 92093, United States.
² Health Department of Biomedical Informatics, University of California San Diego Health, La Jolla, CA 92093, United States.
³ Division of Internal Medicine, Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States.
⁴ Department of Family Medicine, University of California San Diego, La Jolla, CA 92093, United States.

Abstract

Background: Electronic health record (EHR)-based patient messages can contribute to burnout. Messages with a negative tone are particularly challenging to address. In this perspective, we describe our initial evaluation of large language model (LLM)-generated responses to negative EHR patient messages and contend that using LLMs to generate initial drafts may be feasible, although refinement will be needed.

Methods: A retrospective sample (n = 50) of negative patient messages was extracted from a health system EHR, de-identified, and inputted into an LLM (ChatGPT). Qualitative analyses were conducted to compare LLM responses to actual care team responses.

Results: Some LLM-generated draft responses varied from human responses in relational connection, informational content, and recommendations for next steps. Occasionally, the LLM draft responses could have potentially escalated emotionally charged conversations.

Conclusion: Further work is needed to optimize the use of LLMs for responding to negative patient messages in the EHR.

Keywords: ChatGPT; burnout; electronic health records; health services; large language model.

Publication types

Review

Grants and funding

R01 MD014850/MD/NIMHD NIH HHS/United States