Using large language model to guide patients to create efficient and comprehensive clinical care message

Siru Liu; Aileen P Wright; Allison B Mccoy; Sean S Huang; Julian Z Genkins; Josh F Peterson; Yaa A Kumah-Crystal; William Martinez; Babatunde Carew; Dara Mize; Bryan Steitz; Adam Wright

doi:10.1093/jamia/ocae142

Using large language model to guide patients to create efficient and comprehensive clinical care message

J Am Med Inform Assoc. 2024 Aug 1;31(8):1665-1670. doi: 10.1093/jamia/ocae142.

Authors

Siru Liu^{1

2}, Aileen P Wright^{1

3}, Allison B Mccoy¹, Sean S Huang^{1

3}, Julian Z Genkins^{1

3}, Josh F Peterson^{1

3}, Yaa A Kumah-Crystal^{1

4}, William Martinez³, Babatunde Carew³, Dara Mize^{1

3}, Bryan Steitz¹, Adam Wright^{1

3}

Affiliations

¹ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States.
² Department of Computer Science, Vanderbilt University, Nashville, TN 37212, United States.
³ Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States.
⁴ Department of Pediatric Endocrinology, Vanderbilt University Medical Center, Nashville, TN 37232, United States.

Abstract

Objective: This study aims to investigate the feasibility of using Large Language Models (LLMs) to engage with patients at the time they are drafting a question to their healthcare providers, and generate pertinent follow-up questions that the patient can answer before sending their message, with the goal of ensuring that their healthcare provider receives all the information they need to safely and accurately answer the patient's question, eliminating back-and-forth messaging, and the associated delays and frustrations.

Methods: We collected a dataset of patient messages sent between January 1, 2022 to March 7, 2023 at Vanderbilt University Medical Center. Two internal medicine physicians identified 7 common scenarios. We used 3 LLMs to generate follow-up questions: (1) Comprehensive LLM Artificial Intelligence Responder (CLAIR): a locally fine-tuned LLM, (2) GPT4 with a simple prompt, and (3) GPT4 with a complex prompt. Five physicians rated them with the actual follow-ups written by healthcare providers on clarity, completeness, conciseness, and utility.

Results: For five scenarios, our CLAIR model had the best performance. The GPT4 model received higher scores for utility and completeness but lower scores for clarity and conciseness. CLAIR generated follow-up questions with similar clarity and conciseness as the actual follow-ups written by healthcare providers, with higher utility than healthcare providers and GPT4, and lower completeness than GPT4, but better than healthcare providers.

Conclusion: LLMs can generate follow-up patient messages designed to clarify a medical question that compares favorably to those generated by healthcare providers.

Keywords: clinical decision support; large language model; message content; patient portal; patient-doctor communication; primary health care.

MeSH terms

Artificial Intelligence*
Feasibility Studies
Humans
Physician-Patient Relations
Text Messaging

Abstract

MeSH terms

Grants and funding