Evaluating Large Language Model-Supported Instructions for Medication Use: First Steps Toward a Comprehensive Model

Zilma Silveira Nogueira Reis; Adriana Silvina Pagano; Isaias Jose Ramos de Oliveira; Cristiane Dos Santos Dias; Eura Martins Lage; Erico Franco Mineiro; Glaucia Miranda Varella Pereira; Igor de Carvalho Gomes; Vinicius Araujo Basilio; Ricardo João Cruz-Correia; Davi Dos Reis de Jesus; Antônio Pereira de Souza Júnior; Leonardo Chaves Dutra da Rocha

doi:10.1016/j.mcpdig.2024.09.006

Evaluating Large Language Model-Supported Instructions for Medication Use: First Steps Toward a Comprehensive Model

Mayo Clin Proc Digit Health. 2024 Dec;2(4):632-644. doi: 10.1016/j.mcpdig.2024.09.006.

Authors

Affiliations

¹ Health Informatics Center, Faculty of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
² Arts Faculty, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
³ Department of Pediatrics, Faculty of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
⁴ Department of Design Technology, School of Architecture, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
⁵ Department of Obstetrics and Gynecology, Faculty of Medical Sciences, Universidade Estadual de Campinas, Campinas, São Paulo, Brazil.
⁶ Departamento de Medicina da Comunidade Informação e Decisão em Saúde, National Secretary of Primary Care of the Brazilian Ministry of Health, Brasília, Brazil.
⁷ MEDCIDS, Porto University, Porto, Portugal.
⁸ Department of Computer Science, Universidade Federal de São João Del Rei, São João del Rei, Minas Gerais, Brazil.

Abstract

Objective: To assess the support of large language models (LLMs) in generating clearer and more personalized medication instructions to enhance e-prescription.

Patients and methods: We established patient-centered guidelines for adequate, acceptable, and personalized directions to enhance e-prescription. A dataset comprising 104 outpatient scenarios, with an array of medications, administration routes, and patient conditions, was developed following the Brazilian national e-prescribing standard. Three prompts were submitted to a closed-source LLM. The first prompt involved a generic command, the second one was calibrated for content enhancement and personalization, and the third one requested bias mitigation. The third prompt was submitted to an open-source LLM. Outputs were assessed using automated metrics and human evaluation. We conducted the study between March 1, 2024 and September 10, 2024.

Results: Adequacy scores of our closed-source LLM's output showed the third prompt outperforming the first and second one. Full and partial acceptability was achieved in 94.3% of texts with the third prompt. Personalization was rated highly, especially with the second and third prompts. The 2 LLMs showed similar adequacy results. Lack of scientific evidence and factual errors were infrequent and unrelated to a particular prompt or LLM. The frequency of hallucinations was different for each LLM and concerned prescriptions issued upon symptom manifestation and medications requiring dosage adjustment or involving intermittent use. Gender bias was found in our closed-source LLM's output for the first and second prompts, with the third one being bias-free. The second LLM's output was bias-free.

Conclusion: This study demonstrates the potential of LLM-supported generation to produce prescription directions and improve communication between health professionals and patients within the e-prescribing system.