Objectives: Emergency medical triage is crucial for prioritizing patient care in emergency situations, yet its effectiveness can vary significantly based on the experience and training of the personnel involved. This study aims to evaluate the efficacy of integrating Retrieval Augmented Generation (RAG) with Large Language Models (LLMs), specifically OpenAI's GPT models, to standardize triage procedures and reduce variability in emergency care.
Methods: We created 100 simulated triage scenarios based on modified cases from the Japanese National Examination for Emergency Medical Technicians. These scenarios were processed by the RAG-enhanced LLMs, and the models were given patient vital signs, symptoms, and observations from emergency medical services (EMS) teams as inputs. The primary outcome was the accuracy of triage classifications, which was used to compare the performance of the RAG-enhanced LLMs with that of emergency medical technicians and emergency physicians. Secondary outcomes included the rates of under-triage and over-triage.
Results: The Generative Pre-trained Transformer 3.5 (GPT-3.5) with RAG model achieved a correct triage rate of 70%, significantly outperforming Emergency Medical Technicians (EMTs) with 35% and 38% correct rates, and emergency physicians with 50% and 47% correct rates (p < 0.05). Additionally, this model demonstrated a substantial reduction in under-triage rates to 8%, compared with 33% for GPT-3.5 without RAG, and 39% for GPT-4 without RAG.
Conclusions: The integration of RAG with LLMs shows promise in improving the accuracy and consistency of medical assessments in emergency settings. Further validation in diverse medical settings with broader datasets is necessary to confirm the effectiveness and adaptability of these technologies in live environments.