Local large language models for privacy-preserving accelerated review of historic echocardiogram reports

Akhil Vaid; Son Q Duong; Joshua Lampert; Patricia Kovatch; Robert Freeman; Edgar Argulian; Lori Croft; Stamatios Lerakis; Martin Goldman; Rohan Khera; Girish N Nadkarni

doi:10.1093/jamia/ocae085

Local large language models for privacy-preserving accelerated review of historic echocardiogram reports

J Am Med Inform Assoc. 2024 Sep 1;31(9):2097-2102. doi: 10.1093/jamia/ocae085.

Authors

Akhil Vaid^{1

2}, Son Q Duong^{1

3}, Joshua Lampert^{1

4}, Patricia Kovatch⁵, Robert Freeman⁶, Edgar Argulian^{7

8}, Lori Croft^{7

8}, Stamatios Lerakis^{7

8}, Martin Goldman^{7

8}, Rohan Khera⁹, Girish N Nadkarni^{1

2}

Affiliations

¹ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
² The Division of Data Driven and Digital Medicine (D3M), Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
³ Division of Pediatric Cardiology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁴ Helmsley Electrophysiology Center, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁵ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁶ Department of Population Health Science and Policy, Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁷ The Zena and Michael A. Wiener Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁸ Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States.
⁹ Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT 06510, United States.

PMID: 38687616
PMCID: PMC11339495 (available on 2025-04-30)
DOI: 10.1093/jamia/ocae085

Abstract

Objectives: The study developed framework that leverages an open-source Large Language Model (LLM) to enable clinicians to ask plain-language questions about a patient's entire echocardiogram report history. This approach is intended to streamline the extraction of clinical insights from multiple echocardiogram reports, particularly in patients with complex cardiac diseases, thereby enhancing both patient care and research efficiency.

Materials and methods: Data from over 10 years were collected, comprising echocardiogram reports from patients with more than 10 echocardiograms on file at the Mount Sinai Health System. These reports were converted into a single document per patient for analysis, broken down into snippets and relevant snippets were retrieved using text similarity measures. The LLaMA-2 70B model was employed for analyzing the text using a specially crafted prompt. The model's performance was evaluated against ground-truth answers created by faculty cardiologists.

Results: The study analyzed 432 reports from 37 patients for a total of 100 question-answer pairs. The LLM correctly answered 90% questions, with accuracies of 83% for temporality, 93% for severity assessment, 84% for intervention identification, and 100% for diagnosis retrieval. Errors mainly stemmed from the LLM's inherent limitations, such as misinterpreting numbers or hallucinations.

Conclusion: The study demonstrates the feasibility and effectiveness of using a local, open-source LLM for querying and interpreting echocardiogram report data. This approach offers a significant improvement over traditional keyword-based searches, enabling more contextually relevant and semantically accurate responses; in turn showing promise in enhancing clinical decision-making and research by facilitating more efficient access to complex patient data.

Keywords: LLM; echocardiograms; generative AI; large language models; open-source; privacy.

MeSH terms

Confidentiality
Echocardiography*
Electronic Health Records*
Heart Diseases / diagnostic imaging
Humans
Information Storage and Retrieval / methods
Natural Language Processing*

Abstract

MeSH terms

Grants and funding