Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data

Chen, Yuhao; Wang, Zhimu; Wen, Bo; Zulkernine, Farhana

Computer Science > Computation and Language

arXiv:2405.16295 (cs)

[Submitted on 25 May 2024 (v1), last revised 29 May 2024 (this version, v3)]

Title:Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data

Authors:Yuhao Chen, Zhimu Wang, Bo Wen, Farhana Zulkernine

View PDF HTML (experimental)

Abstract:Unstructured text in medical notes and dialogues contains rich information. Recent advancements in Large Language Models (LLMs) have demonstrated superior performance in question answering and summarization tasks on unstructured text data, outperforming traditional text analysis approaches. However, there is a lack of scientific studies in the literature that methodically evaluate and report on the performance of different LLMs, specifically for domain-specific data such as medical chart notes. We propose an evaluation approach to analyze the performance of open-source LLMs such as Llama2 and Mistral for medical summarization tasks, using GPT-4 as an assessor. Our innovative approach to quantitative evaluation of LLMs can enable quality control, support the selection of effective LLMs for specific tasks, and advance knowledge discovery in digital health.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2405.16295 [cs.CL]
	(or arXiv:2405.16295v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.16295

Submission history

From: Yuhao Chen [view email]
[v1] Sat, 25 May 2024 16:16:22 UTC (758 KB)
[v2] Tue, 28 May 2024 02:22:20 UTC (754 KB)
[v3] Wed, 29 May 2024 20:40:32 UTC (1,569 KB)

Computer Science > Computation and Language

Title:Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators