Multi-document Summarization: A Comparative Evaluation

Hewapathirana, Kushan; de Silva, Nisansa; Athuraliya, C. D.

doi:10.1109/ICIIS58898.2023.10253581

Computer Science > Computation and Language

arXiv:2309.04951 (cs)

[Submitted on 10 Sep 2023 (v1), last revised 12 Sep 2023 (this version, v2)]

Title:Multi-document Summarization: A Comparative Evaluation

Authors:Kushan Hewapathirana (1 and 2), Nisansa de Silva (1), C.D. Athuraliya (2) ((1) Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka, (2) ConscientAI, Sri Lanka)

View PDF

Abstract:This paper is aimed at evaluating state-of-the-art models for Multi-document Summarization (MDS) on different types of datasets in various domains and investigating the limitations of existing models to determine future research directions. To address this gap, we conducted an extensive literature review to identify state-of-the-art models and datasets. We analyzed the performance of PRIMERA and PEGASUS models on BigSurvey-MDS and MS$^2$ datasets, which posed unique challenges due to their varied domains. Our findings show that the General-Purpose Pre-trained Model LED outperforms PRIMERA and PEGASUS on the MS$^2$ dataset. We used the ROUGE score as a performance metric to evaluate the identified models on different datasets. Our study provides valuable insights into the models' strengths and weaknesses, as well as their applicability in different domains. This work serves as a reference for future MDS research and contributes to the development of accurate and robust models which can be utilized on demanding datasets with academically and/or scientifically complex data as well as generalized, relatively simple datasets.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2309.04951 [cs.CL]
	(or arXiv:2309.04951v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.04951
Related DOI:	https://doi.org/10.1109/ICIIS58898.2023.10253581

Submission history

From: K M Hewapathirana [view email]
[v1] Sun, 10 Sep 2023 07:43:42 UTC (673 KB)
[v2] Tue, 12 Sep 2023 04:19:49 UTC (653 KB)

Computer Science > Computation and Language

Title:Multi-document Summarization: A Comparative Evaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multi-document Summarization: A Comparative Evaluation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators