EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Gu, Yuxian; Wen, Jiaxin; Sun, Hao; Song, Yi; Ke, Pei; Zheng, Chujie; Zhang, Zheng; Yao, Jianzhu; Liu, Lei; Zhu, Xiaoyan; Huang, Minlie

doi:10.1007/s11633-022-1387-3

Computer Science > Computation and Language

arXiv:2203.09313 (cs)

[Submitted on 17 Mar 2022 (v1), last revised 21 Oct 2023 (this version, v3)]

Title:EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Authors:Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang

View PDF

Abstract:Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

Comments:	Machine Intelligence Research. this https URL . 12 pages, 5 figures. The code and pre-trained models are publicly available at this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2203.09313 [cs.CL]
	(or arXiv:2203.09313v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2203.09313
Related DOI:	https://doi.org/10.1007/s11633-022-1387-3

Submission history

From: Yuxian Gu [view email]
[v1] Thu, 17 Mar 2022 13:33:17 UTC (587 KB)
[v2] Sat, 21 May 2022 12:08:03 UTC (700 KB)
[v3] Sat, 21 Oct 2023 15:36:48 UTC (3,987 KB)

Computer Science > Computation and Language

Title:EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators