An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Morishita, Makoto; Oda, Yusuke; Neubig, Graham; Yoshino, Koichiro; Sudoh, Katsuhito; Nakamura, Satoshi

Computer Science > Computation and Language

arXiv:1706.05765 (cs)

[Submitted on 19 Jun 2017]

Title:An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Authors:Makoto Morishita, Yusuke Oda, Graham Neubig, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

View PDF

Abstract:Training of neural machine translation (NMT) models usually uses mini-batches for efficiency purposes. During the mini-batched training process, it is necessary to pad shorter sentences in a mini-batch to be equal in length to the longest sentence therein for efficient computation. Previous work has noted that sorting the corpus based on the sentence length before making mini-batches reduces the amount of padding and increases the processing speed. However, despite the fact that mini-batch creation is an essential step in NMT training, widely used NMT toolkits implement disparate strategies for doing so, which have not been empirically validated or compared. This work investigates mini-batch creation strategies with experiments over two different datasets. Our results suggest that the choice of a mini-batch creation strategy has a large effect on NMT training and some length-based sorting strategies do not always work well compared with simple shuffling.

Comments:	8 pages, accepted to the First Workshop on Neural Machine Translation
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1706.05765 [cs.CL]
	(or arXiv:1706.05765v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1706.05765

Submission history

From: Makoto Morishita [view email]
[v1] Mon, 19 Jun 2017 02:38:01 UTC (3,539 KB)

Computer Science > Computation and Language

Title:An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators