Token-wise Curriculum Learning for Neural Machine Translation

Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Tuo Zhao

Abstract

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of “easy” samples from training data at the early training stage. This is not always achievable for low-resource languages where the amount of training data is limited. To address such a limitation, we propose a novel token-wise curriculum learning approach that creates sufficient amounts of easy samples. Specifically, the model learns to predict a short sub-sequence from the beginning part of each target sentence at the early stage of training. Then the sub-sequence is gradually expanded as the training progresses. Such a new curriculum design is inspired by the cumulative effect of translation errors, which makes the latter tokens more challenging to predict than the beginning ones. Extensive experiments show that our approach can consistently outperform baselines on five language pairs, especially for low-resource languages. Combining our approach with sentence-level methods further improves the performance of high-resource languages.

Anthology ID:: 2021.findings-emnlp.310
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2021
Month:: November
Year:: 2021
Address:: Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: Findings
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3658–3670
Language:
URL:: https://aclanthology.org/2021.findings-emnlp.310
DOI:: 10.18653/v1/2021.findings-emnlp.310
Bibkey:
Cite (ACL):: Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, and Tuo Zhao. 2021. Token-wise Curriculum Learning for Neural Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3658–3670, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Token-wise Curriculum Learning for Neural Machine Translation (Liang et al., Findings 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.findings-emnlp.310.pdf
Video:: https://aclanthology.org/2021.findings-emnlp.310.mp4
Data: WMT 2016

PDF Cite Search Video