Computer Science > Computation and Language

arXiv:2212.10551 (cs)

[Submitted on 20 Dec 2022 (v1), last revised 19 Jul 2023 (this version, v3)]

Title:Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

Authors:Fei Yuan, Yinquan Lu, WenHao Zhu, Lingpeng Kong, Lei Li, Yu Qiao, Jingjing Xu

View PDF

Abstract:Multilingual neural machine translation (MNMT) aims to build a unified model for many language directions. Existing monolithic models for MNMT encounter two challenges: parameter interference among languages and inefficient inference for large models. In this paper, we revisit the classic multi-way structures and develop a detachable model by assigning each language (or group of languages) to an individual branch that supports plug-and-play training and inference. To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build a translation benchmark covering 433 languages and 1.3B parallel data. Experiments show that Lego-MT with 1.2B parameters brings an average gain of 3.2 spBLEU. It even outperforms M2M-100 with 12B parameters. The proposed training recipe brings a 28.2$\times$ speedup over the conventional multi-way training method.\footnote{ \url{this https URL}.}

Comments:	ACL 2023 Findings
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2212.10551 [cs.CL]
	(or arXiv:2212.10551v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.10551

Submission history

From: Fei Yuan [view email]
[v1] Tue, 20 Dec 2022 18:54:08 UTC (1,371 KB)
[v2] Mon, 29 May 2023 03:39:44 UTC (14,610 KB)
[v3] Wed, 19 Jul 2023 05:52:32 UTC (14,609 KB)

Computer Science > Computation and Language

Title:Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators