Pseudo-Label Training and Model Inertia in Neural Machine Translation

Hsu, Benjamin; Currey, Anna; Niu, Xing; Nădejde, Maria; Dinu, Georgiana

Computer Science > Computation and Language

arXiv:2305.11808 (cs)

[Submitted on 19 May 2023]

Title:Pseudo-Label Training and Model Inertia in Neural Machine Translation

Authors:Benjamin Hsu, Anna Currey, Xing Niu, Maria Nădejde, Georgiana Dinu

View PDF

Abstract:Like many other machine learning applications, neural machine translation (NMT) benefits from over-parameterized deep neural models. However, these models have been observed to be brittle: NMT model predictions are sensitive to small input changes and can show significant variation across re-training or incremental model updates. This work studies a frequently used method in NMT, pseudo-label training (PLT), which is common to the related techniques of forward-translation (or self-training) and sequence-level knowledge distillation. While the effect of PLT on quality is well-documented, we highlight a lesser-known effect: PLT can enhance a model's stability to model updates and input perturbations, a set of properties we call model inertia. We study inertia effects under different training settings and we identify distribution simplification as a mechanism behind the observed results.

Comments:	accepted ICLR 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.11808 [cs.CL]
	(or arXiv:2305.11808v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.11808

Submission history

From: Benjamin Hsu [view email]
[v1] Fri, 19 May 2023 16:45:19 UTC (6,790 KB)

Computer Science > Computation and Language

Title:Pseudo-Label Training and Model Inertia in Neural Machine Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Pseudo-Label Training and Model Inertia in Neural Machine Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators