Search | arXiv e-print repository

Improving Word Sense Disambiguation in Neural Machine Translation with Salient Document Context

Authors: Elijah Rippeth, Marine Carpuat, Kevin Duh, Matt Post

Abstract: Lexical ambiguity is a challenging and pervasive problem in machine translation (\mt). We introduce a simple and scalable approach to resolve translation ambiguity by incorporating a small amount of extra-sentential context in neural \mt. Our approach requires no sense annotation and no change to standard model architectures. Since actual document context is not available for the vast majority of… ▽ More Lexical ambiguity is a challenging and pervasive problem in machine translation (\mt). We introduce a simple and scalable approach to resolve translation ambiguity by incorporating a small amount of extra-sentential context in neural \mt. Our approach requires no sense annotation and no change to standard model architectures. Since actual document context is not available for the vast majority of \mt training data, we collect related sentences for each input to construct pseudo-documents. Salient words from pseudo-documents are then encoded as a prefix to each source sentence to condition the generation of the translation. To evaluate, we release \docmucow, a challenge set for translation disambiguation based on the English-German \mucow \cite{raganato-etal-2020-evaluation} augmented with document IDs. Extensive experiments show that our method translates ambiguous source words better than strong sentence-level baselines and comparable document-level baselines while reducing training costs. △ Less

Submitted 26 November, 2023; originally announced November 2023.

arXiv:2210.12727 [pdf, other]

Additive Interventions Yield Robust Multi-Domain Machine Translation Models

Authors: Elijah Rippeth, Matt Post

Abstract: Additive interventions are a recently-proposed mechanism for controlling target-side attributes in neural machine translation. In contrast to tag-based approaches which manipulate the raw source sequence, interventions work by directly modulating the encoder representation of all tokens in the sequence. We examine the role of additive interventions in a large-scale multi-domain machine translation… ▽ More Additive interventions are a recently-proposed mechanism for controlling target-side attributes in neural machine translation. In contrast to tag-based approaches which manipulate the raw source sequence, interventions work by directly modulating the encoder representation of all tokens in the sequence. We examine the role of additive interventions in a large-scale multi-domain machine translation setting and compare its performance in various inference scenarios. We find that while the performance difference is small between intervention-based systems and tag-based systems when the domain label matches the test domain, intervention-based systems are robust to label error, making them an attractive choice under label uncertainty. Further, we find that the superiority of single-domain fine-tuning comes under question when training data size is scaled, contradicting previous findings. △ Less

Submitted 23 October, 2022; originally announced October 2022.

Comments: 7 pages, 7 figures, WMT22 (Research Track)

arXiv:2205.06644 [pdf, other]

Controlling Translation Formality Using Pre-trained Multilingual Language Models

Authors: Elijah Rippeth, Sweta Agrawal, Marine Carpuat

Abstract: This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at \iwslt, which evaluates translation from English into 6 languages with diverse grammatical formality markers. We investigate to what extent this problem can be addressed with a \textit{single multilingual model}, simultaneously controlling its output for target l… ▽ More This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at \iwslt, which evaluates translation from English into 6 languages with diverse grammatical formality markers. We investigate to what extent this problem can be addressed with a \textit{single multilingual model}, simultaneously controlling its output for target language and formality. Results show that this strategy can approach the translation quality and formality control achieved by dedicated translation models. However, the nature of the underlying pre-trained language model and of the finetuning samples greatly impact results. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: 9 pages, 2 figures, IWSLT22 camera-ready (system paper @ ACL-IWSLT Shared Task on Formality Control for Spoken Language Translation)

Showing 1–3 of 3 results for author: Rippeth, E