Skip to main content

Showing 1–9 of 9 results for author: Vilar, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02832  [pdf, other

    cs.CL cs.LG

    Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms

    Authors: Firas Trabelsi, David Vilar, Mara Finkelstein, Markus Freitag

    Abstract: Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding using matrix completion techniques, focusing on the task of machine translation. We formulate MBR decoding as a matrix completion problem, where the u… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2311.05350  [pdf, other

    cs.CL

    There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

    Authors: Jan-Thorsten Peter, David Vilar, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Markus Freitag

    Abstract: Quality Estimation (QE), the evaluation of machine translation output without the need of explicit references, has seen big improvements in the last years with the use of neural metrics. In this paper we analyze the viability of using QE metrics for filtering out bad quality sentence pairs in the training data of neural machine translation systems~(NMT). While most corpus filtering methods are foc… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: to be published at WMT23

  3. arXiv:2310.06707  [pdf, other

    cs.CL cs.AI

    Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model

    Authors: Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers

    Abstract: Maximum-a-posteriori (MAP) decoding is the most widely used decoding strategy for neural machine translation (NMT) models. The underlying assumption is that model probability correlates well with human judgment, with better translations getting assigned a higher score by the model. However, research has shown that this assumption does not always hold, and generation quality can be improved by deco… ▽ More

    Submitted 25 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  4. arXiv:2211.09102  [pdf, other

    cs.CL

    Prompting PaLM for Translation: Assessing Strategies and Performance

    Authors: David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh Ratnakar, George Foster

    Abstract: Large language models (LLMs) that have been trained on multilingual but not parallel text exhibit a remarkable ability to translate between languages. We probe this ability in an in-depth study of the pathways language model (PaLM), which has demonstrated the strongest machine translation (MT) performance among similarly-trained LLMs to date. We investigate various strategies for choosing translat… ▽ More

    Submitted 25 June, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: ACL 2023

  5. arXiv:2112.03052  [pdf, other

    cs.LG cs.CL cs.CV

    Scaling Up Influence Functions

    Authors: Andrea Schioppa, Polina Zablotskaia, David Vilar, Artem Sokolov

    Abstract: We address efficient calculation of influence functions for tracking predictions back to the training data. We propose and analyze a new approach to speeding up the inverse Hessian calculation based on Arnoldi iteration. With this improvement, we achieve, to the best of our knowledge, the first successful implementation of influence functions that scales to full-size (language and vision) Transfor… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: Published at AAAI-22

  6. arXiv:2110.06997  [pdf, other

    cs.CL cs.AI

    Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

    Authors: Julia Kreutzer, David Vilar, Artem Sokolov

    Abstract: Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e.g. containing contents from multiple domains or different levels of quality or complexity. Naturally, these facets do not occur with equal frequency, nor are they equally important for the test scenario at hand. In this work, we propose to optimize this balance jointly… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: EMNLP Findings 2021

  7. arXiv:2008.04885  [pdf, ps, other

    cs.CL

    The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

    Authors: Tobias Domhan, Michael Denkowski, David Vilar, Xing Niu, Felix Hieber, Kenneth Heafield

    Abstract: We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit. New features include a simplified code base through the use of MXNet's Gluon API, a focus on state of the art model architectures, distributed mixed precision training, and efficient CPU decoding with 8-bit quantization. These improvements result in faster training and inference, hig… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

  8. arXiv:1804.06609  [pdf, other

    cs.CL

    Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

    Authors: Matt Post, David Vilar

    Abstract: The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms. Recent work, however, has introduced a new capability: lexically constrained or guided decoding, a modification to beam search that forces the inclusion of pre-specified words and phrases in the output. However, while theoretically sound, e… ▽ More

    Submitted 9 November, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

    Comments: 11 pages, 9 figures, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

  9. arXiv:1712.05690  [pdf, other

    cs.CL cs.LG stat.ML

    Sockeye: A Toolkit for Neural Machine Translation

    Authors: Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post

    Abstract: We describe Sockeye (version 1.12), an open-source sequence-to-sequence toolkit for Neural Machine Translation (NMT). Sockeye is a production-ready framework for training and applying models as well as an experimental platform for researchers. Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attenti… ▽ More

    Submitted 1 June, 2018; v1 submitted 15 December, 2017; originally announced December 2017.