Skip to main content

Showing 1–15 of 15 results for author: Hasler, E

.
  1. arXiv:2405.20089  [pdf, other

    cs.CL

    The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

    Authors: David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran

    Abstract: Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine translation models, such as steerability, inherent document-level translation abilities, and the ability to produce less literal translations. We perform an… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024 (long, main)

  2. arXiv:2404.11288  [pdf, other

    cs.CL

    A Preference-driven Paradigm for Enhanced Translation with Large Language Models

    Authors: Dawei Zhu, Sony Trenous, Xiaoyu Shen, Dietrich Klakow, Bill Byrne, Eva Hasler

    Abstract: Recent research has shown that large language models (LLMs) can achieve remarkable translation performance through supervised fine-tuning (SFT) using only a small amount of parallel data. However, SFT simply instructs the model to imitate the reference translations at the token level, making it vulnerable to the noise present in the references. Hence, the assistance from SFT often reaches a platea… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 (long, main)

  3. arXiv:2312.00536  [pdf, other

    cs.CL

    Trained MT Metrics Learn to Cope with Machine-translated References

    Authors: Jannis Vamvas, Tobias Domhan, Sony Trenous, Rico Sennrich, Eva Hasler

    Abstract: Neural metrics trained on human evaluations of MT tend to correlate well with human judgments, but their behavior is not fully understood. In this paper, we perform a controlled experiment and compare a baseline metric that has not been trained on human evaluations (Prism) to a trained version of the same metric (Prism+FT). Surprisingly, we find that Prism+FT becomes more robust to machine-transla… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: WMT 2023

  4. arXiv:2302.08920  [pdf, other

    econ.GN

    A tale of two tails: 130 years of growth-at-risk

    Authors: Martin Gächter, Elias Hasler, Florian Huber

    Abstract: We extend the existing growth-at-risk (GaR) literature by examining a long time period of 130 years in a time-varying parameter regression model. We identify several important insights for policymakers. First, both the level as well as the determinants of GaR vary significantly over time. Second, the stability of upside risks to GDP growth reported in earlier research is specific to the period kno… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  5. arXiv:2210.13281  [pdf, other

    cs.CL

    Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

    Authors: Tsz Kin Lam, Eva Hasler, Felix Hieber

    Abstract: Customer feedback can be an important signal for improving commercial machine translation systems. One solution for fixing specific translation errors is to remove the related erroneous training instances followed by re-training of the machine translation system, which we refer to as instance-specific data filtering. Influence functions (IF) have been shown to be effective in finding such relevant… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted at WMT 2022

  6. arXiv:2210.04545  [pdf, other

    cs.CL

    Automatic Evaluation and Analysis of Idioms in Neural Machine Translation

    Authors: Christos Baziotis, Prashant Mathur, Eva Hasler

    Abstract: A major open problem in neural machine translation (NMT) is the translation of idiomatic expressions, such as "under the weather". The meaning of these expressions is not composed by the meaning of their constituent words, and NMT models tend to translate them literally (i.e., word-by-word), which leads to confusing and nonsensical translations. Research on idioms in NMT is limited and obstructed… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  7. arXiv:2205.06618  [pdf, other

    cs.CL cs.AI cs.LG

    The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation

    Authors: Tobias Domhan, Eva Hasler, Ke Tran, Sony Trenous, Bill Byrne, Felix Hieber

    Abstract: Vocabulary selection, or lexical shortlisting, is a well-known technique to improve latency of Neural Machine Translation models by constraining the set of allowed output words during inference. The chosen set is typically determined by separately trained alignment model parameters, independent of the source-sentence context at inference time. While vocabulary selection appears competitive with re… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  8. arXiv:1805.03750  [pdf, ps, other

    cs.CL

    Neural Machine Translation Decoding with Terminology Constraints

    Authors: Eva Hasler, Adrià De Gispert, Gonzalo Iglesias, Bill Byrne

    Abstract: Despite the impressive quality improvements yielded by neural machine translation (NMT) systems, controlling their translation output to adhere to user-provided terminology constraints remains an open problem. We describe our approach to constrained neural decoding based on finite-state machines and multi-stack decoding which supports target-side constraints as well as constraints with correspondi… ▽ More

    Submitted 9 May, 2018; originally announced May 2018.

    Comments: Proceedings of NAACL-HLT 2018

  9. arXiv:1804.11324  [pdf, other

    cs.CL

    Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment

    Authors: Gonzalo Iglesias, William Tambellini, Adrià De Gispert, Eva Hasler, Bill Byrne

    Abstract: We describe a batched beam decoding algorithm for NMT with LMBR n-gram posteriors, showing that LMBR techniques still yield gains on top of the best recently reported results with Transformers. We also discuss acceleration strategies for deployment, and the effect of the beam size and batching on memory and speed.

    Submitted 30 April, 2018; originally announced April 2018.

    Comments: Proceedings of NAACL-HLT 2018

  10. arXiv:1708.01809  [pdf, other

    cs.CL

    A Comparison of Neural Models for Word Ordering

    Authors: Eva Hasler, Felix Stahlberg, Marcus Tomalin, Adri`a de Gispert, Bill Byrne

    Abstract: We compare several language models for the word-ordering task and propose a new bag-to-sequence neural model based on attention-based sequence-to-sequence models. We evaluate the model on a large German WMT data set where it significantly outperforms existing models. We also describe a novel search strategy for LM-based word ordering and report results on the English Penn Treebank. Our best model… ▽ More

    Submitted 5 August, 2017; originally announced August 2017.

    Comments: Accepted for publication at INLG 2017

  11. arXiv:1707.06885  [pdf, other

    cs.CL

    SGNMT -- A Flexible NMT Decoding Platform for Quick Prototy** of New Models and Search Strategies

    Authors: Felix Stahlberg, Eva Hasler, Danielle Saunders, Bill Byrne

    Abstract: This paper introduces SGNMT, our experimental platform for machine translation research. SGNMT provides a generic interface to neural and symbolic scoring modules (predictors) with left-to-right semantic such as translation models like NMT, language models, translation lattices, $n$-best lists or other kinds of scores and constraints. Predictors can be combined with other predictors to form comple… ▽ More

    Submitted 21 July, 2017; originally announced July 2017.

    Comments: Accepted as EMNLP 2017 demo paper

  12. arXiv:1612.03791  [pdf, other

    cs.CL

    Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

    Authors: Felix Stahlberg, Adrià de Gispert, Eva Hasler, Bill Byrne

    Abstract: We present a novel scheme to combine neural machine translation (NMT) with traditional statistical machine translation (SMT). Our approach borrows ideas from linearised lattice minimum Bayes-risk decoding for SMT. The NMT score is combined with the Bayes-risk of the translation according the SMT lattice. This makes our approach much more flexible than $n$-best list or lattice rescoring as the neur… ▽ More

    Submitted 13 February, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

    Comments: EACL2017 short paper

  13. arXiv:1606.04963  [pdf, other

    cs.CL

    The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16

    Authors: Felix Stahlberg, Eva Hasler, Bill Byrne

    Abstract: This paper presents the University of Cambridge submission to WMT16. Motivated by the complementary nature of syntactical machine translation and neural machine translation (NMT), we exploit the synergies of Hiero and NMT in different combination schemes. Starting out with a simple neural lattice rescoring approach, we show that the Hiero lattices are often too narrow for NMT ensembles. Therefore,… ▽ More

    Submitted 15 June, 2016; originally announced June 2016.

  14. Syntactically Guided Neural Machine Translation

    Authors: Felix Stahlberg, Eva Hasler, Aurelien Waite, Bill Byrne

    Abstract: We investigate the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT). Weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities. With a slightly modified NMT beam-search decoder we find gains over… ▽ More

    Submitted 19 May, 2016; v1 submitted 15 May, 2016; originally announced May 2016.

    Comments: ACL 2016

  15. arXiv:1510.04709  [pdf, ps, other

    cs.CL cs.CV cs.LG cs.NE

    Multilingual Image Description with Neural Sequence Models

    Authors: Desmond Elliott, Stella Frank, Eva Hasler

    Abstract: In this paper we present an approach to multi-language image description bringing together insights from neural machine translation and neural image description. To create a description of an image for a given target language, our sequence generation models condition on feature vectors from the image, the description from the source language, and/or a multimodal vector computed over the image and… ▽ More

    Submitted 18 November, 2015; v1 submitted 15 October, 2015; originally announced October 2015.

    Comments: Under review as a conference paper at ICLR 2016