Skip to main content

Showing 1–11 of 11 results for author: Fomicheva, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.13041  [pdf, other

    cs.CL cs.CY cs.LG

    Towards Explainable Evaluation Metrics for Machine Translation

    Authors: Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger

    Abstract: Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for machine translation (for example, COMET or BERTScore) are based on black-box large language models. They often achieve strong correlations with human judgments, but recent research indicates that the lower-quality classical metrics remain dominant, one of the potential reasons being that their decision proce… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: Preprint. We published an earlier version of this paper (arXiv:2203.11131) under a different title. Both versions consider the conceptualization of explainable metrics and are overall similar. However, the new version puts a stronger emphasis on the survey of approaches for the explanation of MT metrics including the latest LLM based approaches

  2. arXiv:2211.09878  [pdf, other

    cs.CL

    Reducing Hallucinations in Neural Machine Translation with Feature Attribution

    Authors: Joël Tang, Marina Fomicheva, Lucia Specia

    Abstract: Neural conditional language generation models achieve the state-of-the-art in Neural Machine Translation (NMT) but are highly dependent on the quality of parallel training dataset. When trained on low-quality datasets, these models are prone to various error types, including hallucinations, i.e. outputs that are fluent, but unrelated to the source sentences. These errors are particularly dangerous… ▽ More

    Submitted 14 June, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

  3. arXiv:2203.11131  [pdf, other

    cs.CL cs.LG

    Towards Explainable Evaluation Metrics for Natural Language Generation

    Authors: Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger

    Abstract: Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics (such as BERTScore or MoverScore) are based on black-box language models such as BERT or XLM-R. They often achieve strong correlations with human judgments, but recent research indicates that the lower-quality classical metrics remain dominant, one of the potential reasons being that their decision processes are… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  4. arXiv:2110.04392  [pdf, other

    cs.CL

    The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results

    Authors: Marina Fomicheva, Piyawat Lertvittayakumjorn, Wei Zhao, Steffen Eger, Yang Gao

    Abstract: In this paper, we introduce the Eval4NLP-2021shared task on explainable quality estimation. Given a source-translation pair, this shared task requires not only to provide a sentence-level score indicating the overall quality of the translation, but also to explain this score by identifying the words that negatively impact translation quality. We present the data, annotation guidelines and evaluati… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  5. arXiv:2109.10859  [pdf, other

    cs.CL cs.AI

    Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

    Authors: Diptesh Kanojia, Marina Fomicheva, Tharindu Ranasinghe, Frédéric Blain, Constantin Orăsan, Lucia Specia

    Abstract: Current Machine Translation (MT) systems achieve very good results on a growing variety of language pairs and datasets. However, they are known to produce fluent translation outputs that can contain important meaning errors, thus undermining their reliability in practice. Quality Estimation (QE) is the task of automatically assessing the performance of MT systems at test time. Thus, in order to be… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: Accepted to WMT 2021 Conference co-located with EMNLP 2021. 14 pages with a 4 page appendix

  6. arXiv:2108.12197  [pdf, other

    cs.CL

    Translation Error Detection as Rationale Extraction

    Authors: Marina Fomicheva, Lucia Specia, Nikolaos Aletras

    Abstract: Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences. Predicting translation errors, i.e. detecting specifically which words are incorrect, is a more challenging task, especially with limited amounts of training data. We hypothesize that, not unlike humans, successf… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

  7. arXiv:2107.00411  [pdf, other

    cs.CL

    Knowledge Distillation for Quality Estimation

    Authors: Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

    Abstract: Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations. Recent success in QE stems from the use of multilingual pre-trained representations, where very large models lead to impressive results. However, the inference time, d… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: ACL Findings 2021

  8. arXiv:2104.05688  [pdf, other

    cs.CL cs.HC

    Backtranslation Feedback Improves User Confidence in MT, Not Quality

    Authors: Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

    Abstract: Translating text into a language unknown to the text's author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility. We demonstrate this by showing three ways in which user confidence in the outbound translation, as well as its overall final quality, can be affected: backward translation, qua… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 9 pages (excluding references); to appear at NAACL-HWT 2021

  9. arXiv:2102.11403  [pdf, other

    cs.CL

    Exploring Supervised and Unsupervised Rewards in Machine Translation

    Authors: Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

    Abstract: Reinforcement Learning (RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time. When applied to neural Machine Translation (MT), it minimises the mismatch between the cross-entropy loss and non-differentiable evaluation metrics like BLEU. However, the suitability of these metrics as reward function… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: Long paper accepted to EACL 2021, Camera-ready version

  10. arXiv:2010.04480  [pdf, other

    cs.CL

    MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset

    Authors: Marina Fomicheva, Shuo Sun, Erick Fonseca, Chrysoula Zerva, Frédéric Blain, Vishrav Chaudhary, Francisco Guzmán, Nina Lopatina, Lucia Specia, André F. T. Martins

    Abstract: We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human labels for up to 10,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well… ▽ More

    Submitted 11 October, 2021; v1 submitted 9 October, 2020; originally announced October 2020.

  11. arXiv:2005.10608  [pdf, other

    cs.CL

    Unsupervised Quality Estimation for Neural Machine Translation

    Authors: Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

    Abstract: Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additi… ▽ More

    Submitted 20 July, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: Accepted for publication in TACL. Authors' final version