Skip to main content

Showing 1–3 of 3 results for author: van Stigt, D

.
  1. arXiv:2406.19482  [pdf, other

    cs.CL

    xTower: A Multilingual LLM for Explaining and Correcting Translation Errors

    Authors: Marcos Treviso, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, José Pombal, Tania Vaz, Helena Wu, Beatriz Silva, Daan van Stigt, André F. T. Martins

    Abstract: While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for tr… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2310.10482  [pdf, other

    cs.CL

    xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection

    Authors: Nuno M. Guerreiro, Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, André F. T. Martins

    Abstract: Widely used learned metrics for machine translation evaluation, such as COMET and BLEURT, estimate the quality of a translation hypothesis by providing a single sentence-level score. As such, they offer little insight into translation errors (e.g., what are the errors and what is their severity). On the other hand, generative large language models (LLMs) are amplifying the adoption of more granula… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Work in progress

  3. arXiv:2309.11925  [pdf, other

    cs.CL

    Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

    Authors: Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins

    Abstract: We present the joint contribution of Unbabel and Instituto Superior Técnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks,… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.