Skip to main content

Showing 1–4 of 4 results for author: Etchegoyhen, T

.
  1. arXiv:2406.12364  [pdf, other

    cs.CL

    Does Context Help Mitigate Gender Bias in Neural Machine Translation?

    Authors: Harritxu Gete, Thierry Etchegoyhen

    Abstract: Neural Machine Translation models tend to perpetuate gender bias present in their training data distribution. Context-aware models have been previously suggested as a means to mitigate this type of bias. In this work, we examine this claim by analysing in detail the translation of stereotypical professions in English to German, and translation with non-informative context in Basque to Spanish. Our… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.11464  [pdf, other

    cs.CL

    Automating Easy Read Text Segmentation

    Authors: Jesús Calleja, Thierry Etchegoyhen, David Ponce

    Abstract: Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments, to facilitate reading. Automated segmentation methods could foster the creation of Easy Read content, but their viability has yet to be addressed. In this work, we study no… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2402.06342  [pdf, ps, other

    cs.CL

    Promoting Target Data in Context-aware Neural Machine Translation

    Authors: Harritxu Gete, Thierry Etchegoyhen

    Abstract: Standard context-aware neural machine translation (NMT) typically relies on parallel document-level data, exploiting both source and target contexts. Concatenation-based approaches in particular, still a strong baseline for document-level NMT, prepend source and/or target context sentences to the sentences to be translated, with model variants that exploit equal amounts of source and target data o… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2312.11075  [pdf, other

    cs.CL

    Split and Rephrase with Large Language Models

    Authors: David Ponce, Thierry Etchegoyhen, Jesús Calleja Pérez, Harritxu Gete

    Abstract: The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we ev… ▽ More

    Submitted 3 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.