Skip to main content

Showing 1–22 of 22 results for author: Guerreiro, N

.
  1. arXiv:2406.19482  [pdf, other

    cs.CL

    xTower: A Multilingual LLM for Explaining and Correcting Translation Errors

    Authors: Marcos Treviso, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, José Pombal, Tania Vaz, Helena Wu, Beatriz Silva, Daan van Stigt, André F. T. Martins

    Abstract: While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for tr… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2404.00104  [pdf, other

    astro-ph.SR

    Accurate PRD modeling of the forward-scattering Hanle effect in the chromospheric CaI 4227 Å line

    Authors: Luca Belluzzi, Simone Riva, Gioele Janett, Nuno Guerreiro, Fabio Riva, Pietro Benedusi, Tanausú del Pino Alemán, Ernest Alsina Ballester, Javier Trujillo Bueno, Jiří Štěpán

    Abstract: Measurable linear scattering polarization signals have been predicted and detected at the solar disk center in the core of chromospheric lines. These forward-scattering polarization signals, which are of high interest for magnetic field diagnostics, have always been modeled either under the assumption of complete frequency redistribution (CRD), or taking partial frequency redistribution (PRD) effe… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  3. arXiv:2402.17733  [pdf, other

    cs.CL

    Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

    Authors: Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins

    Abstract: While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and pa… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  4. arXiv:2402.13331  [pdf, other

    cs.CL

    Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

    Authors: Anas Himmi, Guillaume Staerman, Marine Picot, Pierre Colombo, Nuno M. Guerreiro

    Abstract: Hallucinated translations pose significant threats and safety concerns when it comes to the practical deployment of machine translation systems. Previous research works have identified that detectors exhibit complementary performance different detectors excel at detecting different types of hallucinations. In this paper, we propose to address the limitations of individual detectors by combining th… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  5. arXiv:2402.00786  [pdf, other

    cs.CL cs.LG

    CroissantLLM: A Truly Bilingual French-English Language Model

    Authors: Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo

    Abstract: We introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware. To that end, we pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio, a cust… ▽ More

    Submitted 29 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  6. arXiv:2310.13448  [pdf, other

    cs.CL

    Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning

    Authors: Duarte M. Alves, Nuno M. Guerreiro, João Alves, José Pombal, Ricardo Rei, José G. C. de Souza, Pierre Colombo, André F. T. Martins

    Abstract: Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capa… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 - Findings

  7. arXiv:2310.10482  [pdf, other

    cs.CL

    xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection

    Authors: Nuno M. Guerreiro, Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, André F. T. Martins

    Abstract: Widely used learned metrics for machine translation evaluation, such as COMET and BLEURT, estimate the quality of a translation hypothesis by providing a single sentence-level score. As such, they offer little insight into translation errors (e.g., what are the errors and what is their severity). On the other hand, generative large language models (LLMs) are amplifying the adoption of more granula… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Work in progress

  8. Assessment of the CRD approximation for the observer's frame RIII redistribution matrix

    Authors: Simone Riva, Nuno Guerreiro, Gioele Janett, Diego Rossinelli, Pietro Benedusi, Rolf Krause, Luca Belluzzi

    Abstract: Approximated forms of the RII and RIII redistribution matrices are frequently applied to simplify the numerical solution of the radiative transfer problem for polarized radiation, taking partial frequency redistribution (PRD) effects into account. A widely used approximation for RIII is to consider its expression under the assumption of complete frequency redistribution (CRD) in the observer frame… ▽ More

    Submitted 12 November, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

    Journal ref: A&A 679, A87 (2023)

  9. arXiv:2309.11925  [pdf, other

    cs.CL

    Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

    Authors: Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins

    Abstract: We present the joint contribution of Unbabel and Instituto Superior Técnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks,… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  10. arXiv:2305.17075  [pdf, other

    cs.CL

    CREST: A Joint Framework for Rationalization and Counterfactual Text Generation

    Authors: Marcos Treviso, Alexis Ross, Nuno M. Guerreiro, André F. T. Martins

    Abstract: Selective rationales and counterfactual examples have emerged as two effective, complementary classes of interpretability methods for analyzing and training NLP models. However, prior work has not explored how these methods can be integrated to combine their complementary advantages. We overcome this limitation by introducing CREST (ContRastive Edits with Sparse raTionalization), a joint framework… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 (main)

  11. arXiv:2305.11806  [pdf, other

    cs.CL

    The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics

    Authors: Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, André F. T. Martins

    Abstract: Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU. Yet, neural metrics are, to a great extent, "black boxes" returning a single sentence-level score without transparency about the decision-making process. In this work, we develop and… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  12. arXiv:2304.08654  [pdf, other

    cs.SE

    Unleashing the Power of Sound: Revisiting the Physics of Notations for Modelling with auditory symbols

    Authors: Nuno Guerreiro, Vasco Amaral, Miguel Goulão

    Abstract: Sound - the oft-neglected sense for Software Engineering - is a crucial component of our daily lives, playing a vital role in how we interact with the world around us. In this paper, we challenge the traditional boundaries of Software Engineering by proposing a new approach based on sound design for using sound in modelling tools that is on par with visual design. By drawing upon the seminal work… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  13. arXiv:2303.16104  [pdf, other

    cs.CL

    Hallucinations in Large Multilingual Translation Models

    Authors: Nuno M. Guerreiro, Duarte Alves, Jonas Waldendorf, Barry Haddow, Alexandra Birch, Pierre Colombo, André F. T. Martins

    Abstract: Large-scale multilingual machine translation systems have demonstrated remarkable ability to translate directly between numerous languages, making them increasingly appealing for real-world applications. However, when deployed in the wild, these models may generate hallucinated translations which have the potential to severely undermine user trust and raise safety concerns. Existing research on ha… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  14. arXiv:2212.09631  [pdf, other

    cs.CL cs.LG

    Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

    Authors: Nuno M. Guerreiro, Pierre Colombo, Pablo Piantanida, André F. T. Martins

    Abstract: Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can unpredictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the prob… ▽ More

    Submitted 19 May, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted at ACL 2023

  15. arXiv:2209.06243  [pdf, other

    cs.CL cs.LG

    CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task

    Authors: Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana C. Farinha, Christine Maroti, José G. C. de Souza, Taisiya Glushkova, Duarte M. Alves, Alon Lavie, Luisa Coheur, André F. T. Martins

    Abstract: We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE). Our team participated on all three subtasks: (i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii) Critical Error Detection. For all tasks we build on top of the COMET framework, connecting it with the predictor-estimator architecture of OpenKiwi, and equip** it w… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: WMT 2022 Quality Estimation shared task

  16. arXiv:2208.05309  [pdf, other

    cs.CL cs.LG

    Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation

    Authors: Nuno M. Guerreiro, Elena Voita, André F. T. Martins

    Abstract: Although the problem of hallucinations in neural machine translation (NMT) has received some attention, research on this highly pathological phenomenon lacks solid ground. Previous work has been limited in several ways: it often resorts to artificial settings where the problem is amplified, it disregards some (common) types of hallucinations, and it does not validate adequacy of detection heuristi… ▽ More

    Submitted 5 March, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted at EACL23 (main)

  17. Hanle rotation signatures in Sr I 4607 Å

    Authors: Franziska Zeuner, Luca Belluzzi, Nuno Guerreiro, Renzo Ramelli, Michele Bianda

    Abstract: Observations of scattering polarization and the Hanle effect in various spectral lines are increasingly used to complement traditional solar magnetic field determination techniques. One of the strongest scattering polarization signals in the photosphere is measured in the Sr I line at 4607.3 Å when observed close to the solar limb. Here, we present the first observational evidence of Hanle rotatio… ▽ More

    Submitted 6 May, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: 7 pages, 4 figures, accepted for publication in A&A

    Journal ref: A&A 662, A46 (2022)

  18. Modeling the scattering polarization of the solar Ca i 4227 Å line with angle-dependent partial frequency redistribution

    Authors: Gioele Janett, Ernest Alsina Ballester, Nuno Guerreiro, Simone Riva, Luca Belluzzi, Tanausú del Pino Alemán, Javier Trujillo Bueno

    Abstract: Context. The correct modeling of the scattering polarization signals observed in several strong resonance lines requires taking partial frequency redistribution (PRD) phenomena into account. Aims. This work aims at assessing the impact and the range of validity of the angle-averaged AA approximation with respect to the general angle-dependent (AD) treatment of PRD effects in the modeling of scatte… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

  19. arXiv:2109.04552  [pdf, other

    cs.CL cs.LG

    SPECTRA: Sparse Structured Text Rationalization

    Authors: Nuno Miguel Guerreiro, André F. T. Martins

    Abstract: Selective rationalization aims to produce decisions along with rationales (e.g., text highlights or word alignments between two sentences). Commonly, rationales are modeled as stochastic binary masks, requiring sampling-based gradient estimators, which complicates training and requires careful hyperparameter tuning. Sparse attention mechanisms are a deterministic alternative, but they lack a way t… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021 (main conference)

  20. Lower solar atmosphere and magnetism at ultra-high spatial resolution

    Authors: Remo Collet, Serena Criscuoli, Ilaria Ermolli, Damian Fabbian, Nuno Guerreiro, Margit Haberreiter, Courtney Peck, Tiago M. D. Pereira, Matthias Rempel, Sami K. Solanki, Sven Wedemeyer-Boehm

    Abstract: We present the scientific case for a future space-based telescope aimed at very high spatial and temporal resolution imaging of the solar photosphere and chromosphere. Previous missions (e.g., HINODE, SUNRISE) have demonstrated the power of observing the solar photosphere and chromosphere at high spatial resolution without contamination from Earth's atmosphere. We argue here that increased spatial… ▽ More

    Submitted 7 December, 2016; originally announced December 2016.

  21. Numerical Simulations of Coronal Heating through Footpoint Braiding

    Authors: Viggo Hansteen, Nuno Guerreiro, Bart De Pontieu, Mats Carlsson

    Abstract: Advanced 3D radiative MHD simulations now reproduce many properties of the outer solar atmosphere. When including a domain from the convection zone into the corona, a hot chromosphere and corona are self-consistently maintained. Here we study two realistic models, with different simulated area, magnetic field strength and topology, and numerical resolution. These are compared in order to character… ▽ More

    Submitted 28 August, 2015; originally announced August 2015.

    Comments: 20 pages, accepted by ApJ

  22. arXiv:0807.4373  [pdf, ps, other

    astro-ph gr-qc hep-ph

    Dark matter from cosmic defects on galactic scales?

    Authors: N. Guerreiro, P. P. Avelino, J. P. M. de Carvalho, C. J. A. P. Martins

    Abstract: We discuss the possible dynamical role of extended cosmic defects on galactic scales, specifically focusing on the possibility that they may provide the dark matter suggested by the classical problem of galactic rotation curves. We emphasize that the more standard defects (such as Goto-Nambu strings) are unsuitable for this task, but show that more general models (such as transonic wiggly string… ▽ More

    Submitted 12 September, 2008; v1 submitted 28 July, 2008; originally announced July 2008.

    Comments: Submitted to Phys. Rev. D (Brief Reports). v2: Reference added and some typos corrected, matches published version

    Journal ref: Phys.Rev.D78:067302,2008