Skip to main content

Showing 1–20 of 20 results for author: Berard, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01126  [pdf, other

    cs.CL cs.AI

    Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation

    Authors: Nadezhda Chirkova, Vassilina Nikoulina, Jean-Luc Meunier, Alexandre Bérard

    Abstract: We focus on multi-domain Neural Machine Translation, with the goal of develo** efficient models which can handle data from various domains seen during training and are robust to domains unseen during training. We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling, which helps to accommodate a variety of multi-domain data,… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.20052  [pdf, other

    cs.CL

    Understanding and Mitigating Language Confusion in LLMs

    Authors: Kelly Marchisio, Wei-Yin Ko, Alexandre Bérard, Théo Dehaze, Sebastian Ruder

    Abstract: We investigate a surprising limitation of LLMs: their inability to consistently generate text in a user's desired language. We create the Language Confusion Benchmark (LCB) to evaluate such failures, covering 15 typologically diverse languages with existing and newly-created English and multilingual prompts. We evaluate a range of LLMs on monolingual and cross-lingual generation reflecting practic… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2306.07763  [pdf, other

    cs.CL

    NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track

    Authors: Edward Gow-Smith, Alexandre Berard, Marcely Zanon Boito, Ioan Calapodescu

    Abstract: This paper presents NAVER LABS Europe's systems for Tamasheq-French and Quechua-Spanish speech translation in the IWSLT 2023 Low-Resource track. Our work attempts to maximize translation quality in low-resource settings using multilingual parameter-efficient solutions that leverage strong pre-trained models. Our primary submission for Tamasheq outperforms the previous state of the art by 7.5 BLEU… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: IWSLT 2023: Tamasheq-French and Quechua-Spanish challenge winner

  4. arXiv:2212.09811  [pdf, other

    cs.CL cs.AI cs.LG

    Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model

    Authors: Yeskendir Koishekenov, Alexandre Berard, Vassilina Nikoulina

    Abstract: The recently released NLLB-200 is a set of multilingual Neural Machine Translation models that cover 202 languages. The largest model is based on a Mixture of Experts architecture and achieves SoTA results across many language pairs. It contains 54.5B parameters and requires at least four 32GB GPUs just for inference. In this work, we propose a pruning method that enables the removal of up to 80%… ▽ More

    Submitted 7 July, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

  5. arXiv:2210.11621  [pdf, other

    cs.CL cs.AI cs.LG

    SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages

    Authors: Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

    Abstract: In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation. To overcome the "curse of multilinguality", these models often opt for scaling up the number of parameters, which makes their use in resource-constrained environments challenging. We introd… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022

    Journal ref: https://aclanthology.org/2022.emnlp-main.571

  6. arXiv:2205.10828  [pdf, other

    cs.CL cs.AI cs.LG

    What Do Compressed Multilingual Machine Translation Models Forget?

    Authors: Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

    Abstract: Recently, very large pre-trained models achieve state-of-the-art results in various natural language processing (NLP) tasks, but their size makes it more challenging to apply them in resource-constrained environments. Compression techniques allow to drastically reduce the size of the models and therefore their inference time with negligible impact on top-tier metrics. However, the general performa… ▽ More

    Submitted 27 June, 2023; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: Accepted to Findings of EMNLP 2022, presented at WMT 2022

    Journal ref: https://aclanthology.org/2022.findings-emnlp.317/

  7. arXiv:2110.10478  [pdf, other

    cs.CL

    Continual Learning in Multilingual NMT via Language-Specific Embeddings

    Authors: Alexandre Berard

    Abstract: This paper proposes a technique for adding a new source or target language to an existing multilingual NMT model without re-training it on the initial set of languages. It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data. Some additional language-specific components may be trained to improve… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: Accepted as a research paper to WMT 2021

  8. arXiv:2110.10472  [pdf, other

    cs.CL

    Multilingual Unsupervised Neural Machine Translation with Denoising Adapters

    Authors: Ahmet Üstün, Alexandre Bérard, Laurent Besacier, Matthias Gallé

    Abstract: We consider the problem of multilingual unsupervised machine translation, translating to and from languages that only have monolingual data by using auxiliary parallel language pairs. For this problem the standard procedure so far to leverage the monolingual data is back-translation, which is computationally costly and hard to tune. In this paper we propose instead to use denoising adapters, ada… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: Accepted as a long paper to EMNLP 2021

  9. arXiv:2110.09574  [pdf, other

    cs.CL

    Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters

    Authors: Asa Cooper Stickland, Alexandre Bérard, Vassilina Nikoulina

    Abstract: Adapter layers are lightweight, learnable units inserted between transformer layers. Recent work explores using such layers for neural machine translation (NMT), to adapt pre-trained models to new domains or language pairs, training only a small set of parameters for each new setting (language pair or domain). In this work we study the compositionality of language and domain adapters in the contex… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: Accepted at The Sixth Conference in Machine Translation (WMT21)

  10. arXiv:2109.06679  [pdf, other

    cs.CL

    Efficient Inference for Multilingual Neural Machine Translation

    Authors: Alexandre Berard, Dain Lee, Stéphane Clinchant, Kweonwoo Jung, Vassilina Nikoulina

    Abstract: Multilingual NMT has become an attractive solution for MT deployment in production. But to match bilingual quality, it comes at the cost of larger and slower models. In this work, we consider several ways to make multilingual NMT faster at inference without degrading its quality. We experiment with several "light decoder" architectures in two 20-language multi-parallel settings: small-scale on TED… ▽ More

    Submitted 8 November, 2021; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 long paper

  11. arXiv:2008.02878  [pdf, ps, other

    cs.CL cs.LG

    A Multilingual Neural Machine Translation Model for Biomedical Data

    Authors: Alexandre Bérard, Zae Myung Kim, Vassilina Nikoulina, Eunjeong L. Park, Matthias Gallé

    Abstract: We release a multilingual neural machine translation model, which can be used to translate text in the biomedical domain. The model can translate from 5 languages (French, German, Italian, Korean and Spanish) into English. It is trained with large amounts of generic and biomedical data, using domain tags. Our benchmarks show that it performs near state-of-the-art both on news (generic domain) and… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: https://github.com/naver/covid19-nmt

  12. arXiv:1910.14589  [pdf, other

    cs.CL

    Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness

    Authors: Alexandre Bérard, Ioan Calapodescu, Marc Dymetman, Claude Roux, Jean-Luc Meunier, Vassilina Nikoulina

    Abstract: We share a French-English parallel corpus of Foursquare restaurant reviews (https://europe.naverlabs.com/research/natural-language-processing/machine-translation-of-restaurant-reviews), and define a new task to encourage research on Neural Machine Translation robustness and domain adaptation, in a real-world scenario where better-quality MT would be greatly beneficial. We discuss the challenges of… ▽ More

    Submitted 31 October, 2019; originally announced October 2019.

    Comments: WNGT 2019 Paper

  13. arXiv:1910.14539  [pdf, other

    cs.CL

    Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019

    Authors: Fahimeh Saleh, Alexandre Bérard, Ioan Calapodescu, Laurent Besacier

    Abstract: Recently, neural models led to significant improvements in both machine translation (MT) and natural language generation tasks (NLG). However, generation of long descriptive summaries conditioned on structured data remains an open challenge. Likewise, MT that goes beyond sentence-level context is still an open issue (e.g., document-level MT or MT with metadata). To address these challenges, we pro… ▽ More

    Submitted 31 October, 2019; originally announced October 2019.

    Comments: WNGT 2019 - System Description Paper

  14. arXiv:1907.06488  [pdf, other

    cs.CL

    Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness Task

    Authors: Alexandre Bérard, Ioan Calapodescu, Claude Roux

    Abstract: This paper describes the systems that we submitted to the WMT19 Machine Translation robustness task. This task aims to improve MT's robustness to noise found on social media, like informal language, spelling mistakes and other orthographic variations. The organizers provide parallel data extracted from a social media website in two language pairs: French-English and Japanese-English (in both trans… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.

    Comments: WMT 2019 - Shared Task Paper

  15. arXiv:1806.06734  [pdf, other

    cs.CL cs.AI

    Unsupervised Word Segmentation from Speech with Attention

    Authors: Pierre Godard, Marcely Zanon-Boito, Lucas Ondel, Alexandre Berard, François Yvon, Aline Villavicencio, Laurent Besacier

    Abstract: We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-ph… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Comments: Interspeech 2018

  16. arXiv:1802.04200  [pdf, other

    cs.CL

    End-to-End Automatic Speech Translation of Audiobooks

    Authors: Alexandre Bérard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin

    Abstract: We investigate end-to-end speech-to-text translation on a corpus of audiobooks specifically augmented for this task. Previous works investigated the extreme case where source language transcription is not available during learning nor decoding, but we also study a midway case where source language transcription is available at training time only. In this case, a single model is trained to decode s… ▽ More

    Submitted 12 February, 2018; originally announced February 2018.

    Comments: Accepted to ICASSP 2018 (poster presentation)

  17. arXiv:1709.05631  [pdf, other

    cs.CL

    Unwritten Languages Demand Attention Too! Word Discovery with Encoder-Decoder Models

    Authors: Marcely Zanon Boito, Alexandre Berard, Aline Villavicencio, Laurent Besacier

    Abstract: Word discovery is the task of extracting words from unsegmented text. In this paper we examine to what extent neural networks can be applied to this task in a realistic unwritten language scenario, where only small corpora and limited annotations are available. We investigate two scenarios: one with no supervision and another with limited supervision with access to the most frequent words. Obtaine… ▽ More

    Submitted 19 September, 2017; v1 submitted 17 September, 2017; originally announced September 2017.

    Comments: Accepted to IEEE ASRU 2017

  18. arXiv:1707.05118  [pdf, other

    cs.CL

    LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task

    Authors: Alexandre Berard, Olivier Pietquin, Laurent Besacier

    Abstract: This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017. We propose two neural post-editing models: a monosource model with a task-specific attention mechanism, which performs particularly well in a low-resource scenario; and a chained architecture which makes use of the source sentence to provide extra context. This latter architecture manages to slig… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

    Comments: keywords: neural post-edition, attention models

  19. arXiv:1612.01744  [pdf, other

    cs.CL

    Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

    Authors: Alexandre Berard, Olivier Pietquin, Christophe Servan, Laurent Besacier

    Abstract: This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding. We propose a model for direct speech-to-text translation, which gives promising results on a small French-English synthetic corpus. Relaxing the need for source language transcription would drastically change the data collection… ▽ More

    Submitted 6 December, 2016; originally announced December 2016.

    Comments: accepted to NIPS workshop on End-to-end Learning for Speech and Audio Processing

  20. arXiv:1610.01291  [pdf, ps, other

    cs.CL

    Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?

    Authors: Christophe Servan, Alexandre Berard, Zied Elloumi, Hervé Blanchon, Laurent Besacier

    Abstract: This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments ar… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

    Comments: accepted to COLING 2016 conference