Skip to main content

Showing 1–5 of 5 results for author: Idrissi, B Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.19737  [pdf, other

    cs.CL

    Better & Faster Large Language Models via Multi-token Prediction

    Authors: Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, David Lopez-Paz, Gabriel Synnaeve

    Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at once results in higher sample efficiency. More specifically, at each position in the training corpus, we ask the model to predict the following n tokens using n independent output heads, operating on top of a shared m… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  2. arXiv:2211.01866  [pdf, other

    cs.CV cs.LG

    ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

    Authors: Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim

    Abstract: Deep learning vision systems are widely deployed across applications where reliability is critical. However, even today's best models can fail to recognize an object when its pose, lighting, or background varies. While existing benchmarks surface examples challenging for models, they do not explain why such mistakes arise. To address this need, we introduce ImageNet-X, a set of sixteen human annot… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  3. arXiv:2110.14503  [pdf, other

    cs.LG cs.AI cs.CR

    Simple data balancing achieves competitive worst-group-accuracy

    Authors: Badr Youbi Idrissi, Martin Arjovsky, Mohammad Pezeshki, David Lopez-Paz

    Abstract: We study the problem of learning classifiers that perform well across (known or unknown) groups of data. After observing that common worst-group-accuracy datasets suffer from substantial imbalances, we set out to compare state-of-the-art methods to simple balancing of classes and groups by either subsampling or reweighting data. Our results show that these data balancing baselines achieve state-of… ▽ More

    Submitted 18 February, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted at CLeaR (Causal Learning and Reasoning) 2022

  4. arXiv:2109.00417  [pdf, other

    cs.CL cs.AI

    Masked Adversarial Generation for Neural Machine Translation

    Authors: Badr Youbi Idrissi, Stéphane Clinchant

    Abstract: Attacking Neural Machine Translation models is an inherently combinatorial task on discrete sequences, solved with approximate heuristics. Most methods use the gradient to attack the model on each sample independently. Instead of mechanically applying the gradient, could we learn to produce meaningful adversarial attacks ? In contrast to existing approaches, we learn to attack a model by training… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: 5 pages

  5. VirAAL: Virtual Adversarial Active Learning For NLU

    Authors: Gregory Senay, Badr Youbi Idrissi, Marine Haziza

    Abstract: This paper presents VirAAL, an Active Learning framework based on Adversarial Training. VirAAL aims to reduce the effort of annotation in Natural Language Understanding (NLU). VirAAL is based on Virtual Adversarial Training (VAT), a semi-supervised approach that regularizes the model through Local Distributional Smoothness. With that, adversarial perturbations are added to the inputs making the po… ▽ More

    Submitted 16 November, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: To appear in Proc. IEEE SLT 2021, January 19-22, 2021, Shenzhen, China