Skip to main content

Showing 1–6 of 6 results for author: Bitton, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.12950  [pdf, other

    cs.CL

    Code Llama: Open Foundation Models for Code

    Authors: Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, **gyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom , et al. (1 additional authors not shown)

    Abstract: We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama… ▽ More

    Submitted 31 January, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

  2. arXiv:2206.04137  [pdf, other

    cs.CL

    Adversarial Text Normalization

    Authors: Joanna Bitton, Maya Pavlova, Ivan Evtimov

    Abstract: Text-based adversarial attacks are becoming more commonplace and accessible to general internet users. As these attacks proliferate, the need to address the gap in model robustness becomes imminent. While retraining on adversarial data may increase performance, there remains an additional class of character-level attacks on which these models falter. Additionally, the process to retrain a model is… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  3. arXiv:2201.06494  [pdf, other

    cs.AI cs.CV

    AugLy: Data Augmentations for Robustness

    Authors: Zoe Papakipos, Joanna Bitton

    Abstract: We introduce AugLy, a data augmentation library with a focus on adversarial robustness. AugLy provides a wide array of augmentations for multiple modalities (audio, image, text, & video). These augmentations were inspired by those that real users perform on social media platforms, some of which were not already supported by existing data augmentation libraries. AugLy can be used for any purpose wh… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

  4. arXiv:2104.02821  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Measuring Fairness in AI: the Casual Conversations Dataset

    Authors: Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer

    Abstract: This paper introduces a novel dataset to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions. Our dataset is composed of 3,011 subjects and contains over 45,000 videos, with an average of 15 videos per person. The videos were recorded in multiple U.S. states with a diverse set of adu… ▽ More

    Submitted 3 November, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

  5. arXiv:2011.09957  [pdf, other

    cs.CV

    Adversarial Threats to DeepFake Detection: A Practical Perspective

    Authors: Paarth Neekhara, Brian Dolhansky, Joanna Bitton, Cristian Canton Ferrer

    Abstract: Facially manipulated images and videos or DeepFakes can be used maliciously to fuel misinformation or defame individuals. Therefore, detecting DeepFakes is crucial to increase the credibility of social media platforms and other media sharing web sites. State-of-the art DeepFake detection techniques rely on neural network based classification models which are known to be vulnerable to adversarial e… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  6. arXiv:2006.07397  [pdf, other

    cs.CV cs.LG

    The DeepFake Detection Challenge (DFDC) Dataset

    Authors: Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, Cristian Canton Ferrer

    Abstract: Deepfakes are a recent off-the-shelf manipulation technique that allows anyone to swap two identities in a single video. In addition to Deepfakes, a variety of GAN-based face swap** methods have also been published with accompanying code. To counter this emerging threat, we have constructed an extremely large face swap video dataset to enable the training of detection models, and organized the a… ▽ More

    Submitted 27 October, 2020; v1 submitted 12 June, 2020; originally announced June 2020.