Skip to main content

Showing 1–3 of 3 results for author: Darcet, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15613  [pdf, other

    cs.LG cs.AI cs.CV

    Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

    Authors: Huy V. Vo, Vasil Khalidov, Timothée Darcet, Théo Moutakanni, Nikita Smetanin, Marc Szafraniec, Hugo Touvron, Camille Couprie, Maxime Oquab, Armand Joulin, Hervé Jégou, Patrick Labatut, Piotr Bojanowski

    Abstract: Self-supervised features are the cornerstone of modern machine learning systems. They are typically pre-trained on data collections whose construction and curation typically require extensive human effort. This manual process has some limitations similar to those encountered in supervised learning, e.g., the crowd-sourced selection of data is costly and time-consuming, preventing scaling the datas… ▽ More

    Submitted 28 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2309.16588  [pdf, other

    cs.CV

    Vision Transformers Need Registers

    Authors: Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski

    Abstract: Transformers have recently emerged as a powerful tool for learning visual representations. In this paper, we identify and characterize artifacts in feature maps of both supervised and self-supervised ViT networks. The artifacts correspond to high-norm tokens appearing during inference primarily in low-informative background areas of images, that are repurposed for internal computations. We propose… ▽ More

    Submitted 12 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

  3. arXiv:2304.07193  [pdf, other

    cs.CV

    DINOv2: Learning Robust Visual Features without Supervision

    Authors: Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin , et al. (1 additional authors not shown)

    Abstract: The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pr… ▽ More

    Submitted 2 February, 2024; v1 submitted 14 April, 2023; originally announced April 2023.