Skip to main content

Showing 1–45 of 45 results for author: Douze, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.10746  [pdf, other

    cs.CV cs.DB

    Vector search with small radiuses

    Authors: Gergely Szilvasy, Pierre-Emmanuel Mazaré, Matthijs Douze

    Abstract: In recent years, the dominant accuracy metric for vector search is the recall of a result list of fixed size (top-k retrieval), considering as ground truth the exact vector retrieval results. Although convenient to compute, this metric is distantly related to the end-to-end accuracy of a full system that integrates vector search. In this paper we focus on the common case where a hard decision need… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  2. arXiv:2402.14904  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Watermarking Makes Language Models Radioactive

    Authors: Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

    Abstract: This paper investigates the radioactivity of LLM-generated texts, i.e. whether it is possible to detect that such input was used as training data. Conventional methods like membership inference can carry out this detection with some level of accuracy. We show that watermarked training data leaves traces easier to detect and much more reliable than membership inference. We link the contamination le… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  3. arXiv:2401.14732  [pdf, other

    cs.LG

    Residual Quantization with Implicit Neural Codebooks

    Authors: Iris A. M. Huijben, Matthijs Douze, Matthew Muckley, Ruud J. G. van Sloun, Jakob Verbeek

    Abstract: Vector quantization is a fundamental operation for data compression and vector search. To obtain high accuracy, multi-codebook methods represent each vector using codewords across several codebooks. Residual quantization (RQ) is one such method, which iteratively quantizes the error of the previous step. While the error distribution is dependent on previously-selected codewords, this dependency is… ▽ More

    Submitted 21 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: To appear at ICML 2024

  4. arXiv:2401.08281  [pdf, other

    cs.LG cs.CV cs.SE

    The Faiss library

    Authors: Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou

    Abstract: Vector databases manage large collections of embedding vectors. As AI applications are growing rapidly, so are the number of embeddings that need to be stored and indexed. The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and transform vectors. This pa… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  5. arXiv:2310.11446  [pdf, other

    cs.CR cs.AI cs.CL

    Functional Invariants to Watermark Large Transformers

    Authors: Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze

    Abstract: The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance. However, most existing approaches require to optimize the weights to imprint the watermark signal, which is not suitable at scale due to the computational cost. This pa… ▽ More

    Submitted 18 January, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Published at ICASSP 2024. Webpage at https://pierrefdz.github.io/publications/invariancewm/

  6. arXiv:2308.02752  [pdf, other

    cs.CV

    DeDrift: Robust Similarity Search under Content Drift

    Authors: Dmitry Baranchuk, Matthijs Douze, Yash Upadhyay, I. Zeki Yalniz

    Abstract: The statistical distribution of content uploaded and searched on media sharing sites changes over time due to seasonal, sociological and technical factors. We investigate the impact of this "content drift" for large-scale similarity search tools, based on nearest neighbor search in embedding space. Unless a costly index reconstruction is performed frequently, content drift degrades the search accu… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: ICCV2023

  7. arXiv:2306.09489  [pdf, other

    cs.CV cs.AI cs.MM

    The 2023 Video Similarity Dataset and Challenge

    Authors: Ed Pizzi, Giorgos Kordopatis-Zilos, Hiral Patel, Gheorghe Postelnicu, Sugosh Nagavara Ravindra, Akshay Gupta, Symeon Papadopoulos, Giorgos Tolias, Matthijs Douze

    Abstract: This work introduces a dataset, benchmark, and challenge for the problem of video copy detection and localization. The problem comprises two distinct but related tasks: determining whether a query video shares content with a reference video ("detection"), and additionally temporally localizing the shared content within each video ("localization"). The benchmark is designed to evaluate methods on t… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  8. arXiv:2303.15435  [pdf, other

    cs.CV cs.AI

    The Stable Signature: Rooting Watermarks in Latent Diffusion Models

    Authors: Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon

    Abstract: Generative image modeling enables a wide range of applications but raises ethical concerns about responsible deployment. This paper introduces an active strategy combining image watermarking and Latent Diffusion Models. The goal is for all generated images to conceal an invisible watermark allowing for future detection and/or identification. The method quickly fine-tunes the latent decoder of the… ▽ More

    Submitted 26 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: Published at ICCV 2023. Code at https://github.com/facebookresearch/stable_signature - webpage at https://pierrefdz.github.io/publications/stablesignature

  9. arXiv:2210.10620  [pdf, other

    cs.IR cs.AI cs.CV cs.LG

    Active Image Indexing

    Authors: Pierre Fernandez, Matthijs Douze, Hervé Jégou, Teddy Furon

    Abstract: Image copy detection and retrieval from large databases leverage two components. First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image. Second, an efficient but approximate similarity search algorithm trades scalability (size and speed) against quality of the search, thereby introducing a source of error. This paper impr… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  10. arXiv:2205.03763  [pdf, other

    cs.LG cs.DB cs.DS cs.PF

    Results of the NeurIPS'21 Challenge on Billion-Scale Approximate Nearest Neighbor Search

    Authors: Harsha Vardhan Simhadri, George Williams, Martin Aumüller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamy, Gopal Srinivasa, Suhas Jayaram Subramanya, **gdong Wang

    Abstract: Despite the broad range of algorithms for Approximate Nearest Neighbor Search, most empirical evaluations of algorithms have focused on smaller datasets, typically of 1 million points~\citep{Benchmark}. However, deploying recent advances in embedding based techniques for search, recommendation and ranking at scale require ANNS indices at billion, trillion or larger scale. Barring a few recent pape… ▽ More

    Submitted 7 May, 2022; originally announced May 2022.

  11. arXiv:2202.10261  [pdf, other

    cs.CV cs.CR cs.LG

    A Self-Supervised Descriptor for Image Copy Detection

    Authors: Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, Matthijs Douze

    Abstract: Image copy detection is an important task for content moderation. We introduce SSCD, a model that builds on a recent self-supervised contrastive training objective. We adapt this method to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine im… ▽ More

    Submitted 25 March, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

  12. arXiv:2202.04007  [pdf, other

    cs.CV cs.LG

    Results and findings of the 2021 Image Similarity Challenge

    Authors: Zoë Papakipos, Giorgos Tolias, Tomas Jenicek, Ed Pizzi, Shuhei Yokoo, Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang, Sanjay Addicam, Sergio Manuel Papadakis, Cristian Canton Ferrer, Ondrej Chum, Matthijs Douze

    Abstract: The 2021 Image Similarity Challenge introduced a dataset to serve as a new benchmark to evaluate recent image copy detection methods. There were 200 participants to the competition. This paper presents a quantitative and qualitative analysis of the top submissions. It appears that the most difficult image transformations involve either severe image crops or hiding into unrelated images, combined w… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  13. arXiv:2112.09581  [pdf, other

    cs.CV cs.LG

    Watermarking Images in Self-Supervised Latent Spaces

    Authors: Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze

    Abstract: We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches. We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time. Our method can operate at any resolution and creates watermarks robust to a broad range of transformations (rotations, crops, JPEG, contrast, etc). It signi… ▽ More

    Submitted 23 March, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

  14. arXiv:2112.09568  [pdf, other

    cs.CV cs.LG

    Nearest neighbor search with compact codes: A decoder perspective

    Authors: Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou

    Abstract: Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization. These methods minimize a certain loss, typically the mean squared error or other objective functions tailored to the retrieval problem. In this paper, we re-interpret popular methods such as binary hashing or product quantizers as… ▽ More

    Submitted 21 February, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

  15. arXiv:2112.03162  [pdf, other

    cs.CV cs.CL

    Embedding Arithmetic of Multimodal Queries for Image Retrieval

    Authors: Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk

    Abstract: Latent text representations exhibit geometric regularities, such as the famous analogy: queen is to king what woman is to man. Such structured semantic relations were not demonstrated on image representations. Recent works aiming at bridging this semantic gap embed images and text into a multimodal space, enabling the transfer of text-defined transformations to the image modality. We introduce the… ▽ More

    Submitted 20 October, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: accepted at O-DRUM (CVPR workshop 2022)

  16. arXiv:2106.09681  [pdf, other

    cs.CV cs.LG

    XCiT: Cross-Covariance Image Transformers

    Authors: Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

    Abstract: Following their success in natural language processing, transformers have recently shown much promise for computer vision. The self-attention operation underlying transformers yields global interactions between all tokens ,i.e. words or image patches, and enables flexible modelling of image data beyond the local interactions of convolutions. This flexibility, however, comes with a quadratic comple… ▽ More

    Submitted 18 June, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

  17. arXiv:2106.09672  [pdf, other

    cs.CV

    The 2021 Image Similarity Dataset and Challenge

    Authors: Matthijs Douze, Giorgos Tolias, Ed Pizzi, Zoë Papakipos, Lowik Chanussot, Filip Radenovic, Tomas Jenicek, Maxim Maximov, Laura Leal-Taixé, Ismail Elezi, Ondřej Chum, Cristian Canton Ferrer

    Abstract: This paper introduces a new benchmark for large-scale image similarity detection. This benchmark is used for the Image Similarity Challenge at NeurIPS'21 (ISC2021). The goal is to determine whether a query image is a modified copy of any image in a reference corpus of size 1~million. The benchmark features a variety of image transformations such as automated transformations, hand-crafted image edi… ▽ More

    Submitted 21 February, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

  18. arXiv:2104.01136  [pdf, other

    cs.CV

    LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

    Authors: Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze

    Abstract: We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime. Our work exploits recent findings in attention-based architectures, which are competitive on highly parallel processing hardware. We revisit principles from the extensive literature on convolutional neural networks to apply them to transformers, in particular… ▽ More

    Submitted 6 May, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  19. arXiv:2012.12877  [pdf, other

    cs.CV

    Training data-efficient image transformers & distillation through attention

    Authors: Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou

    Abstract: Recently, neural networks purely based on attention were shown to address image understanding tasks such as image classification. However, these visual transformers are pre-trained with hundreds of millions of images using an expensive infrastructure, thereby limiting their adoption. In this work, we produce a competitive convolution-free transformer by training on Imagenet only. We train them o… ▽ More

    Submitted 15 January, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

  20. arXiv:2011.12982  [pdf, other

    cs.CV

    Grafit: Learning fine-grained image representations with coarse labels

    Authors: Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou

    Abstract: This paper tackles the problem of learning a finer representation than the one provided by training labels. This enables fine-grained category retrieval of images in a collection annotated with coarse labels only. Our network is learned with a nearest-neighbor classifier objective, and an instance loss inspired by self-supervised learning. By jointly leveraging the coarse labels and the underlyi… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

  21. arXiv:2008.05763  [pdf, other

    cs.CV eess.IV

    Powers of layers for image-to-image translation

    Authors: Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou

    Abstract: We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc. We start from an image autoencoder architecture with fixed weights. For each task we learn a residual block operating in the latent space, which is iteratively called until the target domain is reached. A specific training schedule is required to a… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

  22. arXiv:2007.04755  [pdf, other

    cs.CV

    Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation

    Authors: Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata

    Abstract: Few-shot learning aims to recognize novel classes from a few examples. Although significant progress has been made in the image domain, few-shot video classification is relatively unexplored. We argue that previous methods underestimate the importance of video feature learning and propose to learn spatiotemporal features using a 3D CNN. Proposing a two-stage approach that learns video features on… ▽ More

    Submitted 13 October, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted by TPAMI in October, 2021

  23. arXiv:2007.00991  [pdf, other

    eess.AS cs.CL cs.SD

    Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

    Authors: Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

    Abstract: Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs other methods on unsupervised evaluation benchmarks. Here, we introduce WavAugment, a time-domain data augmentation library and find that applying augmentation in the past is gene… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

  24. arXiv:2003.08237  [pdf, other

    cs.CV cs.LG

    Fixing the train-test resolution discrepancy: FixEfficientNet

    Authors: Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

    Abstract: This paper provides an extensive analysis of the performance of the EfficientNet image classifiers with several recent training procedures, in particular one that corrects the discrepancy between train and test images. The resulting network, called FixEfficientNet, significantly outperforms the initial architecture with the same number of parameters. For instance, our FixEfficientNet-B0 trained… ▽ More

    Submitted 18 November, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

  25. arXiv:2002.00937  [pdf, other

    stat.ML cs.CR cs.CV cs.LG

    Radioactive data: tracing through training

    Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

    Abstract: We want to detect whether a particular image dataset has been used to train a model. We propose a new technique, \emph{radioactive data}, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. The mark is robust to strong variations such as different architectures or optimization methods. Given a trained model, our technique detects the u… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

  26. arXiv:1910.07323  [pdf, ps, other

    cs.CL cs.AI cs.LG eess.AS

    Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

    Authors: Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze

    Abstract: The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors. Usually, either a quality control stage discards transcriptions with too many errors, or the noisy transcriptions are used as is. We introduce Lead2Gold, a method to train an ASR system that exploits the full potential of noisy transcriptions. Based on a noise model of transcription errors, Lead2Gold… ▽ More

    Submitted 16 October, 2019; originally announced October 2019.

    Comments: 8 pages, 4 tables, Accepted for publication in ASRU 2019

    ACM Class: I.2.6; I.2.7

  27. arXiv:1908.11229  [pdf, other

    stat.ML cs.CR cs.LG

    White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

    Authors: Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou

    Abstract: Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set. In this paper, we derive the optimal strategy for membership inference with a few assumptions on the distribution of the parameters. We show that optimal attacks only depend on the loss function, and thus black-box attacks are as good as white-box att… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

  28. arXiv:1906.06423  [pdf, other

    cs.CV cs.LG

    Fixing the train-test resolution discrepancy

    Authors: Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

    Abstract: Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the typical size of the objects seen by the classifier at train and test time. We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time. We t… ▽ More

    Submitted 20 January, 2022; v1 submitted 14 June, 2019; originally announced June 2019.

  29. arXiv:1902.05509  [pdf, other

    cs.CV

    MultiGrain: a unified image embedding for classes and instances

    Authors: Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze

    Abstract: MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted… ▽ More

    Submitted 3 April, 2019; v1 submitted 14 February, 2019; originally announced February 2019.

  30. arXiv:1809.06396  [pdf, other

    cs.CV

    Déjà Vu: an empirical evaluation of the memorization properties of ConvNets

    Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

    Abstract: Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting. This paper considers the related question of "membership inference", where the goal is to determine if an image was used during training. We consider it under three complementary angles. We show how to detect which dataset was used… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

  31. arXiv:1807.05520  [pdf, other

    cs.CV

    Deep Clustering for Unsupervised Learning of Visual Features

    Authors: Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze

    Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. De… ▽ More

    Submitted 18 March, 2019; v1 submitted 15 July, 2018; originally announced July 2018.

    Comments: Accepted at ECCV 2018

  32. arXiv:1806.03198  [pdf, other

    stat.ML cs.LG

    Spreading vectors for similarity search

    Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

    Abstract: Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods. State-of-the-art techniques learn parameters of quantizers on training data for optimal performance, thus adapting quantizers to the data. In this work, we propose to reverse this paradigm and adapt the data to the quantizer: we train a neural net which last layer forms a fixed parameter-free quanti… ▽ More

    Submitted 30 August, 2019; v1 submitted 8 June, 2018; originally announced June 2018.

    Comments: Published at ICLR 2019

  33. arXiv:1804.09996  [pdf, other

    cs.CV cs.DB cs.DS cs.IR

    Link and code: Fast indexing with graphs and compact regression codes

    Authors: Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou

    Abstract: Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory constraint required to index billions of images on a single server. This leads us to propose a method based both on graph traversal and compact representations. We en… ▽ More

    Submitted 27 April, 2018; v1 submitted 26 April, 2018; originally announced April 2018.

  34. arXiv:1708.02898  [pdf, other

    cs.CV cs.MM

    An evaluation of large-scale methods for image instance and class discovery

    Authors: Matthijs Douze, Hervé Jégou, Jeff Johnson

    Abstract: This paper aims at discovering meaningful subsets of related images from large image collections without annotations. We search groups of images related at different levels of semantic, i.e., either instances or visual classes. While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the… ▽ More

    Submitted 9 August, 2017; originally announced August 2017.

    Comments: Published at ACM Multimedia workshop

  35. arXiv:1706.02332  [pdf, other

    cs.CV cs.LG stat.ML

    Low-shot learning with large-scale diffusion

    Authors: Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou

    Abstract: This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time. This setup is often referred to as low-shot learning, where a standard approach is to re-train the last few layers of a convolutional neural network learned on separate classes for which training examples are abundant. We consider a semi-supervised setting based… ▽ More

    Submitted 15 June, 2018; v1 submitted 7 June, 2017; originally announced June 2017.

  36. arXiv:1706.01558  [pdf, other

    cs.GR

    QuickCSG: Fast Arbitrary Boolean Combinations of N Solids

    Authors: Matthijs Douze, Jean-Sébastien Franco, Bruno Raffin

    Abstract: QuickCSG computes the result for general N-polyhedron boolean expressions without an intermediate tree of solids. We propose a vertex-centric view of the problem, which simplifies the identification of final geometric contributions, and facilitates its spatial decomposition. The problem is then cast in a single KD-tree exploration, geared toward the result by early pruning of any region of space n… ▽ More

    Submitted 5 June, 2017; originally announced June 2017.

    Report number: INRIA RR-8687

  37. arXiv:1704.04154  [pdf, other

    cs.CL

    Learning Joint Multilingual Sentence Representations with Neural Machine Translation

    Authors: Holger Schwenk, Matthijs Douze

    Abstract: In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages. Our aim is that a representation which is independent of the language, is likely to capture the underlying semantics. We define a new cross-lingual similarity measure, compare up to 1.4M sentence representations and study the characteristics of close senten… ▽ More

    Submitted 8 August, 2017; v1 submitted 13 April, 2017; originally announced April 2017.

    Comments: 11 pages, 2 figures, published at ACL workshop RepL4NLP

    MSC Class: 68T50

  38. arXiv:1702.08734  [pdf, other

    cs.CV cs.DB cs.DS cs.IR

    Billion-scale similarity search with GPUs

    Authors: Jeff Johnson, Matthijs Douze, Hervé Jégou

    Abstract: Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less para… ▽ More

    Submitted 28 February, 2017; originally announced February 2017.

  39. arXiv:1612.03651  [pdf, other

    cs.CL cs.LG

    FastText.zip: Compressing text classification models

    Authors: Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov

    Abstract: We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method built upon product quantization to store word embeddings. While the original technique leads to a loss in accuracy, we adapt this method to circumvent quantizati… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

    Comments: Submitted to ICLR 2017

  40. arXiv:1609.06753  [pdf, other

    cs.CV

    How should we evaluate supervised hashing?

    Authors: Alexandre Sablayrolles, Matthijs Douze, Hervé Jégou, Nicolas Usunier

    Abstract: Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes. When hashing is supervised, the codes are trained using labels on the training data. This paper first shows that the evaluation protocols used in the literature for supervised hashing are not satisfactory: we show that a trivial solution that encodes the output of a… ▽ More

    Submitted 10 August, 2017; v1 submitted 21 September, 2016; originally announced September 2016.

  41. arXiv:1609.01882  [pdf, other

    cs.CV cs.DB cs.IT cs.LG

    Polysemous codes

    Authors: Matthijs Douze, Hervé Jégou, Florent Perronnin

    Abstract: This paper considers the problem of approximate nearest neighbor search in the compressed domain. We introduce polysemous codes, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance. Their design is inspired by algorithms introduced in the 90's to construct channel-optimized vector quantizers. At search time, th… ▽ More

    Submitted 10 October, 2016; v1 submitted 7 September, 2016; originally announced September 2016.

    Comments: The final (author) version of our ECCV'16 paper

  42. arXiv:1603.00438  [pdf, other

    cs.CV

    Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach

    Authors: Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmid

    Abstract: Convolutional neural networks (CNNs) have recently received a lot of attention due to their ability to model local stationary structures in natural images in a multi-scale fashion, when learning all model parameters with supervision. While excellent performance was achieved for image classification when large amounts of labeled visual data are available, their success for un-supervised tasks such… ▽ More

    Submitted 1 March, 2016; originally announced March 2016.

  43. arXiv:1508.03755  [pdf, other

    cs.CV

    Beat-Event Detection in Action Movie Franchises

    Authors: Danila Potapov, Matthijs Douze, Jerome Revaud, Zaid Harchaoui, Cordelia Schmid

    Abstract: While important advances were recently made towards temporally localizing and recognizing specific human actions or activities in videos, efficient detection and classification of long video chunks belonging to semantically defined categories such as "pursuit" or "romance" remains challenging.We introduce a new dataset, Action Movie Franchises, consisting of a collection of Hollywood action movie… ▽ More

    Submitted 15 August, 2015; originally announced August 2015.

  44. arXiv:1506.02588  [pdf, other

    cs.CV

    Circulant temporal encoding for video retrieval and temporal alignment

    Authors: Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid

    Abstract: We address the problem of specific video event retrieval. Given a query video of a specific event, e.g., a concert of Madonna, the goal is to retrieve other videos of the same event that temporally overlap with the query. Our approach encodes the frame descriptors of a video to jointly represent their appearance and temporal order. It exploits the properties of circulant matrices to efficiently co… ▽ More

    Submitted 30 November, 2015; v1 submitted 8 June, 2015; originally announced June 2015.

  45. arXiv:1102.3828  [pdf, ps, other

    cs.IR cs.CV

    Searching in one billion vectors: re-rank with source coding

    Authors: Hervé Jégou, Romain Tavenard, Matthijs Douze, Laurent Amsaleg

    Abstract: Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the est… ▽ More

    Submitted 18 February, 2011; originally announced February 2011.

    Comments: International Conference on Acoustics, Speech and Signal Processing, Prague : Czech Republic (2011)