Skip to main content

Showing 1–5 of 5 results for author: Yalniz, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.02752  [pdf, other

    cs.CV

    DeDrift: Robust Similarity Search under Content Drift

    Authors: Dmitry Baranchuk, Matthijs Douze, Yash Upadhyay, I. Zeki Yalniz

    Abstract: The statistical distribution of content uploaded and searched on media sharing sites changes over time due to seasonal, sociological and technical factors. We investigate the impact of this "content drift" for large-scale similarity search tools, based on nearest neighbor search in embedding space. Unless a costly index reconstruction is performed frequently, content drift degrades the search accu… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: ICCV2023

  2. arXiv:2007.00077  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Similarity Search for Efficient Active Learning and Search of Rare Concepts

    Authors: Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz

    Abstract: Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples. Existing approaches search globally for the optimal examples to label, scaling linearly or even quadratically with the unlabeled data. In this paper, we improve the computational efficiency of active learning and search methods by restricting the candidate pool for la… ▽ More

    Submitted 22 July, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

  3. arXiv:2003.06729  [pdf, other

    cs.CV cs.LG stat.ML

    NoiseRank: Unsupervised Label Noise Reduction with Dependence Models

    Authors: Karishma Sharma, Pinar Donmez, Enming Luo, Yan Liu, I. Zeki Yalniz

    Abstract: Label noise is increasingly prevalent in datasets acquired from noisy channels. Existing approaches that detect and remove label noise generally rely on some form of supervision, which is not scalable and error-prone. In this paper, we propose NoiseRank, for unsupervised label noise reduction using Markov Random Fields (MRF). We construct a dependence model to estimate the posterior probability of… ▽ More

    Submitted 14 March, 2020; originally announced March 2020.

  4. arXiv:1905.00546  [pdf, other

    cs.CV

    Billion-scale semi-supervised learning for image classification

    Authors: I. Zeki Yalniz, Hervé Jégou, Kan Chen, Manohar Paluri, Dhruv Mahajan

    Abstract: This paper presents a study of semi-supervised learning with large convolutional networks. We propose a pipeline, based on a teacher/student paradigm, that leverages a large collection of unlabelled images (up to 1 billion). Our main goal is to improve the performance for a given target architecture, like ResNet-50 or ResNext. We provide an extensive analysis of the success factors of our approach… ▽ More

    Submitted 1 May, 2019; originally announced May 2019.

  5. arXiv:1903.01612  [pdf, other

    cs.CV cs.LG

    Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search

    Authors: Abhimanyu Dubey, Laurens van der Maaten, Zeki Yalniz, Yixuan Li, Dhruv Mahajan

    Abstract: A plethora of recent work has shown that convolutional networks are not robust to adversarial images: images that are created by perturbing a sample from the data distribution as to maximize the loss on the perturbed example. In this work, we hypothesize that adversarial perturbations move the image away from the image manifold in the sense that there exists no physical process that could have pro… ▽ More

    Submitted 4 May, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

    Comments: CVPR 2019 Oral presentation; camera-ready with supplement (14 pages). v1 updated from error in Table 2, row 10