Skip to main content

Showing 1–18 of 18 results for author: Hazirbas, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.07329  [pdf, other

    cs.CV

    The Bias of Harmful Label Associations in Vision-Language Models

    Authors: Caner Hazirbas, Alicia Sun, Yonathan Efroni, Mark Ibrahim

    Abstract: Despite the remarkable performance of foundation vision-language models, the shared representation space for text and vision can also encode harmful label associations detrimental to fairness. While prior work has uncovered bias in vision-language models' (VLMs) classification performance across geography, work has been limited along the important axis of harmful label associations due to a lack o… ▽ More

    Submitted 15 April, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  2. arXiv:2309.15251  [pdf, other

    cs.CV cs.AI

    VPA: Fully Test-Time Visual Prompt Adaptation

    Authors: Jiachen Sun, Mark Ibrahim, Melissa Hall, Ivan Evtimov, Z. Morley Mao, Cristian Canton Ferrer, Caner Hazirbas

    Abstract: Textual prompt tuning has demonstrated significant performance improvements in adapting natural language processing models to a variety of downstream tasks by treating hand-engineered prompts as trainable parameters. Inspired by the success of textual prompting, several studies have investigated the efficacy of visual prompt tuning. In this work, we present Visual Prompt Adaptation (VPA), the firs… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  3. arXiv:2306.11710  [pdf, other

    cs.CV

    Data-Driven but Privacy-Conscious: Pedestrian Dataset De-identification via Full-Body Person Synthesis

    Authors: Maxim Maximov, Tim Meinhardt, Ismail Elezi, Zoe Papakipos, Caner Hazirbas, Cristian Canton Ferrer, Laura Leal-Taixé

    Abstract: The advent of data-driven technology solutions is accompanied by an increasing concern with data privacy. This is of particular importance for human-centered image recognition tasks, such as pedestrian detection, re-identification, and tracking. To highlight the importance of privacy issues and motivate future research, we motivate and introduce the Pedestrian Dataset De-Identification (PDI) task.… ▽ More

    Submitted 22 June, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

  4. arXiv:2304.05391  [pdf, other

    cs.CV

    Pinpointing Why Object Recognition Performance Degrades Across Income Levels and Geographies

    Authors: Laura Gustafson, Megan Richards, Melissa Hall, Caner Hazirbas, Diane Bouchacourt, Mark Ibrahim

    Abstract: Despite impressive advances in object-recognition, deep learning systems' performance degrades significantly across geographies and lower income levels raising pressing concerns of inequity. Addressing such performance gaps remains a challenge, as little is understood about why performance degrades across incomes or geographies. We take a step in this direction by annotating images from Dollar Str… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  5. arXiv:2303.04838  [pdf, other

    cs.CV cs.AI cs.CL cs.CY

    The Casual Conversations v2 Dataset

    Authors: Bilal Porgali, Vítor Albiero, Jordan Ryda, Cristian Canton Ferrer, Caner Hazirbas

    Abstract: This paper introduces a new large consent-driven dataset aimed at assisting in the evaluation of algorithmic bias and robustness of computer vision and audio speech models in regards to 11 attributes that are self-provided or labeled by trained annotators. The dataset includes 26,467 videos of 5,567 unique paid participants, with an average of almost 5 videos per person, recorded in Brazil, India,… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  6. arXiv:2212.04825  [pdf, other

    cs.CV

    A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

    Authors: Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

    Abstract: Machine learning models have been found to learn shortcuts -- unintended decision rules that are unable to generalize -- undermining models' reliability. Previous works address this problem under the tenuous assumption that only a single shortcut exists in the training data. Real-world images are rife with multiple visual cues from background to texture. Key to advancing the reliability of vision… ▽ More

    Submitted 21 March, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: CVPR 2023. Code is available at https://github.com/facebookresearch/Whac-A-Mole

  7. arXiv:2211.05809  [pdf, other

    cs.CV cs.AI cs.CL cs.CY

    Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness

    Authors: Caner Hazirbas, Ye** Bang, Tiezheng Yu, Parisa Assar, Bilal Porgali, Vítor Albiero, Stefan Hermanek, Jacqueline Pan, Emily McReynolds, Miranda Bogen, Pascale Fung, Cristian Canton Ferrer

    Abstract: Develo** robust and fair AI systems require datasets with comprehensive set of labels that can help ensure the validity and legitimacy of relevant measurements. Recent efforts, therefore, focus on collecting person-related datasets that have carefully selected labels, including sensitive characteristics, and consent forms in place to use those attributes for model testing and development. Respon… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

  8. arXiv:2211.01866  [pdf, other

    cs.CV cs.LG

    ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

    Authors: Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim

    Abstract: Deep learning vision systems are widely deployed across applications where reliability is critical. However, even today's best models can fail to recognize an object when its pose, lighting, or background varies. While existing benchmarks surface examples challenging for models, they do not explain why such mistakes arise. To address this need, we introduce ImageNet-X, a set of sixteen human annot… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  9. arXiv:2203.17260  [pdf, other

    cs.CV cs.LG

    Generating High Fidelity Data from Low-density Regions using Diffusion Models

    Authors: Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer

    Abstract: Our work focuses on addressing sample deficiency from low-density regions of data manifold in common image datasets. We leverage diffusion process based generative models to synthesize novel images from low-density regions. We observe that uniform sampling from diffusion models predominantly samples from high-density regions of the data manifold. Therefore, we modify the sampling process to guide… ▽ More

    Submitted 26 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: CVPR 2022 (fixed some discrepancies in notation - v2)

  10. arXiv:2202.07603  [pdf, other

    cs.CV cs.AI cs.CY

    Fairness Indicators for Systematic Assessments of Visual Feature Extractors

    Authors: Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun, Nicolas Usunier

    Abstract: Does everyone equally benefit from computer vision systems? Answers to this question become more and more important as computer vision systems are deployed at large scale, and can spark major concerns when they exhibit vast performance discrepancies between people from various demographic and social backgrounds. Systematic diagnosis of fairness, harms, and biases of computer vision systems is an i… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  11. arXiv:2111.09983  [pdf, other

    eess.AS cs.SD

    Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions

    Authors: Chunxi Liu, Michael Picheny, Leda Sarı, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf

    Abstract: It is well known that many machine learning systems demonstrate bias towards specific groups of individuals. This problem has been studied extensively in the Facial Recognition area, but much less so in Automatic Speech Recognition (ASR). This paper presents initial Speech Recognition results on "Casual Conversations" -- a publicly released 846 hour corpus designed to help researchers evaluate the… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Submitted to ICASSP 2022. Our dataset will be publicly available at (https://ai.facebook.com/datasets/casual-conversations-downloads) for general use. We also would like to note that considering the limitations of our dataset, we limit the use of it for only evaluation purposes (see license agreement)

  12. arXiv:2106.09222  [pdf, other

    stat.ML cs.CR cs.CV cs.LG

    Localized Uncertainty Attacks

    Authors: Ousmane Amadou Dia, Theofanis Karaletsos, Caner Hazirbas, Cristian Canton Ferrer, Ilknur Kaynar Kabul, Erik Meijer

    Abstract: The susceptibility of deep learning models to adversarial perturbations has stirred renewed attention in adversarial examples resulting in a number of attacks. However, most of these attacks fail to encompass a large spectrum of adversarial perturbations that are imperceptible to humans. In this paper, we present localized uncertainty attacks, a novel class of threat models against deterministic a… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: CVPR 2021 Workshop on Adversarial Machine Learning in Computer Vision

  13. arXiv:2104.02821  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Measuring Fairness in AI: the Casual Conversations Dataset

    Authors: Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer

    Abstract: This paper introduces a novel dataset to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions. Our dataset is composed of 3,011 subjects and contains over 45,000 videos, with an average of 15 videos per person. The videos were recorded in multiple U.S. states with a diverse set of adu… ▽ More

    Submitted 3 November, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

  14. What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation?

    Authors: Nikolaus Mayer, Eddy Ilg, Philipp Fischer, Caner Hazirbas, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox

    Abstract: The finding that very large networks can be trained efficiently and reliably has led to a paradigm shift in computer vision from engineered solutions to learning formulations. As a result, the research challenge shifts from devising algorithms to creating suitable and abundant training data for supervised learning. How to efficiently create such training data? The dominant data acquisition method… ▽ More

    Submitted 22 March, 2018; v1 submitted 19 January, 2018; originally announced January 2018.

    Comments: added references (UCL dataset); added IJCV copyright information

  15. Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems

    Authors: Tim Meinhardt, Michael Moeller, Caner Hazirbas, Daniel Cremers

    Abstract: While variational methods have been among the most powerful tools for solving linear inverse problems in imaging, deep (convolutional) neural networks have recently taken the lead in many challenging benchmarks. A remaining drawback of deep learning approaches is their requirement for an expensive retraining whenever the specific problem, the noise level, noise type, or desired measure of fidelity… ▽ More

    Submitted 30 August, 2017; v1 submitted 11 April, 2017; originally announced April 2017.

  16. arXiv:1704.01085  [pdf, other

    cs.CV

    Deep Depth From Focus

    Authors: Caner Hazirbas, Sebastian Georg Soyer, Maximilian Christian Staab, Laura Leal-Taixé, Daniel Cremers

    Abstract: Depth from focus (DFF) is one of the classical ill-posed inverse problems in computer vision. Most approaches recover the depth at each pixel based on the focal setting which exhibits maximal sharpness. Yet, it is not obvious how to reliably estimate the sharpness level, particularly in low-textured areas. In this paper, we propose `Deep Depth From Focus (DDFF)' as the first end-to-end learning ap… ▽ More

    Submitted 28 October, 2018; v1 submitted 4 April, 2017; originally announced April 2017.

    Comments: accepted to Asian Conference on Computer Vision (ACCV) 2018

  17. arXiv:1611.07890  [pdf, other

    cs.CV

    Image-based localization using LSTMs for structured feature correlation

    Authors: Florian Walch, Caner Hazirbas, Laura Leal-Taixé, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers

    Abstract: In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improve… ▽ More

    Submitted 20 August, 2017; v1 submitted 23 November, 2016; originally announced November 2016.

  18. arXiv:1504.06852  [pdf, other

    cs.CV cs.LG

    FlowNet: Learning Optical Flow with Convolutional Networks

    Authors: Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox

    Abstract: Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare tw… ▽ More

    Submitted 4 May, 2015; v1 submitted 26 April, 2015; originally announced April 2015.

    Comments: Added supplementary material

    ACM Class: I.2.6; I.4.8