Search | arXiv e-print repository

Spurious Correlations and Where to Find Them

Authors: Gautam Sreekumar, Vishnu Naresh Boddeti

Abstract: Spurious correlations occur when a model learns unreliable features from the data and are a well-known drawback of data-driven learning. Although there are several algorithms proposed to mitigate it, we are yet to jointly derive the indicators of spurious correlations. As a result, the solutions built upon standalone hypotheses fail to beat simple ERM baselines. We collect some of the commonly stu… ▽ More Spurious correlations occur when a model learns unreliable features from the data and are a well-known drawback of data-driven learning. Although there are several algorithms proposed to mitigate it, we are yet to jointly derive the indicators of spurious correlations. As a result, the solutions built upon standalone hypotheses fail to beat simple ERM baselines. We collect some of the commonly studied hypotheses behind the occurrence of spurious correlations and investigate their influence on standard ERM baselines using synthetic datasets generated from causal graphs. Subsequently, we observe patterns connecting these hypotheses and model design choices. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 2nd Workshop on SCIS, ICML 2023

arXiv:1910.07423 [pdf, other]

On the Global Optima of Kernelized Adversarial Representation Learning

Authors: Bashir Sadeghi, Runyi Yu, Vishnu Naresh Boddeti

Abstract: Adversarial representation learning is a promising paradigm for obtaining data representations that are invariant to certain sensitive attributes while retaining the information necessary for predicting target attributes. Existing approaches solve this problem through iterative adversarial minimax optimization and lack theoretical guarantees. In this paper, we first study the "linear" form of this… ▽ More Adversarial representation learning is a promising paradigm for obtaining data representations that are invariant to certain sensitive attributes while retaining the information necessary for predicting target attributes. Existing approaches solve this problem through iterative adversarial minimax optimization and lack theoretical guarantees. In this paper, we first study the "linear" form of this problem i.e., the setting where all the players are linear functions. We show that the resulting optimization problem is both non-convex and non-differentiable. We obtain an exact closed-form expression for its global optima through spectral learning and provide performance guarantees in terms of analytical bounds on the achievable utility and invariance. We then extend this solution and analysis to non-linear functions through kernel representation. Numerical experiments on UCI, Extended Yale B and CIFAR-100 datasets indicate that, (a) practically, our solution is ideal for "imparting" provable invariance to any biased pre-trained data representation, and (b) empirically, the trade-off between utility and invariance provided by our solution is comparable to iterative minimax optimization of existing deep neural network based approaches. Code is available at https://github.com/human-analysis/Kernel-ARL △ Less

Submitted 25 December, 2019; v1 submitted 16 October, 2019; originally announced October 2019.

Comments: Accepted for publication at ICCV 2019. This version includes additional theoretical and experimental analysis. Minor update to the GMM experiment

arXiv:1904.05514 [pdf, other]

Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach

Authors: Proteek Chandan Roy, Vishnu Naresh Boddeti

Abstract: Image recognition systems have demonstrated tremendous progress over the past few decades thanks, in part, to our ability of learning compact and robust representations of images. As we witness the wide spread adoption of these systems, it is imperative to consider the problem of unintended leakage of information from an image representation, which might compromise the privacy of the data owner. T… ▽ More Image recognition systems have demonstrated tremendous progress over the past few decades thanks, in part, to our ability of learning compact and robust representations of images. As we witness the wide spread adoption of these systems, it is imperative to consider the problem of unintended leakage of information from an image representation, which might compromise the privacy of the data owner. This paper investigates the problem of learning an image representation that minimizes such leakage of user information. We formulate the problem as an adversarial non-zero sum game of finding a good embedding function with two competing goals: to retain as much task dependent discriminative image information as possible, while simultaneously minimizing the amount of information, as measured by entropy, about other sensitive attributes of the user. We analyze the stability and convergence dynamics of the proposed formulation using tools from non-linear systems theory and compare to that of the corresponding adversarial zero-sum game formulation that optimizes likelihood as a measure of information content. Numerical experiments on UCI, Extended Yale B, CIFAR-10 and CIFAR-100 datasets indicate that our proposed approach is able to learn image representations that exhibit high task performance while mitigating leakage of predefined sensitive information. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: Accepted for oral presentation at CVPR 2019

arXiv:1803.09672 [pdf, other]

On the Intrinsic Dimensionality of Image Representations

Authors: Sixue Gong, Vishnu Naresh Boddeti, Anil K. Jain

Abstract: This paper addresses the following questions pertaining to the intrinsic dimensionality of any given image representation: (i) estimate its intrinsic dimensionality, (ii) develop a deep neural network based non-linear map**, dubbed DeepMDS, that transforms the ambient representation to the minimal intrinsic space, and (iii) validate the veracity of the map** through image matching in the intri… ▽ More This paper addresses the following questions pertaining to the intrinsic dimensionality of any given image representation: (i) estimate its intrinsic dimensionality, (ii) develop a deep neural network based non-linear map**, dubbed DeepMDS, that transforms the ambient representation to the minimal intrinsic space, and (iii) validate the veracity of the map** through image matching in the intrinsic space. Experiments on benchmark image datasets (LFW, IJB-C and ImageNet-100) reveal that the intrinsic dimensionality of deep neural network representations is significantly lower than the dimensionality of the ambient features. For instance, SphereFace's 512-dim face representation and ResNet's 512-dim image representation have an intrinsic dimensionality of 16 and 19 respectively. Further, the DeepMDS map** is able to obtain a representation of significantly lower dimensionality while maintaining discriminative ability to a large extent, 59.75% TAR @ 0.1% FAR in 16-dim vs 71.26% TAR in 512-dim on IJB-C and a Top-1 accuracy of 77.0% at 19-dim vs 83.4% at 512-dim on ImageNet-100. △ Less

Submitted 10 April, 2019; v1 submitted 26 March, 2018; originally announced March 2018.

Comments: Accepted for publication at CVPR 2019

arXiv:1710.02277 [pdf, other]

Efficient K-Shot Learning with Regularized Deep Networks

Authors: Donghyun Yoo, Haoqi Fan, Vishnu Naresh Boddeti, Kris M. Kitani

Abstract: Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires… ▽ More Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires very careful optimization to prevent severe over-fitting to small datasets. The problem of sub-optimality and over-fitting, is due in part to the large number of parameters used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning. To prevent overfitting, our key strategy is to cluster the model parameters while ensuring intra-cluster similarity and inter-cluster diversity of the parameters, effectively regularizing the dimensionality of the parameter search space. In particular, we identify groups of neurons within each layer of a deep network that shares similar activation patterns. When the network is to be fine-tuned for a classification task using only k examples, we propagate a single gradient to all of the neuron parameters that belong to the same group. The grou** of neurons is non-trivial as neuron activations depend on the distribution of the input data. To efficiently search for optimal grou**s conditioned on the input data, we propose a reinforcement learning search strategy using recurrent networks to learn the optimal group assignments for each network layer. Experimental results show that our method can be easily applied to several popular convolutional neural networks and improve upon other state-of-the-art fine-tuning based k-shot learning strategies by more than10% △ Less

Submitted 6 October, 2017; originally announced October 2017.

arXiv:1709.10433 [pdf, other]

On the Capacity of Face Representation

Authors: Sixue Gong, Vishnu Naresh Boddeti, Anil K. Jain

Abstract: In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalabil… ▽ More In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity problem in terms of packing bounds on a low-dimensional manifold embedded within a deep representation space. By explicitly accounting for the manifold structure of the representation as well two different sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of a given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of two deep neural network based face representations, namely 128-dimensional FaceNet and 512-dimensional SphereFace. Numerical experiments on unconstrained faces (IJB-C) provides a capacity upper bound of $2.7\times10^4$ for FaceNet and $8.4\times10^4$ for SphereFace representation at a false acceptance rate (FAR) of 1%. As expected, capacity reduces drastically at lower FARs. The capacity at FAR of 0.1% and 0.001% is $2.2\times10^3$ and $1.6\times10^{1}$, respectively for FaceNet and $3.6\times10^3$ and $6.0\times10^0$, respectively for SphereFace. △ Less

Submitted 11 April, 2019; v1 submitted 29 September, 2017; originally announced September 2017.

arXiv:1411.2316 [pdf, other]

doi 10.1109/TPAMI.2014.2375215

Zero-Aliasing Correlation Filters for Object Recognition

Authors: Joseph A. Fernandez, Vishnu Naresh Boddeti, Andres Rodriguez, B. V. K. Vijaya Kumar

Abstract: Correlation filters (CFs) are a class of classifiers that are attractive for object localization and tracking applications. Traditionally, CFs have been designed in the frequency domain using the discrete Fourier transform (DFT), where correlation is efficiently implemented. However, existing CF designs do not account for the fact that the multiplication of two DFTs in the frequency domain corresp… ▽ More Correlation filters (CFs) are a class of classifiers that are attractive for object localization and tracking applications. Traditionally, CFs have been designed in the frequency domain using the discrete Fourier transform (DFT), where correlation is efficiently implemented. However, existing CF designs do not account for the fact that the multiplication of two DFTs in the frequency domain corresponds to a circular correlation in the time/spatial domain. Because this was previously unaccounted for, prior CF designs are not truly optimal, as their optimization criteria do not accurately quantify their optimization intention. In this paper, we introduce new zero-aliasing constraints that completely eliminate this aliasing problem by ensuring that the optimization criterion for a given CF corresponds to a linear correlation rather than a circular correlation. This means that previous CF designs can be significantly improved by this reformulation. We demonstrate the benefits of this new CF design approach with several important CFs. We present experimental results on diverse data sets and present solutions to the computational challenges associated with computing these CFs. Code for the CFs described in this paper and their respective zero-aliasing versions is available at http://vishnu.boddeti.net/projects/correlation-filters.html △ Less

Submitted 19 November, 2014; v1 submitted 9 November, 2014; originally announced November 2014.

Comments: 14 pages, to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)

Showing 1–7 of 7 results for author: Boddeti, V N