-
Spurious Correlations and Where to Find Them
Authors:
Gautam Sreekumar,
Vishnu Naresh Boddeti
Abstract:
Spurious correlations occur when a model learns unreliable features from the data and are a well-known drawback of data-driven learning. Although there are several algorithms proposed to mitigate it, we are yet to jointly derive the indicators of spurious correlations. As a result, the solutions built upon standalone hypotheses fail to beat simple ERM baselines. We collect some of the commonly stu…
▽ More
Spurious correlations occur when a model learns unreliable features from the data and are a well-known drawback of data-driven learning. Although there are several algorithms proposed to mitigate it, we are yet to jointly derive the indicators of spurious correlations. As a result, the solutions built upon standalone hypotheses fail to beat simple ERM baselines. We collect some of the commonly studied hypotheses behind the occurrence of spurious correlations and investigate their influence on standard ERM baselines using synthetic datasets generated from causal graphs. Subsequently, we observe patterns connecting these hypotheses and model design choices.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
On the Global Optima of Kernelized Adversarial Representation Learning
Authors:
Bashir Sadeghi,
Runyi Yu,
Vishnu Naresh Boddeti
Abstract:
Adversarial representation learning is a promising paradigm for obtaining data representations that are invariant to certain sensitive attributes while retaining the information necessary for predicting target attributes. Existing approaches solve this problem through iterative adversarial minimax optimization and lack theoretical guarantees. In this paper, we first study the "linear" form of this…
▽ More
Adversarial representation learning is a promising paradigm for obtaining data representations that are invariant to certain sensitive attributes while retaining the information necessary for predicting target attributes. Existing approaches solve this problem through iterative adversarial minimax optimization and lack theoretical guarantees. In this paper, we first study the "linear" form of this problem i.e., the setting where all the players are linear functions. We show that the resulting optimization problem is both non-convex and non-differentiable. We obtain an exact closed-form expression for its global optima through spectral learning and provide performance guarantees in terms of analytical bounds on the achievable utility and invariance. We then extend this solution and analysis to non-linear functions through kernel representation. Numerical experiments on UCI, Extended Yale B and CIFAR-100 datasets indicate that, (a) practically, our solution is ideal for "imparting" provable invariance to any biased pre-trained data representation, and (b) empirically, the trade-off between utility and invariance provided by our solution is comparable to iterative minimax optimization of existing deep neural network based approaches. Code is available at https://github.com/human-analysis/Kernel-ARL
△ Less
Submitted 25 December, 2019; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach
Authors:
Proteek Chandan Roy,
Vishnu Naresh Boddeti
Abstract:
Image recognition systems have demonstrated tremendous progress over the past few decades thanks, in part, to our ability of learning compact and robust representations of images. As we witness the wide spread adoption of these systems, it is imperative to consider the problem of unintended leakage of information from an image representation, which might compromise the privacy of the data owner. T…
▽ More
Image recognition systems have demonstrated tremendous progress over the past few decades thanks, in part, to our ability of learning compact and robust representations of images. As we witness the wide spread adoption of these systems, it is imperative to consider the problem of unintended leakage of information from an image representation, which might compromise the privacy of the data owner. This paper investigates the problem of learning an image representation that minimizes such leakage of user information. We formulate the problem as an adversarial non-zero sum game of finding a good embedding function with two competing goals: to retain as much task dependent discriminative image information as possible, while simultaneously minimizing the amount of information, as measured by entropy, about other sensitive attributes of the user. We analyze the stability and convergence dynamics of the proposed formulation using tools from non-linear systems theory and compare to that of the corresponding adversarial zero-sum game formulation that optimizes likelihood as a measure of information content. Numerical experiments on UCI, Extended Yale B, CIFAR-10 and CIFAR-100 datasets indicate that our proposed approach is able to learn image representations that exhibit high task performance while mitigating leakage of predefined sensitive information.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
On the Intrinsic Dimensionality of Image Representations
Authors:
Sixue Gong,
Vishnu Naresh Boddeti,
Anil K. Jain
Abstract:
This paper addresses the following questions pertaining to the intrinsic dimensionality of any given image representation: (i) estimate its intrinsic dimensionality, (ii) develop a deep neural network based non-linear map**, dubbed DeepMDS, that transforms the ambient representation to the minimal intrinsic space, and (iii) validate the veracity of the map** through image matching in the intri…
▽ More
This paper addresses the following questions pertaining to the intrinsic dimensionality of any given image representation: (i) estimate its intrinsic dimensionality, (ii) develop a deep neural network based non-linear map**, dubbed DeepMDS, that transforms the ambient representation to the minimal intrinsic space, and (iii) validate the veracity of the map** through image matching in the intrinsic space. Experiments on benchmark image datasets (LFW, IJB-C and ImageNet-100) reveal that the intrinsic dimensionality of deep neural network representations is significantly lower than the dimensionality of the ambient features. For instance, SphereFace's 512-dim face representation and ResNet's 512-dim image representation have an intrinsic dimensionality of 16 and 19 respectively. Further, the DeepMDS map** is able to obtain a representation of significantly lower dimensionality while maintaining discriminative ability to a large extent, 59.75% TAR @ 0.1% FAR in 16-dim vs 71.26% TAR in 512-dim on IJB-C and a Top-1 accuracy of 77.0% at 19-dim vs 83.4% at 512-dim on ImageNet-100.
△ Less
Submitted 10 April, 2019; v1 submitted 26 March, 2018;
originally announced March 2018.
-
Efficient K-Shot Learning with Regularized Deep Networks
Authors:
Donghyun Yoo,
Haoqi Fan,
Vishnu Naresh Boddeti,
Kris M. Kitani
Abstract:
Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires…
▽ More
Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires very careful optimization to prevent severe over-fitting to small datasets. The problem of sub-optimality and over-fitting, is due in part to the large number of parameters used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning. To prevent overfitting, our key strategy is to cluster the model parameters while ensuring intra-cluster similarity and inter-cluster diversity of the parameters, effectively regularizing the dimensionality of the parameter search space. In particular, we identify groups of neurons within each layer of a deep network that shares similar activation patterns. When the network is to be fine-tuned for a classification task using only k examples, we propagate a single gradient to all of the neuron parameters that belong to the same group. The grou** of neurons is non-trivial as neuron activations depend on the distribution of the input data. To efficiently search for optimal grou**s conditioned on the input data, we propose a reinforcement learning search strategy using recurrent networks to learn the optimal group assignments for each network layer. Experimental results show that our method can be easily applied to several popular convolutional neural networks and improve upon other state-of-the-art fine-tuning based k-shot learning strategies by more than10%
△ Less
Submitted 6 October, 2017;
originally announced October 2017.
-
On the Capacity of Face Representation
Authors:
Sixue Gong,
Vishnu Naresh Boddeti,
Anil K. Jain
Abstract:
In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalabil…
▽ More
In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity problem in terms of packing bounds on a low-dimensional manifold embedded within a deep representation space. By explicitly accounting for the manifold structure of the representation as well two different sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of a given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of two deep neural network based face representations, namely 128-dimensional FaceNet and 512-dimensional SphereFace. Numerical experiments on unconstrained faces (IJB-C) provides a capacity upper bound of $2.7\times10^4$ for FaceNet and $8.4\times10^4$ for SphereFace representation at a false acceptance rate (FAR) of 1%. As expected, capacity reduces drastically at lower FARs. The capacity at FAR of 0.1% and 0.001% is $2.2\times10^3$ and $1.6\times10^{1}$, respectively for FaceNet and $3.6\times10^3$ and $6.0\times10^0$, respectively for SphereFace.
△ Less
Submitted 11 April, 2019; v1 submitted 29 September, 2017;
originally announced September 2017.
-
Zero-Aliasing Correlation Filters for Object Recognition
Authors:
Joseph A. Fernandez,
Vishnu Naresh Boddeti,
Andres Rodriguez,
B. V. K. Vijaya Kumar
Abstract:
Correlation filters (CFs) are a class of classifiers that are attractive for object localization and tracking applications. Traditionally, CFs have been designed in the frequency domain using the discrete Fourier transform (DFT), where correlation is efficiently implemented. However, existing CF designs do not account for the fact that the multiplication of two DFTs in the frequency domain corresp…
▽ More
Correlation filters (CFs) are a class of classifiers that are attractive for object localization and tracking applications. Traditionally, CFs have been designed in the frequency domain using the discrete Fourier transform (DFT), where correlation is efficiently implemented. However, existing CF designs do not account for the fact that the multiplication of two DFTs in the frequency domain corresponds to a circular correlation in the time/spatial domain. Because this was previously unaccounted for, prior CF designs are not truly optimal, as their optimization criteria do not accurately quantify their optimization intention. In this paper, we introduce new zero-aliasing constraints that completely eliminate this aliasing problem by ensuring that the optimization criterion for a given CF corresponds to a linear correlation rather than a circular correlation. This means that previous CF designs can be significantly improved by this reformulation. We demonstrate the benefits of this new CF design approach with several important CFs. We present experimental results on diverse data sets and present solutions to the computational challenges associated with computing these CFs. Code for the CFs described in this paper and their respective zero-aliasing versions is available at http://vishnu.boddeti.net/projects/correlation-filters.html
△ Less
Submitted 19 November, 2014; v1 submitted 9 November, 2014;
originally announced November 2014.