-
FACTS: First Amplify Correlations and Then Slice to Discover Bias
Authors:
Sriram Yenamandra,
Pratik Ramesh,
Viraj Prabhu,
Judy Hoffman
Abstract:
Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes (e.g. context). Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold. In this work, we study the problem of identifying such slices to inform downstream bias mitigati…
▽ More
Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes (e.g. context). Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold. In this work, we study the problem of identifying such slices to inform downstream bias mitigation strategies. We propose First Amplify Correlations and Then Slice to Discover Bias (FACTS), wherein we first amplify correlations to fit a simple bias-aligned hypothesis via strongly regularized empirical risk minimization. Next, we perform correlation-aware slicing via mixture modeling in bias-aligned feature space to discover underperforming data slices that capture distinct correlations. Despite its simplicity, our method considerably improves over prior work (by as much as 35% precision@10) in correlation bias identification across a range of diverse evaluation settings. Our code is available at: https://github.com/yvsriram/FACTS.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
ZipIt! Merging Models from Different Tasks without Training
Authors:
George Stoica,
Daniel Bolya,
Jakob Bjorner,
Pratik Ramesh,
Taylor Hearn,
Judy Hoffman
Abstract:
Typical deep visual recognition models are capable of performing the one task they were trained on. In this paper, we tackle the extremely difficult problem of combining distinct models with different initializations, each solving a separate task, into one multi-task model without any additional training. Prior work in model merging permutes one model to the space of the other then averages them t…
▽ More
Typical deep visual recognition models are capable of performing the one task they were trained on. In this paper, we tackle the extremely difficult problem of combining distinct models with different initializations, each solving a separate task, into one multi-task model without any additional training. Prior work in model merging permutes one model to the space of the other then averages them together. While this works for models trained on the same task, we find that this fails to account for the differences in models trained on disjoint tasks. Thus, we introduce "ZipIt!", a general method for merging two arbitrary models of the same architecture that incorporates two simple strategies. First, in order to account for features that aren't shared between models, we expand the model merging problem to allow for merging features within each model by defining a general "zip" operation. Second, we add support for partially zip** the models up until a specified layer, naturally creating a multi-head model. We find that these two changes combined account for 20-60% improvement over prior work, making it more feasible to merge models trained on disjoint tasks without retraining.
△ Less
Submitted 12 March, 2024; v1 submitted 4 May, 2023;
originally announced May 2023.
-
GATSBI: Generative Adversarial Training for Simulation-Based Inference
Authors:
Poornima Ramesh,
Jan-Matthis Lueckmann,
Jan Boelts,
Álvaro Tejero-Cantero,
David S. Greenberg,
Pedro J. Gonçalves,
Jakob H. Macke
Abstract:
Simulation-based inference (SBI) refers to statistical inference on stochastic models for which we can generate samples, but not compute likelihoods. Like SBI algorithms, generative adversarial networks (GANs) do not require explicit likelihoods. We study the relationship between SBI and GANs, and introduce GATSBI, an adversarial approach to SBI. GATSBI reformulates the variational objective in an…
▽ More
Simulation-based inference (SBI) refers to statistical inference on stochastic models for which we can generate samples, but not compute likelihoods. Like SBI algorithms, generative adversarial networks (GANs) do not require explicit likelihoods. We study the relationship between SBI and GANs, and introduce GATSBI, an adversarial approach to SBI. GATSBI reformulates the variational objective in an adversarial setting to learn implicit posterior distributions. Inference with GATSBI is amortised across observations, works in high-dimensional posterior spaces and supports implicit priors. We evaluate GATSBI on two SBI benchmark problems and on two high-dimensional simulators. On a model for wave propagation on the surface of a shallow water body, we show that GATSBI can return well-calibrated posterior estimates even in high dimensions. On a model of camera optics, it infers a high-dimensional posterior given an implicit prior, and performs better than a state-of-the-art SBI approach. We also show how GATSBI can be extended to perform sequential posterior estimation to focus on individual observations. Overall, GATSBI opens up opportunities for leveraging advances in GANs to perform Bayesian inference on high-dimensional simulation-based models.
△ Less
Submitted 12 March, 2022;
originally announced March 2022.
-
One to rule them all: Towards Joint Indic Language Hate Speech Detection
Authors:
Mehar Bhatia,
Tenzin Singhay Bhotia,
Akshat Agarwal,
Prakash Ramesh,
Shubham Gupta,
Kumar Shridhar,
Felix Laumann,
Ayushman Dash
Abstract:
This paper is a contribution to the Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) 2021 shared task. Social media today is a hotbed of toxic and hateful conversations, in various languages. Recent news reports have shown that current models struggle to automatically identify hate posted in minority languages. Therefore, efficiently curbing hate speech is a crit…
▽ More
This paper is a contribution to the Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) 2021 shared task. Social media today is a hotbed of toxic and hateful conversations, in various languages. Recent news reports have shown that current models struggle to automatically identify hate posted in minority languages. Therefore, efficiently curbing hate speech is a critical challenge and problem of interest. We present a multilingual architecture using state-of-the-art transformer language models to jointly learn hate and offensive speech detection across three languages namely, English, Hindi, and Marathi. On the provided testing corpora, we achieve Macro F1 scores of 0.7996, 0.7748, 0.8651 for sub-task 1A and 0.6268, 0.5603 during the fine-grained classification of sub-task 1B. These results show the efficacy of exploiting a multilingual training scheme.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Long-Term Memory Networks for Question Answering
Authors:
Fenglong Ma,
Radha Chitta,
Saurabh Kataria,
**g Zhou,
Palghat Ramesh,
Tong Sun,
**g Gao
Abstract:
Question answering is an important and difficult task in the natural language processing domain, because many basic natural language processing tasks can be cast into a question answering task. Several deep neural network architectures have been developed recently, which employ memory and inference components to memorize and reason over text information, and generate answers to questions. However,…
▽ More
Question answering is an important and difficult task in the natural language processing domain, because many basic natural language processing tasks can be cast into a question answering task. Several deep neural network architectures have been developed recently, which employ memory and inference components to memorize and reason over text information, and generate answers to questions. However, a major drawback of many such models is that they are capable of only generating single-word answers. In addition, they require large amount of training data to generate accurate answers. In this paper, we introduce the Long-Term Memory Network (LTMN), which incorporates both an external memory module and a Long Short-Term Memory (LSTM) module to comprehend the input data and generate multi-word answers. The LTMN model can be trained end-to-end using back-propagation and requires minimal supervision. We test our model on two synthetic data sets (based on Facebook's bAbI data set) and the real-world Stanford question answering data set, and show that it can achieve state-of-the-art performance.
△ Less
Submitted 6 July, 2017;
originally announced July 2017.
-
Deep Multimodal Representation Learning from Temporal Data
Authors:
Xitong Yang,
Palghat Ramesh,
Radha Chitta,
Sriganesh Madhvanath,
Edgar A. Bernal,
Jiebo Luo
Abstract:
In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlat…
▽ More
In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlational Recurrent Neural Network (CorrRNN), a novel temporal fusion model for fusing multiple input modalities that are inherently temporal in nature. Key features of our proposed model include: (i) simultaneous learning of the joint representation and temporal dependencies between modalities, (ii) use of multiple loss terms in the objective function, including a maximum correlation loss term to enhance learning of cross-modal information, and (iii) the use of an attention model to dynamically adjust the contribution of different input modalities to the joint representation. We validate our model via experimentation on two different tasks: video- and sensor-based activity classification, and audio-visual speech recognition. We empirically analyze the contributions of different components of the proposed CorrRNN model, and demonstrate its robustness, effectiveness and state-of-the-art performance on multiple datasets.
△ Less
Submitted 11 April, 2017;
originally announced April 2017.
-
Invisibility System Using Image Processing and Optical Camouflage Technology
Authors:
Vasireddy Srikanth,
Pillem Ramesh
Abstract:
Invisible persons are seen in fiction stories only, but in the real world it is proved that invisibility is possible. This paper describes the creation of invisibility with the help of technologies like Optical camouflage; Image based rendering and Retro reflective projection. The object that needs to be made transparent or invisible is painted or covered with retro reflective material. Then a pro…
▽ More
Invisible persons are seen in fiction stories only, but in the real world it is proved that invisibility is possible. This paper describes the creation of invisibility with the help of technologies like Optical camouflage; Image based rendering and Retro reflective projection. The object that needs to be made transparent or invisible is painted or covered with retro reflective material. Then a projector projects the background image on it making the masking object virtually transparent. Capturing the background image requires a video camera, which sits behind the person wearing the cloak. The video from the camera must be in a digital format so it can be sent to a computer for image processing using image based rendering technical. There are some useful applications for this simple but astonishing technology.
△ Less
Submitted 8 February, 2014;
originally announced April 2014.