Search | arXiv e-print repository

Continual Learning with Dependency Preserving Hypernetworks

Authors: Dupati Srikar Chandra, Sakshi Varshney, P. K. Srijith, Sunil Gupta

Abstract: Humans learn continually throughout their lifespan by accumulating diverse knowledge and fine-tuning it for future tasks. When presented with a similar goal, neural networks suffer from catastrophic forgetting if data distributions across sequential tasks are not stationary over the course of learning. An effective approach to address such continual learning (CL) problems is to use hypernetworks w… ▽ More Humans learn continually throughout their lifespan by accumulating diverse knowledge and fine-tuning it for future tasks. When presented with a similar goal, neural networks suffer from catastrophic forgetting if data distributions across sequential tasks are not stationary over the course of learning. An effective approach to address such continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network. However, the continual learning performance of existing hypernetwork based approaches are affected by the assumption of independence of the weights across the layers in order to maintain parameter efficiency. To address this limitation, we propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency. We propose to use recurrent neural network (RNN) based hypernetwork that can generate layer weights efficiently while allowing for dependencies across them. In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance. To demonstrate the effectiveness of the proposed methods, we conducted experiments on several image classification continual learning tasks and settings. We found that the proposed methods based on the RNN hypernetworks outperformed the baselines in all these CL settings and tasks. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: Paper got accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2207.13771 [pdf, other]

CompText: Visualizing, Comparing & Understanding Text Corpus

Authors: Suvi Varshney, Divjeet Singh Jas

Abstract: A common practice in Natural Language Processing (NLP) is to visualize the text corpus without reading through the entire literature, still gras** the central idea and key points described. For a long time, researchers focused on extracting topics from the text and visualizing them based on their relative significance in the corpus. However, recently, researchers started coming up with more comp… ▽ More A common practice in Natural Language Processing (NLP) is to visualize the text corpus without reading through the entire literature, still gras** the central idea and key points described. For a long time, researchers focused on extracting topics from the text and visualizing them based on their relative significance in the corpus. However, recently, researchers started coming up with more complex systems that not only expose the topics of the corpus but also word closely related to the topic to give users a holistic view. These detailed visualizations spawned research on comparing text corpora based on their visualization. Topics are often compared to idealize the difference between corpora. However, to capture greater semantics from different corpora, researchers have started to compare texts based on the sentiment of the topics related to the text. Comparing the words carrying the most weightage, we can get an idea about the important topics for corpus. There are multiple existing texts comparing methods present that compare topics rather than sentiments but we feel that focusing on sentiment-carrying words would better compare the two corpora. Since only sentiments can explain the real feeling of the text and not just the topic, topics without sentiments are just nouns. We aim to differentiate the corpus with a focus on sentiment, as opposed to comparing all the words appearing in the two corpora. The rationale behind this is, that the two corpora do not many have identical words for side-by-side comparison, so comparing the sentiment words gives us an idea of how the corpora are appealing to the emotions of the reader. We can argue that the entropy or the unexpectedness and divergence of topics should also be of importance and help us to identify key pivot points and the importance of certain topics in the corpus alongside relative sentiment. △ Less

Submitted 27 July, 2022; originally announced July 2022.

arXiv:2108.06670 [pdf, other]

Deep Geospatial Interpolation Networks

Authors: Sumit Kumar Varshney, Jeetu Kumar, Aditya Tiwari, Rishabh Singh, Venkata M. V. Gunturi, Narayanan C. Krishnan

Abstract: Interpolation in Spatio-temporal data has applications in various domains such as climate, transportation, and mining. Spatio-Temporal interpolation is highly challenging due to the complex spatial and temporal relationships. However, traditional techniques such as Kriging suffer from high running time and poor performance on data that exhibit high variance across space and time dimensions. To thi… ▽ More Interpolation in Spatio-temporal data has applications in various domains such as climate, transportation, and mining. Spatio-Temporal interpolation is highly challenging due to the complex spatial and temporal relationships. However, traditional techniques such as Kriging suffer from high running time and poor performance on data that exhibit high variance across space and time dimensions. To this end, we propose a novel deep neural network called as Deep Geospatial Interpolation Network(DGIN), which incorporates both spatial and temporal relationships and has significantly lower training time. DGIN consists of three major components: Spatial Encoder to capture the spatial dependencies, Sequential module to incorporate the temporal dynamics, and an Attention block to learn the importance of the temporal neighborhood around the gap. We evaluate DGIN on the MODIS reflectance dataset from two different regions. Our experimental results indicate that DGIN has two advantages: (a) it outperforms alternative approaches (has lower MSE with p-value < 0.01) and, (b) it has significantly low execution time than Kriging. △ Less

Submitted 15 August, 2021; originally announced August 2021.

arXiv:2103.04032 [pdf, other]

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Authors: Sakshi Varshney, Vinay Kumar Verma, Srijith P K, Lawrence Carin, Piyush Rai

Abstract: We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task-specific parameters act as local adapters for each task, and help in efficiently obtaining task-… ▽ More We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task-specific parameters act as local adapters for each task, and help in efficiently obtaining task-specific feature maps. Moreover, we propose an element-wise addition of residual bias in the transformed feature space, which further helps stabilize GAN training in such settings. Our approach also leverages task similarity information based on the Fisher information matrix. Leveraging this knowledge from previous tasks significantly improves the model performance. In addition, the similarity measure also helps reduce the parameter growth in continual adaptation and helps to learn a compact model. In contrast to the recent approaches for continually-learned GANs, the proposed approach provides a memory-efficient way to perform effective continual data generation. Through extensive experiments on challenging and diverse datasets, we show that the feature-map-transformation approach outperforms state-of-the-art methods for continually-learned GANs, with substantially fewer parameters. The proposed method generates high-quality samples that can also improve the generative-replay-based continual learning for discriminative tasks. △ Less

Submitted 30 July, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

Comments: Under Submission

arXiv:1908.00706 [pdf, other]

AdvGAN++ : Harnessing latent layers for adversary generation

Authors: Puneet Mangla, Surgan Jandial, Sakshi Varshney, Vineeth N Balasubramanian

Abstract: Adversarial examples are fabricated examples, indistinguishable from the original image that mislead neural networks and drastically lower their performance. Recently proposed AdvGAN, a GAN based approach, takes input image as a prior for generating adversaries to target a model. In this work, we show how latent features can serve as better priors than input images for adversary generation by prop… ▽ More Adversarial examples are fabricated examples, indistinguishable from the original image that mislead neural networks and drastically lower their performance. Recently proposed AdvGAN, a GAN based approach, takes input image as a prior for generating adversaries to target a model. In this work, we show how latent features can serve as better priors than input images for adversary generation by proposing AdvGAN++, a version of AdvGAN that achieves higher attack rates than AdvGAN and at the same time generates perceptually realistic images on MNIST and CIFAR-10 datasets. △ Less

Submitted 23 December, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

Comments: Accepted at Neural Architects Workshop, ICCV 2019

arXiv:1809.03147 [pdf, other]

Is Leakage Power a Linear Function of Temperature?

Authors: Hameedah Sultan, Shashank Varshney, Smruti R Sarangi

Abstract: In this work, we present a study of the leakage power modeling techniques commonly used in the architecture community. We further provide an analysis of the error in leakage power estimation using the various modeling techniques. We strongly believe that this study will help researchers determine an appropriate leakage model to use in their work, based on the desired modeling accuracy and speed. In this work, we present a study of the leakage power modeling techniques commonly used in the architecture community. We further provide an analysis of the error in leakage power estimation using the various modeling techniques. We strongly believe that this study will help researchers determine an appropriate leakage model to use in their work, based on the desired modeling accuracy and speed. △ Less

Submitted 10 September, 2018; originally announced September 2018.

arXiv:1401.3510 [pdf]

doi 10.5121/ijnlc.2013.2604

Improving Performance Of English-Hindi Cross Language Information Retrieval Using Transliteration Of Query Terms

Authors: Saurabh Varshney, Jyoti Bajpai

Abstract: The main issue in Cross Language Information Retrieval (CLIR) is the poor performance of retrieval in terms of average precision when compared to monolingual retrieval performance. The main reasons behind poor performance of CLIR are mismatching of query terms, lexical ambiguity and un-translated query terms. The existing problems of CLIR are needed to be addressed in order to increase the perform… ▽ More The main issue in Cross Language Information Retrieval (CLIR) is the poor performance of retrieval in terms of average precision when compared to monolingual retrieval performance. The main reasons behind poor performance of CLIR are mismatching of query terms, lexical ambiguity and un-translated query terms. The existing problems of CLIR are needed to be addressed in order to increase the performance of the CLIR system. In this paper, we are putting our effort to solve the given problem by proposed an algorithm for improving the performance of English-Hindi CLIR system. We used all possible combination of Hindi translated query using transliteration of English query terms and choosing the best query among them for retrieval of documents. The experiment is performed on FIRE 2010 (Forum of Information Retrieval Evaluation) datasets. The experimental result show that the proposed approach gives better performance of English-Hindi CLIR system and also helps in overcoming existing problems and outperforms the existing English-Hindi CLIR system in terms of average precision. △ Less

Submitted 15 January, 2014; originally announced January 2014.

Comments: International Journal on Natural Language Computing (IJNLC) Vol. 2, No.6, December 2013 http://airccse.org/journal/ijnlc/index.html

Showing 1–7 of 7 results for author: Varshney, S