-
Sparsistent filtering of comovement networks from high-dimensional data
Authors:
Arnab Chakrabarti,
Anindya S. Chakrabarti
Abstract:
Network filtering is an important form of dimension reduction to isolate the core constituents of large and interconnected complex systems. We introduce a new technique to filter large dimensional networks arising out of dynamical behavior of the constituent nodes, exploiting their spectral properties. As opposed to the well known network filters that rely on preserving key topological properties…
▽ More
Network filtering is an important form of dimension reduction to isolate the core constituents of large and interconnected complex systems. We introduce a new technique to filter large dimensional networks arising out of dynamical behavior of the constituent nodes, exploiting their spectral properties. As opposed to the well known network filters that rely on preserving key topological properties of the realized network, our method treats the spectrum as the fundamental object and preserves spectral properties. Applying asymptotic theory for high dimensional data for the filter, we show that it can be tuned to interpolate between zero filtering to maximal filtering that induces sparsity and consistency while having the least spectral distance from a linear shrinkage estimator. We apply our proposed filter to covariance networks constructed from financial data, to extract the key subnetwork embedded in the full sample network.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
A fast learning algorithm for One-Class Slab Support Vector Machines
Authors:
Bagesh Kumar,
Ayush Sinha,
Sourin Chakrabarti,
Prof. O. P. Vyas
Abstract:
One Class Slab Support Vector Machines (OCSSVM) have turned out to be better in terms of accuracy in certain classes of classification problems than the traditional SVMs and One Class SVMs or even other One class classifiers. This paper proposes fast training method for One Class Slab SVMs using an updated Sequential Minimal Optimization (SMO) which divides the multi variable optimization problem…
▽ More
One Class Slab Support Vector Machines (OCSSVM) have turned out to be better in terms of accuracy in certain classes of classification problems than the traditional SVMs and One Class SVMs or even other One class classifiers. This paper proposes fast training method for One Class Slab SVMs using an updated Sequential Minimal Optimization (SMO) which divides the multi variable optimization problem to smaller sub problems of size two that can then be solved analytically. The results indicate that this training method scales better to large sets of training data than other Quadratic Programming (QP) solvers.
△ Less
Submitted 3 May, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
Advances in Bayesian Probabilistic Modeling for Industrial Applications
Authors:
Sayan Ghosh,
Piyush Pandita,
Steven Atkinson,
Waad Subber,
Yiming Zhang,
Natarajan Chennimalai Kumar,
Suryarghya Chakrabarti,
Li** Wang
Abstract:
Industrial applications frequently pose a notorious challenge for state-of-the-art methods in the contexts of optimization, designing experiments and modeling unknown physical response. This problem is aggravated by limited availability of clean data, uncertainty in available physics-based models and additional logistic and computational expense associated with experiments. In such a scenario, Bay…
▽ More
Industrial applications frequently pose a notorious challenge for state-of-the-art methods in the contexts of optimization, designing experiments and modeling unknown physical response. This problem is aggravated by limited availability of clean data, uncertainty in available physics-based models and additional logistic and computational expense associated with experiments. In such a scenario, Bayesian methods have played an impactful role in alleviating the aforementioned obstacles by quantifying uncertainty of different types under limited resources. These methods, usually deployed as a framework, allows decision makers to make informed choices under uncertainty while being able to incorporate information on the the fly, usually in the form of data, from multiple sources while being consistent with the physical intuition about the problem. This is a major advantage that Bayesian methods bring to fruition especially in the industrial context. This paper is a compendium of the Bayesian modeling methodology that is being consistently developed at GE Research. The methodology, called GE's Bayesian Hybrid Modeling (GEBHM), is a probabilistic modeling method, based on the Kennedy and O'Hagan framework, that has been continuously scaled-up and industrialized over several years. In this work, we explain the various advancements in GEBHM's methods and demonstrate their impact on several challenging industrial problems.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Differentially Private Link Prediction With Protected Connections
Authors:
Abir De,
Soumen Chakrabarti
Abstract:
Link prediction (LP) algorithms propose to each node a ranked list of nodes that are currently non-neighbors, as the most likely candidates for future linkage. Owing to increasing concerns about privacy, users (nodes) may prefer to keep some of their connections protected or private. Motivated by this observation, our goal is to design a differentially private LP algorithm, which trades off betwee…
▽ More
Link prediction (LP) algorithms propose to each node a ranked list of nodes that are currently non-neighbors, as the most likely candidates for future linkage. Owing to increasing concerns about privacy, users (nodes) may prefer to keep some of their connections protected or private. Motivated by this observation, our goal is to design a differentially private LP algorithm, which trades off between privacy of the protected node-pairs and the link prediction accuracy. More specifically, we first propose a form of differential privacy on graphs, which models the privacy loss only of those node-pairs which are marked as protected. Next, we develop DPLP , a learning to rank algorithm, which applies a monotone transform to base scores from a non-private LP system, and then adds noise. DPLP is trained with a privacy induced ranking loss, which optimizes the ranking utility for a given maximum allowed level of privacy leakage of the protected node-pairs. Under a recently-introduced latent node embedding model, we present a formal trade-off between privacy and LP utility. Extensive experiments with several real-life graphs and several LP heuristics show that DPLP can trade off between privacy and predictive performance more effectively than several alternatives.
△ Less
Submitted 14 December, 2020; v1 submitted 20 July, 2019;
originally announced August 2019.
-
Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings
Authors:
Vihari Piratla,
Sunita Sarawagi,
Soumen Chakrabarti
Abstract:
Given a small corpus $\mathcal D_T$ pertaining to a limited set of focused topics, our goal is to train embeddings that accurately capture the sense of words in the topic in spite of the limited size of $\mathcal D_T$. These embeddings may be used in various tasks involving $\mathcal D_T$. A popular strategy in limited data settings is to adapt pre-trained embeddings $\mathcal E$ trained on a larg…
▽ More
Given a small corpus $\mathcal D_T$ pertaining to a limited set of focused topics, our goal is to train embeddings that accurately capture the sense of words in the topic in spite of the limited size of $\mathcal D_T$. These embeddings may be used in various tasks involving $\mathcal D_T$. A popular strategy in limited data settings is to adapt pre-trained embeddings $\mathcal E$ trained on a large corpus. To correct for sense drift, fine-tuning, regularization, projection, and pivoting have been proposed recently. Among these, regularization informed by a word's corpus frequency performed well, but we improve upon it using a new regularizer based on the stability of its cooccurrence with other words. However, a thorough comparison across ten topics, spanning three tasks, with standardized settings of hyper-parameters, reveals that even the best embedding adaptation strategies provide small gains beyond well-tuned baselines, which many earlier comparisons ignored. In a bold departure from adapting pretrained embeddings, we propose using $\mathcal D_T$ to probe, attend to, and borrow fragments from any large, topic-rich source corpus (such as Wikipedia), which need not be the corpus used to pretrain embeddings. This step is made scalable and practical by suitable indexing. We reach the surprising conclusion that even limited corpus augmentation is more useful than adapting embeddings, which suggests that non-dominant sense information may be irrevocably obliterated from pretrained embeddings and cannot be salvaged by adaptation.
△ Less
Submitted 24 July, 2019; v1 submitted 5 June, 2019;
originally announced June 2019.
-
Graphon Estimation from Partially Observed Network Data
Authors:
Soumendu Sundar Mukherjee,
Sayak Chakrabarti
Abstract:
We consider estimating the edge-probability matrix of a network generated from a graphon model when the full network is not observed---only some overlap** subgraphs are. We extend the neighbourhood smoothing (NBS) algorithm of Zhang et al. (2017) to this missing-data set-up and show experimentally that, for a wide range of graphons, the extended NBS algorithm achieves significantly smaller error…
▽ More
We consider estimating the edge-probability matrix of a network generated from a graphon model when the full network is not observed---only some overlap** subgraphs are. We extend the neighbourhood smoothing (NBS) algorithm of Zhang et al. (2017) to this missing-data set-up and show experimentally that, for a wide range of graphons, the extended NBS algorithm achieves significantly smaller error rates than standard graphon estimation algorithms such as vanilla neighbourhood smoothing (NBS), universal singular value thresholding (USVT), blockmodel approximation, matrix completion, etc. We also show that the extended NBS algorithm is much more robust to missing data.
△ Less
Submitted 27 June, 2019; v1 submitted 2 June, 2019;
originally announced June 2019.
-
Multi-task Learning for Target-dependent Sentiment Classification
Authors:
Divam Gupta,
Kushagra Singh,
Soumen Chakrabarti,
Tanmoy Chakraborty
Abstract:
Detecting and aggregating sentiments toward people, organizations, and events expressed in unstructured social media have become critical text mining operations. Early systems detected sentiments over whole passages, whereas more recently, target-specific sentiments have been of greater interest. In this paper, we present MTTDSC, a multi-task target-dependent sentiment classification system that i…
▽ More
Detecting and aggregating sentiments toward people, organizations, and events expressed in unstructured social media have become critical text mining operations. Early systems detected sentiments over whole passages, whereas more recently, target-specific sentiments have been of greater interest. In this paper, we present MTTDSC, a multi-task target-dependent sentiment classification system that is informed by feature representation learnt for the related auxiliary task of passage-level sentiment classification. The auxiliary task uses a gated recurrent unit (GRU) and pools GRU states, followed by an auxiliary fully-connected layer that outputs passage-level predictions. In the main task, these GRUs contribute auxiliary per-token representations over and above word embeddings. The main task has its own, separate GRUs. The auxiliary and main GRUs send their states to a different fully connected layer, trained for the main task. Extensive experiments using two auxiliary datasets and three benchmark datasets (of which one is new, introduced by us) for the main task demonstrate that MTTDSC outperforms state-of-the-art baselines. Using word-level sensitivity analysis, we present anecdotal evidence that prior systems can make incorrect target-specific predictions because they miss sentiments expressed by words independent of target.
△ Less
Submitted 7 February, 2019;
originally announced February 2019.
-
GIRNet: Interleaved Multi-Task Recurrent State Sequence Models
Authors:
Divam Gupta,
Tanmoy Chakraborty,
Soumen Chakrabarti
Abstract:
In several natural language tasks, labeled sequences are available in separate domains (say, languages), but the goal is to label sequences with mixed domain (such as code-switched text). Or, we may have available models for labeling whole passages (say, with sentiments), which we would like to exploit toward better position-specific label inference (say, target-dependent sentiment annotation). A…
▽ More
In several natural language tasks, labeled sequences are available in separate domains (say, languages), but the goal is to label sequences with mixed domain (such as code-switched text). Or, we may have available models for labeling whole passages (say, with sentiments), which we would like to exploit toward better position-specific label inference (say, target-dependent sentiment annotation). A key characteristic shared across such tasks is that different positions in a primary instance can benefit from different `experts' trained from auxiliary data, but labeled primary instances are scarce, and labeling the best expert for each position entails unacceptable cognitive burden. We propose GITNet, a unified position-sensitive multi-task recurrent neural network (RNN) architecture for such applications. Auxiliary and primary tasks need not share training instances. Auxiliary RNNs are trained over auxiliary instances. A primary instance is also submitted to each auxiliary RNN, but their state sequences are gated and merged into a novel composite state sequence tailored to the primary inference task. Our approach is in sharp contrast to recent multi-task networks like the cross-stitch and sluice network, which do not control state transfer at such fine granularity. We demonstrate the superiority of GIRNet using three applications: sentiment classification of code-switched passages, part-of-speech tagging of code-switched text, and target position-sensitive annotation of sentiment in monolingual passages. In all cases, we establish new state-of-the-art performance beyond recent competitive baselines.
△ Less
Submitted 25 December, 2018; v1 submitted 28 November, 2018;
originally announced November 2018.
-
Generalizing Across Domains via Cross-Gradient Training
Authors:
Shiv Shankar,
Vihari Piratla,
Soumen Chakrabarti,
Siddhartha Chaudhuri,
Preethi Jyothi,
Sunita Sarawagi
Abstract:
We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains. CROSSGRAD does not need an adaptation phase via labeled or unlabeled data, or domain features in the new domain. Most existing domain adaptation methods attempt to erase domain signals using techniques like domain adversarial training. In contrast, CROSSGRAD is free to use domain…
▽ More
We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains. CROSSGRAD does not need an adaptation phase via labeled or unlabeled data, or domain features in the new domain. Most existing domain adaptation methods attempt to erase domain signals using techniques like domain adversarial training. In contrast, CROSSGRAD is free to use domain signals for predicting labels, if it can prevent overfitting on training domains. We conceptualize the task in a Bayesian setting, in which a sampling step is implemented as data augmentation, based on domain-guided perturbations of input instances. CROSSGRAD parallelly trains a label and a domain classifier on examples perturbed by loss gradients of each other's objectives. This enables us to directly perturb inputs, without separating and re-mixing domain signals while making various distributional assumptions. Empirical evaluation on three different applications where this setting is natural establishes that (1) domain-guided perturbation provides consistently better generalization to unseen domains, compared to generic instance perturbation methods, and that (2) data augmentation is a more stable and accurate method than domain adversarial training.
△ Less
Submitted 1 May, 2018; v1 submitted 28 April, 2018;
originally announced April 2018.