-
Large Deviation Analysis of Score-based Hypothesis Testing
Authors:
Enmao Diao,
Taposh Banerjee,
Vahid Tarokh
Abstract:
Score-based statistical models play an important role in modern machine learning, statistics, and signal processing. For hypothesis testing, a score-based hypothesis test is proposed in \cite{wu2022score}. We analyze the performance of this score-based hypothesis testing procedure and derive upper bounds on the probabilities of its Type I and II errors. We prove that the exponents of our error bou…
▽ More
Score-based statistical models play an important role in modern machine learning, statistics, and signal processing. For hypothesis testing, a score-based hypothesis test is proposed in \cite{wu2022score}. We analyze the performance of this score-based hypothesis testing procedure and derive upper bounds on the probabilities of its Type I and II errors. We prove that the exponents of our error bounds are asymptotically (in the number of samples) tight for the case of simple null and alternative hypotheses. We calculate these error exponents explicitly in specific cases and provide numerical studies for various other scenarios of interest.
△ Less
Submitted 3 February, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Robust Quickest Change Detection in Non-Stationary Processes
Authors:
Yingze Hou,
Yousef Oleyaeimotlagh,
Rahul Mishra,
Hoda Bidkhori,
Taposh Banerjee
Abstract:
Optimal algorithms are developed for robust detection of changes in non-stationary processes. These are processes in which the distribution of the data after change varies with time. The decision-maker does not have access to precise information on the post-change distribution. It is shown that if the post-change non-stationary family has a distribution that is least favorable in a well-defined se…
▽ More
Optimal algorithms are developed for robust detection of changes in non-stationary processes. These are processes in which the distribution of the data after change varies with time. The decision-maker does not have access to precise information on the post-change distribution. It is shown that if the post-change non-stationary family has a distribution that is least favorable in a well-defined sense, then the algorithms designed using the least favorable distributions are robust and optimal. Non-stationary processes are encountered in public health monitoring and space and military applications. The robust algorithms are applied to real and simulated data to show their effectiveness.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Harnessing The Collective Wisdom: Fusion Learning Using Decision Sequences From Diverse Sources
Authors:
Trambak Banerjee,
Bowen Gang,
Jianliang He
Abstract:
Learning from the collective wisdom of crowds enhances the transparency of scientific findings by incorporating diverse perspectives into the decision-making process. Synthesizing such collective wisdom is related to the statistical notion of fusion learning from multiple data sources or studies. However, fusing inferences from diverse sources is challenging since cross-source heterogeneity and po…
▽ More
Learning from the collective wisdom of crowds enhances the transparency of scientific findings by incorporating diverse perspectives into the decision-making process. Synthesizing such collective wisdom is related to the statistical notion of fusion learning from multiple data sources or studies. However, fusing inferences from diverse sources is challenging since cross-source heterogeneity and potential data-sharing complicate statistical inference. Moreover, studies may rely on disparate designs, employ widely different modeling techniques for inferences, and prevailing data privacy norms may forbid sharing even summary statistics across the studies for an overall analysis. In this paper, we propose an Integrative Ranking and Thresholding (IRT) framework for fusion learning in multiple testing. IRT operates under the setting where from each study a triplet is available: the vector of binary accept-reject decisions on the tested hypotheses, the study-specific False Discovery Rate (FDR) level and the hypotheses tested by the study. Under this setting, IRT constructs an aggregated, nonparametric, and discriminatory measure of evidence against each null hypotheses, which facilitates ranking the hypotheses in the order of their likelihood of being rejected. We show that IRT guarantees an overall FDR control under arbitrary dependence between the evidence measures as long as the studies control their respective FDR at the desired levels. Furthermore, IRT synthesizes inferences from diverse studies irrespective of the underlying multiple testing algorithms employed by them. While the proofs of our theoretical statements are elementary, IRT is extremely flexible, and a comprehensive numerical study demonstrates that it is a powerful framework for pooling inferences.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Empirical Bayes Estimation with Side Information: A Nonparametric Integrative Tweedie Approach
Authors:
Jiajun Luo,
Trambak Banerjee,
Gourab Mukherjee,
Wenguang Sun
Abstract:
We investigate the problem of compound estimation of normal means while accounting for the presence of side information. Leveraging the empirical Bayes framework, we develop a nonparametric integrative Tweedie (NIT) approach that incorporates structural knowledge encoded in multivariate auxiliary data to enhance the precision of compound estimation. Our approach employs convex optimization tools t…
▽ More
We investigate the problem of compound estimation of normal means while accounting for the presence of side information. Leveraging the empirical Bayes framework, we develop a nonparametric integrative Tweedie (NIT) approach that incorporates structural knowledge encoded in multivariate auxiliary data to enhance the precision of compound estimation. Our approach employs convex optimization tools to estimate the gradient of the log-density directly, enabling the incorporation of structural constraints. We conduct theoretical analyses of the asymptotic risk of NIT and establish the rate at which NIT converges to the oracle estimator. As the dimension of the auxiliary data increases, we accurately quantify the improvements in estimation risk and the associated deterioration in convergence rate. The numerical performance of NIT is illustrated through the analysis of both simulated and real data, demonstrating its superiority over existing methods.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Large-Scale Multiple Testing of Composite Null Hypotheses Under Heteroskedasticity
Authors:
Bowen Gang,
Trambak Banerjee
Abstract:
Heteroskedasticity poses several methodological challenges in designing valid and powerful procedures for simultaneous testing of composite null hypotheses. In particular, the conventional practice of standardizing or re-scaling heteroskedastic test statistics in this setting may severely affect the power of the underlying multiple testing procedure. Additionally, when the inferential parameter of…
▽ More
Heteroskedasticity poses several methodological challenges in designing valid and powerful procedures for simultaneous testing of composite null hypotheses. In particular, the conventional practice of standardizing or re-scaling heteroskedastic test statistics in this setting may severely affect the power of the underlying multiple testing procedure. Additionally, when the inferential parameter of interest is correlated with the variance of the test statistic, methods that ignore this dependence may fail to control the type I error at the desired level. We propose a new Heteroskedasticity Adjusted Multiple Testing (HAMT) procedure that avoids data reduction by standardization, and directly incorporates the side information from the variances into the testing procedure. Our approach relies on an improved nonparametric empirical Bayes deconvolution estimator that offers a practical strategy for capturing the dependence between the inferential parameter of interest and the variance of the test statistic. We develop theory to show that HAMT is asymptotically valid and optimal for FDR control. Simulation results demonstrate that HAMT outperforms existing procedures with substantial power gain across many settings at the same FDR level. The method is illustrated on an application involving the detection of engaged users on a mobile game app.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Robust Quickest Change Detection for Unnormalized Models
Authors:
Suya Wu,
Enmao Diao,
Taposh Banerjee,
Jie Ding,
Vahid Tarokh
Abstract:
Detecting an abrupt and persistent change in the underlying distribution of online data streams is an important problem in many applications. This paper proposes a new robust score-based algorithm called RSCUSUM, which can be applied to unnormalized models and addresses the issue of unknown post-change distributions. RSCUSUM replaces the Kullback-Leibler divergence with the Fisher divergence betwe…
▽ More
Detecting an abrupt and persistent change in the underlying distribution of online data streams is an important problem in many applications. This paper proposes a new robust score-based algorithm called RSCUSUM, which can be applied to unnormalized models and addresses the issue of unknown post-change distributions. RSCUSUM replaces the Kullback-Leibler divergence with the Fisher divergence between pre- and post-change distributions for computational efficiency in unnormalized statistical models and introduces a notion of the ``least favorable'' distribution for robust change detection. The algorithm and its theoretical analysis are demonstrated through simulation studies.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference Under Heterogeneity
Authors:
Trambak Banerjee,
Bhaswar B. Bhattacharya,
Gourab Mukherjee
Abstract:
Nonparametric two-sample testing is a classical problem in inferential statistics. While modern two-sample tests, such as the edge count test and its variants, can handle multivariate and non-Euclidean data, contemporary gargantuan datasets often exhibit heterogeneity due to the presence of latent subpopulations. Direct application of these tests, without regulating for such heterogeneity, may lea…
▽ More
Nonparametric two-sample testing is a classical problem in inferential statistics. While modern two-sample tests, such as the edge count test and its variants, can handle multivariate and non-Euclidean data, contemporary gargantuan datasets often exhibit heterogeneity due to the presence of latent subpopulations. Direct application of these tests, without regulating for such heterogeneity, may lead to incorrect statistical decisions. We develop a new nonparametric testing procedure that accurately detects differences between the two samples in the presence of unknown heterogeneity in the data generation process. Our framework handles this latent heterogeneity through a composite null that entertains the possibility that the two samples arise from a mixture distribution with identical component distributions but with possibly different mixing weights. In this regime, we study the asymptotic behavior of weighted edge count test statistic and show that it can be effectively re-calibrated to detect arbitrary deviations from the composite null. For practical implementation we propose a Bootstrapped Weighted Edge Count test which involves a bootstrap-based calibration procedure that can be easily implemented across a wide range of heterogeneous regimes. A comprehensive simulation study and an application to detecting aberrant user behaviors in online games demonstrates the excellent non-asymptotic performance of the proposed test.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Quickest Change Detection in Statistically Periodic Processes with Unknown Post-Change Distribution
Authors:
Yousef Oleyaeimotlagh,
Taposh Banerjee,
Ahmad Taha,
Eugene John
Abstract:
Algorithms are developed for the quickest detection of a change in statistically periodic processes. These are processes in which the statistical properties are nonstationary but repeat after a fixed time interval. It is assumed that the pre-change law is known to the decision maker but the post-change law is unknown. In this framework, three families of problems are studied: robust quickest chang…
▽ More
Algorithms are developed for the quickest detection of a change in statistically periodic processes. These are processes in which the statistical properties are nonstationary but repeat after a fixed time interval. It is assumed that the pre-change law is known to the decision maker but the post-change law is unknown. In this framework, three families of problems are studied: robust quickest change detection, joint quickest change detection and classification, and multislot quickest change detection. In the multislot problem, the exact slot within a period where a change may occur is unknown. Algorithms are proposed for each problem, and either exact optimality or asymptotic optimal in the low false alarm regime is proved for each of them. The developed algorithms are then used for anomaly detection in traffic data and arrhythmia detection and identification in electrocardiogram (ECG) data. The effectiveness of the algorithms is also demonstrated on simulated data.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
Quickest Change Detection for Unnormalized Statistical Models
Authors:
Suya Wu,
Enmao Diao,
Taposh Banerjee,
Jie Ding,
Vahid Tarokh
Abstract:
Classical quickest change detection algorithms require modeling pre-change and post-change distributions. Such an approach may not be feasible for various machine learning models because of the complexity of computing the explicit distributions. Additionally, these methods may suffer from a lack of robustness to model mismatch and noise. This paper develops a new variant of the classical Cumulativ…
▽ More
Classical quickest change detection algorithms require modeling pre-change and post-change distributions. Such an approach may not be feasible for various machine learning models because of the complexity of computing the explicit distributions. Additionally, these methods may suffer from a lack of robustness to model mismatch and noise. This paper develops a new variant of the classical Cumulative Sum (CUSUM) algorithm for the quickest change detection. This variant is based on Fisher divergence and the Hyvärinen score and is called the Score-based CUSUM (SCUSUM) algorithm. The SCUSUM algorithm allows the applications of change detection for unnormalized statistical models, i.e., models for which the probability density function contains an unknown normalization constant. The asymptotic optimality of the proposed algorithm is investigated by deriving expressions for average detection delay and the mean running time to a false alarm. Numerical results are provided to demonstrate the performance of the proposed algorithm.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Do financial regulators act in the public's interest? A Bayesian latent class estimation framework for assessing regulatory responses to banking crises
Authors:
Padma Sharma,
Trambak Banerjee
Abstract:
When banks fail amidst financial crises, the public criticizes regulators for bailing out or liquidating specific banks, especially the ones that gain attention due to their size or dominance. A comprehensive assessment of regulators, however, requires examining all their decisions, and not just specific ones, against the regulator's dual objective of preserving financial stability while discourag…
▽ More
When banks fail amidst financial crises, the public criticizes regulators for bailing out or liquidating specific banks, especially the ones that gain attention due to their size or dominance. A comprehensive assessment of regulators, however, requires examining all their decisions, and not just specific ones, against the regulator's dual objective of preserving financial stability while discouraging moral hazard. In this article, we develop a Bayesian latent class estimation framework to assess regulators on these competing objectives and evaluate their decisions against resolution rules recommended by theoretical studies of bank behavior designed to contain moral hazard incentives. The proposed estimation framework addresses the unobserved heterogeneity underlying regulator's decisions in resolving failed banks and provides a disciplined statistical approach for inferring if they acted in the public interest. Our results reveal that during the crises of 1980's, the U.S. banking regulator's resolution decisions were consistent with recommended decision rules, while the U.S. savings and loans (S&L) regulator, which ultimately faced insolvency in 1989 at a cost of $132 billion to the taxpayer, had deviated from such recommendations. Timely interventions based on this evaluation could have redressed the S&L regulator's decision structure and prevented losses to taxpayers.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Socioeconomic disparities and COVID-19: the causal connections
Authors:
Tannista Banerjee,
Ayan Paul,
Vishak Srikanth,
Inga Strümke
Abstract:
The analysis of causation is a challenging task that can be approached in various ways. With the increasing use of machine learning based models in computational socioeconomics, explaining these models while taking causal connections into account is a necessity. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with $do$ calculus, namely causal Sh…
▽ More
The analysis of causation is a challenging task that can be approached in various ways. With the increasing use of machine learning based models in computational socioeconomics, explaining these models while taking causal connections into account is a necessity. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with $do$ calculus, namely causal Shapley values. Using causal Shapley values, we analyze socioeconomic disparities that have a causal link to the spread of COVID-19 in the USA. We study several phases of the disease spread to show how the causal connections change over time. We perform a causal analysis using random effects models and discuss the correspondence between the two methods to verify our results. We show the distinct advantages a non-linear machine learning models have over linear models when performing a multivariate analysis, especially since the machine learning models can map out non-linear correlations in the data. In addition, the causal Shapley values allow for including the causal structure in the variable importance computed for the machine learning model.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
A Nearest-Neighbor Based Nonparametric Test for Viral Remodeling in Heterogeneous Single-Cell Proteomic Data
Authors:
Trambak Banerjee,
Bhaswar B. Bhattacharya,
Gourab Mukherjee
Abstract:
An important problem in contemporary immunology studies based on single-cell protein expression data is to determine whether cellular expressions are remodeled post infection by a pathogen. One natural approach for detecting such changes is to use non-parametric two-sample statistical tests. However, in single-cell studies, direct application of these tests is often inadequate because single-cell…
▽ More
An important problem in contemporary immunology studies based on single-cell protein expression data is to determine whether cellular expressions are remodeled post infection by a pathogen. One natural approach for detecting such changes is to use non-parametric two-sample statistical tests. However, in single-cell studies, direct application of these tests is often inadequate because single-cell level expression data from uninfected populations often contains attributes of several latent sub-populations with highly heterogeneous characteristics. As a result, viruses often infect these different sub-populations at different rates in which case the traditional nonparametric two-sample tests for checking similarity in distributions are no longer conservative. We propose a new nonparametric method for Testing Remodeling Under Heterogeneity (TRUH) that can accurately detect changes in the infected samples compared to possibly heterogeneous uninfected samples. Our testing framework is based on composite nulls and is designed to allow the null model to encompass the possibility that the infected samples, though unaltered by the virus, might be dominantly arising from under-represented sub-populations in the baseline data. The TRUH statistic, which uses nearest neighbor projections of the infected samples into the baseline uninfected population, is calibrated using a novel bootstrap algorithm. We demonstrate the non-asymptotic performance of the test via simulation experiments and derive the large sample limit of the test statistic, which provides theoretical support towards consistent asymptotic calibration of the test. We use the TRUH statistic for studying remodeling in tonsillar T cells under different types of HIV infection and find that unlike traditional tests, TRUH based statistical inference conforms to the biologically validated immunological theories on HIV infection.
△ Less
Submitted 24 June, 2020; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Nonparametric Empirical Bayes Estimation on Heterogeneous Data
Authors:
Trambak Banerjee,
Luella J. Fu,
Gareth M. James,
Gourab Mukherjee,
Wenguang Sun
Abstract:
The simultaneous estimation of many parameters based on data collected from corresponding studies is a key research problem that has received renewed attention in the high-dimensional setting. Many practical situations involve heterogeneous data where heterogeneity is captured by a nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity prese…
▽ More
The simultaneous estimation of many parameters based on data collected from corresponding studies is a key research problem that has received renewed attention in the high-dimensional setting. Many practical situations involve heterogeneous data where heterogeneity is captured by a nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale estimation problems. We address this issue by introducing the ``Nonparametric Empirical Bayes Structural Tweedie" (NEST) estimator, which efficiently estimates the unknown effect sizes and properly adjusts for heterogeneity via a generalized version of Tweedie's formula. For the normal means problem, NEST simultaneously handles the two main selection biases introduced by heterogeneity: one, the selection bias in the mean, which cannot be effectively corrected without also correcting for, two, selection bias in the variance. We develop theory to show that NEST is asymptotically as good as the optimal Bayes rule that uniquely minimizes a weighted squared error loss. In our simulation studies NEST outperforms competing methods, with much efficiency gains in many settings. The proposed method is demonstrated on estimating the batting averages of baseball players and Sharpe ratios of mutual fund returns. Extensions to other members of the two-parameter exponential family are discussed.
△ Less
Submitted 14 August, 2023; v1 submitted 28 February, 2020;
originally announced February 2020.
-
A Generalizable Method for Automated Quality Control of Functional Neuroimaging Datasets
Authors:
Matthew Kollada,
Qingzhu Gao,
Monika S Mellem,
Tathagata Banerjee,
William J Martin
Abstract:
Over the last twenty five years, advances in the collection and analysis of fMRI data have enabled new insights into the brain basis of human health and disease. Individual behavioral variation can now be visualized at a neural level as patterns of connectivity among brain regions. Functional brain imaging is enhancing our understanding of clinical psychiatric disorders by revealing ties between r…
▽ More
Over the last twenty five years, advances in the collection and analysis of fMRI data have enabled new insights into the brain basis of human health and disease. Individual behavioral variation can now be visualized at a neural level as patterns of connectivity among brain regions. Functional brain imaging is enhancing our understanding of clinical psychiatric disorders by revealing ties between regional and network abnormalities and psychiatric symptoms. Initial success in this arena has recently motivated collection of larger datasets which are needed to leverage fMRI to generate brain-based biomarkers to support development of precision medicines. Despite methodological advances and enhanced computational power, evaluating the quality of fMRI scans remains a critical step in the analytical framework. Before analysis can be performed, expert reviewers visually inspect raw scans and preprocessed derivatives to determine viability of the data. This Quality Control (QC) process is labor intensive, and the inability to automate at large scale has proven to be a limiting factor in clinical neuroscience fMRI research. We present a novel method for automating the QC of fMRI scans. We train machine learning classifiers using features derived from brain MR images to predict the "quality" of those images, based on the ground truth of an expert's opinion. We emphasize the importance of these classifiers' ability to generalize their predictions across data from different studies. To address this, we propose a novel approach entitled "FMRI preprocessing Log mining for Automated, Generalizable Quality Control" (FLAG-QC), in which features derived from mining runtime logs are used to train the classifier. We show that classifiers trained on FLAG-QC features perform much better (AUC=0.79) than previously proposed feature sets (AUC=0.56) when testing their ability to generalize across studies.
△ Less
Submitted 20 December, 2019;
originally announced December 2019.
-
Scrambled Translation Problem: A Problem of Denoising UNMT
Authors:
Tamali Banerjee,
Rudra Murthy V,
Pushpak Bhattacharyya
Abstract:
In this paper, we identify an interesting kind of error in the output of Unsupervised Neural Machine Translation (UNMT) systems like \textit{Undreamt}(footnote). We refer to this error type as \textit{Scrambled Translation problem}. We observe that UNMT models which use \textit{word shuffle} noise (as in case of Undreamt) can generate correct words, but fail to stitch them together to form phrases…
▽ More
In this paper, we identify an interesting kind of error in the output of Unsupervised Neural Machine Translation (UNMT) systems like \textit{Undreamt}(footnote). We refer to this error type as \textit{Scrambled Translation problem}. We observe that UNMT models which use \textit{word shuffle} noise (as in case of Undreamt) can generate correct words, but fail to stitch them together to form phrases. As a result, words of the translated sentence look \textit{scrambled}, resulting in decreased BLEU. We hypothesise that the reason behind \textit{scrambled translation problem} is 'shuffling noise' which is introduced in every input sentence as a denoising strategy. To test our hypothesis, we experiment by retraining UNMT models with a simple \textit{retraining} strategy. We stop the training of the Denoising UNMT model after a pre-decided number of iterations and resume the training for the remaining iterations -- which number is also pre-decided -- using original sentence as input without adding any noise. Our proposed solution achieves significant performance improvement UNMT models that train conventionally. We demonstrate these performance gains on four language pairs, \textit{viz.}, English-French, English-German, English-Spanish, Hindi-Punjabi. Our qualitative and quantitative analysis shows that the retraining strategy helps achieve better alignment as observed by attention heatmap and better phrasal translation, leading to statistically significant improvement in BLEU scores.
△ Less
Submitted 17 June, 2021; v1 submitted 30 October, 2019;
originally announced November 2019.
-
A General Framework for Empirical Bayes Estimation in Discrete Linear Exponential Family
Authors:
Trambak Banerjee,
Qiang Liu,
Gourab Mukherjee,
Wenguang Sun
Abstract:
We develop a Nonparametric Empirical Bayes (NEB) framework for compound estimation in the discrete linear exponential family, which includes a wide class of discrete distributions frequently arising from modern big data applications. We propose to directly estimate the Bayes shrinkage factor in the generalized Robbins' formula via solving a scalable convex program, which is carefully developed bas…
▽ More
We develop a Nonparametric Empirical Bayes (NEB) framework for compound estimation in the discrete linear exponential family, which includes a wide class of discrete distributions frequently arising from modern big data applications. We propose to directly estimate the Bayes shrinkage factor in the generalized Robbins' formula via solving a scalable convex program, which is carefully developed based on a RKHS representation of the Stein's discrepancy measure. The new NEB estimation framework is flexible for incorporating various structural constraints into the data driven rule, and provides a unified approach to compound estimation with both regular and scaled squared error losses. We develop theory to show that the class of NEB estimators enjoys strong asymptotic properties. Comprehensive simulation studies as well as analyses of real data examples are carried out to demonstrate the superiority of the NEB estimator over competing methods.
△ Less
Submitted 20 October, 2019;
originally announced October 2019.
-
Toward Sensor-based Sleep Monitoring with Electrodermal Activity Measures
Authors:
William Romine,
Tanvi Banerjee,
Garrett Goodman
Abstract:
We use self-report and electrodermal activity (EDA) wearable sensor data from 77 nights of sleep on six participants to test the efficacy of EDA data for sleep monitoring. We used factor analysis to find latent factors in the EDA data, and causal model search to find the most probable graphical model accounting for self-reported sleep efficiency (SE), sleep quality (SQ), and the latent EDA factors…
▽ More
We use self-report and electrodermal activity (EDA) wearable sensor data from 77 nights of sleep on six participants to test the efficacy of EDA data for sleep monitoring. We used factor analysis to find latent factors in the EDA data, and causal model search to find the most probable graphical model accounting for self-reported sleep efficiency (SE), sleep quality (SQ), and the latent EDA factors. Structural equation modeling was used to confirm fit of the extracted graph. Based on the generated graph, logistic regression and naive Bayes models were used to test the efficacy of the EDA data in predicting SE and SQ. Six EDA features extracted from the total signal over a night's sleep could be explained by two latent factors, EDA Magnitude and EDA Storms. EDA Magnitude performed as a strong predictor for SE to aid detection of substantial changes in time asleep. The performance of EDA Magnitured and SE in classifying SQ showed promise for wearable sleep monitoring applications. However, our data suggest that obtaining a more accurate sensor-based measure of SE will be necessary before smaller changes in SQ can be detected from EDA sensor data alone.
△ Less
Submitted 31 January, 2019;
originally announced January 2019.
-
Adaptive Sparse Estimation with Side Information
Authors:
Trambak Banerjee,
Gourab Mukherjee,
Wenguang Sun
Abstract:
The article considers the problem of estimating a high-dimensional sparse parameter in the presence of side information that encodes the sparsity structure. We develop a general framework that involves first using an auxiliary sequence to capture the side information, and then incorporating the auxiliary sequence in inference to reduce the estimation risk. The proposed method, which carries out ad…
▽ More
The article considers the problem of estimating a high-dimensional sparse parameter in the presence of side information that encodes the sparsity structure. We develop a general framework that involves first using an auxiliary sequence to capture the side information, and then incorporating the auxiliary sequence in inference to reduce the estimation risk. The proposed method, which carries out adaptive SURE-thresholding using side information (ASUS), is shown to have robust performance and enjoy optimality properties. We develop new theories to characterize regimes in which ASUS far outperforms competitive shrinkage estimators, and establish precise conditions under which ASUS is asymptotically optimal. Simulation studies are conducted to show that ASUS substantially improves the performance of existing methods in many settings. The methodology is applied for analysis of data from single cell virology studies and microarray time course experiments.
△ Less
Submitted 17 October, 2019; v1 submitted 28 November, 2018;
originally announced November 2018.
-
Sequential Detection of Regime Changes in Neural Data
Authors:
Taposh Banerjee,
Stephen Allsop,
Kay M. Tye,
Demba Ba,
Vahid Tarokh
Abstract:
The problem of detecting changes in firing patterns in neural data is studied. The problem is formulated as a quickest change detection problem. Important algorithms from the literature are reviewed. A new algorithmic technique is discussed to detect deviations from learned baseline behavior. The algorithms studied can be applied to both spike and local field potential data. The algorithms are app…
▽ More
The problem of detecting changes in firing patterns in neural data is studied. The problem is formulated as a quickest change detection problem. Important algorithms from the literature are reviewed. A new algorithmic technique is discussed to detect deviations from learned baseline behavior. The algorithms studied can be applied to both spike and local field potential data. The algorithms are applied to mice spike data to verify the presence of behavioral learning.
△ Less
Submitted 2 September, 2018;
originally announced September 2018.
-
Cyclostationary Statistical Models and Algorithms for Anomaly Detection Using Multi-Modal Data
Authors:
Taposh Banerjee,
Gene Whipps,
Prudhvi Gurram,
Vahid Tarokh
Abstract:
A framework is proposed to detect anomalies in multi-modal data. A deep neural network-based object detector is employed to extract counts of objects and sub-events from the data. A cyclostationary model is proposed to model regular patterns of behavior in the count sequences. The anomaly detection problem is formulated as a problem of detecting deviations from learned cyclostationary behavior. Se…
▽ More
A framework is proposed to detect anomalies in multi-modal data. A deep neural network-based object detector is employed to extract counts of objects and sub-events from the data. A cyclostationary model is proposed to model regular patterns of behavior in the count sequences. The anomaly detection problem is formulated as a problem of detecting deviations from learned cyclostationary behavior. Sequential algorithms are proposed to detect anomalies using the proposed model. The proposed algorithms are shown to be asymptotically efficient in a well-defined sense. The developed algorithms are applied to a multi-modal data consisting of CCTV imagery and social media posts to detect a 5K run in New York City.
△ Less
Submitted 2 July, 2018;
originally announced July 2018.
-
Sequential Event Detection Using Multimodal Data in Nonstationary Environments
Authors:
Taposh Banerjee,
Gene Whipps,
Prudhvi Gurram,
Vahid Tarokh
Abstract:
The problem of sequential detection of anomalies in multimodal data is considered. The objective is to observe physical sensor data from CCTV cameras, and social media data from Twitter and Instagram to detect anomalous behaviors or events. Data from each modality is transformed to discrete time count data by using an artificial neural network to obtain counts of objects in CCTV images and by coun…
▽ More
The problem of sequential detection of anomalies in multimodal data is considered. The objective is to observe physical sensor data from CCTV cameras, and social media data from Twitter and Instagram to detect anomalous behaviors or events. Data from each modality is transformed to discrete time count data by using an artificial neural network to obtain counts of objects in CCTV images and by counting the number of tweets or Instagram posts in a geographical area. The anomaly detection problem is then formulated as a problem of quickest detection of changes in count statistics. The quickest detection problem is then solved using the framework of partially observable Markov decision processes (POMDP), and structural results on the optimal policy are obtained. The resulting optimal policy is then applied to real multimodal data collected from New York City around a 5K race to detect the race. The count data both before and after the change is found to be nonstationary in nature. The proposed mathematical approach to this problem provides a framework for event detection in such nonstationary environments and across multiple data modalities.
△ Less
Submitted 23 March, 2018;
originally announced March 2018.
-
Early hospital mortality prediction using vital signals
Authors:
Reza Sadeghi,
Tanvi Banerjee,
William Romine
Abstract:
Early hospital mortality prediction is critical as intensivists strive to make efficient medical decisions about the severely ill patients staying in intensive care units. As a result, various methods have been developed to address this problem based on clinical records. However, some of the laboratory test results are time-consuming and need to be processed. In this paper, we propose a novel meth…
▽ More
Early hospital mortality prediction is critical as intensivists strive to make efficient medical decisions about the severely ill patients staying in intensive care units. As a result, various methods have been developed to address this problem based on clinical records. However, some of the laboratory test results are time-consuming and need to be processed. In this paper, we propose a novel method to predict mortality using features extracted from the heart signals of patients within the first hour of ICU admission. In order to predict the risk, quantitative features have been computed based on the heart rate signals of ICU patients. Each signal is described in terms of 12 statistical and signal-based features. The extracted features are fed into eight classifiers: decision tree, linear discriminant, logistic regression, support vector machine (SVM), random forest, boosted trees, Gaussian SVM, and K-nearest neighborhood (K-NN). To derive insight into the performance of the proposed method, several experiments have been conducted using the well-known clinical dataset named Medical Information Mart for Intensive Care III (MIMIC-III). The experimental results demonstrate the capability of the proposed method in terms of precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). The decision tree classifier satisfies both accuracy and interpretability better than the other classifiers, producing an F1-score and AUC equal to 0.91 and 0.93, respectively. It indicates that heart rate signals can be used for predicting mortality in patients in the ICU, achieving a comparable performance with existing predictions that rely on high dimensional features from clinical records which need to be processed and may contain missing information.
△ Less
Submitted 9 February, 2019; v1 submitted 17 March, 2018;
originally announced March 2018.
-
Wavelet Shrinkage and Thresholding based Robust Classification for Brain Computer Interface
Authors:
Taposh Banerjee,
John Choi,
Bijan Pesaran,
Demba Ba,
Vahid Tarokh
Abstract:
A macaque monkey is trained to perform two different kinds of tasks, memory aided and visually aided. In each task, the monkey saccades to eight possible target locations. A classifier is proposed for direction decoding and task decoding based on local field potentials (LFP) collected from the prefrontal cortex. The LFP time-series data is modeled in a nonparametric regression framework, as a func…
▽ More
A macaque monkey is trained to perform two different kinds of tasks, memory aided and visually aided. In each task, the monkey saccades to eight possible target locations. A classifier is proposed for direction decoding and task decoding based on local field potentials (LFP) collected from the prefrontal cortex. The LFP time-series data is modeled in a nonparametric regression framework, as a function corrupted by Gaussian noise. It is shown that if the function belongs to Besov bodies, then using the proposed wavelet shrinkage and thresholding based classifier is robust and consistent. The classifier is then applied to the LFP data to achieve high decoding performance. The proposed classifier is also quite general and can be applied for the classification of other types of time-series data as well, not necessarily brain data.
△ Less
Submitted 27 November, 2017; v1 submitted 27 October, 2017;
originally announced October 2017.
-
Classification of Local Field Potentials using Gaussian Sequence Model
Authors:
Taposh Banerjee,
John Choi,
Bijan Pesaran,
Demba Ba,
Vahid Tarokh
Abstract:
A problem of classification of local field potentials (LFPs), recorded from the prefrontal cortex of a macaque monkey, is considered. An adult macaque monkey is trained to perform a memory-based saccade. The objective is to decode the eye movement goals from the LFP collected during a memory period. The LFP classification problem is modeled as that of classification of smooth functions embedded in…
▽ More
A problem of classification of local field potentials (LFPs), recorded from the prefrontal cortex of a macaque monkey, is considered. An adult macaque monkey is trained to perform a memory-based saccade. The objective is to decode the eye movement goals from the LFP collected during a memory period. The LFP classification problem is modeled as that of classification of smooth functions embedded in Gaussian noise. It is then argued that using minimax function estimators as features would lead to consistent LFP classifiers. The theory of Gaussian sequence models allows us to represent minimax estimators as finite dimensional objects. The LFP classifier resulting from this mathematical endeavor is a spectrum based technique, where Fourier series coefficients of the LFP data, followed by appropriate shrinkage and thresholding, are used as features in a linear discriminant classifier. The classifier is then applied to the LFP data to achieve high decoding accuracy. The function classification approach taken in the paper also provides a systematic justification for using Fourier series, with shrinkage and thresholding, as features for the problem, as opposed to using the power spectrum. It also suggests that phase information is crucial to the decision making.
△ Less
Submitted 27 November, 2017; v1 submitted 4 October, 2017;
originally announced October 2017.
-
Kiefer Wolfowitz Algorithm is Asymptotically Optimal for a Class of Non-Stationary Bandit Problems
Authors:
Rahul Singh,
Taposh Banerjee
Abstract:
We consider the problem of designing an allocation rule or an "online learning algorithm" for a class of bandit problems in which the set of control actions available at each time $s$ is a convex, compact subset of $\mathbb{R}^d$. Upon choosing an action $x$ at time $s$, the algorithm obtains a noisy value of the unknown and time-varying function $f_s$ evaluated at $x$. The "regret" of an algorith…
▽ More
We consider the problem of designing an allocation rule or an "online learning algorithm" for a class of bandit problems in which the set of control actions available at each time $s$ is a convex, compact subset of $\mathbb{R}^d$. Upon choosing an action $x$ at time $s$, the algorithm obtains a noisy value of the unknown and time-varying function $f_s$ evaluated at $x$. The "regret" of an algorithm is the gap between its expected reward, and the reward earned by a strategy which has the knowledge of the function $f_s$ at each time $s$ and hence chooses the action $x_s$ that maximizes $f_s$.
For this non-stationary bandit problem set-up, we consider two variants of the Kiefer Wolfowitz (KW) algorithm i) KW with fixed step-size $β$, and ii) KW with sliding window of length $L$. We show that if the number of times that the function $f_s$ varies during time $T$ is $o(T)$, and if the learning rates of the proposed algorithms are chosen "optimally", then the regret of the proposed algorithms is $o(T)$, and hence the algorithms are asymptotically efficient.
△ Less
Submitted 8 March, 2017; v1 submitted 26 February, 2017;
originally announced February 2017.
-
Feature Screening in Large Scale Cluster Analysis
Authors:
Trambak Banerjee,
Gourab Mukherjee,
Peter Radchenko
Abstract:
We propose a novel methodology for feature screening in clustering massive datasets, in which both the number of features and the number of observations can potentially be very large. Taking advantage of a fusion penalization based convex clustering criterion, we propose a very fast screening procedure that efficiently discards non-informative features by first computing a clustering score corresp…
▽ More
We propose a novel methodology for feature screening in clustering massive datasets, in which both the number of features and the number of observations can potentially be very large. Taking advantage of a fusion penalization based convex clustering criterion, we propose a very fast screening procedure that efficiently discards non-informative features by first computing a clustering score corresponding to the clustering tree constructed for each feature, and then thresholding the resulting values. We provide theoretical support for our approach by establishing uniform non-asymptotic bounds on the clustering scores of the "noise" features. These bounds imply perfect screening of non-informative features with high probability and are derived via careful analysis of the empirical processes corresponding to the clustering trees that are constructed for each of the features by the associated clustering procedure. Through extensive simulation experiments we compare the performance of our proposed method with other screening approaches, popularly used in cluster analysis, and obtain encouraging results. We demonstrate empirically that our method is applicable to cluster analysis of big datasets arising in single-cell gene expression studies.
△ Less
Submitted 4 October, 2017; v1 submitted 11 January, 2017;
originally announced January 2017.
-
Quickest Change Detection Approach to Optimal Control in Markov Decision Processes with Model Changes
Authors:
Taposh Banerjee,
Miao Liu,
Jonathan P. How
Abstract:
Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over time. When a prior knowledge of change statistics is available, the standard Bayesian approach to this problem is to reformulate it as a partially observable M…
▽ More
Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over time. When a prior knowledge of change statistics is available, the standard Bayesian approach to this problem is to reformulate it as a partially observable MDP (POMDP) and solve it using approximate POMDP solvers, which are typically computationally demanding. In this paper, the problem is analyzed through the viewpoint of quickest change detection (QCD), a set of tools for detecting a change in the distribution of a sequence of random variables. Current methods applying QCD to such problems only passively detect changes by following prescribed policies, without optimizing the choice of actions for long term performance. We demonstrate that ignoring the reward-detection trade-off can cause a significant loss in long term rewards, and propose a two threshold switching strategy to solve the issue. A non-Bayesian problem formulation is also proposed for scenarios where a Bayesian formulation cannot be defined. The performance of the proposed two threshold strategy is examined through numerical analysis on a non-stationary MDP task, and the strategy outperforms the state-of-the-art QCD methods in both Bayesian and non-Bayesian settings.
△ Less
Submitted 1 March, 2017; v1 submitted 21 September, 2016;
originally announced September 2016.
-
Non-parametric Quickest Change Detection for Large Scale Random Matrices
Authors:
Taposh Banerjee,
Hamed Firouzi,
Alfred O. Hero III
Abstract:
The problem of quickest detection of a change in the distribution of a $n\times p$ random matrix based on a sequence of observations having a single unknown change point is considered. The forms of the pre- and post-change distributions of the rows of the matrices are assumed to belong to the family of elliptically contoured densities with sparse dispersion matrices but are otherwise unknown. We p…
▽ More
The problem of quickest detection of a change in the distribution of a $n\times p$ random matrix based on a sequence of observations having a single unknown change point is considered. The forms of the pre- and post-change distributions of the rows of the matrices are assumed to belong to the family of elliptically contoured densities with sparse dispersion matrices but are otherwise unknown. We propose a non-parametric stop** rule that is based on a novel summary statistic related to k-nearest neighbor correlation between columns of each observed random matrix. In the large scale regime of $p\rightarrow \infty$ and $n$ fixed we show that, among all functions of the proposed summary statistic, the proposed stop** rule is asymptotically optimal under a minimax quickest change detection (QCD) model.
△ Less
Submitted 19 June, 2015;
originally announced June 2015.
-
Quickest Change Detection
Authors:
Venugopal V. Veeravalli,
Taposh Banerjee
Abstract:
The problem of detecting changes in the statistical properties of a stochastic system and time series arises in various branches of science and engineering. It has a wide spectrum of important applications ranging from machine monitoring to biomedical signal processing. In all of these applications the observations being monitored undergo a change in distribution in response to a change or anomaly…
▽ More
The problem of detecting changes in the statistical properties of a stochastic system and time series arises in various branches of science and engineering. It has a wide spectrum of important applications ranging from machine monitoring to biomedical signal processing. In all of these applications the observations being monitored undergo a change in distribution in response to a change or anomaly in the environment, and the goal is to detect the change as quickly as possibly, subject to false alarm constraints. In this chapter, two formulations of the quickest change detection problem, Bayesian and minimax, are introduced, and optimal or asymptotically optimal solutions to these formulations are discussed. Then some generalizations and extensions of the quickest change detection problem are described. The chapter is concluded with a discussion of applications and open issues.
△ Less
Submitted 19 October, 2012;
originally announced October 2012.
-
Generalized Analysis of a Distributed Energy Efficient Algorithm for Change Detection
Authors:
Taposh Banerjee,
Vinod Sharma
Abstract:
An energy efficient distributed Change Detection scheme based on Page's CUSUM algorithm was presented in \cite{icassp}. In this paper we consider a nonparametric version of this algorithm. In the algorithm in \cite{icassp}, each sensor runs CUSUM and transmits only when the CUSUM is above some threshold. The transmissions from the sensors are fused at the physical layer. The channel is modeled a…
▽ More
An energy efficient distributed Change Detection scheme based on Page's CUSUM algorithm was presented in \cite{icassp}. In this paper we consider a nonparametric version of this algorithm. In the algorithm in \cite{icassp}, each sensor runs CUSUM and transmits only when the CUSUM is above some threshold. The transmissions from the sensors are fused at the physical layer. The channel is modeled as a Multiple Access Channel (MAC) corrupted with noise. The fusion center performs another CUSUM to detect the change. In this paper, we generalize the algorithm to also include nonparametric CUSUM and provide a unified analysis.
△ Less
Submitted 10 August, 2009;
originally announced August 2009.
-
Optimal factorial designs for cDNA microarray experiments
Authors:
Tathagata Banerjee,
Rahul Mukerjee
Abstract:
We consider cDNA microarray experiments when the cell populations have a factorial structure, and investigate the problem of their optimal designing under a baseline parametrization where the objects of interest differ from those under the more common orthogonal parametrization. First, analytical results are given for the $2\times 2$ factorial. Since practical applications often involve a more c…
▽ More
We consider cDNA microarray experiments when the cell populations have a factorial structure, and investigate the problem of their optimal designing under a baseline parametrization where the objects of interest differ from those under the more common orthogonal parametrization. First, analytical results are given for the $2\times 2$ factorial. Since practical applications often involve a more complex factorial structure, we next explore general factorials and obtain a collection of optimal designs in the saturated, that is, most economic, case. This, in turn, is seen to yield an approach for finding optimal or efficient designs in the practically more important nearly saturated cases. Thereafter, the findings are extended to the more intricate situation where the underlying model incorporates dye-coloring effects, and the role of dye-swap** is critically examined.
△ Less
Submitted 27 March, 2008;
originally announced March 2008.
-
A Conversation with Shoutir Kishore Chatterjee
Authors:
Tathagata Banerjee,
Rahul Mukerjee
Abstract:
Shoutir Kishore Chatterjee was born in Ranchi, a small hill station in India, on November 6, 1934. He received his B.Sc. in statistics from the Presidency College, Calcutta, in 1954, and M.Sc. and Ph.D. degrees in statistics from the University of Calcutta in 1956 and 1962, respectively. He was appointed a lecturer in the Department of Statistics, University of Calcutta, in 1960 and was a member…
▽ More
Shoutir Kishore Chatterjee was born in Ranchi, a small hill station in India, on November 6, 1934. He received his B.Sc. in statistics from the Presidency College, Calcutta, in 1954, and M.Sc. and Ph.D. degrees in statistics from the University of Calcutta in 1956 and 1962, respectively. He was appointed a lecturer in the Department of Statistics, University of Calcutta, in 1960 and was a member of its faculty until his retirement as a professor in 1997. Indeed, from the 1970s he steered the teaching and research activities of the department for the next three decades. Professor Chatterjee was the National Lecturer in Statistics (1985--1986) of the University Grants Commission, India, the President of the Section of Statistics of the Indian Science Congress (1989) and an Emeritus Scientist (1997--2000) of the Council of Scientific and Industrial Research, India. Professor Chatterjee, affectionately known as SKC to his students and admirers, is a truly exceptional person who embodies the spirit of eternal India. He firmly believes that ``fulfillment in man's life does not come from amassing a lot of money, after the threshold of what is required for achieving a decent living is crossed. It does not come even from peer recognition for intellectual achievements. Of course, one has to work and toil a lot before one realizes these facts.''
△ Less
Submitted 25 October, 2007;
originally announced October 2007.