Skip to main content

Showing 1–50 of 76 results for author: Bhattacharya, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13944  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Generalization error of min-norm interpolators in transfer learning

    Authors: Yanke Song, Sohom Bhattacharya, Pragya Sur

    Abstract: This paper establishes the generalization error of pooled min-$\ell_2$-norm interpolation in transfer learning where data from diverse distributions are available. Min-norm interpolators emerge naturally as implicit regularized limits of modern machine learning algorithms. Previous work characterized their out-of-distribution risk when samples from the test distribution are unavailable during trai… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 53 pages, 2 figures

  2. arXiv:2308.14988  [pdf, other

    math.ST stat.ME stat.ML

    Inferences on Mixing Probabilities and Ranking in Mixed-Membership Models

    Authors: Sohom Bhattacharya, Jianqing Fan, Jikai Hou

    Abstract: Network data is prevalent in numerous big data applications including economics and health networks where it is of prime importance to understand the latent structure of network. In this paper, we model the network using the Degree-Corrected Mixed Membership (DCMM) model. In DCMM model, for each node $i$, there exists a membership vector… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  3. arXiv:2308.09104  [pdf, other

    stat.ML cs.LG stat.ME

    A comprehensive study of spike and slab shrinkage priors for structurally sparse Bayesian neural networks

    Authors: Sanket Jantre, Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: Network complexity and computational efficiency have become increasingly significant aspects of deep learning. Sparse deep learning addresses these challenges by recovering a sparse representation of the underlying target function by reducing heavily over-parameterized deep neural networks. Specifically, deep neural architectures compressed via structured sparsity (e.g. node sparsity) provide low… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  4. arXiv:2302.05851  [pdf, other

    math.ST stat.ME stat.ML

    Deep Neural Networks for Nonparametric Interaction Models with Diverging Dimension

    Authors: Sohom Bhattacharya, Jianqing Fan, Debarghya Mukherjee

    Abstract: Deep neural networks have achieved tremendous success due to their representation power and adaptation to low-dimensional structures. Their potential for estimating structured regression functions has been recently established in the literature. However, most of the studies require the input dimension to be fixed and consequently ignore the effect of dimension on the rate of convergence and hamper… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: 46 pages, 2 figures

  5. arXiv:2211.13478  [pdf, other

    stat.ME

    A New Spatio-Temporal Model Exploiting Hamiltonian Equations

    Authors: Satyaki Mazumder, Sayantan Banerjee, Sourabh Bhattacharya

    Abstract: The solutions of Hamiltonian equations are known to describe the underlying phase space of the mechanical system. Hamiltonian Monte Carlo is the sole use of the properties of solutions to the Hamiltonian equations in Bayesian statistics. In this article, we propose a novel spatio-temporal model using a strategic modification of the Hamiltonian equations, incorporating appropriate stochasticity via… ▽ More

    Submitted 23 November, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: An updated version, demonstrating superiority of our ideas over existing ones

  6. arXiv:2206.09233  [pdf, other

    stat.CO math.ST

    IID Sampling from Posterior Dirichlet Process Mixtures

    Authors: Sourabh Bhattacharya

    Abstract: The influence of Dirichlet process mixture is ubiquitous in the Bayesian nonparametrics literature. But sampling from its posterior distribution remains a challenge, despite the advent of various Markov chain Monte Carlo methods. The primary challenge is the infinite-dimensional setup, and even if the infinite-dimensional random measure is integrated out, high-dimensionality and discreteness still… ▽ More

    Submitted 18 June, 2022; originally announced June 2022.

  7. arXiv:2206.01446  [pdf, other

    stat.ME

    Modified Bivariate Weibull Distribution Allowing Instantaneous and Early Failures

    Authors: Sumangal Bhattacharya, Ishapathik Das, Muralidharan Kunnummal

    Abstract: In reliability and life data analysis, the Weibull distribution is widely used to accommodate more data characteristics by changing the values of the parameters. We frequently observe many zeros or close to zero data points in reliability and life testing experiments. We call this phenomenon a nearly instantaneous failure. Many researchers modified the commonly used univariate parametric models su… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: 27 pages, 6 fgures, 7 Tables

  8. arXiv:2206.00794  [pdf, other

    stat.ML cs.LG math.ST

    Sequential Bayesian Neural Subnetwork Ensembles

    Authors: Sanket Jantre, Sandeep Madireddy, Shrijita Bhattacharya, Tapabrata Maiti, Prasanna Balaprakash

    Abstract: Deep neural network ensembles that appeal to model diversity have been used successfully to improve predictive performance and model robustness in several applications. Whereas, it has recently been shown that sparse subnetworks of dense models can match the performance of their dense counterparts and increase their robustness while effectively decreasing the model complexity. However, most ensemb… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  9. arXiv:2112.07939  [pdf, other

    stat.CO stat.ME

    IID Sampling from Doubly Intractable Distributions

    Authors: Sourabh Bhattacharya

    Abstract: Intractable posterior distributions of parameters with intractable normalizing constants depending upon the parameters are known as doubly intractable posterior distributions. The terminology itself indicates that obtaining Bayesian inference from such posteriors is doubly difficult compared to traditional intractable posteriors where the normalizing constants are tractable and admit traditional M… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

  10. arXiv:2111.12664  [pdf, other

    cs.CV stat.ML

    MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning

    Authors: Siladittya Manna, Umapada Pal, Saumik Bhattacharya

    Abstract: Self-supervised contrastive learning frameworks have progressed rapidly over the last few years. In this paper, we propose a novel mutual information optimization-based loss function for contrastive learning. We model our pre-training task as a binary classification problem to induce an implicit contrastive effect and predict whether a pair is positive or negative. We further improve the näive los… ▽ More

    Submitted 9 March, 2023; v1 submitted 24 November, 2021; originally announced November 2021.

  11. arXiv:2109.12633  [pdf, other

    stat.ME stat.CO

    IID Sampling from Intractable Multimodal and Variable-Dimensional Distributions

    Authors: Sourabh Bhattacharya

    Abstract: Bhattacharya (2021b) has introduced a novel methodology for generating iid realizations from any target distribution on the Euclidean space, irrespective of dimensionality. In this article, our purpose is two-fold. We first extend the method for obtaining iid realizations from general multimodal distributions, and illustrate with a mixture of two 50-dimensional normal distributions. Then we extend… ▽ More

    Submitted 15 December, 2021; v1 submitted 26 September, 2021; originally announced September 2021.

    Comments: An updated version after fixing some typos in the paper and code

  12. arXiv:2109.01548  [pdf, other

    stat.ME math.ST

    Variational Bayes algorithm and posterior consistency of Ising model parameter estimation

    Authors: Minwoo Kim, Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: Ising models originated in statistical physics and are widely used in modeling spatial data and computer vision problems. However, statistical inference of this model remains challenging due to intractable nature of the normalizing constant in the likelihood. Here, we use a pseudo-likelihood instead to study the Bayesian estimation of two-parameter, inverse temperature, and magnetization, Ising mo… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Comments: 26 pages

  13. arXiv:2108.11000  [pdf, other

    stat.ML cs.LG

    Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details

    Authors: Sanket Jantre, Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: Sparse deep neural networks have proven to be efficient for predictive model building in large-scale studies. Although several works have studied theoretical and numerical properties of sparse neural architectures, they have primarily focused on the edge selection. Sparsity through edge selection might be intuitively appealing; however, it does not necessarily reduce the structural complexity of a… ▽ More

    Submitted 8 July, 2022; v1 submitted 24 August, 2021; originally announced August 2021.

  14. arXiv:2107.05956  [pdf, other

    stat.CO stat.ME

    IID Sampling from Intractable Distributions

    Authors: Sourabh Bhattacharya

    Abstract: We propose a novel methodology for drawing iid realizations from any target distribution on the Euclidean space with arbitrary dimension. No assumption of compact support is necessary for the validity of our theory and method. Our idea is to construct an appropriate infinite sequence of concentric closed ellipsoids, represent the target distribution as an infinite mixture on the central ellipsoid… ▽ More

    Submitted 15 December, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

    Comments: An updated version with some typos in the paper and code fixed. Now the iid and TMCMC results are in close agreement for the Challenger and the Salmonella examples

  15. arXiv:2107.01480  [pdf

    stat.ME

    Assessing contribution of treatment phases through tip** point analyses via counterfactual elicitation using rank preserving structural failure time models

    Authors: Sudipta Bhattacharya, Jyotirmoy Dey

    Abstract: This article provides a novel approach to assess the importance of specific treatment phases within a treatment regimen through tip** point analyses (TPA) of a time-to-event endpoint using rank-preserving-structural-failure-time (RPSFT) modelling. In oncology clinical research, an experimental treatment is often added to the standard of care therapy in multiple treatment phases to improve patien… ▽ More

    Submitted 3 July, 2021; originally announced July 2021.

    Comments: 38 pages, 6 figures, 3 tables. arXiv admin note: text overlap with arXiv:2011.09070

  16. arXiv:2106.12652  [pdf, ps, other

    stat.CO stat.ME

    Black Box Variational Bayesian Model Averaging

    Authors: Vojtech Kejzlar, Shrijita Bhattacharya, Mookyong Son, Tapabrata Maiti

    Abstract: For many decades now, Bayesian Model Averaging (BMA) has been a popular framework to systematically account for model uncertainty that arises in situations when multiple competing models are available to describe the same or similar physical process. The implementation of this framework, however, comes with a multitude of practical challenges including posterior approximation via Markov Chain Mont… ▽ More

    Submitted 28 March, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  17. arXiv:2106.02290  [pdf, other

    math.ST cs.IT math.PR stat.ME

    Matrix completion with data-dependent missingness probabilities

    Authors: Sohom Bhattacharya, Sourav Chatterjee

    Abstract: The problem of completing a large matrix with lots of missing entries has received widespread attention in the last couple of decades. Two popular approaches to the matrix completion problem are based on singular value thresholding and nuclear norm minimization. Most of the past works on this subject assume that there is a single number $p$ such that each entry of the matrix is available independe… ▽ More

    Submitted 22 April, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: 28 pages, 9 figures. To appear in IEEE Trans. Inf. Theory

  18. arXiv:2105.08451  [pdf, other

    stat.ME stat.CO

    Bayesian Levy-Dynamic Spatio-Temporal Process: Towards Big Data Analysis

    Authors: Sourabh Bhattacharya

    Abstract: In this era of big data, all scientific disciplines are evolving fast to cope up with the enormity of the available information. So is statistics, the queen of science. Big data are particularly relevant to spatio-temporal statistics, thanks to much-improved technology in satellite based remote sensing and Geographical Information Systems. However, none of the existing approaches seem to meet the… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: Feedback welcome

  19. arXiv:2012.09746  [pdf

    stat.ME

    Non-parametric estimation of Expectation and Variance of event count and of incidence rate in a recurrent process -- where intensity of event-occurrence changes with the occurrence of each higher order event

    Authors: Sudipta Bhattacharya

    Abstract: In this paper, a novel non-parametric method for estimation of expectation and maximum value of the variance function is proposed for recurrent events where intensity of event occurrence changes with the occurrence of each higher order event. These kinds of recurrent events are often observed in clinical trials for cardio-vascular events and also in many social experiments involving drug addiction… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 21 pages, 2 figures

  20. arXiv:2011.09592  [pdf, other

    stat.ML cs.LG math.ST

    Variational Bayes Neural Network: Posterior Consistency, Classification Accuracy and Computational Challenges

    Authors: Shrijita Bhattacharya, Zihuan Liu, Tapabrata Maiti

    Abstract: Bayesian neural network models (BNN) have re-surged in recent years due to the advancement of scalable computations and its utility in solving complex prediction problems in a wide variety of applications. Despite the popularity and usefulness of BNN, the conventional Markov Chain Monte Carlo based implementation suffers from high computational cost, limiting the use of this powerful technique in… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  21. arXiv:2011.09070  [pdf

    stat.ME stat.AP

    Assessing contribution of treatment phases through tip** point analyses using rank preserving structural failure time models

    Authors: Sudipta Bhattacharya, Jyotirmoy Dey

    Abstract: In clinical trials, an experimental treatment is sometimes added on to a standard of care or control therapy in multiple treatment phases (e.g., concomitant and maintenance phases) to improve patient outcomes. When the new regimen provides meaningful benefit over the control therapy in such cases, it proves difficult to separately assess the contribution of each phase to the overall effect observe… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 33 pages, 6 figures and 4 tables

  22. arXiv:2010.13591  [pdf, other

    math.OC stat.ME

    Function Optimization with Posterior Gaussian Derivative Process

    Authors: Sucharita Roy, Sourabh Bhattacharya

    Abstract: In this article, we propose and develop a novel Bayesian algorithm for optimization of functions whose first and second partial derivatives are known. The basic premise is the Gaussian process representation of the function which induces a first derivative process that is also Gaussian. The Bayesian posterior solutions of the derivative process set equal to zero, given data consisting of suitable… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: Comments welcome

  23. Quantile Regression Neural Networks: A Bayesian Approach

    Authors: Sanket R. Jantre, Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: This article introduces a Bayesian neural network estimation method for quantile regression assuming an asymmetric Laplace distribution (ALD) for the response variable. It is shown that the posterior distribution for feedforward neural network quantile regression is asymptotically consistent under a misspecified ALD model. This consistency proof embeds the problem from density estimation domain an… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Journal ref: J Stat Theory Pract 15 (3), 1-34, 2021

  24. arXiv:2009.06229  [pdf, other

    math.PR stat.AP

    Bayesian Appraisal of Random Series Convergence with Application to Climate Change

    Authors: Sucharita Roy, Sourabh Bhattacharya

    Abstract: Roy and Bhattacharya (2020) provided Bayesian characterization of infinite series, and their most important application, namely, to the Dirichlet series characterizing the (in)famous Riemann Hypothesis, revealed insights that are not in support of the most celebrated conjecture for over 150 years. In contrast with deterministic series considered by Roy and Bhattacharya (2020), in this article we… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

    Comments: Comments welcome

  25. arXiv:2008.11175  [pdf, other

    stat.AP stat.ME

    How Ominous is the Future Global Warming Premonition?

    Authors: Debashis Chatterjee, Sourabh Bhattacharya

    Abstract: Global warming, the phenomenon of increasing global average temperature in the recent decades, is receiving wide attention due to its very significant adverse effects on climate. Whether global warming will continue even in the future, is a question that is most important to investigate. In this regard, the so-called general circulation models (GCMs) have attempted to project the future climate, a… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

    Comments: Comments welcome

  26. A Fast and Calibrated Computer Model Emulator: An Empirical Bayes Approach

    Authors: Vojtech Kejzlar, Mookyong Son, Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: Mathematical models implemented on a computer have become the driving force behind the acceleration of the cycle of scientific processes. This is because computer models are typically much faster and economical to run than physical experiments. In this work, we develop an empirical Bayes approach to predictions of physical quantities using a computer model, where we assume that the computer model… ▽ More

    Submitted 2 July, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Journal ref: Stat Comput 31, 49 (2021)

  27. arXiv:2008.02897  [pdf, other

    cs.LG stat.ML

    Iterative Compression of End-to-End ASR Model using AutoML

    Authors: Abhinav Mehrotra, Łukasz Dudziak, **su Yeo, Young-yoon Lee, Ravichander Vipperla, Mohamed S. Abdelfattah, Sourav Bhattacharya, Samin Ishtiaq, Alberto Gil C. P. Ramos, SangJeong Lee, Daehyun Kim, Nicholas D. Lane

    Abstract: Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in develo** automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selectio… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Journal ref: INTERSPEECH 2020

  28. arXiv:2007.07847  [pdf, other

    math.ST stat.ME

    A Bayesian Multiple Testing Paradigm for Model Selection in Inverse Regression Problems

    Authors: Debashis Chatterjee, Sourabh Bhattacharya

    Abstract: In this article, we propose a novel Bayesian multiple testing formulation for model and variable selection in inverse setups, judiciously embedding the idea of inverse reference distributions proposed by Bhattacharya (2013) in a mixture framework consisting of the competing models. We develop the theory and methods in the general context encompassing parametric and nonparametric competing models,… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: Comments welcome

  29. arXiv:2006.15786  [pdf, ps, other

    stat.ML cs.LG math.ST

    Statistical Foundation of Variational Bayes Neural Networks

    Authors: Shrijita Bhattacharya, Tapabrata Maiti

    Abstract: Despite the popularism of Bayesian neural networks in recent years, its use is somewhat limited in complex and big data situations due to the computational cost associated with full posterior evaluations. Variational Bayes (VB) provides a useful alternative to circumvent the computational cost and time complexity associated with the generation of samples from the true posterior using Markov Chain… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

  30. arXiv:2006.07405  [pdf, other

    cs.LG cs.DC stat.ML

    O(1) Communication for Distributed SGD through Two-Level Gradient Averaging

    Authors: Subhadeep Bhattacharya, Weikuan Yu, Fahim Tahmid Chowdhury

    Abstract: Large neural network models present a hefty communication challenge to distributed Stochastic Gradient Descent (SGD), with a communication complexity of O(n) per worker for a model of n parameters. Many sparsification and quantization techniques have been proposed to compress the gradients, some reducing the communication complexity to O(k), where k << n. In this paper, we introduce a strategy cal… ▽ More

    Submitted 15 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  31. arXiv:2006.06020  [pdf, ps, other

    math.ST stat.ME

    Convergence of Pseudo-Bayes Factors in Forward and Inverse Regression Problems

    Authors: Debashis Chatterjee, Sourabh Bhattacharya

    Abstract: In the Bayesian literature on model comparison, Bayes factors play the leading role. In the classical statistical literature, model selection criteria are often devised used cross-validation ideas. Amalgamating the ideas of Bayes factor and cross-validation Geisser and Eddy (1979) created the pseudo-Bayes factor. The usage of cross-validation inculcates several theoretical advantages, computationa… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

    Comments: Comments welcome

  32. arXiv:2005.07468  [pdf, ps, other

    stat.AP

    Hierarchical Bayesian state-space modeling of age- and sex-structured wildlife population dynamics

    Authors: Sabyasachi Mukhopadhyay, Hans-Peter Piepho, Sourabh Bhattacharya, Holly T. Dublin, Joseph O. Ogutu

    Abstract: Biodiversity is declining at alarming rates worldwide, including for large wild mammals. It is therefore imperative to develop effective population conservation and recovery strategies. Population dynamics models can provide insights into processes driving declines of particular populations of a species and their relative importance. We develop an integrated Bayesian state-space population dynamic… ▽ More

    Submitted 19 December, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

  33. arXiv:1912.02595  [pdf, other

    stat.ME math.ST

    Outlier detection and a tail-adjusted boxplot based on extreme value theory

    Authors: Shrijita Bhattacharya, Jan Beirlant

    Abstract: Whether an extreme observation is an outlier or not, depends strongly on the corresponding tail behaviour of the underlying distribution. We develop an automatic, data-driven method to identify extreme tail behaviour that deviates from the intermediate and central characteristics. This allows for detecting extreme outliers or sets of extreme data that show less spread than the bulk of the data. To… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

  34. arXiv:1911.02623  [pdf, ps, other

    cs.LG stat.ML

    Map Enhanced Route Travel Time Prediction using Deep Neural Networks

    Authors: Soumi Das, Rajath Nandan Kalava, Kolli Kiran Kumar, Akhil Kandregula, Kalpam Suhaas, Sourangshu Bhattacharya, Niloy Ganguly

    Abstract: Travel time estimation is a fundamental problem in transportation science with extensive literature. The study of these techniques has intensified due to availability of many publicly available large trip datasets. Recently developed deep learning based models have improved the generality and performance and have focused on estimating times for individual sub-trajectories and aggregating them to p… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  35. arXiv:1911.00915  [pdf, other

    stat.CO math.ST

    Estimating accuracy of the MCMC variance estimator: a central limit theorem for batch means estimators

    Authors: Saptarshi Chakraborty, Suman K. Bhattacharya, Kshitij Khare

    Abstract: The batch means estimator of the MCMC variance is a simple and effective measure of accuracy for MCMC based ergodic averages. Under various regularity conditions, the estimator has been shown to be consistent for the true variance. However, the estimator can be unstable in practice as it depends directly on the raw MCMC output. A measure of accuracy of the batch means estimator itself, ideally in… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

    Comments: 28 pages, 2 figures

    MSC Class: 60J22 (Primary); 62F15 (secondary)

  36. arXiv:1810.10495  [pdf, ps, other

    math.ST stat.ME

    Posterior Convergence of Gaussian and General Stochastic Process Regression Under Possible Misspecifications

    Authors: Debashis Chatterjee, Sourabh Bhattacharya

    Abstract: In this article, we investigate posterior convergence in nonparametric regression models where the unknown regression function is modeled by some appropriate stochastic process. In this regard, we consider two setups. The first setup is based on Gaussian processes, where the covariates are either random or non-random and the noise may be either normally or double-exponentially distributed. In the… ▽ More

    Submitted 1 May, 2020; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: An updated version

  37. arXiv:1810.09909  [pdf, other

    math.ST stat.ME

    Bayes Factor Asymptotics for Variable Selection in the Gaussian Process Framework

    Authors: Minerva Mukhopadhyay, Sourabh Bhattacharya

    Abstract: Although variable selection is one of the most popular areas of modern statistical research, much of its development has taken place in the classical paradigm compared to the Bayesian counterpart. Somewhat surprisingly, both the paradigms have focussed almost completely on linear models, in spite of the vast scope offered by the model liberation movement brought about by modern advancements in stu… ▽ More

    Submitted 26 May, 2021; v1 submitted 23 October, 2018; originally announced October 2018.

    Comments: A very significantly updated version, with extensive treatment of the "large p, large n" paradigm, even when p>>n. Substantial methodological development added with TTMCMC based Bayes factor oriented variable selection, along with ample simulation experiments and a real data analysis in the bona fide "large p, small n" premise

  38. arXiv:1808.07704  [pdf, other

    stat.ME

    Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data

    Authors: Shrijita Bhattacharya, Michael Kallitsis, Stilian Stoev

    Abstract: We introduce a trimmed version of the Hill estimator for the index of a heavy-tailed distribution, which is robust to perturbations in the extreme order statistics. In the ideal Pareto setting, the estimator is essentially finite-sample efficient among all unbiased estimators with a given strict upper break-down point. For general heavy-tailed models, we establish the asymptotic normality of the e… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

  39. arXiv:1711.03758  [pdf, other

    stat.AP stat.ME

    A Novel Bayesian Multiple Testing Approach to Deregulated miRNA Discovery Harnessing Positional Clustering

    Authors: Noirrit Kiran Chandra, Richa Singh, Sourabh Bhattacharya

    Abstract: MicroRNAs (miRNAs) are small non-coding RNAs that function as regulators of gene expression. In recent years, there has been a tremendous and growing interest among researchers to investigate the role of miRNAs in normal cellular as well as in disease processes. Thus to investigate the role of miRNAs in oral cancer, we analyse the expression levels of miRNAs to identify miRNAs with statistically s… ▽ More

    Submitted 11 April, 2018; v1 submitted 10 November, 2017; originally announced November 2017.

    Comments: An updated version

  40. arXiv:1711.02068  [pdf, other

    cs.HC stat.ML

    From Multimodal to Unimodal Webpages for Develo** Countries

    Authors: Vidyapu Sandeep, V Vijaya Saradhi, Samit Bhattacharya

    Abstract: The multimodal web elements such as text and images are associated with inherent memory costs to store and transfer over the Internet. With the limited network connectivity in develo** countries, webpage rendering gets delayed in the presence of high-memory demanding elements such as images (relative to text). To overcome this limitation, we propose a Canonical Correlation Analysis (CCA) based c… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

    Comments: Presented at NIPS 2017 Workshop on Machine Learning for the Develo** World

  41. arXiv:1709.08073  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM

    Cross-modal Recurrent Models for Weight Objective Prediction from Multimodal Time-series Data

    Authors: Petar Veličković, Laurynas Karazija, Nicholas D. Lane, Sourav Bhattacharya, Edgar Liberis, Pietro Liò, Angela Chieh, Otmane Bellahsen, Matthieu Vegreville

    Abstract: We analyse multimodal time-series data corresponding to weight, sleep and steps measurements. We focus on predicting whether a user will successfully achieve his/her weight objective. For this, we design several deep long short-term memory (LSTM) architectures, including a novel cross-modal LSTM (X-LSTM), and demonstrate their superiority over baseline approaches. The X-LSTM improves parameter eff… ▽ More

    Submitted 29 November, 2017; v1 submitted 23 September, 2017; originally announced September 2017.

    Comments: To appear in NIPS ML4H 2017 and NIPS TSW 2017

  42. arXiv:1707.06852  [pdf, other

    stat.ME math.ST

    A Statistical Perspective on Inverse and Inverse Regression Problems

    Authors: Debashis Chatterjee, Sourabh Bhattacharya

    Abstract: Inverse problems, where in broad sense the task is to learn from the noisy response about some unknown function, usually represented as the argument of some known functional form, has received wide attention in the general scientific disciplines. How- ever, in mainstream statistics such inverse problem paradigm does not seem to be as popular. In this article we provide a brief overview of such pro… ▽ More

    Submitted 21 July, 2017; originally announced July 2017.

    Comments: To appear in RASHI

  43. arXiv:1706.03842  [pdf, other

    cs.RO stat.AP

    Approximate Structure Construction Using Large Statistical Swarms

    Authors: Subhrajit Bhattacharya

    Abstract: In this paper we describe a novel local algorithm for large statistical swarms using "harmonic attractor dynamics", by means of which a swarm can construct harmonics of the environment. This in turn allows the swarm to approximately reconstruct desired structures in the environment. The robots navigate in a discrete environment, completely free of localization, being able to communicate with other… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

    Comments: 9 pages, 7 figures

  44. arXiv:1705.03088  [pdf, other

    stat.ME

    Trimming the Hill estimator: robustness, optimality and adaptivity

    Authors: Shrijita Bhattacharya, Michael Kallitsis, Stilian Stoev

    Abstract: We introduce a trimmed version of the Hill estimator for the index of a heavy-tailed distribution, which is robust to perturbations in the extreme order statistics. In the ideal Pareto setting, the estimator is essentially finite-sample efficient among all unbiased estimators with a given strict upper break-down point. For general heavy-tailed models, we establish the asymptotic normality of the e… ▽ More

    Submitted 14 November, 2017; v1 submitted 8 May, 2017; originally announced May 2017.

  45. arXiv:1704.07349  [pdf, other

    stat.AP

    A Non-Gaussian, Nonparametric Structure for Gene-Gene and Gene-Environment Interactions in Case-Control Studies Based on Hierarchies of Dirichlet Processes

    Authors: Durba Bhattacharya, Sourabh Bhattacharya

    Abstract: It is becoming increasingly clear that complex interactions among genes and environmental factors play crucial roles in triggering complex diseases. Thus, understanding such interactions is vital, which is possible only through statistical models that adequately account for such intricate, albeit unknown, dependence structures. Bhattacharya & Bhattacharya (2016b) attempt such modeling, relating fi… ▽ More

    Submitted 1 May, 2020; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: An updated version

  46. arXiv:1703.04956  [pdf, ps, other

    math.ST stat.ME

    A Short Note on Almost Sure Convergence of Bayes Factors in the General Set-Up

    Authors: Debashis Chatterjee, Trisha Maitra, Sourabh Bhattacharya

    Abstract: Although there is a significant literature on the asymptotic theory of Bayes factor, the set-ups considered are usually specialized and often involves independent and identically distributed data. Even in such specialized cases, mostly weak consistency results are available. In this article, for the first time ever, we derive the almost sure convergence theory of Bayes factor in the general set-up… ▽ More

    Submitted 17 April, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

    Comments: To appear in The American Statistician

  47. arXiv:1610.08367  [pdf, other

    stat.ME

    Nonparametric Dynamic State Space Modeling of Observed Circular Time Series with Circular Latent States: A Bayesian Perspective

    Authors: Satyaki Mazumder, Sourabh Bhattacharya

    Abstract: Circular time series has received relatively little attention in statistics and modeling complex circular time series using the state space approach is non-existent in the literature. In this article we introduce a flexible Bayesian nonparametric approach to state space modeling of observed circular time series where even the latent states are circular random variables. Crucially, we assume that t… ▽ More

    Submitted 15 March, 2017; v1 submitted 26 October, 2016; originally announced October 2016.

    Comments: This significantly updated version will appear in Journal of Statistical Theory and Practice

  48. arXiv:1610.01712  [pdf, other

    cs.LG stat.ML

    A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences

    Authors: Asis Roy, Sourangshu Bhattacharya, Kalyan Guin

    Abstract: Tests for Esophageal cancer can be expensive, uncomfortable and can have side effects. For many patients, we can predict non-existence of disease with 100% certainty, just using demographics, lifestyle, and medical history information. Our objective is to devise a general methodology for customizing tests using user preferences so that expensive or uncomfortable tests can be avoided. We propose to… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

  49. arXiv:1602.07280  [pdf, other

    stat.AP cs.LG

    A Statistical Model for Stroke Outcome Prediction and Treatment Planning

    Authors: Abhishek Sengupta, Vaibhav Rajan, Sakyajit Bhattacharya, G R K Sarma

    Abstract: Stroke is a major cause of mortality and long--term disability in the world. Predictive outcome models in stroke are valuable for personalized treatment, rehabilitation planning and in controlled clinical trials. In this paper we design a new model to predict outcome in the short-term, the putative therapeutic window for several treatments. Our regression-based model has a parametric form that is… ▽ More

    Submitted 22 February, 2016; originally announced February 2016.

  50. arXiv:1601.03519  [pdf, other

    stat.AP

    Effects of Gene-Environment and Gene-Gene Interactions in Case-Control Studies: A Novel Bayesian Semiparametric Approach

    Authors: Durba Bhattacharya, Sourabh Bhattacharya

    Abstract: Cognizance of gene-environment interactions may help prevent or detain the onset of complex diseases like cardiovascular disease, cancer, type2 diabetes, autism or asthma by adjustments to lifestyle. In this regard, we extend the Bayesian semiparametric gene-gene interaction model of Bhattacharya & Bhattacharya (2015) to include the possibility of influencing gene-gene interactions by environmenta… ▽ More

    Submitted 21 July, 2017; v1 submitted 14 January, 2016; originally announced January 2016.

    Comments: The latest version. arXiv admin note: text overlap with arXiv:1411.7571