-
A placement-value based approach to concave ROC analysis
Authors:
Soutik Ghosal,
Zhen Chen
Abstract:
The receiver operating characteristic (ROC) curve is an important graphic tool for evaluating a test in a wide range of disciplines. While useful, an ROC curve can cross the chance line, either by having an S-shape or a hook at the extreme specificity. These non-concave ROC curves are sub-optimal according to decision theory, as there are points that are superior than those corresponding to the po…
▽ More
The receiver operating characteristic (ROC) curve is an important graphic tool for evaluating a test in a wide range of disciplines. While useful, an ROC curve can cross the chance line, either by having an S-shape or a hook at the extreme specificity. These non-concave ROC curves are sub-optimal according to decision theory, as there are points that are superior than those corresponding to the portions below the chance line with either the same sensitivity or specificity. We extend the literature by proposing a novel placement value-based approach to ensure concave curvature of the ROC curve, and utilize Bayesian paradigm to make estimations under both a parametric and a semiparametric framework. We conduct extensive simulation studies to assess the performance of the proposed methodology under various scenarios, and apply it to a pancreatic cancer dataset.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Coverage of Credible Sets for Regression under Variable Selection
Authors:
Samhita Pal,
Subhashis Ghosal
Abstract:
We study the asymptotic frequentist coverage of credible sets based on a novel Bayesian approach for a multiple linear regression model under variable selection. We initially ignore the issue of variable selection, which allows us to put a conjugate normal prior on the coefficient vector. The variable selection step is incorporated directly in the posterior through a sparsity-inducing map and uses…
▽ More
We study the asymptotic frequentist coverage of credible sets based on a novel Bayesian approach for a multiple linear regression model under variable selection. We initially ignore the issue of variable selection, which allows us to put a conjugate normal prior on the coefficient vector. The variable selection step is incorporated directly in the posterior through a sparsity-inducing map and uses the induced prior for making an inference instead of the natural conjugate posterior. The sparsity-inducing map minimizes the sum of the squared l2-distance weighted by the data matrix and a suitably scaled l1-penalty term. We obtain the limiting coverage of various credible regions and demonstrate that a modified credible interval for a component has the exact asymptotic frequentist coverage if the corresponding predictor is asymptotically uncorrelated with other predictors. Through extensive simulation, we provide a guideline for choosing the penalty parameter as a function of the credibility level appropriate for the corresponding coverage. We also show finite-sample numerical results that support the conclusions from the asymptotic theory. We also provide the credInt package that implements the method in R to obtain the credible intervals along with the posterior samples.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Impact of methodological assumptions and covariates on the cutoff estimation in ROC analysis
Authors:
Soutik Ghosal
Abstract:
The Receiver Operating Characteristic (ROC) curve stands as a cornerstone in assessing the efficacy of biomarkers for disease diagnosis. Beyond merely evaluating performance, it provides with an optimal cutoff for biomarker values, crucial for disease categorization. While diverse methodologies exist for threshold estimation, less attention has been paid to integrating covariate impact into this p…
▽ More
The Receiver Operating Characteristic (ROC) curve stands as a cornerstone in assessing the efficacy of biomarkers for disease diagnosis. Beyond merely evaluating performance, it provides with an optimal cutoff for biomarker values, crucial for disease categorization. While diverse methodologies exist for threshold estimation, less attention has been paid to integrating covariate impact into this process. Covariates can strongly impact diagnostic summaries, leading to variations across different covariate levels. Therefore, a tailored covariate-based framework is imperative for outlining covariate-specific optimal cutoffs. Moreover, recent investigations into cutoff estimators have overlooked the influence of ROC curve estimation methodologies. This study endeavors to bridge this gap by addressing the research void. Extensive simulation studies are conducted to scrutinize the performance of ROC curve estimation models in estimating different cutoffs in varying scenarios, encompassing diverse data-generating mechanisms and covariate effects. Additionally, leveraging the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the research assesses the performance of different biomarkers in diagnosing Alzheimer's disease and determines the suitable optimal cutoffs.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Bayesian Inference for High-dimensional Time Series by Latent Process Modeling
Authors:
Arkaprava Roy,
Anindya Roy,
Subhashis Ghosal
Abstract:
Time series data arising in many applications nowadays are high-dimensional. A large number of parameters describe features of these time series. We propose a novel approach to modeling a high-dimensional time series through several independent univariate time series, which are then orthogonally rotated and sparsely linearly transformed. With this approach, any specified intrinsic relations among…
▽ More
Time series data arising in many applications nowadays are high-dimensional. A large number of parameters describe features of these time series. We propose a novel approach to modeling a high-dimensional time series through several independent univariate time series, which are then orthogonally rotated and sparsely linearly transformed. With this approach, any specified intrinsic relations among component time series given by a graphical structure can be maintained at all time snapshots. We call the resulting process an Orthogonally-rotated Univariate Time series (OUT). Key structural properties of time series such as stationarity and causality can be easily accommodated in the OUT model. For Bayesian inference, we put suitable prior distributions on the spectral densities of the independent latent times series, the orthogonal rotation matrix, and the common precision matrix of the component times series at every time point. A likelihood is constructed using the Whittle approximation for univariate latent time series. An efficient Markov Chain Monte Carlo (MCMC) algorithm is developed for posterior computation. We study the convergence of the pseudo-posterior distribution based on the Whittle likelihood for the model's parameters upon develo** a new general posterior convergence theorem for pseudo-posteriors. We find that the posterior contraction rate for independent observations essentially prevails in the OUT model under very mild conditions on the temporal dependence described in terms of the smoothness of the corresponding spectral densities. Through a simulation study, we compare the accuracy of estimating the parameters and identifying the graphical structure with other approaches. We apply the proposed methodology to analyze a dataset on different industrial components of the US gross domestic product between 2010 and 2019 and predict future observations.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Bayesian Inference for Multivariate Monotone Densities
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone…
▽ More
We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone one by a projection map, taking its $\mathbb{L}_1$-projection onto the space of monotone functions, which is subsequently normalized to integrate to one. We construct consistent Bayesian tests to test multivariate monotonicity of a probability density based on the $\mathbb{L}_1$-distance to the class of monotone functions. The test is shown to have a size going to zero and high power against alternatives sufficiently separated from the null hypothesis. To obtain a Bayesian credible interval for the value of the density function at an interior point with guaranteed asymptotic frequentist coverage, we consider a posterior quantile interval of an induced map transforming the function value to its value optimized over certain blocks. The limiting coverage is explicitly calculated and is seen to be higher than the credibility level used in the construction. By exploring the asymptotic relationship between the coverage and the credibility, we show that a desired asymptomatic coverage can be obtained exactly by starting with an appropriate credibility level.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Bayesian Inference for $k$-Monotone Densities with Applications to Multiple Testing
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
Shape restriction, like monotonicity or convexity, imposed on a function of interest, such as a regression or density function, allows for its estimation without smoothness assumptions. The concept of $k$-monotonicity encompasses a family of shape restrictions, including decreasing and convex decreasing as special cases corresponding to $k=1$ and $k=2$. We consider Bayesian approaches to estimate…
▽ More
Shape restriction, like monotonicity or convexity, imposed on a function of interest, such as a regression or density function, allows for its estimation without smoothness assumptions. The concept of $k$-monotonicity encompasses a family of shape restrictions, including decreasing and convex decreasing as special cases corresponding to $k=1$ and $k=2$. We consider Bayesian approaches to estimate a $k$-monotone density. By utilizing a kernel mixture representation and putting a Dirichlet process or a finite mixture prior on the mixing distribution, we show that the posterior contraction rate in the Hellinger distance is $(n/\log n)^{- k/(2k + 1)}$ for a $k$-monotone density, which is minimax optimal up to a polylogarithmic factor. When the true $k$-monotone density is a finite $J_0$-component mixture of the kernel, the contraction rate improves to the nearly parametric rate $\sqrt{(J_0 \log n)/n}$. Moreover, by putting a prior on $k$, we show that the same rates hold even when the best value of $k$ is unknown. A specific application in modeling the density of $p$-values in a large-scale multiple testing problem is considered. Simulation studies are conducted to evaluate the performance of the proposed method.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Optimal Bayesian Smoothing of Functional Observations over a Large Graph
Authors:
Arkaprava Roy,
Shubhashis Ghosal
Abstract:
In modern contexts, some types of data are observed in high-resolution, essentially continuously in time. Such data units are best described as taking values in a space of functions. Subject units carrying the observations may have intrinsic relations among themselves, and are best described by the nodes of a large graph. It is often sensible to think that the underlying signals in these functiona…
▽ More
In modern contexts, some types of data are observed in high-resolution, essentially continuously in time. Such data units are best described as taking values in a space of functions. Subject units carrying the observations may have intrinsic relations among themselves, and are best described by the nodes of a large graph. It is often sensible to think that the underlying signals in these functional observations vary smoothly over the graph, in that neighboring nodes have similar underlying signals. This qualitative information allows borrowing of strength over neighboring nodes and consequently leads to more accurate inference. In this paper, we consider a model with Gaussian functional observations and adopt a Bayesian approach to smoothing over the nodes of the graph. We characterize the minimax rate of estimation in terms of the regularity of the signals and their variation across nodes quantified in terms of the graph Laplacian. We show that an appropriate prior constructed from the graph Laplacian can attain the minimax bound, while using a mixture prior, the minimax rate up to a logarithmic factor can be attained simultaneously for all possible values of functional and graphical smoothness. We also show that in the fixed smoothness setting, an optimal sized credible region has arbitrarily high frequentist coverage. A simulation experiment demonstrates that the method performs better than potential competing methods like the random forest. The method is also applied to a dataset on daily temperatures measured at several weather stations in the US state of North Carolina.
△ Less
Submitted 19 July, 2021; v1 submitted 20 April, 2021;
originally announced April 2021.
-
Interpretable and synergistic deep learning for visual explanation and statistical estimations of segmentation of disease features from medical images
Authors:
Sambuddha Ghosal,
Pratik Shah
Abstract:
Deep learning (DL) models for disease classification or segmentation from medical images are increasingly trained using transfer learning (TL) from unrelated natural world images. However, shortcomings and utility of TL for specialized tasks in the medical imaging domain remain unknown and are based on assumptions that increasing training data will improve performance. We report detailed compariso…
▽ More
Deep learning (DL) models for disease classification or segmentation from medical images are increasingly trained using transfer learning (TL) from unrelated natural world images. However, shortcomings and utility of TL for specialized tasks in the medical imaging domain remain unknown and are based on assumptions that increasing training data will improve performance. We report detailed comparisons, rigorous statistical analysis and comparisons of widely used DL architecture for binary segmentation after TL with ImageNet initialization (TII-models) with supervised learning with only medical images(LMI-models) of macroscopic optical skin cancer, microscopic prostate core biopsy and Computed Tomography (CT) DICOM images. Through visual inspection of TII and LMI model outputs and their Grad-CAM counterparts, our results identify several counter intuitive scenarios where automated segmentation of one tumor by both models or the use of individual segmentation output masks in various combinations from individual models leads to 10% increase in performance. We also report sophisticated ensemble DL strategies for achieving clinical grade medical image segmentation and model explanations under low data regimes. For example; estimating performance, explanations and replicability of LMI and TII models described by us can be used for situations in which sparsity promotes better learning. A free GitHub repository of TII and LMI models, code and more than 10,000 medical images and their Grad-CAM output from this study can be used as starting points for advanced computational medicine and DL research for biomedical discovery and applications.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Bayesian Multivariate Quantile Regression Using Dependent Dirichlet Process Prior
Authors:
Indrabati Bhattacharya,
Subhashis Ghosal
Abstract:
In this article, we consider a non-parametric Bayesian approach to multivariate quantile regression. The collection of related conditional distributions of a response vector Y given a univariate covariate X is modeled using a Dependent Dirichlet Process (DDP) prior. The DDP is used to introduce dependence across x. As the realizations from a Dirichlet process prior are almost surely discrete, we n…
▽ More
In this article, we consider a non-parametric Bayesian approach to multivariate quantile regression. The collection of related conditional distributions of a response vector Y given a univariate covariate X is modeled using a Dependent Dirichlet Process (DDP) prior. The DDP is used to introduce dependence across x. As the realizations from a Dirichlet process prior are almost surely discrete, we need to convolve it with a kernel. To model the error distribution as flexibly as possible, we use a countable mixture of multidimensional normal distributions as our kernel. For posterior computations, we use a truncated stick-breaking representation of the DDP. This approximation enables us to deal with only a finitely number of parameters. We use a Block Gibbs sampler for estimating the model parameters. We illustrate our method with simulation studies and real data applications. Finally, we provide a theoretical justification for the proposed method through posterior consistency. Our proposed procedure is new even when the response is univariate.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
Bayesian Semi-supervised learning under nonparanormality
Authors:
Rui Zhu,
Subhashis Ghosal
Abstract:
Semi-supervised learning is a classification method which makes use of both labeled data and unlabeled data for training. In this paper, we propose a semi-supervised learning algorithm using a Bayesian semi-supervised model. We make a general assumption that the observations will follow two multivariate normal distributions depending on their true labels after the same unknown transformation. We u…
▽ More
Semi-supervised learning is a classification method which makes use of both labeled data and unlabeled data for training. In this paper, we propose a semi-supervised learning algorithm using a Bayesian semi-supervised model. We make a general assumption that the observations will follow two multivariate normal distributions depending on their true labels after the same unknown transformation. We use B-splines to put a prior on the transformation function for each component. To use unlabeled data in a semi-supervised setting, we assume the labels are missing at random. The posterior distributions can then be described using our assumptions, which we compute by the Gibbs sampling technique. The proposed method is then compared with several other available methods through an extensive simulation study. Finally we apply the proposed method in real data contexts for diagnosing breast cancer and classify radar returns. We conclude that the proposed method has better prediction accuracy in a wide variety of cases.
△ Less
Submitted 11 January, 2020;
originally announced January 2020.
-
Deep Generative Models Strike Back! Improving Understanding and Evaluation in Light of Unmet Expectations for OoD Data
Authors:
John Just,
Sambuddha Ghosal
Abstract:
Advances in deep generative and density models have shown impressive capacity to model complex probability density functions in lower-dimensional space. Also, applying such models to high-dimensional image data to model the PDF has shown poor generalization, with out-of-distribution data being assigned equal or higher likelihood than in-sample data. Methods to deal with this have been proposed tha…
▽ More
Advances in deep generative and density models have shown impressive capacity to model complex probability density functions in lower-dimensional space. Also, applying such models to high-dimensional image data to model the PDF has shown poor generalization, with out-of-distribution data being assigned equal or higher likelihood than in-sample data. Methods to deal with this have been proposed that deviate from a fully unsupervised approach, requiring large ensembles or additional knowledge about the data, not commonly available in the real-world. In this work, the previously offered reasoning behind these issues is challenged empirically, and it is shown that data-sets such as MNIST fashion/digits and CIFAR10/SVHN are trivially separable and have no overlap on their respective data manifolds that explains the higher OoD likelihood. Models like masked autoregressive flows and block neural autoregressive flows are shown to not suffer from OoD likelihood issues to the extent of GLOW, PixelCNN++, and real NVP. A new avenue is also explored which involves a change of basis to a new space of the same dimension with an orthonormal unitary basis of eigenvectors before modeling. In the test data-sets and models, this aids in pushing down the relative likelihood of the contrastive OoD data set and improve discrimination results. The significance of the density of the original space is maintained, while invertibility remains tractable. Finally, a look to the previous generation of generative models in the form of probabilistic principal component analysis is inspired, and revisited for the same data-sets and shown to work really well for discriminating anomalies based on likelihood in a fully unsupervised fashion compared with pixelCNN++, GLOW, and real NVP with less complexity and faster training. Also, dimensionality reduction using PCA is shown to improve anomaly detection in generative models.
△ Less
Submitted 12 November, 2019;
originally announced November 2019.
-
Encoding Invariances in Deep Generative Models
Authors:
Viraj Shah,
Ameya Joshi,
Sambuddha Ghosal,
Balaji Pokuri,
Soumik Sarkar,
Baskar Ganapathysubramanian,
Chinmay Hegde
Abstract:
Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a…
▽ More
Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a new generative modeling approach, InvNet, that can efficiently model data spaces with known invariances. We devise an adversarial training algorithm to encode them into data distribution. We validate our framework in three experimental settings: generating images with fixed motifs; solving nonlinear partial differential equations (PDEs); and reconstructing two-phase microstructures with desired statistical properties. We complement our experiments with several theoretical results.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Regression-Based Bayesian Estimation and Structure Learning for Nonparanormal Graphical Models
Authors:
Jami J. Mulgrave,
Subhashis Ghosal
Abstract:
A nonparanormal graphical model is a semiparametric generalization of a Gaussian graphical model for continuous variables in which it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations. We consider a Bayesian approach to inference in a nonparanormal graphical model in which we put priors on the unknown transformations through a r…
▽ More
A nonparanormal graphical model is a semiparametric generalization of a Gaussian graphical model for continuous variables in which it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations. We consider a Bayesian approach to inference in a nonparanormal graphical model in which we put priors on the unknown transformations through a random series based on B-splines. We use a regression formulation to construct the likelihood through the Cholesky decomposition on the underlying precision matrix of the transformed variables and put shrinkage priors on the regression coefficients. We apply a plug-in variational Bayesian algorithm for learning the sparse precision matrix and compare the performance to a posterior Gibbs sampling scheme in a simulation study. We finally apply the proposed methods to a real data set. KEYWORDS:
△ Less
Submitted 20 February, 2021; v1 submitted 8 December, 2018;
originally announced December 2018.
-
Bayesian Analysis of Nonparanormal Graphical Models Using Rank-Likelihood
Authors:
Jami J. Mulgrave,
Subhashis Ghosal
Abstract:
Gaussian graphical models, where it is assumed that the variables of interest jointly follow a multivariate normal distribution with a sparse precision matrix, have been used to study intrinsic dependence among variables, but the normality assumption may be restrictive in many settings. A nonparanormal graphical model is a semiparametric generalization of a Gaussian graphical model for continuous…
▽ More
Gaussian graphical models, where it is assumed that the variables of interest jointly follow a multivariate normal distribution with a sparse precision matrix, have been used to study intrinsic dependence among variables, but the normality assumption may be restrictive in many settings. A nonparanormal graphical model is a semiparametric generalization of a Gaussian graphical model for continuous variables where it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformation. We consider a Bayesian approach for the nonparanormal graphical model using a rank-likelihood which remains invariant under monotone transformations, thereby avoiding the need to put a prior on the transformation functions. On the underlying precision matrix of the transformed variables, we consider a horseshoe prior on its Cholesky decomposition and use an efficient posterior Gibbs sampling scheme. We present a posterior consistency result for the precision matrix based on the rank-based likelihood. We study the numerical performance of the proposed method through a simulation study and apply it on a real dataset.
△ Less
Submitted 16 April, 2019; v1 submitted 6 December, 2018;
originally announced December 2018.
-
Interpretable deep learning for guided structure-property explorations in photovoltaics
Authors:
Balaji Sesha Sarath Pokuri,
Sambuddha Ghosal,
Apurva Kokate,
Baskar Ganapathysubramanian,
Soumik Sarkar
Abstract:
The performance of an organic photovoltaic device is intricately connected to its active layer morphology. This connection between the active layer and device performance is very expensive to evaluate, either experimentally or computationally. Hence, designing morphologies to achieve higher performances is non-trivial and often intractable. To solve this, we first introduce a deep convolutional ne…
▽ More
The performance of an organic photovoltaic device is intricately connected to its active layer morphology. This connection between the active layer and device performance is very expensive to evaluate, either experimentally or computationally. Hence, designing morphologies to achieve higher performances is non-trivial and often intractable. To solve this, we first introduce a deep convolutional neural network (CNN) architecture that can serve as a fast and robust surrogate for the complex structure-property map. Several tests were performed to gain trust in this trained model. Then, we utilize this fast framework to perform robust microstructural design to enhance device performance.
△ Less
Submitted 11 December, 2018; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Bayesian Change Point Detection for Functional Data
Authors:
Xiuqi Li,
Subhashis Ghosal
Abstract:
We propose a Bayesian method to detect change points for functional data. We extract the features of a sequence of functional data by the discrete wavelet transform (DWT), and treat each sequence of feature independently. We believe there is potentially a change in each feature at possibly different time points. The functional data evolves through such changes throughout the sequences of observati…
▽ More
We propose a Bayesian method to detect change points for functional data. We extract the features of a sequence of functional data by the discrete wavelet transform (DWT), and treat each sequence of feature independently. We believe there is potentially a change in each feature at possibly different time points. The functional data evolves through such changes throughout the sequences of observations. The change point for this sequence of functional data is the cumulative effect of changes in all features. We assign the features with priors which incorporate the characteristic of the wavelet coefficients. Then we compute the posterior distribution of change point for each sequence of feature, and define a matrix where each entry is a measure of similarity between two functional data in this sequence. We compute the ratio of the mean similarity between groups and within groups for all possible partitions, and the change point is where the ratio reaches the minimum. We demonstrate this method using a dataset on climate change.
△ Less
Submitted 3 August, 2018;
originally announced August 2018.
-
Bayesian Classification of Multiclass Functional Data
Authors:
Xiuqi Li,
Subhashis Ghosal
Abstract:
We propose a Bayesian approach to estimating parameters in multiclass functional models. Unordered multinomial probit, ordered multinomial probit and multinomial logistic models are considered. We use finite random series priors based on a suitable basis such as B-splines in these three multinomial models, and classify the functional data using the Bayes rule. We average over models based on the m…
▽ More
We propose a Bayesian approach to estimating parameters in multiclass functional models. Unordered multinomial probit, ordered multinomial probit and multinomial logistic models are considered. We use finite random series priors based on a suitable basis such as B-splines in these three multinomial models, and classify the functional data using the Bayes rule. We average over models based on the marginal likelihood estimated from Markov Chain Monte Carlo (MCMC) output. Posterior contraction rates for the three multinomial models are computed. We also consider Bayesian linear and quadratic discriminant analyses on the multivariate data obtained by applying a functional principal component technique on the original functional data. A simulation study is conducted to compare these methods on different types of data. We also apply these methods to a phoneme dataset.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
Bayesian Inference in Nonparanormal Graphical Models
Authors:
Jami J. Mulgrave,
Subhashis Ghosal
Abstract:
Gaussian graphical models have been used to study intrinsic dependence among several variables, but the Gaussianity assumption may be restrictive in many applications. A nonparanormal graphical model is a semiparametric generalization for continuous variables where it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations on each of…
▽ More
Gaussian graphical models have been used to study intrinsic dependence among several variables, but the Gaussianity assumption may be restrictive in many applications. A nonparanormal graphical model is a semiparametric generalization for continuous variables where it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations on each of them. We consider a Bayesian approach in the nonparanormal graphical model by putting priors on the unknown transformations through a random series based on B-splines where the coefficients are ordered to induce monotonicity. A truncated normal prior leads to partial conjugacy in the model and is useful for posterior simulation using Gibbs sampling. On the underlying precision matrix of the transformed variables, we consider a spike-and-slab prior and use an efficient posterior Gibbs sampling scheme. We use the Bayesian Information Criterion to choose the hyperparameters for the spike-and-slab prior. We present a posterior consistency result on the underlying transformation and the precision matrix. We study the numerical performance of the proposed method through an extensive simulation study and finally apply the proposed method on a real data set.
△ Less
Submitted 11 April, 2019; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Bayesian ROC surface estimation under verification bias
Authors:
Rui Zhu,
Subhashis Ghosal
Abstract:
The Receiver Operating Characteristic (ROC) surface is a generalization of ROC curve and is widely used for assessment of the accuracy of diagnostic tests on three categories. A complication called the verification bias, meaning that not all subjects have their true disease status verified often occur in real application of ROC analysis. This is a common problem since the gold standard test, which…
▽ More
The Receiver Operating Characteristic (ROC) surface is a generalization of ROC curve and is widely used for assessment of the accuracy of diagnostic tests on three categories. A complication called the verification bias, meaning that not all subjects have their true disease status verified often occur in real application of ROC analysis. This is a common problem since the gold standard test, which is used to generate true disease status, can be invasive and expensive. In this paper, we will propose a Bayesian approach for estimating the ROC surface based on continuous data under a semi-parametric trinormality assumption. Our proposed method often adopted in ROC analysis can also be extended to situation in the presence of verification bias. We compute the posterior distribution of the parameters under trinormality assumption by using a rank-based likelihood. Consistency of the posterior under mild conditions is also established. We compare our method with the existing methods for estimating ROC surface and conclude that our method performs well in terms of accuracy.
△ Less
Submitted 18 March, 2018;
originally announced March 2018.
-
Bayesian method for causal inference in spatially-correlated multivariate time series
Authors:
Bo Ning,
Subhashis Ghosal,
Jewell Thomas
Abstract:
Measuring the causal impact of an advertising campaign on sales is an essential task for advertising companies. Challenges arise when companies run advertising campaigns in multiple stores which are spatially correlated, and when the sales data have a low signal-to-noise ratio which makes the advertising effects hard to detect. This paper proposes a solution to address both of these challenges. A…
▽ More
Measuring the causal impact of an advertising campaign on sales is an essential task for advertising companies. Challenges arise when companies run advertising campaigns in multiple stores which are spatially correlated, and when the sales data have a low signal-to-noise ratio which makes the advertising effects hard to detect. This paper proposes a solution to address both of these challenges. A novel Bayesian method is proposed to detect weaker impacts and a multivariate structural time series model is used to capture the spatial correlation between stores through placing a $\mathcal{G}$-Wishart prior on the precision matrix. The new method is to compare two posterior distributions of a latent variable---one obtained by using the observed data from the test stores and the other one obtained by using the data from their counterfactual potential outcomes. The counterfactual potential outcomes are estimated from the data of synthetic controls, each of which is a linear combination of sales figures at many control stores over the causal period. Control stores are selected using a revised Expectation-Maximization variable selection (EMVS) method. A two-stage algorithm is proposed to estimate the parameters of the model. To prevent the prediction intervals from being explosive, a stationarity constraint is imposed on the local linear trend of the model through a recently proposed prior. The benefit of using this prior is discussed in this paper. A detailed simulation study shows the effectiveness of using our proposed method to detect weaker causal impact. The new method is applied to measure the causal effect of an advertising campaign for a consumer product sold at stores of a large national retail chain.
△ Less
Submitted 12 March, 2018; v1 submitted 18 January, 2018;
originally announced January 2018.
-
High-dimensional single-index Bayesian modeling of brain atrophy
Authors:
Arkaprava Roy,
Subhashis Ghosal,
Kingshuk Roy Choudhury
Abstract:
We propose a model of brain atrophy as a function of high-dimensional genetic information and low dimensional covariates such as gender, age, APOE gene, and disease status. A nonparametric single-index Bayesian model of high dimension is proposed to model the relationship with B-spline series prior on the unknown functions and Dirichlet process scale mixture of centered normal prior on the distrib…
▽ More
We propose a model of brain atrophy as a function of high-dimensional genetic information and low dimensional covariates such as gender, age, APOE gene, and disease status. A nonparametric single-index Bayesian model of high dimension is proposed to model the relationship with B-spline series prior on the unknown functions and Dirichlet process scale mixture of centered normal prior on the distributions of the random effects. The posterior rate of contraction without the random effect is established for a fixed number of regions and time points with increasing sample size. We implement an efficient computation algorithm through a Hamiltonian Monte Carlo (HMC) algorithm. The performance of the proposed Bayesian method is compared with the corresponding least square estimator in the linear model with horseshoe prior, LASSO and SCAD penalization on the high-dimensional covariates. The proposed Bayesian method is applied to a dataset on volumes of brain regions recorded over multiple visits of 748 individuals using 620,901 SNPs and 6 other covariates for each individual, to identify factors associated with brain atrophy.
△ Less
Submitted 11 February, 2019; v1 submitted 18 December, 2017;
originally announced December 2017.
-
Interpretable Deep Learning applied to Plant Stress Phenoty**
Authors:
Sambuddha Ghosal,
David Blystone,
Asheesh K. Singh,
Baskar Ganapathysubramanian,
Arti Singh,
Soumik Sarkar
Abstract:
Availability of an explainable deep learning model that can be applied to practical real world scenarios and in turn, can consistently, rapidly and accurately identify specific and minute traits in applicable fields of biological sciences, is scarce. Here we consider one such real world example viz., accurate identification, classification and quantification of biotic and abiotic stresses in crop…
▽ More
Availability of an explainable deep learning model that can be applied to practical real world scenarios and in turn, can consistently, rapidly and accurately identify specific and minute traits in applicable fields of biological sciences, is scarce. Here we consider one such real world example viz., accurate identification, classification and quantification of biotic and abiotic stresses in crop research and production. Up until now, this has been predominantly done manually by visual inspection and require specialized training. However, such techniques are hindered by subjectivity resulting from inter- and intra-rater cognitive variability. Here, we demonstrate the ability of a machine learning framework to identify and classify a diverse set of foliar stresses in the soybean plant with remarkable accuracy. We also present an explanation mechanism using gradient-weighted class activation map** that isolates the visual symptoms used by the model to make predictions. This unsupervised identification of unique visual symptoms for each stress provides a quantitative measure of stress severity, allowing for identification, classification and quantification in one framework. The learnt model appears to be agnostic to species and make good predictions for other (non-soybean) species, demonstrating an ability of transfer learning.
△ Less
Submitted 28 October, 2017; v1 submitted 24 October, 2017;
originally announced October 2017.
-
Bayesian Modeling of the Structural Connectome for Studying Alzheimer Disease
Authors:
Arkaprava Roy,
Subhashis Ghosal,
Jeffrey Prescott,
Kingshuk Roy Choudhury
Abstract:
We study possible relations between the structure of the connectome, white matter connecting different regions of brain, and Alzheimer disease. Regression models in covariates including age, gender and disease status for the extent of white matter connecting each pair of regions of brain are proposed. Subject We study possible relations between the Alzheimer's disease progression and the structure…
▽ More
We study possible relations between the structure of the connectome, white matter connecting different regions of brain, and Alzheimer disease. Regression models in covariates including age, gender and disease status for the extent of white matter connecting each pair of regions of brain are proposed. Subject We study possible relations between the Alzheimer's disease progression and the structure of the connectome, white matter connecting different regions of brain. Regression models in covariates including age, gender and disease status for the extent of white matter connecting each pair of regions of brain are proposed. Subject inhomogeneity is also incorporated in the model through random effects with an unknown distribution. As there are large number of pairs of regions, we also adopt a dimension reduction technique through graphon (Lovasz and Szegedy (2006)) functions, which reduces functions of pairs of regions to functions of regions. The connecting graphon functions are considered unknown but assumed smoothness allows putting priors of low complexity on them. We pursue a nonparametric Bayesian approach by assigning a Dirichlet process scale mixture of zero mean normal prior on the distributions of the random effects and finite random series of tensor products of B-splines priors on the underlying graphon functions. Markov chain Monte Carlo techniques, for drawing samples for the posterior distributions are developed. The proposed Bayesian method overwhelmingly outperforms similar ANCOVA models in the simulation setup. The proposed Bayesian approach is applied on a dataset of 100 subjects and 83 brain regions and key regions implicated in the changing connectome are identified.
△ Less
Submitted 31 March, 2019; v1 submitted 12 October, 2017;
originally announced October 2017.
-
Multivariate Gaussian Network Structure Learning
Authors:
Xingqi Du,
Subhashis Ghosal
Abstract:
We consider a graphical model where a multivariate normal vector is associated with each node of the underlying graph and estimate the graphical structure. We minimize a loss function obtained by regressing the vector at each node on those at the remaining ones under a group penalty. We show that the proposed estimator can be computed by a fast convex optimization algorithm. We show that as the sa…
▽ More
We consider a graphical model where a multivariate normal vector is associated with each node of the underlying graph and estimate the graphical structure. We minimize a loss function obtained by regressing the vector at each node on those at the remaining ones under a group penalty. We show that the proposed estimator can be computed by a fast convex optimization algorithm. We show that as the sample size increases, the estimated regression coefficients and the correct graphical structure are correctly estimated with probability tending to one. By extensive simulations, we show the superiority of the proposed method over comparable procedures. We apply the technique on two real datasets. The first one is to identify gene and protein networks showing up in cancer cell lines, and the second one is to reveal the connections among different industries in the US.
△ Less
Submitted 16 September, 2017;
originally announced September 2017.
-
Bayesian Non-parametric Simultaneous Quantile Regression for Complete and Grid Data
Authors:
Priyam Das,
Subhashis Ghosal
Abstract:
In this paper, we consider Bayesian methods for non-parametric quantile regressions with multiple continuous predictors ranging values in the unit interval. In the first method, the quantile function is assumed to be smooth over the explanatory variable and is expanded in tensor product of B-spline basis functions. While in the second method, the distribution function is assumed to be smooth over…
▽ More
In this paper, we consider Bayesian methods for non-parametric quantile regressions with multiple continuous predictors ranging values in the unit interval. In the first method, the quantile function is assumed to be smooth over the explanatory variable and is expanded in tensor product of B-spline basis functions. While in the second method, the distribution function is assumed to be smooth over the explanatory variable and is expanded in tensor product of B-spline basis functions. Unlike other existing methods of non-parametric quantile regressions, the proposed methods estimate the whole quantile function instead of estimating on a grid of quantiles. Priors on the B-spline coefficients are put in such a way that the monotonicity of the estimated quantile levels are maintained unlike local polynomial quantile regression methods. The proposed methods have also been modified for quantile grid data where only the percentile range of each response observations are known. Simulations studies have been provided for both complete and quantile grid data. The proposed method has been used to estimate the quantiles of US household income data and North Atlantic hurricane intensity data.
△ Less
Submitted 30 November, 2016;
originally announced December 2016.
-
Analyzing Ozone Concentration by Bayesian Spatio-temporal Quantile Regression
Authors:
Priyam Das,
Subhashis Ghosal
Abstract:
Ground level Ozone is one of the six common air-pollutants on which the EPA has set national air quality standards. In order to capture the spatio-temporal trend of 1-hour and 8-hour average ozone concentration in the US, we develop a method for spatio-temporal simultaneous quantile regression. Unlike existing procedures, in the proposed method, smoothing across the sites is incorporated within mo…
▽ More
Ground level Ozone is one of the six common air-pollutants on which the EPA has set national air quality standards. In order to capture the spatio-temporal trend of 1-hour and 8-hour average ozone concentration in the US, we develop a method for spatio-temporal simultaneous quantile regression. Unlike existing procedures, in the proposed method, smoothing across the sites is incorporated within modeling assumptions thus allowing borrowing of information across locations, an essential step when the number of samples in each location is low. The quantile function has been assumed to be linear in time and smooth over space and at any given site is given by a convex combination of two monotone increasing functions $ξ_1$ and $ξ_2$ not depending on time. A B-spline basis expansion with increasing coefficients varying smoothly over the space is used to put a prior and a Bayesian analysis is performed. We analyze the average daily 1-hour maximum and 8-hour maximum ozone concentration level data of US and California during 2006-2015 using the proposed method. It is noted that in the last ten years, there is an overall decreasing trend in both 1-hour maximum and 8-hour maximum ozone concentration level over the most parts of the US. In California, an overall a decreasing trend of 1-hour maximum ozone level is observed while no particular overall trend has been observed in the case of 8-hour maximum ozone level.
△ Less
Submitted 5 December, 2016; v1 submitted 15 September, 2016;
originally announced September 2016.
-
Bayesian mode and maximum estimation and accelerated rates of contraction
Authors:
William Weimin Yoo,
Subhashis Ghosal
Abstract:
We study the problem of estimating the mode and maximum of an unknown regression function in the presence of noise. We adopt the Bayesian approach by using tensor-product B-splines and endowing the coefficients with Gaussian priors. In the usual fixed-in-advanced sampling plan, we establish posterior contraction rates for mode and maximum and show that they coincide with the minimax rates for this…
▽ More
We study the problem of estimating the mode and maximum of an unknown regression function in the presence of noise. We adopt the Bayesian approach by using tensor-product B-splines and endowing the coefficients with Gaussian priors. In the usual fixed-in-advanced sampling plan, we establish posterior contraction rates for mode and maximum and show that they coincide with the minimax rates for this problem. To quantify estimation uncertainty, we construct credible sets for these two quantities that have high coverage probabilities with optimal sizes. If one is allowed to collect data sequentially, we further propose a Bayesian two-stage estimation procedure, where a second stage posterior is built based on samples collected within a credible set constructed from a first stage posterior. Under appropriate conditions on the radius of this credible set, we can accelerate optimal contraction rates from the fixed-in-advanced setting to the minimax sequential rates. A simulation experiment shows that our Bayesian two-stage procedure outperforms single-stage procedure and also slightly improves upon a non-Bayesian two-stage procedure.
△ Less
Submitted 15 March, 2018; v1 submitted 12 August, 2016;
originally announced August 2016.
-
Bayesian Detection of Image Boundaries
Authors:
Meng Li,
Subhashis Ghosal
Abstract:
Detecting boundary of an image based on noisy observations is a fundamental problem of image processing and image segmentation. For a $d$-dimensional image ($d = 2, 3, \ldots$), the boundary can often be described by a closed smooth $(d - 1)$-dimensional manifold. In this paper, we propose a nonparametric Bayesian approach based on priors indexed by $\mathbb{S}^{d - 1}$, the unit sphere in…
▽ More
Detecting boundary of an image based on noisy observations is a fundamental problem of image processing and image segmentation. For a $d$-dimensional image ($d = 2, 3, \ldots$), the boundary can often be described by a closed smooth $(d - 1)$-dimensional manifold. In this paper, we propose a nonparametric Bayesian approach based on priors indexed by $\mathbb{S}^{d - 1}$, the unit sphere in $\mathbb{R}^d$. We derive optimal posterior contraction rates using Gaussian processes or finite random series priors using basis functions such as trigonometric polynomials for 2-dimensional images and spherical harmonics for 3-dimensional images. For 2-dimensional images, we show a rescaled squared exponential Gaussian process on $\mathbb{S}^1$ achieves four goals of guaranteed geometric restriction, (nearly) minimax rate optimal and adaptive to the smoothness level, convenient for joint inference and computationally efficient. We conduct an extensive study of its reproducing kernel Hilbert space, which may be of interest by its own and can also be used in other contexts. Simulations confirm excellent performance of the proposed method and indicate its robustness under model misspecification at least under the simulated settings.
△ Less
Submitted 24 May, 2016; v1 submitted 24 August, 2015;
originally announced August 2015.
-
Adaptive Bayesian density regression for high-dimensional data
Authors:
Weining Shen,
Subhashis Ghosal
Abstract:
Density regression provides a flexible strategy for modeling the distribution of a response variable $Y$ given predictors $\mathbf{X}=(X_1,\ldots,X_p)$ by letting that the conditional density of $Y$ given $\mathbf{X}$ as a completely unknown function and allowing its shape to change with the value of $\mathbf{X}$. The number of predictors $p$ may be very large, possibly much larger than the number…
▽ More
Density regression provides a flexible strategy for modeling the distribution of a response variable $Y$ given predictors $\mathbf{X}=(X_1,\ldots,X_p)$ by letting that the conditional density of $Y$ given $\mathbf{X}$ as a completely unknown function and allowing its shape to change with the value of $\mathbf{X}$. The number of predictors $p$ may be very large, possibly much larger than the number of observations $n$, but the conditional density is assumed to depend only on a much smaller number of predictors, which are unknown. In addition to estimation, the goal is also to select the important predictors which actually affect the true conditional density. We consider a nonparametric Bayesian approach to density regression by constructing a random series prior based on tensor products of spline functions. The proposed prior also incorporates the issue of variable selection. We show that the posterior distribution of the conditional density contracts adaptively at the truth nearly at the optimal oracle rate, determined by the unknown sparsity and smoothness levels, even in the ultra high-dimensional settings where $p$ increases exponentially with $n$. The result is also extended to the anisotropic case where the degree of smoothness can vary in different directions, and both random and deterministic predictors are considered. We also propose a technique to calculate posterior moments of the conditional density function without requiring Markov chain Monte Carlo methods.
△ Less
Submitted 6 January, 2016; v1 submitted 11 March, 2014;
originally announced March 2014.
-
Adaptive Bayesian procedures using random series priors
Authors:
Weining Shen,
Subhashis Ghosal
Abstract:
We consider a prior for nonparametric Bayesian estimation which uses finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior convergence rates for all smoothness levels of the function in the true model by constructing an appropriate "sieve"…
▽ More
We consider a prior for nonparametric Bayesian estimation which uses finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior convergence rates for all smoothness levels of the function in the true model by constructing an appropriate "sieve" and applying the general theory of posterior convergence rates. We apply this general result on several statistical problems such as signal processing, density estimation, various nonparametric regressions, classification, spectral density estimation, functional regression etc. The prior can be viewed as an alternative to the commonly used Gaussian process prior, but properties of the posterior distribution can be analyzed by relatively simpler techniques and in many cases allows a simpler approach to computation without using Markov chain Monte-Carlo (MCMC) methods. A simulation study is conducted to show that the accuracy of the Bayesian estimators based on the random series prior and the Gaussian process prior are comparable. We apply the method on two interesting data sets on functional regression.
△ Less
Submitted 7 February, 2015; v1 submitted 3 March, 2014;
originally announced March 2014.
-
Bayesian estimation of a sparse precision matrix
Authors:
Sayantan Banerjee,
Subhashis Ghosal
Abstract:
We consider the problem of estimating a sparse precision matrix of a multivariate Gaussian distribution, including the case where the dimension $p$ is large. Gaussian graphical models provide an important tool in describing conditional independence through presence or absence of the edges in the underlying graph. A popular non-Bayesian method of estimating a graphical structure is given by the gra…
▽ More
We consider the problem of estimating a sparse precision matrix of a multivariate Gaussian distribution, including the case where the dimension $p$ is large. Gaussian graphical models provide an important tool in describing conditional independence through presence or absence of the edges in the underlying graph. A popular non-Bayesian method of estimating a graphical structure is given by the graphical lasso. In this paper, we consider a Bayesian approach to the problem. We use priors which put a mixture of a point mass at zero and certain absolutely continuous distribution on off-diagonal elements of the precision matrix. Hence the resulting posterior distribution can be used for graphical structure learning. The posterior convergence rate of the precision matrix is obtained. The posterior distribution on the model space is extremely cumbersome to compute. We propose a fast computational method for approximating the posterior probabilities of various graphs using the Laplace approximation approach by expanding the posterior density around the posterior mode, which is the graphical lasso by our choice of the prior distribution. We also provide estimates of the accuracy in the approximation.
△ Less
Submitted 6 April, 2014; v1 submitted 6 September, 2013;
originally announced September 2013.
-
J. K. Ghosh's contribution to statistics: A brief outline
Authors:
Bertrand Clarke,
Subhashis Ghosal
Abstract:
Professor Jayanta Kumar Ghosh has contributed massively to various areas of Statistics over the last five decades. Here, we survey some of his most important contributions. In roughly chronological order, we discuss his major results in the areas of sequential analysis, foundations, asymptotics, and Bayesian inference. It is seen that he progressed from thinking about data points, to thinking ab…
▽ More
Professor Jayanta Kumar Ghosh has contributed massively to various areas of Statistics over the last five decades. Here, we survey some of his most important contributions. In roughly chronological order, we discuss his major results in the areas of sequential analysis, foundations, asymptotics, and Bayesian inference. It is seen that he progressed from thinking about data points, to thinking about data summarization, to the limiting cases of data summarization in as they relate to parameter estimation, and then to more general aspects of modeling including prior and model selection.
△ Less
Submitted 20 May, 2008;
originally announced May 2008.