-
A Bayesian Survival Tree Partition Model Using Latent Gaussian Processes
Authors:
Richard D. Payne,
Nilabja Guha,
Bani K. Mallick
Abstract:
Survival models are used to analyze time-to-event data in a variety of disciplines. Proportional hazard models provide interpretable parameter estimates, but proportional hazards assumptions are not always appropriate. Non-parametric models are more flexible but often lack a clear inferential framework. We propose a Bayesian tree partition model which is both flexible and inferential. Inference is…
▽ More
Survival models are used to analyze time-to-event data in a variety of disciplines. Proportional hazard models provide interpretable parameter estimates, but proportional hazards assumptions are not always appropriate. Non-parametric models are more flexible but often lack a clear inferential framework. We propose a Bayesian tree partition model which is both flexible and inferential. Inference is obtained through the posterior tree structure and flexibility is preserved by modeling the the hazard function in each partition using a latent exponentiated Gaussian process. An efficient reversible jump Markov chain Monte Carlo algorithm is accomplished by marginalizing the parameters in each partition element via a Laplace approximation. Consistency properties for the estimator are established. The method can be used to help determine subgroups as well as prognostic and/or predictive biomarkers in time-to-event data. The method is applied to a liver survival dataset and is compared with some existing methods on simulated data.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
A Conditional Density Estimation Partition Model Using Logistic Gaussian Processes
Authors:
Richard D. Payne,
Nilabja Guha,
Yu Ding,
Bani K. Mallick
Abstract:
Conditional density estimation (density regression) estimates the distribution of a response variable y conditional on covariates x. Utilizing a partition model framework, a conditional density estimation method is proposed using logistic Gaussian processes. The partition is created using a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm…
▽ More
Conditional density estimation (density regression) estimates the distribution of a response variable y conditional on covariates x. Utilizing a partition model framework, a conditional density estimation method is proposed using logistic Gaussian processes. The partition is created using a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm. The Markov chain Monte Carlo algorithm is made possible through a Laplace approximation on the latent variables of the logistic Gaussian process model. This approximation marginalizes the parameters in each partition element, allowing an efficient search of the posterior distribution of the tessellation. The method has desirable consistency properties. In simulation and applications, the model successfully estimates the partition structure and conditional distribution of y.
△ Less
Submitted 20 March, 2017;
originally announced March 2017.
-
Two-Stage Metropolis-Hastings for Tall Data
Authors:
Richard D. Payne,
Bani K. Mallick
Abstract:
This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to…
▽ More
This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to reduce the exact likelihood computational cost in the tall data situation. In the first stage, a new proposal is tested by the approximate likelihood based model. The full likelihood based posterior computation will be conducted only if the proposal passes the first stage screening. Furthermore, this method can be adopted into the consensus Monte Carlo framework. The two-stage method is applied to logistic regression, hierarchical logistic regression, and Bayesian multivariate adaptive regression splines.
△ Less
Submitted 20 March, 2017; v1 submitted 20 November, 2014;
originally announced November 2014.