Search | arXiv e-print repository

Loss-based prior for tree topologies in BART models

Authors: F. Serafini, F. Leisen, C. Villa, K. Wilson

Abstract: We present a novel prior for tree topology within Bayesian Additive Regression Trees (BART) models. This approach quantifies the hypothetical loss in information and the loss due to complexity associated with choosing the wrong tree structure. The resulting prior distribution is compellingly geared toward sparsity, a critical feature considering BART models' tendency to overfit. Our method incorpo… ▽ More We present a novel prior for tree topology within Bayesian Additive Regression Trees (BART) models. This approach quantifies the hypothetical loss in information and the loss due to complexity associated with choosing the wrong tree structure. The resulting prior distribution is compellingly geared toward sparsity, a critical feature considering BART models' tendency to overfit. Our method incorporates prior knowledge into the distribution via two parameters that govern the tree's depth and balance between its left and right branches. Additionally, we propose a default calibration for these parameters, offering an objective version of the prior. We demonstrate our method's efficacy on both simulated and real datasets. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.16828 [pdf, other]

Asymptotics of predictive distributions driven by sample means and variances

Authors: Samuele Garelli, Fabrizio Leisen, Luca Pratelli, Pietro Rigo

Abstract: Let $α_n(\cdot)=P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)$ be the predictive distributions of a sequence $(X_1,X_2,\ldots)$ of $p$-variate random variables. Suppose $$α_n=\mathcal{N}_p(M_n,Q_n)$$ where $M_n=\frac{1}{n}\sum_{i=1}^nX_i$ and $Q_n=\frac{1}{n}\sum_{i=1}^n(X_i-M_n)(X_i-M_n)^t$. Then, there is a random probability measure $α$ on $\mathbb{R}^p$ such that $α_n\rightarrowα$ weakly a.s… ▽ More Let $α_n(\cdot)=P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)$ be the predictive distributions of a sequence $(X_1,X_2,\ldots)$ of $p$-variate random variables. Suppose $$α_n=\mathcal{N}_p(M_n,Q_n)$$ where $M_n=\frac{1}{n}\sum_{i=1}^nX_i$ and $Q_n=\frac{1}{n}\sum_{i=1}^n(X_i-M_n)(X_i-M_n)^t$. Then, there is a random probability measure $α$ on $\mathbb{R}^p$ such that $α_n\rightarrowα$ weakly a.s. If $p\in\{1,2\}$, one also obtains $\lVertα_n-α\rVert\overset{a.s.}\longrightarrow 0$ where $\lVert\cdot\rVert$ is total variation distance. Moreover, the convergence rate of $\lVertα_n-α\rVert$ is arbitrarily close to $n^{-1/2}$. These results (apart from the one regarding the convergence rate) still apply even if $α_n=\mathcal{L}_p(M_n,Q_n)$, where $\mathcal{L}_p$ belongs to a class of distributions much larger than the normal. Finally, the asymptotic behavior of copula-based predictive distributions (introduced in [13]) is investigated and a numerical experiment is performed. △ Less

Submitted 27 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2212.09398 [pdf, other]

Generating knockoffs via conditional independence

Authors: Emanuela Dreassi, Fabrizio Leisen, Luca Pratelli, Pietro Rigo

Abstract: Let $X$ be a $p$-variate random vector and $\widetilde{X}$ a knockoff copy of $X$ (in the sense of \cite{CFJL18}). A new approach for constructing $\widetilde{X}$ (henceforth, NA) has been introduced in \cite{JSPI}. NA has essentially three advantages: (i) To build $\widetilde{X}$ is straightforward; (ii) The joint distribution of $(X,\widetilde{X})$ can be written in closed form; (iii)… ▽ More Let $X$ be a $p$-variate random vector and $\widetilde{X}$ a knockoff copy of $X$ (in the sense of \cite{CFJL18}). A new approach for constructing $\widetilde{X}$ (henceforth, NA) has been introduced in \cite{JSPI}. NA has essentially three advantages: (i) To build $\widetilde{X}$ is straightforward; (ii) The joint distribution of $(X,\widetilde{X})$ can be written in closed form; (iii) $\widetilde{X}$ is often optimal under various criteria. However, for NA to apply, $X_1,\ldots, X_p$ should be conditionally independent given some random element $Z$. Our first result is that any probability measure $μ$ on $\mathbb{R}^p$ can be approximated by a probability measure $μ_0$ of the form $$μ_0\bigl(A_1\times\ldots\times A_p\bigr)=E\Bigl\{\prod_{i=1}^p P(X_i\in A_i\mid Z)\Bigr\}.$$ The approximation is in total variation distance when $μ$ is absolutely continuous, and an explicit formula for $μ_0$ is provided. If $X\simμ_0$, then $X_1,\ldots,X_p$ are conditionally independent. Hence, with a negligible error, one can assume $X\simμ_0$ and build $\widetilde{X}$ through NA. Our second result is a characterization of the knockoffs $\widetilde{X}$ obtained via NA. It is shown that $\widetilde{X}$ is of this type if and only if the pair $(X,\widetilde{X})$ can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of $\widetilde{X}$ given $X$ is obtained in a few cases. In one of such cases, it is assumed $X_i\in\{0,1\}$ for all $i$. △ Less

Submitted 13 December, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: 26 pages

MSC Class: 62E10; 62H05; 60E05; 62J02

arXiv:2208.06785 [pdf, ps, other]

A probabilistic view on predictive constructions for Bayesian learning

Authors: Patrizia Berti, Emanuela Dreassi, Fabrizio Leisen, Pietro Rigo, Luca Pratelli

Abstract: Given a sequence $X=(X_1,X_2,\ldots)$ of random observations, a Bayesian forecaster aims to predict $X_{n+1}$ based on $(X_1,\ldots,X_n)$ for each $n\ge 0$. To this end, in principle, she only needs to select a collection $σ=(σ_0,σ_1,\ldots)$, called ``strategy" in what follows, where $σ_0(\cdot)=P(X_1\in\cdot)$ is the marginal distribution of $X_1$ and… ▽ More Given a sequence $X=(X_1,X_2,\ldots)$ of random observations, a Bayesian forecaster aims to predict $X_{n+1}$ based on $(X_1,\ldots,X_n)$ for each $n\ge 0$. To this end, in principle, she only needs to select a collection $σ=(σ_0,σ_1,\ldots)$, called ``strategy" in what follows, where $σ_0(\cdot)=P(X_1\in\cdot)$ is the marginal distribution of $X_1$ and $σ_n(\cdot)=P(X_{n+1}\in\cdot\mid X_1,\ldots,X_n)$ the $n$-th predictive distribution. Because of the Ionescu-Tulcea theorem, $σ$ can be assigned directly, without passing through the usual prior/posterior scheme. One main advantage is that no prior probability is to be selected. In a nutshell, this is the predictive approach to Bayesian learning. A concise review of the latter is provided in this paper. We try to put such an approach in the right framework, to make clear a few misunderstandings, and to provide a unifying view. Some recent results are discussed as well. In addition, some new strategies are introduced and the corresponding distribution of the data sequence $X$ is determined. The strategies concern generalized Pólya urns, random change points, covariates and stationary sequences. △ Less

Submitted 27 January, 2023; v1 submitted 14 August, 2022; originally announced August 2022.

arXiv:2106.00114 [pdf, ps, other]

Kernel based Dirichlet sequences

Authors: Patrizia Berti, Emanuela Dreassi, Fabrizio Leisen, Luca Pratelli, Pietro Rigo

Abstract: Let $X=(X_1,X_2,\ldots)$ be a sequence of random variables with values in a standard space $(S,\mathcal{B})$. Suppose \begin{gather*} X_1\simν\quad\text{and}\quad P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)=\frac{θν(\cdot)+\sum_{i=1}^nK(X_i)(\cdot)}{n+θ}\quad\quad\text{a.s.} \end{gather*} where $θ>0$ is a constant, $ν$ a probability measure on $\mathcal{B}$, and $K$ a random probability measur… ▽ More Let $X=(X_1,X_2,\ldots)$ be a sequence of random variables with values in a standard space $(S,\mathcal{B})$. Suppose \begin{gather*} X_1\simν\quad\text{and}\quad P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)=\frac{θν(\cdot)+\sum_{i=1}^nK(X_i)(\cdot)}{n+θ}\quad\quad\text{a.s.} \end{gather*} where $θ>0$ is a constant, $ν$ a probability measure on $\mathcal{B}$, and $K$ a random probability measure on $\mathcal{B}$. Then, $X$ is exchangeable whenever $K$ is a regular conditional distribution for $ν$ given any sub-$σ$-field of $\mathcal{B}$. Under this assumption, $X$ enjoys all the main properties of classical Dirichlet sequences, including Sethuraman's representation, conjugacy property, and convergence in total variation of predictive distributions. If $μ$ is the weak limit of the empirical measures, conditions for $μ$ to be a.s. discrete, or a.s. non-atomic, or $μ\llν$ a.s., are provided. Two CLT's are proved as well. The first deals with stable convergence while the second concerns total variation distance. △ Less

Submitted 2 April, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

arXiv:2104.11643 [pdf, ps, other]

Bayesian predictive inference without a prior

Authors: Patrizia Berti, Emanuela Dreassi, Fabrizio Leisen, Pietro Rigo, Luca Pratelli

Abstract: Let $(X_n:n\ge 1)$ be a sequence of random observations. Let $σ_n(\cdot)=P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)$ be the $n$-th predictive distribution and $σ_0(\cdot)=P(X_1\in\cdot)$ the marginal distribution of $X_1$. In a Bayesian framework, to make predictions on $(X_n)$, one only needs the collection $σ=(σ_n:n\ge 0)$. Because of the Ionescu-Tulcea theorem, $σ$ can be assigned directly… ▽ More Let $(X_n:n\ge 1)$ be a sequence of random observations. Let $σ_n(\cdot)=P\bigl(X_{n+1}\in\cdot\mid X_1,\ldots,X_n\bigr)$ be the $n$-th predictive distribution and $σ_0(\cdot)=P(X_1\in\cdot)$ the marginal distribution of $X_1$. In a Bayesian framework, to make predictions on $(X_n)$, one only needs the collection $σ=(σ_n:n\ge 0)$. Because of the Ionescu-Tulcea theorem, $σ$ can be assigned directly, without passing through the usual prior/posterior scheme. One main advantage is that no prior probability has to be selected. In this paper, $σ$ is subjected to two requirements: (i) The resulting sequence $(X_n)$ is conditionally identically distributed, in the sense of Berti, Pratelli and Rigo (2004); (ii) Each $σ_{n+1}$ is a simple recursive update of $σ_n$. Various new $σ$ satisfying (i)-(ii) are introduced and investigated. For such $σ$, the asymptotics of $σ_n$, as $n\rightarrow\infty$, is determined. In some cases, the probability distribution of $(X_n)$ is also evaluated. △ Less

Submitted 26 April, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

arXiv:2104.07752 [pdf, ps, other]

New perspectives on knockoffs construction

Authors: Patrizia Berti, Emanuela Dreassi, Fabrizio Leisen, Luca Pratelli, Pietro Rigo

Abstract: Let $Λ$ be the collection of all probability distributions for $(X,\widetilde{X})$, where $X$ is a fixed random vector and $\widetilde{X}$ ranges over all possible knockoff copies of $X$ (in the sense of \cite{CFJL18}). Three topics are developed in this paper: (i) A new characterization of $Λ$ is proved; (ii) A certain subclass of $Λ$, defined in terms of copulas, is introduced; (iii) The (meanin… ▽ More Let $Λ$ be the collection of all probability distributions for $(X,\widetilde{X})$, where $X$ is a fixed random vector and $\widetilde{X}$ ranges over all possible knockoff copies of $X$ (in the sense of \cite{CFJL18}). Three topics are developed in this paper: (i) A new characterization of $Λ$ is proved; (ii) A certain subclass of $Λ$, defined in terms of copulas, is introduced; (iii) The (meaningful) special case where the components of $X$ are conditionally independent is treated in depth. In real problems, after observing $X=x$, each of points (i)-(ii)-(iii) may be useful to generate a value $\widetilde{x}$ for $\widetilde{X}$ conditionally on $X=x$. △ Less

Submitted 29 July, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

arXiv:2007.11700 [pdf, other]

A Copula-based Fully Bayesian Nonparametric Evaluation of Cardiovascular Risk Markers in the Mexico City Diabetes Study

Authors: Claudia Wehrhahn, Ruth Fuentes-García, Ramsés H. Mena, Fabrizio Leisen, Maria Elena González-Villalpando, Clicerio González-Villalpando

Abstract: Cardiovascular disease lead the cause of death world wide and several studies have been carried out to understand and explore cardiovascular risk markers in normoglycemic and diabetic populations. In this work, we explore the association structure between hyperglycemic markers and cardiovascular risk markers controlled by triglycerides, body mass index, age and gender, for the normoglycemic popula… ▽ More Cardiovascular disease lead the cause of death world wide and several studies have been carried out to understand and explore cardiovascular risk markers in normoglycemic and diabetic populations. In this work, we explore the association structure between hyperglycemic markers and cardiovascular risk markers controlled by triglycerides, body mass index, age and gender, for the normoglycemic population in The Mexico City Diabetes Study. Understanding the association structure could contribute to the assessment of additional cardiovascular risk markers in this low income urban population with a high prevalence of classic cardiovascular risk biomarkers. The association structure is measured by conditional Kendall's tau, defined through conditional copula functions. The latter are in turn modeled under a fully Bayesian nonparametric approach, which allows the complete shape of the copula function to vary for different values of the controlled covariates. △ Less

Submitted 20 August, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

arXiv:2007.05336 [pdf, ps, other]

Completely Random Measures and Lévy Bases in Free probability

Authors: Francesca Collet, Fabrizio Leisen, Steen Thorbjørnsen

Abstract: This paper develops a theory for completely random measures in the framework of free probability. A general existence result for free completely random measures is established, and in analogy to the classical work of Kingman it is proved that such random measures can be decomposed into the sum of a purely atomic part and a (freely) infinitely divisible part. The latter part (termed a free Lévy bas… ▽ More This paper develops a theory for completely random measures in the framework of free probability. A general existence result for free completely random measures is established, and in analogy to the classical work of Kingman it is proved that such random measures can be decomposed into the sum of a purely atomic part and a (freely) infinitely divisible part. The latter part (termed a free Lévy basis) is studied in detail in terms of the free Lévy-Khintchine representation and a theory parallel to the classical work of Rajput and Rosinski is developed. Finally a Lévy-Itô type decomposition for general free Lévy bases is established. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:1909.12112 [pdf, other]

Compound vectors of subordinators and their associated positive Lévy copulas

Authors: Alan Riva Palacio, Fabrizio Leisen

Abstract: Lévy copulas are an important tool which can be used to build dependent Lévy processes. In a classical setting, they have been used to model financial applications. In a Bayesian framework they have been employed to introduce dependent nonparametric priors which allow to model heterogeneous data. This paper focuses on introducing a new class of Lévy copulas based on a class of subordinators recent… ▽ More Lévy copulas are an important tool which can be used to build dependent Lévy processes. In a classical setting, they have been used to model financial applications. In a Bayesian framework they have been employed to introduce dependent nonparametric priors which allow to model heterogeneous data. This paper focuses on introducing a new class of Lévy copulas based on a class of subordinators recently appeared in the literature, called \textit{Compound Random Measures}. The well-known Clayton Lévy copula is a special case of this new class. Furthermore, we provide some novel results about the underlying vector of subordinators such as a series representation and relevant moments. The article concludes with an application to a Danish fire dataset. △ Less

Submitted 31 August, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

arXiv:1909.02989 [pdf, other]

A Pólya-Gamma Sampler for a Generalized Logistic Regression

Authors: Luciana Dalla Valle, Fabrizio Leisen, Luca Rossini, Weixuan Zhu

Abstract: In this paper we introduce a novel Bayesian data augmentation approach for estimating the parameters of the generalised logistic regression model. We propose a Pólya-Gamma sampler algorithm that allows us to sample from the exact posterior distribution, rather than relying on approximations. A simulation study illustrates the flexibility and accuracy of the proposed approach to capture heavy and l… ▽ More In this paper we introduce a novel Bayesian data augmentation approach for estimating the parameters of the generalised logistic regression model. We propose a Pólya-Gamma sampler algorithm that allows us to sample from the exact posterior distribution, rather than relying on approximations. A simulation study illustrates the flexibility and accuracy of the proposed approach to capture heavy and light tails in binary response data of different dimensions. The methodology is applied to two different real datasets, where we demonstrate that the Pólya-Gamma sampler provides more precise estimates than the empirical likelihood method, outperforming approximate approaches. △ Less

Submitted 21 December, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

Comments: Revised Version of the paper

arXiv:1812.07271 [pdf, ps, other]

On a flexible construction of a negative binomial model

Authors: Fabrizio Leisen, Ramsés H. Mena, Freddy Palma Mancilla, Luca Rossini

Abstract: This work presents a construction of stationary Markov models with negative-binomial marginal distributions. A simple closed form expression for the corresponding transition probabilities is given, linking the proposal to well-known classes of birth and death processes and thus revealing interesting characterizations. The advantage of having such closed form expressions is tested on simulated and… ▽ More This work presents a construction of stationary Markov models with negative-binomial marginal distributions. A simple closed form expression for the corresponding transition probabilities is given, linking the proposal to well-known classes of birth and death processes and thus revealing interesting characterizations. The advantage of having such closed form expressions is tested on simulated and real data. △ Less

Submitted 9 April, 2019; v1 submitted 18 December, 2018; originally announced December 2018.

Comments: Forthcoming in "Statistics & Probability Letters"

arXiv:1812.05531 [pdf, other]

A Loss-Based Prior for Gaussian Graphical Models

Authors: Laurentiu Catalin Hinoveanu, Fabrizio Leisen, Cristiano Villa

Abstract: Gaussian graphical models play an important role in various areas such as genetics, finance, statistical physics and others. They are a powerful modelling tool which allows one to describe the relationships among the variables of interest. From the Bayesian perspective, there are two sources of randomness: one is related to the multivariate distribution and the quantities that may parametrise the… ▽ More Gaussian graphical models play an important role in various areas such as genetics, finance, statistical physics and others. They are a powerful modelling tool which allows one to describe the relationships among the variables of interest. From the Bayesian perspective, there are two sources of randomness: one is related to the multivariate distribution and the quantities that may parametrise the model, the other has to do with the underlying graph, $G$, equivalent to describing the conditional independence structure of the model under consideration. In this paper, we propose a prior on G based on two loss components. One considers the loss in information one would incur in selecting the wrong graph, while the second penalises for large number of edges, favouring sparsity. We illustrate the prior on simulated data and on real datasets, and compare the results with other priors on $G$ used in the literature. Moreover, we present a default choice of the prior as well as discuss how it can be calibrated so as to reflect available prior information. △ Less

Submitted 18 April, 2020; v1 submitted 13 December, 2018; originally announced December 2018.

arXiv:1802.05292 [pdf, ps, other]

Loss-based approach to two-piece location-scale distributions with applications to dependent data

Authors: Fabrizio Leisen, Luca Rossini, Cristiano Villa

Abstract: Two-piece location-scale models are used for modeling data presenting departures from symmetry. In this paper, we propose an objective Bayesian methodology for the tail parameter of two particular distributions of the above family: the skewed exponential power distribution and the skewed generalised logistic distribution. We apply the proposed objective approach to time series models and linear re… ▽ More Two-piece location-scale models are used for modeling data presenting departures from symmetry. In this paper, we propose an objective Bayesian methodology for the tail parameter of two particular distributions of the above family: the skewed exponential power distribution and the skewed generalised logistic distribution. We apply the proposed objective approach to time series models and linear regression models where the error terms follow the distributions object of study. The performance of the proposed approach is illustrated through simulation experiments and real data analysis. The methodology yields improvements in density forecasts, as shown by the analysis we carry out on the electricity prices in Nordpool markets. △ Less

Submitted 28 November, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

Comments: 26 pages, 6 Figures

arXiv:1802.00796 [pdf, ps, other]

Bayes Calculations from Quantile Implied Likelihood

Authors: George Karabatsos, Fabrizio Leisen

Abstract: In statistical practice, a realistic Bayesian model for a given data set can be defined by a likelihood function that is analytically or computationally intractable, due to large data sample size, high parameter dimensionality, or complex likelihood functional form. This in turn poses challenges to the computation and inference of the posterior distribution of the model parameters. For such a mode… ▽ More In statistical practice, a realistic Bayesian model for a given data set can be defined by a likelihood function that is analytically or computationally intractable, due to large data sample size, high parameter dimensionality, or complex likelihood functional form. This in turn poses challenges to the computation and inference of the posterior distribution of the model parameters. For such a model, a tractable likelihood function is introduced which approximates the exact likelihood through its quantile function. It is defined by an asymptotic chi-square confidence distribution for a pivotal quantity, which is generated by the asymptotic normal distribution of the sample quantiles given model parameters. This Quantile Implied Likelihood (QIL) gives rise to an approximate posterior distribution which can be estimated by using penalized log-likelihood maximization or any suitable Monte Carlo algorithm. The QIL approach to Bayesian Computation is illustrated through the Bayesian analysis of simulated and real data sets having sample sizes that reach the millions. The analyses involve various models for univariate or multivariate iid or non-iid data, with low or high parameter dimensionality, many of which are defined by intractable likelihoods. The probability models include the Student's t, g-and-h, and g-and-k distributions; the Bayesian logit regression model with many covariates; exponential random graph model, a doubly-intractable model for networks; the multivariate skew normal model, for robust inference of the inverse-covariance matrix when it is large relative to the sample size; and the Wallenius distribution model. △ Less

Submitted 16 March, 2019; v1 submitted 2 February, 2018; originally announced February 2018.

arXiv:1801.08495 [pdf, other]

Limiting behaviour of the stationary search cost distribution driven by a generalized gamma process

Authors: Alfred Kume, Fabrizio Leisen, Antonio Lijoi

Abstract: Consider a list of labeled objects that are organized in a heap. At each time, object $j$ is selected with probability $p_j$ and moved to the top of the heap. This procedure defines a Markov chain on the set of permutations which is referred to in the literature as Move-to-Front rule. The present contribution focuses on the stationary search cost, namely the position of the requested item in the h… ▽ More Consider a list of labeled objects that are organized in a heap. At each time, object $j$ is selected with probability $p_j$ and moved to the top of the heap. This procedure defines a Markov chain on the set of permutations which is referred to in the literature as Move-to-Front rule. The present contribution focuses on the stationary search cost, namely the position of the requested item in the heap when the Markov chain is in equilibrium. We consider the scenario where the number of objects is infinite and the probabilities $p_j$'s are defined as the normalization of the increments of a subordinator. In this setting, we provide an exact formula for the moments of any order of the stationary search cost distribution. We illustrate the new findings in the case of a generalized gamma subordinator and deal with an extension to the two--parameter Poisson--Dirichlet process, also known as Pitman--Yor process. △ Less

Submitted 25 January, 2018; originally announced January 2018.

arXiv:1708.05341 [pdf, ps, other]

An Approximate Likelihood Perspective on ABC Methods

Authors: George Karabatsos, Fabrizio Leisen

Abstract: We are living in the big data era, as current technologies and networks allow for the easy and routine collection of data sets in different disciplines. Bayesian Statistics offers a flexible modeling approach which is attractive for describing the complexity of these datasets. These models often exhibit a likelihood function which is intractable due to the large sample size, high number of paramet… ▽ More We are living in the big data era, as current technologies and networks allow for the easy and routine collection of data sets in different disciplines. Bayesian Statistics offers a flexible modeling approach which is attractive for describing the complexity of these datasets. These models often exhibit a likelihood function which is intractable due to the large sample size, high number of parameters, or functional complexity. Approximate Bayesian Computational (ABC) methods provides likelihood-free methods for performing statistical inferences with Bayesian models defined by intractable likelihood functions. The vastity of the literature on ABC methods created a need to review and relate all ABC approaches so that scientists can more readily understand and apply them for their own work. This article provides a unifying review, general representation, and classification of all ABC methods from the view of approximate likelihood theory. This clarifies how ABC methods can be characterized, related, combined, improved, and applied for future research. Possible future research in ABC is then suggested. △ Less

Submitted 8 May, 2018; v1 submitted 17 August, 2017; originally announced August 2017.

arXiv:1707.06768 [pdf, ps, other]

Integrability conditions for Compound Random Measures

Authors: Alan Riva Palacio, Fabrizio Leisen

Abstract: Compound random measures (CoRM's) are a flexible and tractable framework for vectors of completely random measure. In this paper, we provide conditions to guarantee the existence of a CoRM. Furthermore, we prove some interesting properties of CoRM's when exponential scores and regularly varying Lévy intensities are considered. Compound random measures (CoRM's) are a flexible and tractable framework for vectors of completely random measure. In this paper, we provide conditions to guarantee the existence of a CoRM. Furthermore, we prove some interesting properties of CoRM's when exponential scores and regularly varying Lévy intensities are considered. △ Less

Submitted 12 November, 2017; v1 submitted 21 July, 2017; originally announced July 2017.

arXiv:1706.00599 [pdf, other]

On a Class of Objective Priors from Scoring Rules

Authors: Fabrizio Leisen, Cristiano Villa, Stephen G. Walker

Abstract: Objective prior distributions represent an important tool that allows one to have the advantages of using the Bayesian framework even when information about the parameters of a model is not available. The usual objective approaches work off the chosen statistical model and in the majority of cases the resulting prior is improper, which can pose limitations to a practical implementation, even when… ▽ More Objective prior distributions represent an important tool that allows one to have the advantages of using the Bayesian framework even when information about the parameters of a model is not available. The usual objective approaches work off the chosen statistical model and in the majority of cases the resulting prior is improper, which can pose limitations to a practical implementation, even when the complexity of the model is moderate. In this paper we propose to take a novel look at the construction of objective prior distributions, where the connection with a chosen sampling distribution model is removed. We explore the notion of defining objective prior distributions which allow one to have some degree of flexibility, in particular in exhibiting some desirable features, such as being proper, or centered on specific values which would be of interest in nested model comparisons. The basic tool we use are proper scoring rules and the main result is a class of objective prior distributions that can be employed in scenarios where the usual model based priors fail, such as mixture models and model selection via Bayes factors. In addition, we show that the proposed class of priors is the result of minimising the information it contains, providing solid interpretation to the method. △ Less

Submitted 23 September, 2018; v1 submitted 2 June, 2017; originally announced June 2017.

arXiv:1704.07645 [pdf, other]

Bayesian nonparametric estimation of survival functions with multiple-samples information

Authors: Alan Riva Palacio, Fabrizio Leisen

Abstract: In many real problems, dependence structures more general than exchangeability are required. For instance, in some settings partial exchangeability is a more reasonable assumption. For this reason, vectors of dependent Bayesian nonparametric priors have recently gained popularity. They provide flexible models which are tractable from a computational and theoretical point of view. In this paper, we… ▽ More In many real problems, dependence structures more general than exchangeability are required. For instance, in some settings partial exchangeability is a more reasonable assumption. For this reason, vectors of dependent Bayesian nonparametric priors have recently gained popularity. They provide flexible models which are tractable from a computational and theoretical point of view. In this paper, we focus on their use for estimating survival functions with multiple-samples information. Our methodology allows to model the dependence among survival times of different groups of observations and extend previous work to an arbitrary dimension . Theoretical results about the posterior behaviour of the underlying dependent vector of completely random measures are provided. The performance of the model is tested on a simulated dataset arising from a distributional Clayton copula. △ Less

Submitted 18 March, 2018; v1 submitted 25 April, 2017; originally announced April 2017.

arXiv:1702.05462 [pdf, other]

Objective Bayesian Analysis for Change Point Problems

Authors: Laurentiu Hinoveanu, Fabrizio Leisen, Cristiano Villa

Abstract: In this paper we present a loss-based approach to change point analysis. In particular, we look at the problem from two perspectives. The first focuses on the definition of a prior when the number of change points is known a priori. The second contribution aims to estimate the number of change points by using a loss-based approach recently introduced in the literature. The latter considers change… ▽ More In this paper we present a loss-based approach to change point analysis. In particular, we look at the problem from two perspectives. The first focuses on the definition of a prior when the number of change points is known a priori. The second contribution aims to estimate the number of change points by using a loss-based approach recently introduced in the literature. The latter considers change point estimation as a model selection exercise. We show the performance of the proposed approach on simulated data and real data sets. △ Less

Submitted 7 January, 2018; v1 submitted 17 February, 2017; originally announced February 2017.

arXiv:1701.08142 [pdf, other]

Modelling Preference Data with the Wallenius Distribution

Authors: Clara Grazian, Fabrizio Leisen, Brunero Liseo

Abstract: The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the imp… ▽ More The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the importance of the categories. We illustrate the performance of the estimation procedure on simulated datasets. Finally, we use the new model for analysing two datasets about movies ratings and Italian academic statisticians' journal preferences. The latter is a novel dataset collected by the authors. △ Less

Submitted 28 June, 2018; v1 submitted 27 January, 2017; originally announced January 2017.

Comments: 3 figures

arXiv:1608.00874 [pdf, other]

Modelling and computation using NCoRM mixtures for density regression

Authors: Jim Griffin, Fabrizio Leisen

Abstract: Normalized compound random measures are flexible nonparametric priors for related distributions. We consider building general nonparametric regression models using normalized compound random measure mixture models. Posterior inference is made using a novel pseudo-marginal Metropolis-Hastings sampler for normalized compound random measure mixture models. The algorithm makes use of a new general app… ▽ More Normalized compound random measures are flexible nonparametric priors for related distributions. We consider building general nonparametric regression models using normalized compound random measure mixture models. Posterior inference is made using a novel pseudo-marginal Metropolis-Hastings sampler for normalized compound random measure mixture models. The algorithm makes use of a new general approach to the unbiased estimation of Laplace functionals of compound random measures (which includes completely random measures as a special case). The approach is illustrated on problems of density regression. △ Less

Submitted 31 August, 2017; v1 submitted 2 August, 2016; originally announced August 2016.

arXiv:1607.04796 [pdf, other]

Objective Bayesian modelling of insurance risks with the skewed Student-t distribution

Authors: Fabrizio Leisen, Juan Miguel Marin, Cristiano Villa

Abstract: Insurance risks data typically exhibit skewed behaviour. In this paper, we propose a Bayesian approach to capture the main features of these datasets. This work extends the methodology introduced in Villa and Walker (2014a) by considering an extra parameter which captures the skewness of the data. In particular, a skewed Student-t distribution is considered. Two datasets are analysed: the Danish f… ▽ More Insurance risks data typically exhibit skewed behaviour. In this paper, we propose a Bayesian approach to capture the main features of these datasets. This work extends the methodology introduced in Villa and Walker (2014a) by considering an extra parameter which captures the skewness of the data. In particular, a skewed Student-t distribution is considered. Two datasets are analysed: the Danish fire losses and the US indemnity loss. The analysis is carried with an objective Bayesian approach. For the discrete parameter representing the number of the degrees of freedom, we adopt a novel prior recently introduced in Villa and Walker (2014b). △ Less

Submitted 16 July, 2016; originally announced July 2016.

arXiv:1604.07304 [pdf, ps, other]

doi 10.1080/00949655.2016.1255741

A Note on the Posterior Inference for the Yule-Simon Distribution

Authors: Fabrizio Leisen, Luca Rossini, Cristiano Villa

Abstract: The Yule--Simon distribution has been out of the radar of the Bayesian community, so far. In this note, we propose an explicit Gibbs sampling scheme when a Gamma prior is chosen for the shape parameter. The performance of the algorithm is illustrated with simulation studies, including count data regression, and a real data application to text analysis. We compare our proposal to the frequentist co… ▽ More The Yule--Simon distribution has been out of the radar of the Bayesian community, so far. In this note, we propose an explicit Gibbs sampling scheme when a Gamma prior is chosen for the shape parameter. The performance of the algorithm is illustrated with simulation studies, including count data regression, and a real data application to text analysis. We compare our proposal to the frequentist counterparts showing better performance of our algorithm when a small sample size is considered. △ Less

Submitted 31 October, 2016; v1 submitted 25 April, 2016; originally announced April 2016.

Comments: Forthcoming in the "Journal of Statistical Computation and Simulation" - 12 pages, 4 Figures, 3 Tables

Journal ref: Journal of Statistical Computation and Simulation (2017), 87:6, 1179-1188

arXiv:1604.05661 [pdf, ps, other]

doi 10.1007/s00180-017-0735-1

Objective Bayesian Analysis of the Yule-Simon Distribution with Applications

Authors: Fabrizio Leisen, Luca Rossini, Cristiano Villa

Abstract: The Yule-Simon distribution is usually employed in the analysis of frequency data. As the Bayesian literature, so far, ignored this distribution, here we show the derivation of two objective priors for the parameter of the Yule-Simon distribution. In particular, we discuss the Jeffreys prior and a loss-based prior, which has recently appeared in the literature. We illustrate the performance of the… ▽ More The Yule-Simon distribution is usually employed in the analysis of frequency data. As the Bayesian literature, so far, ignored this distribution, here we show the derivation of two objective priors for the parameter of the Yule-Simon distribution. In particular, we discuss the Jeffreys prior and a loss-based prior, which has recently appeared in the literature. We illustrate the performance of the derived priors through a simulation study and the analysis of real datasets. △ Less

Submitted 19 April, 2016; originally announced April 2016.

Comments: 24 pages, 11 Figures, 7 Tables

arXiv:1603.03484 [pdf, ps, other]

Bayesian Nonparametric Conditional Copula Estimation of Twin Data

Authors: Luciana Dalla Valle, Fabrizio Leisen, Luca Rossini

Abstract: Several studies on heritability in twins aim at understanding the different contribution of environmental and genetic factors to specific traits. Considering the National Merit Twin Study, our purpose is to correctly analyse the influence of the socioeconomic status on the relationship between twins' cognitive abilities. Our methodology is based on conditional copulas, which allow us to model the… ▽ More Several studies on heritability in twins aim at understanding the different contribution of environmental and genetic factors to specific traits. Considering the National Merit Twin Study, our purpose is to correctly analyse the influence of the socioeconomic status on the relationship between twins' cognitive abilities. Our methodology is based on conditional copulas, which allow us to model the effect of a covariate driving the strength of dependence between the main variables. We propose a flexible Bayesian nonparametric approach for the estimation of conditional copulas, which can model any conditional copula density. Our methodology extends the work of Wu et al (2015) by introducing dependence from a covariate in an infinite mixture model. Our results suggest that environmental factors are more influential in families with lower socio-economic position. △ Less

Submitted 3 July, 2017; v1 submitted 10 March, 2016; originally announced March 2016.

Comments: Forthcoming in Journal of the Royal Statistical Society (Series C)

arXiv:1512.01496 [pdf, other]

Embarrassingly Parallel Sequential Markov-chain Monte Carlo for Large Sets of Time Series

Authors: Roberto Casarin, Radu V. Craiu, Fabrizio Leisen

Abstract: Bayesian computation crucially relies on Markov chain Monte Carlo (MCMC) algorithms. In the case of massive data sets, running the Metropolis-Hastings sampler to draw from the posterior distribution becomes prohibitive due to the large number of likelihood terms that need to be calculated at each iteration. In order to perform Bayesian inference for a large set of time series, we consider an algor… ▽ More Bayesian computation crucially relies on Markov chain Monte Carlo (MCMC) algorithms. In the case of massive data sets, running the Metropolis-Hastings sampler to draw from the posterior distribution becomes prohibitive due to the large number of likelihood terms that need to be calculated at each iteration. In order to perform Bayesian inference for a large set of time series, we consider an algorithm that combines 'divide and conquer" ideas previously used to design MCMC algorithms for big data with a sequential MCMC strategy. The performance of the method is illustrated using a large set of financial data. △ Less

Submitted 4 December, 2015; originally announced December 2015.

arXiv:1510.07287 [pdf, other]

A Bootstrap Likelihood approach to Bayesian Computation

Authors: Weixuan Zhu, Juan Miguel Marin, Fabrizio Leisen

Abstract: There is an increasing amount of literature focused on Bayesian computational methods to address problems with intractable likelihood. One approach is a set of algorithms known as Approximate Bayesian Computational (ABC) methods. One of the problems of these algorithms is that the performance depends on the tuning of some parameters, such as the summary statistics, distance and tolerance level. To… ▽ More There is an increasing amount of literature focused on Bayesian computational methods to address problems with intractable likelihood. One approach is a set of algorithms known as Approximate Bayesian Computational (ABC) methods. One of the problems of these algorithms is that the performance depends on the tuning of some parameters, such as the summary statistics, distance and tolerance level. To bypass this problem, Mengersen, Pudlo and Robert (2013) introduced an alternative method based on empirical likelihood, which can be easily implemented when a set of constraints, related to the moments of the distribution, is known. However, the choice of the constraints is sometimes challenging. To overcome this problem, we propose an alternative method based on a bootstrap likelihood approach. The method is easy to implement and in some cases it is faster than the other approaches. The performance of the algorithm is illustrated with examples in Population Genetics, Time Series and Stochastic Differential Equations. Finally, we test the method on a real dataset. △ Less

Submitted 25 October, 2015; originally announced October 2015.

arXiv:1412.7391 [pdf, other]

Merging exchangeable occupancy models: $\mathcal{M}^{(a)}$- models and relation with the maximum entropy principle

Authors: Francesca Collet, Fabrizio Leisen, Fabio Spizzichino

Abstract: In this paper a new transformation of occupancy models, called merging, is introduced. In particular, it will be studied the effect of merging on a class of occupancy models that was recently introduced in Collet et al (2013). These results have an interesting interpretation in the so-called entropy maximization inference. The last part of the paper is devoted to highlight the impact of our findin… ▽ More In this paper a new transformation of occupancy models, called merging, is introduced. In particular, it will be studied the effect of merging on a class of occupancy models that was recently introduced in Collet et al (2013). These results have an interesting interpretation in the so-called entropy maximization inference. The last part of the paper is devoted to highlight the impact of our findings in this research area. △ Less

Submitted 23 December, 2014; originally announced December 2014.

Comments: 16 pages, 1 figure

arXiv:1410.0611 [pdf, other]

Compound random measures and their use in Bayesian nonparametrics

Authors: Jim E. Griffin, Fabrizio Leisen

Abstract: A new class of dependent random measures which we call {\it compound random measures} are proposed and the use of normalized versions of these random measures as priors in Bayesian nonparametric mixture models is considered. Their tractability allows the properties of both compound random measures and normalized compound random measures to be derived. In particular, we show how compound random mea… ▽ More A new class of dependent random measures which we call {\it compound random measures} are proposed and the use of normalized versions of these random measures as priors in Bayesian nonparametric mixture models is considered. Their tractability allows the properties of both compound random measures and normalized compound random measures to be derived. In particular, we show how compound random measures can be constructed with gamma, $σ$-stable and generalized gamma process marginals. We also derive several forms of the Laplace exponent and characterize dependence through both the Lévy copula and correlation function. A slice sampler and an augmented Pólya urn scheme sampler are described for posterior inference when a normalized compound random measure is used as the mixing measure in a nonparametric mixture model and a data example is discussed. △ Less

Submitted 2 September, 2015; v1 submitted 2 October, 2014; originally announced October 2014.

arXiv:1409.1956 [pdf, ps, other]

A Bayesian Beta Markov Random Field Calibration of the Term Structure of Implied Risk Neutral Densities

Authors: Roberto Casarin, Fabrizio Leisen, German Molina, Enrique ter Horst

Abstract: We build on the work in Fackler and King 1990, and propose a more general calibration model for implied risk neutral densities. Our model allows for the joint calibration of a set of densities at different maturities and dates through a Bayesian dynamic Beta Markov Random Field. Our approach allows for possible time dependence between densities with the same maturity, and for dependence across mat… ▽ More We build on the work in Fackler and King 1990, and propose a more general calibration model for implied risk neutral densities. Our model allows for the joint calibration of a set of densities at different maturities and dates through a Bayesian dynamic Beta Markov Random Field. Our approach allows for possible time dependence between densities with the same maturity, and for dependence across maturities at the same point in time. This approach to the problem encompasses model flexibility, parameter parsimony and, more importantly, information pooling across densities. △ Less

Submitted 5 September, 2014; originally announced September 2014.

Comments: 27 pages, 4 figures

arXiv:1308.3779 [pdf, other]

Adaptive Independent Sticky MCMC algorithms

Authors: L. Martino, R. Casarin, F. Leisen, D. Luengo

Abstract: In this work, we introduce a novel class of adaptive Monte Carlo methods, called adaptive independent sticky MCMC algorithms, for efficient sampling from a generic target probability density function (pdf). The new class of algorithms employs adaptive non-parametric proposal densities which become closer and closer to the target as the number of iterations increases. The proposal pdf is built usin… ▽ More In this work, we introduce a novel class of adaptive Monte Carlo methods, called adaptive independent sticky MCMC algorithms, for efficient sampling from a generic target probability density function (pdf). The new class of algorithms employs adaptive non-parametric proposal densities which become closer and closer to the target as the number of iterations increases. The proposal pdf is built using interpolation procedures based on a set of support points which is constructed iteratively based on previously drawn samples. The algorithm's efficiency is ensured by a test that controls the evolution of the set of support points. This extra stage controls the computational cost and the convergence of the proposal density to the target. Each part of the novel family of algorithms is discussed and several examples are provided. Although the novel algorithms are presented for univariate target densities, we show that they can be easily extended to the multivariate context within a Gibbs-type sampler. The ergodicity is ensured and discussed. Exhaustive numerical examples illustrate the efficiency of sticky schemes, both as a stand-alone methods to sample from complicated one-dimensional pdfs and within Gibbs in order to draw from multi-dimensional target distributions. △ Less

Submitted 2 January, 2016; v1 submitted 17 August, 2013; originally announced August 2013.

Comments: A preliminary Matlab code is provided at https://www.mathworks.com/matlabcentral/fileexchange/54701-adaptive-independent-sticky-metropolis--aism--algorithm

arXiv:1305.5385

New isometry of Krall-Laguerre orthogonal polynomials in martingale spaces

Authors: Edmundo J. Huertas, Nuria Torrado, Fabrizio Leisen

Abstract: Sets of orthogonal martingales are importants because they can be used as stochastic integrators in a kind of chaotic representation property, see [20]. In this paper, we revisited the problem studied by W. Schoutens in [21], investigating how an inner product derived from an Uvarov transformation of the Laguerre weight function is used in the orthogonalization procedure of a sequence of martingal… ▽ More Sets of orthogonal martingales are importants because they can be used as stochastic integrators in a kind of chaotic representation property, see [20]. In this paper, we revisited the problem studied by W. Schoutens in [21], investigating how an inner product derived from an Uvarov transformation of the Laguerre weight function is used in the orthogonalization procedure of a sequence of martingales related to a certain Lévy process, called Teugels Martingales. Since the Uvarov transformation depends by a c<0, we are able to provide infinite sets of strongly orthogonal martingales, each one for every c in (-infty,0). In a similar fashion of [21], we introduce a suitable isometry between the space of polynomials and the space of linear combinations of Teugels martingales as well as the general orthogonalization procedure. Finally, the new construction is applied to the Gamma process. △ Less

Submitted 18 November, 2013; v1 submitted 23 May, 2013; originally announced May 2013.

Comments: 13 pages, no figures. This paper has been withdrawn by the authors due to one error in the proof of the isometry

MSC Class: 60G46; 42C05; 60G51; 33C47

arXiv:1112.0867 [pdf, ps, other]

Exchangeable Occupancy Models and Discrete Processes with the Generalized Uniform Order Statistics Property

Authors: Francesca Collet, Fabrizio Leisen, Fabio Spizzichino, Florentina Suter

Abstract: This work focuses on Exchangeable Occupancy Models (EOM) and their relations with the Uniform Order Statistics Property (UOSP) for point processes in discrete time. As our main purpose, we show how definitions and results presented in Shaked, Spizzichino and Suter (2004) can be unified and generalized in the frame of occupancy models. We first show some general facts about EOM's. Then we introduce… ▽ More This work focuses on Exchangeable Occupancy Models (EOM) and their relations with the Uniform Order Statistics Property (UOSP) for point processes in discrete time. As our main purpose, we show how definitions and results presented in Shaked, Spizzichino and Suter (2004) can be unified and generalized in the frame of occupancy models. We first show some general facts about EOM's. Then we introduce a class of EOM's, called $\mathcal{M}^{(a)}$-models, and a concept of generalized Uniform Order Statistics Property in discrete time. For processes with this property, we prove a general characterization result in terms of $\mathcal{M}^{(a)}$-models. Our interest is also focused on properties of closure w.r.t. some natural transformations of EOM's. △ Less

Submitted 5 January, 2013; v1 submitted 5 December, 2011; originally announced December 2011.

Comments: 27 pages

MSC Class: 62G30; 60G09; 60G55

arXiv:1109.4777 [pdf, ps, other]

Beta-Product Poisson-Dirichlet Processes

Authors: Federico Bassetti, Roberto Casarin, Fabrizio Leisen

Abstract: Time series data may exhibit clustering over time and, in a multiple time series context, the clustering behavior may differ across the series. This paper is motivated by the Bayesian non--parametric modeling of the dependence between the clustering structures and the distributions of different time series. We follow a Dirichlet process mixture approach and introduce a new class of multivariate de… ▽ More Time series data may exhibit clustering over time and, in a multiple time series context, the clustering behavior may differ across the series. This paper is motivated by the Bayesian non--parametric modeling of the dependence between the clustering structures and the distributions of different time series. We follow a Dirichlet process mixture approach and introduce a new class of multivariate dependent Dirichlet processes (DDP). The proposed DDP are represented in terms of vector of stick-breaking processes with dependent weights. The weights are beta random vectors that determine different and dependent clustering effects along the dimension of the DDP vector. We discuss some theoretical properties and provide an efficient Monte Carlo Markov Chain algorithm for posterior computation. The effectiveness of the method is illustrated with a simulation study and an application to the United States and the European Union industrial production indexes. △ Less

Submitted 22 September, 2011; originally announced September 2011.

arXiv:1107.3991

Free Completely Random Measures

Authors: Francesca Collet, Fabrizio Leisen

Abstract: In this paper a free analogous of completely random measure is introduced. Furthermore, a representation theorem is proved for free completely random measures that are free infinitely divisible. In this paper a free analogous of completely random measure is introduced. Furthermore, a representation theorem is proved for free completely random measures that are free infinitely divisible. △ Less

Submitted 13 July, 2020; v1 submitted 20 July, 2011; originally announced July 2011.

Comments: The manuscript has been superseded by arXiv article 2007.05336

MSC Class: 60G57; 60E07; 46L54

arXiv:1012.0866 [pdf, other]

Generalized Species Sampling Priors with Latent Beta reinforcements

Authors: Edoardo M. Airoldi, Thiago Costa, Federico Bassetti, Fabrizio Leisen, Michele Guindani

Abstract: Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of indepen… ▽ More Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data. △ Less

Submitted 1 August, 2014; v1 submitted 3 December, 2010; originally announced December 2010.

Comments: For correspondence purposes, Edoardo M. Airoldi's email is [email protected]; Federico Bassetti's email is [email protected]; Michele Guindani's email is [email protected] ; Fabrizo Leisen's email is [email protected]. To appear in the Journal of the American Statistical Association

arXiv:1011.1170 [pdf, ps, other]

doi 10.1007/s11222-011-9301-9

Interacting Multiple Try Algorithms with Different Proposal Distributions

Authors: Roberto Casarin, Radu V. Craiu, Fabrizio Leisen

Abstract: We propose a new class of interacting Markov chain Monte Carlo (MCMC) algorithms designed for increasing the efficiency of a modified multiple-try Metropolis (MTM) algorithm. The extension with respect to the existing MCMC literature is twofold. The sampler proposed extends the basic MTM algorithm by allowing different proposal distributions in the multiple-try generation step. We exploit the stru… ▽ More We propose a new class of interacting Markov chain Monte Carlo (MCMC) algorithms designed for increasing the efficiency of a modified multiple-try Metropolis (MTM) algorithm. The extension with respect to the existing MCMC literature is twofold. The sampler proposed extends the basic MTM algorithm by allowing different proposal distributions in the multiple-try generation step. We exploit the structure of the MTM algorithm with different proposal distributions to naturally introduce an interacting MTM mechanism (IMTM) that expands the class of population Monte Carlo methods. We show the validity of the algorithm and discuss the choice of the selection weights and of the different proposals. We provide numerical studies which show that the new algorithm can perform better than the basic MTM algorithm and that the interaction mechanism allows the IMTM to efficiently explore the state space. △ Less

Submitted 4 November, 2010; originally announced November 2010.

Journal ref: Statistics and Computing, 23, No. 2, 185--200, 2013

arXiv:1008.0121 [pdf, ps, other]

Bayesian Model Selection for Beta Autoregressive Processes

Authors: R. Casarin, L. Dalla Valle, F. Leisen

Abstract: We deal with Bayesian inference for Beta autoregressive processes. We restrict our attention to the class of conditionally linear processes. These processes are particularly suitable for forecasting purposes, but are difficult to estimate due to the constraints on the parameter space. We provide a full Bayesian approach to the estimation and include the parameter restrictions in the inference prob… ▽ More We deal with Bayesian inference for Beta autoregressive processes. We restrict our attention to the class of conditionally linear processes. These processes are particularly suitable for forecasting purposes, but are difficult to estimate due to the constraints on the parameter space. We provide a full Bayesian approach to the estimation and include the parameter restrictions in the inference problem by a suitable specification of the prior distributions. Moreover in a Bayesian framework parameter estimation and model choice can be solved simultaneously. In particular we suggest a Markov-Chain Monte Carlo (MCMC) procedure based on a Metropolis-Hastings within Gibbs algorithm and solve the model selection problem following a reversible jump MCMC approach. △ Less

Submitted 31 July, 2010; originally announced August 2010.

MSC Class: 62M10 (Primary) 91B84; 62F15 (Secondary)

arXiv:1006.5835 [pdf, ps, other]

Limiting behavior of the search cost distribution for the move-to-front rule in the stable case

Authors: Fabrizio Leisen, Antonio Lijoi, Christian Paroissin

Abstract: Move-to-front rule is a heuristic updating a list of n items according to requests. Items are required with unknown probabilities (or popularities). The induced Markov chain is known to be ergodic. One main problem is the study of the distribution of the search cost dened as the position of the required item. Here we first establish the link between two recent papers that both extend results prove… ▽ More Move-to-front rule is a heuristic updating a list of n items according to requests. Items are required with unknown probabilities (or popularities). The induced Markov chain is known to be ergodic. One main problem is the study of the distribution of the search cost dened as the position of the required item. Here we first establish the link between two recent papers that both extend results proved by Kingman on the expected stationary search cost. Combining results contained in these papers, we obtain the limiting behavior for any moments of the stationary seach cost as n tends to innity. △ Less

Submitted 30 June, 2010; originally announced June 2010.

arXiv:0806.2724 [pdf, ps, other]

Conditionally identically distributed species sampling sequences

Authors: Federico Bassetti, Irene Crimaldi, Fabrizio Leisen

Abstract: Conditional identity in distribution (Berti et al. (2004)) is a new type of dependence for random variables, which generalizes the well-known notion of exchangeability. In this paper, a class of random sequences, called Generalized Species Sampling Sequences, is defined and a condition to have conditional identity in distribution is given. Moreover, a class of generalized species sampling sequen… ▽ More Conditional identity in distribution (Berti et al. (2004)) is a new type of dependence for random variables, which generalizes the well-known notion of exchangeability. In this paper, a class of random sequences, called Generalized Species Sampling Sequences, is defined and a condition to have conditional identity in distribution is given. Moreover, a class of generalized species sampling sequences that are conditionally identically distributed is introduced and studied: the Generalized Ottawa sequences (GOS). This class contains a '`randomly reinforced'' version of the Pólya urn and of the Blackwell-MacQueen urn scheme. For the empirical means and the predictive means of a GOS, we prove two convergence results toward suitable mixtures of Gaussian distributions. The first one is in the sense of stable convergence and the second one in the sense of almost sure conditional convergence. In the last part of the paper we study the length of the partition induced by a GOS at time $n$, i.e. the random number of distinct values of a GOS until time $n$. Under suitable conditions, we prove a strong law of large numbers and a central limit theorem in the sense of stable convergence. All the given results in the paper are accompanied by some examples. △ Less

Submitted 17 June, 2008; originally announced June 2008.

Showing 1–42 of 42 results for author: Leisen, F