-
Easily Computed Marginal Likelihoods from Posterior Simulation Using the THAMES Estimator
Authors:
Martin Metodiev,
Marie Perrot-Dockès,
Sarah Ouadah,
Nicholas J. Irons,
Adrian E. Raftery
Abstract:
We propose an easily computed estimator of marginal likelihoods from posterior simulation output, via reciprocal importance sampling, combining earlier proposals of DiCiccio et al (1997) and Robert and Wraith (2009). This involves only the unnormalized posterior densities from the sampled parameter values, and does not involve additional simulations beyond the main posterior simulation, or additio…
▽ More
We propose an easily computed estimator of marginal likelihoods from posterior simulation output, via reciprocal importance sampling, combining earlier proposals of DiCiccio et al (1997) and Robert and Wraith (2009). This involves only the unnormalized posterior densities from the sampled parameter values, and does not involve additional simulations beyond the main posterior simulation, or additional complicated calculations. It is unbiased for the reciprocal of the marginal likelihood, consistent, has finite variance, and is asymptotically normal. It involves one user-specified control parameter, and we derive an optimal way of specifying this. We illustrate it with several numerical examples.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Sign-consistent estimation in a sparse Poisson model
Authors:
Marina Gomtsyan,
Céline Lévy-Leduc,
Sarah Ouadah,
Laure Sansonnet
Abstract:
In this work, we consider an estimation method in sparse Poisson models inspired by [1] and provide novel sign consistency results under mild conditions.
In this work, we consider an estimation method in sparse Poisson models inspired by [1] and provide novel sign consistency results under mild conditions.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Variable selection in sparse multivariate GLARMA models: Application to germination control by environment
Authors:
M. Gomtsyan,
C. Lévy-Leduc,
S. Ouadah,
L. Sansonnet,
C. Bailly,
L. Rajjou
Abstract:
We propose a novel and efficient iterative two-stage variable selection approach for multivariate sparse GLARMA models, which can be used for modelling multivariate discrete-valued time series. Our approach consists in iteratively combining two steps: the estimation of the autoregressive moving average (ARMA) coefficients of multivariate GLARMA models and the variable selection in the coefficients…
▽ More
We propose a novel and efficient iterative two-stage variable selection approach for multivariate sparse GLARMA models, which can be used for modelling multivariate discrete-valued time series. Our approach consists in iteratively combining two steps: the estimation of the autoregressive moving average (ARMA) coefficients of multivariate GLARMA models and the variable selection in the coefficients of the Generalized Linear Model (GLM) part of the model performed by regularized methods. We explain how to implement our approach efficiently. Then we assess the performance of our methodology using synthetic data and compare it with alternative methods. Finally, we illustrate it on RNA-Seq data resulting from polyribosome profiling to determine translational status for all mRNAs in germinating seeds. Our approach, which is implemented in the MultiGlarmaVarSel R package and available on the CRAN, is very attractive since it benefits from a low computational load and is able to outperform the other methods for recovering the null and non-null coefficients.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
Variable selection in sparse GLARMA models
Authors:
Marina Gomtsyan,
Céline Lévy-Leduc,
Sarah Ouadah,
Laure Sansonnet,
Thomas Blein
Abstract:
In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coeffici…
▽ More
In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coefficients of Generalized Linear Models (GLM). We first establish the consistency of the ARMA part coefficient estimators in a specific case. Then, we explain how to efficiently implement our approach. Finally, we assess the performance of our methodology using synthetic data, compare it with alternative methods and illustrate it on an example of real-world application. Our approach, which is implemented in the GlarmaVarSel R package and available on the CRAN, is very attractive since it benefits from a low computational load and is able to outperform the other methods in terms of coefficient estimation, particularly in recovering the non null regression coefficients.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Motif-based tests for bipartite networks
Authors:
Sarah Ouadah,
Pierre Latouche,
Stéphane Robin
Abstract:
Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts f…
▽ More
Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts for both the density of the network and possible imbalances between the degrees of the nodes. Under the B-EDD model, we prove the asymptotic normality of the count of any given motif, considering sparsity conditions. We also provide close-form expressions for the mean and the variance of this count. This allows to avoid computationally prohibitive resampling procedures. Based on these results, we define a goodness-of-fit test for the B-EDD model and propose a family of tests for network comparisons. We assess the asymptotic normality of the test statistics and the power of the proposed tests on synthetic experiments and illustrate their use on ecological data sets.
△ Less
Submitted 16 October, 2023; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Variable selection in sparse GLARMA models
Authors:
M. Gomtsyan,
C. Lévy-Leduc,
S. Ouadah,
L. Sansonnet
Abstract:
In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coeffici…
▽ More
In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coefficients of Generalized Linear Models (GLM). We first establish the consistency of the ARMA part coefficient estimators in a specific case. Then, we explain how to efficiently implement our approach. Finally, we assess the performance of our methodology using synthetic data and compare it with alternative methods. Our approach is very attractive since it benefits from a low computational load and is able to outperform the other methods in terms of coefficient estimation, particularly in recovering the non null regression coefficients.
△ Less
Submitted 15 July, 2020;
originally announced July 2020.
-
Variable selection in sparse high-dimensional GLARMA models
Authors:
Céline Lévy-Leduc,
Sarah Ouadah,
Laure Sansonnet
Abstract:
In this paper, we propose a novel variable selection approach in the framework of sparse high-dimensional GLARMA models. It consists in combining the estimation of the autoregressive moving average (ARMA) coefficients of these models with regularized methods designed for Generalized Linear Models (GLM). The properties of our approach are investigated both from a theoretical and a numerical point o…
▽ More
In this paper, we propose a novel variable selection approach in the framework of sparse high-dimensional GLARMA models. It consists in combining the estimation of the autoregressive moving average (ARMA) coefficients of these models with regularized methods designed for Generalized Linear Models (GLM). The properties of our approach are investigated both from a theoretical and a numerical point of view. More precisely, we establish in a specific case the consistency of the ARMA part coefficient estimators. We explain how to implement our approach and we show that it is very attractive since it benefits from a low computational load. We also assess the performance of our methodology using synthetic data and compare it with alternative approaches. Our numerical experiments show that combining the estimation of the ARMA part coefficients with regularized methods designed for GLM dramatically improves the variable selection performance.
△ Less
Submitted 11 October, 2019; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Nonparametric homogeneity tests and multiple change-point estimation for analyzing large Hi-C data matrices
Authors:
Vincent Brault,
Sarah Ouadah,
Laure Sansonnet,
Céline Lévy-Leduc
Abstract:
We propose a novel nonparametric approach for estimating the location of block boundaries (change-points) of non-overlap** blocks in a random symmetric matrix which consists of random variables having their distribution changing from one block to the other. Our method is based on a nonparametric two-sample homogeneity test for matrices that we extend to the more general case of several groups. W…
▽ More
We propose a novel nonparametric approach for estimating the location of block boundaries (change-points) of non-overlap** blocks in a random symmetric matrix which consists of random variables having their distribution changing from one block to the other. Our method is based on a nonparametric two-sample homogeneity test for matrices that we extend to the more general case of several groups. We first provide some theoretical results for the two associated test statistics and we explain how to derive change-point location estimators. Then, some numerical experiments are given in order to support our claims. Finally, our approach is applied to Hi-C data which are used in molecular biology for better understanding the influence of the chromosomal conformation on the cells functioning.
△ Less
Submitted 12 May, 2016;
originally announced May 2016.
-
Goodness of fit of logistic models for random graphs
Authors:
Pierre Latouche,
Stéphane Robin,
Sarah Ouadah
Abstract:
Logistic regression is a natural and simple tool to understand how covariates contribute to explain the topology of a binary network. Once the model fitted, the practitioner is interested in the goodness-of-fit of the regression in order to check if the covariates are sufficient to explain the whole topology of the network and, if they are not, to analyze the residual structure. To address this pr…
▽ More
Logistic regression is a natural and simple tool to understand how covariates contribute to explain the topology of a binary network. Once the model fitted, the practitioner is interested in the goodness-of-fit of the regression in order to check if the covariates are sufficient to explain the whole topology of the network and, if they are not, to analyze the residual structure. To address this problem, we introduce a generic model that combines logistic regression with a network-oriented residual term. This residual term takes the form of the graphon function of a W-graph. Using a variational Bayes framework, we infer the residual graphon by averaging over a series of blockwise constant functions. This approach allows us to define a generic goodness-of-fit criterion, which corresponds to the posterior probability for the residual graphon to be constant. Experiments on toy data are carried out to assess the accuracy of the procedure. Several networks from social sciences and ecology are studied to illustrate the proposed methodology.
△ Less
Submitted 6 January, 2017; v1 submitted 2 August, 2015;
originally announced August 2015.
-
Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases
Authors:
Sarah Ouadah,
Stéphane Robin,
Pierre Latouche
Abstract:
The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erdös-Rényi model in which the edges have different connection probabilities. Seco…
▽ More
The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erdös-Rényi model in which the edges have different connection probabilities. Second, we consider a generic model for exchangeable random graphs called the W-graph. The stochastic block model and the expected degree distribution model fall within this framework. We prove the asymptotic normality of the degree mean square under these independent and exchangeable models and derive formal tests. We study the power of the proposed tests and we prove the asymptotic normality under specific sparsity regimes. The tests are illustrated on real networks from social sciences and ecology, and their performances are assessed via a simulation study.
△ Less
Submitted 29 July, 2019; v1 submitted 29 July, 2015;
originally announced July 2015.