Search | arXiv e-print repository

Easily Computed Marginal Likelihoods from Posterior Simulation Using the THAMES Estimator

Authors: Martin Metodiev, Marie Perrot-Dockès, Sarah Ouadah, Nicholas J. Irons, Adrian E. Raftery

Abstract: We propose an easily computed estimator of marginal likelihoods from posterior simulation output, via reciprocal importance sampling, combining earlier proposals of DiCiccio et al (1997) and Robert and Wraith (2009). This involves only the unnormalized posterior densities from the sampled parameter values, and does not involve additional simulations beyond the main posterior simulation, or additio… ▽ More We propose an easily computed estimator of marginal likelihoods from posterior simulation output, via reciprocal importance sampling, combining earlier proposals of DiCiccio et al (1997) and Robert and Wraith (2009). This involves only the unnormalized posterior densities from the sampled parameter values, and does not involve additional simulations beyond the main posterior simulation, or additional complicated calculations. It is unbiased for the reciprocal of the marginal likelihood, consistent, has finite variance, and is asymptotically normal. It involves one user-specified control parameter, and we derive an optimal way of specifying this. We illustrate it with several numerical examples. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2303.14020 [pdf, ps, other]

Sign-consistent estimation in a sparse Poisson model

Authors: Marina Gomtsyan, Céline Lévy-Leduc, Sarah Ouadah, Laure Sansonnet

Abstract: In this work, we consider an estimation method in sparse Poisson models inspired by [1] and provide novel sign consistency results under mild conditions. In this work, we consider an estimation method in sparse Poisson models inspired by [1] and provide novel sign consistency results under mild conditions. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2208.14721 [pdf, other]

Variable selection in sparse multivariate GLARMA models: Application to germination control by environment

Authors: M. Gomtsyan, C. Lévy-Leduc, S. Ouadah, L. Sansonnet, C. Bailly, L. Rajjou

Abstract: We propose a novel and efficient iterative two-stage variable selection approach for multivariate sparse GLARMA models, which can be used for modelling multivariate discrete-valued time series. Our approach consists in iteratively combining two steps: the estimation of the autoregressive moving average (ARMA) coefficients of multivariate GLARMA models and the variable selection in the coefficients… ▽ More We propose a novel and efficient iterative two-stage variable selection approach for multivariate sparse GLARMA models, which can be used for modelling multivariate discrete-valued time series. Our approach consists in iteratively combining two steps: the estimation of the autoregressive moving average (ARMA) coefficients of multivariate GLARMA models and the variable selection in the coefficients of the Generalized Linear Model (GLM) part of the model performed by regularized methods. We explain how to implement our approach efficiently. Then we assess the performance of our methodology using synthetic data and compare it with alternative methods. Finally, we illustrate it on RNA-Seq data resulting from polyribosome profiling to determine translational status for all mRNAs in germinating seeds. Our approach, which is implemented in the MultiGlarmaVarSel R package and available on the CRAN, is very attractive since it benefits from a low computational load and is able to outperform the other methods for recovering the null and non-null coefficients. △ Less

Submitted 31 August, 2022; originally announced August 2022.

arXiv:2208.14168 [pdf, other]

doi 10.48550/arXiv.2007.08623

Variable selection in sparse GLARMA models

Authors: Marina Gomtsyan, Céline Lévy-Leduc, Sarah Ouadah, Laure Sansonnet, Thomas Blein

Abstract: In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coeffici… ▽ More In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coefficients of Generalized Linear Models (GLM). We first establish the consistency of the ARMA part coefficient estimators in a specific case. Then, we explain how to efficiently implement our approach. Finally, we assess the performance of our methodology using synthetic data, compare it with alternative methods and illustrate it on an example of real-world application. Our approach, which is implemented in the GlarmaVarSel R package and available on the CRAN, is very attractive since it benefits from a low computational load and is able to outperform the other methods in terms of coefficient estimation, particularly in recovering the non null regression coefficients. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2007.08623, arXiv:1907.07085

arXiv:2101.11381 [pdf, other]

Motif-based tests for bipartite networks

Authors: Sarah Ouadah, Pierre Latouche, Stéphane Robin

Abstract: Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts f… ▽ More Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts for both the density of the network and possible imbalances between the degrees of the nodes. Under the B-EDD model, we prove the asymptotic normality of the count of any given motif, considering sparsity conditions. We also provide close-form expressions for the mean and the variance of this count. This allows to avoid computationally prohibitive resampling procedures. Based on these results, we define a goodness-of-fit test for the B-EDD model and propose a family of tests for network comparisons. We assess the asymptotic normality of the test statistics and the power of the proposed tests on synthetic experiments and illustrate their use on ecological data sets. △ Less

Submitted 16 October, 2023; v1 submitted 27 January, 2021; originally announced January 2021.

arXiv:2007.08623 [pdf, other]

Variable selection in sparse GLARMA models

Authors: M. Gomtsyan, C. Lévy-Leduc, S. Ouadah, L. Sansonnet

Abstract: In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coeffici… ▽ More In this paper, we propose a novel and efficient two-stage variable selection approach for sparse GLARMA models, which are pervasive for modeling discrete-valued time series. Our approach consists in iteratively combining the estimation of the autoregressive moving average (ARMA) coefficients of GLARMA models with regularized methods designed for performing variable selection in regression coefficients of Generalized Linear Models (GLM). We first establish the consistency of the ARMA part coefficient estimators in a specific case. Then, we explain how to efficiently implement our approach. Finally, we assess the performance of our methodology using synthetic data and compare it with alternative methods. Our approach is very attractive since it benefits from a low computational load and is able to outperform the other methods in terms of coefficient estimation, particularly in recovering the non null regression coefficients. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1907.07085

arXiv:1907.07085 [pdf, other]

Variable selection in sparse high-dimensional GLARMA models

Authors: Céline Lévy-Leduc, Sarah Ouadah, Laure Sansonnet

Abstract: In this paper, we propose a novel variable selection approach in the framework of sparse high-dimensional GLARMA models. It consists in combining the estimation of the autoregressive moving average (ARMA) coefficients of these models with regularized methods designed for Generalized Linear Models (GLM). The properties of our approach are investigated both from a theoretical and a numerical point o… ▽ More In this paper, we propose a novel variable selection approach in the framework of sparse high-dimensional GLARMA models. It consists in combining the estimation of the autoregressive moving average (ARMA) coefficients of these models with regularized methods designed for Generalized Linear Models (GLM). The properties of our approach are investigated both from a theoretical and a numerical point of view. More precisely, we establish in a specific case the consistency of the ARMA part coefficient estimators. We explain how to implement our approach and we show that it is very attractive since it benefits from a low computational load. We also assess the performance of our methodology using synthetic data and compare it with alternative approaches. Our numerical experiments show that combining the estimation of the ARMA part coefficients with regularized methods designed for GLM dramatically improves the variable selection performance. △ Less

Submitted 11 October, 2019; v1 submitted 16 July, 2019; originally announced July 2019.

arXiv:1605.03751 [pdf, other]

Nonparametric homogeneity tests and multiple change-point estimation for analyzing large Hi-C data matrices

Authors: Vincent Brault, Sarah Ouadah, Laure Sansonnet, Céline Lévy-Leduc

Abstract: We propose a novel nonparametric approach for estimating the location of block boundaries (change-points) of non-overlap** blocks in a random symmetric matrix which consists of random variables having their distribution changing from one block to the other. Our method is based on a nonparametric two-sample homogeneity test for matrices that we extend to the more general case of several groups. W… ▽ More We propose a novel nonparametric approach for estimating the location of block boundaries (change-points) of non-overlap** blocks in a random symmetric matrix which consists of random variables having their distribution changing from one block to the other. Our method is based on a nonparametric two-sample homogeneity test for matrices that we extend to the more general case of several groups. We first provide some theoretical results for the two associated test statistics and we explain how to derive change-point location estimators. Then, some numerical experiments are given in order to support our claims. Finally, our approach is applied to Hi-C data which are used in molecular biology for better understanding the influence of the chromosomal conformation on the cells functioning. △ Less

Submitted 12 May, 2016; originally announced May 2016.

Comments: 25 pages, 9 figures

arXiv:1508.00286 [pdf, other]

Goodness of fit of logistic models for random graphs

Authors: Pierre Latouche, Stéphane Robin, Sarah Ouadah

Abstract: Logistic regression is a natural and simple tool to understand how covariates contribute to explain the topology of a binary network. Once the model fitted, the practitioner is interested in the goodness-of-fit of the regression in order to check if the covariates are sufficient to explain the whole topology of the network and, if they are not, to analyze the residual structure. To address this pr… ▽ More Logistic regression is a natural and simple tool to understand how covariates contribute to explain the topology of a binary network. Once the model fitted, the practitioner is interested in the goodness-of-fit of the regression in order to check if the covariates are sufficient to explain the whole topology of the network and, if they are not, to analyze the residual structure. To address this problem, we introduce a generic model that combines logistic regression with a network-oriented residual term. This residual term takes the form of the graphon function of a W-graph. Using a variational Bayes framework, we infer the residual graphon by averaging over a series of blockwise constant functions. This approach allows us to define a generic goodness-of-fit criterion, which corresponds to the posterior probability for the residual graphon to be constant. Experiments on toy data are carried out to assess the accuracy of the procedure. Several networks from social sciences and ecology are studied to illustrate the proposed methodology. △ Less

Submitted 6 January, 2017; v1 submitted 2 August, 2015; originally announced August 2015.

arXiv:1507.08140 [pdf, other]

Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases

Authors: Sarah Ouadah, Stéphane Robin, Pierre Latouche

Abstract: The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erdös-Rényi model in which the edges have different connection probabilities. Seco… ▽ More The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erdös-Rényi model in which the edges have different connection probabilities. Second, we consider a generic model for exchangeable random graphs called the W-graph. The stochastic block model and the expected degree distribution model fall within this framework. We prove the asymptotic normality of the degree mean square under these independent and exchangeable models and derive formal tests. We study the power of the proposed tests and we prove the asymptotic normality under specific sparsity regimes. The tests are illustrated on real networks from social sciences and ecology, and their performances are assessed via a simulation study. △ Less

Submitted 29 July, 2019; v1 submitted 29 July, 2015; originally announced July 2015.

Showing 1–10 of 10 results for author: Ouadah, S