Skip to main content

Showing 1–22 of 22 results for author: Neuvial, P

.
  1. arXiv:2407.06892  [pdf, other

    stat.ME

    When Knockoffs fail: diagnosing and fixing non-exchangeability of Knockoffs

    Authors: Alexandre Blain, Bertrand Thirion, Julia Linhart, Pierre Neuvial

    Abstract: Knockoffs are a popular statistical framework that addresses the challenging problem of conditional variable selection in high-dimensional settings with statistical control. Such statistical control is essential for the reliability of inference. However, knockoff guarantees rely on an exchangeability assumption that is difficult to test in practice, and there is little discussion in the literature… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2310.11822  [pdf, other

    stat.ME math.ST stat.AP

    Post-clustering Inference under Dependency

    Authors: Javier González-Delgado, Juan Cortés, Pierre Neuvial

    Abstract: Recent work by Gao et al. has laid the foundations for post-clustering inference. For the first time, the authors established a theoretical framework allowing to test for differences between means of estimated clusters. Additionally, they studied the estimation of unknown parameters while controlling the selective type I error. However, their theory was developed for independent observations ident… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  3. arXiv:2310.10373  [pdf, other

    stat.ME

    False Discovery Proportion control for aggregated Knockoffs

    Authors: Alexandre Blain, Bertrand Thirion, Olivier Grisel, Pierre Neuvial

    Abstract: Controlled variable selection is an important analytical step in various scientific fields, such as brain imaging or genomics. In these high-dimensional data settings, considering too many variables leads to poor models and high costs, hence the need for statistical guarantees on false positives. Knockoffs are a popular statistical tool for conditional variable selection in high dimension. However… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  4. arXiv:2309.01492  [pdf, other

    stat.ME math.ST

    Selective inference after convex clustering with $\ell_1$ penalization

    Authors: François Bachoc, Cathy Maugis-Rabusseau, Pierre Neuvial

    Abstract: Classical inference methods notoriously fail when applied to data-driven test hypotheses or inference targets. Instead, dedicated methodologies are required to obtain statistical guarantees for these selective inference problems. Selective inference is particularly relevant post-clustering, typically when testing a difference in mean between two clusters. In this paper, we address convex clusterin… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 40 pages, 8 figures

    MSC Class: 62F03; 62H30

  5. arXiv:2208.13724  [pdf, other

    stat.ME math.ST

    FDP control in multivariate linear models using the bootstrap

    Authors: Samuel Davenport, Bertrand Thirion, Pierre Neuvial

    Abstract: In this article we develop a method for performing post hoc inference of the False Discovery Proportion (FDP) over multiple contrasts of interest in the multivariate linear model. To do so we use the bootstrap to simulate from the distribution of the null contrasts. We combine the bootstrap with the post hoc inference bounds of Blanchard (2020) and prove that doing so provides simultaneous asympto… ▽ More

    Submitted 20 September, 2022; v1 submitted 29 August, 2022; originally announced August 2022.

  6. Notip: Non-parametric True Discovery Proportion control for brain imaging

    Authors: Alexandre Blain, Bertrand Thirion, Pierre Neuvial

    Abstract: Cluster-level inference procedures are widely used for brain map**. These methods compare the size of clusters obtained by thresholding brain maps to an upper bound under the global null hypothesis, computed using Random Field Theory or permutations. However, the guarantees obtained by this type of inference - i.e. at least one voxel is truly activated in the cluster - are not informative with r… ▽ More

    Submitted 21 July, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: NeuroImage (2022)

    Journal ref: NeuroImage (2022), 119492

  7. Two-sample goodness-of-fit tests on the flat torus based on Wasserstein distance and their relevance to structural biology

    Authors: Javier González-Delgado, Alberto González-Sanz, Juan Cortés, Pierre Neuvial

    Abstract: This work is motivated by the study of local protein structure, which is defined by two variable dihedral angles that take values from probability distributions on the flat torus. Our goal is to provide the space $\mathcal{P}(\mathbb{R}^2/\mathbb{Z}^2)$ with a metric that quantifies local structural modifications due to changes in the protein sequence, and to define associated two-sample goodness-… ▽ More

    Submitted 11 September, 2023; v1 submitted 31 July, 2021; originally announced August 2021.

    Journal ref: J. González-Delgado, A. González-Sanz, J. Cortés, P. Neuvial. Two-sample goodness-of-fit tests on the flat torus based on Wasserstein distance and their relevance to structural biology. Electron. J. Statist., 17(1) 1547-1586, 2023

  8. arXiv:2105.00288  [pdf, other

    stat.ME

    Post hoc false discovery proportion inference under a Hidden Markov Model

    Authors: Marie Perrot-Dockès, Gilles Blanchard, Pierre Neuvial, Etienne Roquain

    Abstract: We address the multiple testing problem under the assumption that the true/false hypotheses are driven by a Hidden Markov Model (HMM), which is recognized as a fundamental setting to model multiple testing under dependence since the seminal work of \citet{sun2009large}. While previous work has concentrated on deriving specific procedures with a controlled False Discovery Rate (FDR) under this mode… ▽ More

    Submitted 1 May, 2021; originally announced May 2021.

  9. arXiv:2004.08312  [pdf, other

    q-bio.MN q-bio.GN stat.AP

    Identification of deregulated transcription factors involved in subtypes of cancers

    Authors: Magali Champion, Julien Chiquet, Pierre Neuvial, Mohamed Elati, François Radvanyi, Etienne Birmelé

    Abstract: We propose a methodology for the identification of transcription factors involved in the deregulation of genes in tumoral cells. This strategy is based on the inference of a reference gene regulatory network that connects transcription factors to their downstream targets using gene expression data. The behavior of genes in tumor samples is then carefully compared to this network of reference to de… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Journal ref: Proceedings of the 12th International Conference on Bioinformatics and Computational Biology, vol 70, pages 1--10

  10. arXiv:1910.11575  [pdf, other

    math.ST stat.ME

    On agnostic post hoc approaches to false positive control

    Authors: Gilles Blanchard, Pierre Neuvial, Etienne Roquain

    Abstract: This document is a book chapter which gives a partial survey on post hoc approaches to false positive control.

    Submitted 25 October, 2019; originally announced October 2019.

  11. arXiv:1909.10923  [pdf, other

    stat.ME

    Applicability and Interpretability of Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

    Authors: Nathanaël Randriamihamison, Nathalie Vialaneix, Pierre Neuvial

    Abstract: Hierarchical Agglomerative Classification (HAC) with Ward's linkage has been widely used since its introduction in Ward (1963). The present article reviews the different extensions of the method to various input data and the constrained framework, while providing applicability conditions. In addition, various versions of the graphical representation of the results as a dendrogram are also presente… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

  12. arXiv:1902.01596  [pdf, other

    math.ST q-bio.QM

    Adjacency-constrained hierarchical clustering of a band similarity matrix with application to Genomics

    Authors: Christophe Ambroise, Alia Dehman, Pierre Neuvial, Guillem Rigaill, Nathalie Vialaneix

    Abstract: Motivation: Genomic data analyses such as Genome-Wide Association Studies (GWAS) or Hi-C studies are often faced with the problem of partitioning chromosomes into successive regions based on a similarity matrix of high-resolution, locus-level measurements. An intuitive way of doing this is to perform a modified Hierarchical Agglomerative Clustering (HAC), where only adjacent clusters (according to… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

  13. arXiv:1807.01470  [pdf, other

    math.ST

    Post hoc false positive control for spatially structured hypotheses

    Authors: Guillermo Durand, Gilles Blanchard, Pierre Neuvial, Etienne Roquain

    Abstract: In a high dimensional multiple testing framework, we present new confidence bounds on the false positives contained in subsets S of selected null hypotheses. The coverage probability holds simultaneously over all subsets S, which means that the obtained confidence bounds are post hoc. Therefore, S can be chosen arbitrarily, possibly by using the data set several times. We focus in this paper speci… ▽ More

    Submitted 19 September, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  14. On the Post Selection Inference constant under Restricted Isometry Properties

    Authors: François Bachoc, Gilles Blanchard, Pierre Neuvial

    Abstract: Uniformly valid confidence intervals post model selection in regression can be constructed based on Post-Selection Inference (PoSI) constants. PoSI constants are minimal for orthogonal design matrices, and can be upper bounded in function of the sparsity of the set of models under consideration, for generic design matrices. In order to improve on these generic sparse upper bounds, we consider desi… ▽ More

    Submitted 22 November, 2018; v1 submitted 20 April, 2018; originally announced April 2018.

    Comments: Electronic journal of statistics, Shaker Heights, OH : Institute of Mathematical Statistics, 2018

  15. arXiv:1703.02307  [pdf, other

    math.ST

    Post hoc inference via joint family-wise error rate control

    Authors: Gilles Blanchard, Pierre Neuvial, Etienne Roquain

    Abstract: We introduce a general methodology for post hoc inference in a large-scale multiple testing framework. The approach is called "user-agnostic" in the sense that the statistical guarantee on the number of correct rejections holds for any set of candidate items selected by the user (after having seen the data). This task is investigated by defining a suitable criterion, named the joint-family-wise-er… ▽ More

    Submitted 8 January, 2018; v1 submitted 7 March, 2017; originally announced March 2017.

  16. A model for gene deregulation detection using expression data

    Authors: Thomas Picchetti, Julien Chiquet, Mohamed Elati, Pierre Neuvial, Rémy Nicolle, Etienne Birmelé

    Abstract: In tumoral cells, gene regulation mechanisms are severely altered, and these modifications in the regulations may be characteristic of different subtypes of cancer. However, these alterations do not necessarily induce differential expressions between the subtypes. To answer this question, we propose a statistical methodology to identify the misregulated genes given a reference network and gene exp… ▽ More

    Submitted 8 January, 2016; v1 submitted 21 May, 2015; originally announced May 2015.

    Report number: MAP5 2015-17

  17. arXiv:1402.7203  [pdf, other

    q-bio.QM stat.AP

    Performance evaluation of DNA copy number segmentation methods

    Authors: Morgane Pierre-Jean, Guillem Rigaill, Pierre Neuvial

    Abstract: A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. We have designed and implemented a framework to generate… ▽ More

    Submitted 5 November, 2015; v1 submitted 28 February, 2014; originally announced February 2014.

    Journal ref: Briefings in Bioinformatics, Oxford University Press (OUP), 2015, 16 (4)

  18. arXiv:1206.6980  [pdf, ps, other

    stat.AP q-bio.QM

    More power via graph-structured tests for differential expression of gene networks

    Authors: Laurent Jacob, Pierre Neuvial, Sandrine Dudoit

    Abstract: We consider multivariate two-sample tests of means, where the location shift between the two populations is expected to be related to a known graph structure. An important application of such tests is the detection of differentially expressed genes between two patient populations, as shifts in expression levels are expected to be coherent with the structure of graphs reflecting gene properties suc… ▽ More

    Submitted 29 June, 2012; originally announced June 2012.

    Comments: Published in at http://dx.doi.org/10.1214/11-AOAS528 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: substantial text overlap with arXiv:1009.5173

    Report number: IMS-AOAS-AOAS528

    Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 2, 561-600

  19. On false discovery rate thresholding for classification under sparsity

    Authors: Pierre Neuvial, Etienne Roquain

    Abstract: We study the properties of false discovery rate (FDR) thresholding, viewed as a classification procedure. The "0"-class (null) is assumed to have a known density while the "1"-class (alternative) is obtained from the "0"-class either by translation or by scaling. Furthermore, the "1"-class is assumed to have a small number of elements w.r.t. the "0"-class (sparsity). We focus on densities of the S… ▽ More

    Submitted 5 March, 2013; v1 submitted 30 June, 2011; originally announced June 2011.

    Journal ref: Annals of Statistics (2012) Vol. 40, No. 5, 2572-2600

  20. arXiv:1009.5173  [pdf, ps, other

    q-bio.QM stat.AP

    Gains in Power from Structured Two-Sample Tests of Means on Graphs

    Authors: Laurent Jacob, Pierre Neuvial, Sandrine Dudoit

    Abstract: We consider multivariate two-sample tests of means, where the location shift between the two populations is expected to be related to a known graph structure. An important application of such tests is the detection of differentially expressed genes between two patient populations, as shifts in expression levels are expected to be coherent with the structure of graphs reflecting gene properties suc… ▽ More

    Submitted 27 September, 2010; originally announced September 2010.

    Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 2, 561-600

  21. arXiv:1003.0747  [pdf, other

    math.ST physics.data-an q-bio.QM stat.AP stat.ME

    Asymptotic Results on Adaptive False Discovery Rate Controlling Procedures Based on Kernel Estimators

    Authors: Pierre Neuvial

    Abstract: The False Discovery Rate (FDR) is a commonly used type I error rate in multiple testing problems. It is defined as the expected False Discovery Proportion (FDP), that is, the expected fraction of false positives among rejected hypotheses. When the hypotheses are independent, the Benjamini-Hochberg procedure achieves FDR control at any pre-specified level. By construction, FDR control offers no gua… ▽ More

    Submitted 20 April, 2013; v1 submitted 3 March, 2010; originally announced March 2010.

    Journal ref: Journal of Machine Learning Research 14 (2013) 1423-1459

  22. Asymptotic properties of false discovery rate controlling procedures under independence

    Authors: Pierre Neuvial

    Abstract: We investigate the performance of a family of multiple comparison procedures for strong control of the False Discovery Rate ($\mathsf{FDR}$). The $\mathsf{FDR}$ is the expected False Discovery Proportion ($\mathsf{FDP}$), that is, the expected fraction of false rejections among all rejected hypotheses. A number of refinements to the original Benjamini-Hochberg procedure [1] have been proposed, t… ▽ More

    Submitted 21 November, 2008; v1 submitted 14 March, 2008; originally announced March 2008.

    Comments: Published in at http://dx.doi.org/10.1214/08-EJS207 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-EJS-EJS_2008_207 MSC Class: 62G10; 62H15; 60F05 (Primary)

    Journal ref: Electronic Journal of Statistics 2008, Vol. 2, 1065-1110