Skip to main content

Showing 1–19 of 19 results for author: Noiry, N

.
  1. arXiv:2401.11842  [pdf, other

    stat.ME stat.AP stat.ML

    Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials

    Authors: Valentine Perrin, Nathan Noiry, Nicolas Loiseau, Alex Nowak

    Abstract: Non-significant randomized control trials can hide subgroups of good responders to experimental drugs, thus hindering subsequent development. Identifying such heterogeneous treatment effects is key for precision medicine and many post-hoc analysis methods have been developed for that purpose. While several benchmarks have been carried out to identify the strengths and weaknesses of these methods,… ▽ More

    Submitted 23 January, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 9 pages, 8 figures, 2 tables. Code available at https://github.com/owkin/hte . Comments are welcome!

  2. arXiv:2310.14001  [pdf, other

    cs.CL

    Toward Stronger Textual Attack Detectors

    Authors: Pierre Colombo, Marine Picot, Nathan Noiry, Guillaume Staerman, Pablo Piantanida

    Abstract: The landscape of available textual adversarial attacks keeps growing, posing severe threats and raising concerns regarding the deep NLP system's integrity. However, the crucial problem of defending against malicious attacks has only drawn the attention of the NLP community. The latter is nonetheless instrumental in develo** robust and trustworthy systems. This paper makes two important contribut… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Findings EMNLP 2023

  3. arXiv:2310.13990  [pdf, other

    cs.LG cs.CL

    A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification

    Authors: Pierre Colombo, Nathan Noiry, Guillaume Staerman, Pablo Piantanida

    Abstract: One of the pursued objectives of deep learning is to provide tools that learn abstract representations of reality from the observation of multiple contextual situations. More precisely, one wishes to extract disentangled representations which are (i) low dimensional and (ii) whose components are independent and correspond to concepts capturing the essence of the objects under consideration (Locate… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Findings AACL 2023

  4. arXiv:2306.07891  [pdf, other

    cs.DS

    Online Matching in Geometric Random Graphs

    Authors: Flore Sentenac, Nathan Noiry, Matthieu Lerasle, Laurent Ménard, Vianney Perchet

    Abstract: We investigate online maximum cardinality matching, a central problem in ad allocation. In this problem, users are revealed sequentially, and each new user can be paired with any previously unmatched campaign that it is compatible with. Despite the limited theoretical guarantees, the greedy algorithm, which matches incoming users with any available campaign, exhibits outstanding performance in pra… ▽ More

    Submitted 5 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

  5. arXiv:2306.03522  [pdf, other

    cs.LG cs.CV stat.ML

    A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

    Authors: Eduardo Dadalto, Pierre Colombo, Guillaume Staerman, Nathan Noiry, Pablo Piantanida

    Abstract: A key feature of out-of-distribution (OOD) detection is to exploit a trained neural network by extracting statistical patterns and relationships through the multi-layer classifier to detect shifts in the expected input data distribution. Despite achieving solid results, several state-of-the-art methods rely on the penultimate or last layer outputs only, leaving behind valuable information for OOD… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  6. arXiv:2305.10284  [pdf, other

    cs.CL cs.AI

    Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks

    Authors: Anas Himmi, Ekhine Irurozki, Nathan Noiry, Stephan Clemencon, Pierre Colombo

    Abstract: The evaluation of natural language processing (NLP) systems is crucial for advancing the field, but current benchmarking approaches often assume that all systems have scores available for all tasks, which is not always practical. In reality, several factors such as the cost of running baseline, private systems, computational limitations, or incomplete data may prevent some systems from being evalu… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  7. arXiv:2211.13527  [pdf, other

    cs.CL

    Beyond Mahalanobis-Based Scores for Textual OOD Detection

    Authors: Pierre Colombo, Eduardo D. C. Gomes, Guillaume Staerman, Nathan Noiry, Pablo Piantanida

    Abstract: Deep learning methods have boosted the adoption of NLP systems in real-life applications. However, they turn out to be vulnerable to distribution shifts over time which may cause severe dysfunctions in production systems, urging practitioners to develop tools to detect out-of-distribution (OOD) samples through the lens of the neural network. In this paper, we introduce TRUSTED, a new OOD detector… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Journal ref: NeurIPS 2022

  8. arXiv:2210.13664  [pdf, other

    cs.CV cs.AI

    Mitigating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model

    Authors: Jean-Rémy Conti, Nathan Noiry, Vincent Despiegel, Stéphane Gentric, Stéphan Clémençon

    Abstract: In spite of the high performance and reliability of deep learning algorithms in a wide range of everyday applications, many investigations tend to show that a lot of models exhibit biases, discriminating against specific subgroups of the population (e.g. gender, ethnicity). This urges the practitioner to develop fair systems with a uniform/comparable performance across sensitive groups. In this wo… ▽ More

    Submitted 22 February, 2024; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to ICML 2022

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:4344-4369, 2022

  9. arXiv:2208.14585  [pdf, other

    cs.CL

    The Glass Ceiling of Automatic Evaluation in Natural Language Generation

    Authors: Pierre Colombo, Maxime Peyrard, Nathan Noiry, Robert West, Pablo Piantanida

    Abstract: Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods. Thus, numerous research efforts have focused on crafting such metrics. In this work, we take a step back and analyze recent progress by comparing the body of existing automatic metrics and human metrics altogether. As metrics are used based on how they rank systems, we compar… ▽ More

    Submitted 7 October, 2022; v1 submitted 30 August, 2022; originally announced August 2022.

  10. arXiv:2205.03589  [pdf, other

    cs.CL

    Learning Disentangled Textual Representations via Statistical Measures of Similarity

    Authors: Pierre Colombo, Guillaume Staerman, Nathan Noiry, Pablo Piantanida

    Abstract: When working with textual data, a natural application of disentangled representations is fair classification where the goal is to make predictions without being biased (or influenced) by sensitive attributes that may be present in the data (e.g., age, gender or race). Dominant approaches to disentangle a sensitive attribute from textual representations rely on learning simultaneously a penalizatio… ▽ More

    Submitted 7 October, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: ACL 2022

  11. arXiv:2202.03799  [pdf, other

    cs.CL cs.AI

    What are the best systems? New perspectives on NLP Benchmarking

    Authors: Pierre Colombo, Nathan Noiry, Ekhine Irurozki, Stephan Clemencon

    Abstract: In Machine Learning, a benchmark refers to an ensemble of datasets associated with one or multiple metrics together with a way to aggregate different systems performances. They are instrumental in (i) assessing the progress of new methods along different axes and (ii) selecting the best systems for practical use. This is particularly the case for NLP with the development of large pre-trained model… ▽ More

    Submitted 7 October, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  12. arXiv:2109.09590  [pdf, other

    math.ST stat.ML

    Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics

    Authors: Myrto Limnios, Nathan Noiry, Stéphan Clémençon

    Abstract: The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are abnormal. Detecting the latter, defined as outliers, is one of the major challenges for machine learning applications (e.g. in fraud detection or in predictive… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  13. arXiv:2107.00995  [pdf, other

    cs.DS stat.ML

    Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm

    Authors: Nathan Noiry, Flore Sentenac, Vianney Perchet

    Abstract: Motivated by sequential budgeted allocation problems, we investigate online matching problems where connections between vertices are not i.i.d., but they have fixed degree distributions -- the so-called configuration model. We estimate the competitive ratio of the simplest algorithm, GREEDY, by approximating some relevant stochastic discrete processes by their continuous counterparts, that are sol… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  14. arXiv:2106.11130  [pdf, other

    math.PR math.CO

    Long induced paths in a configuration model

    Authors: Nathanaël Enriquez, Gabriel Faraud, Laurent Ménard, Nathan Noiry

    Abstract: In an article published in 1987 in Combinatorica \cite{MR918397}, Frieze and Jackson established a lower bound on the length of the longest induced path (and cycle) in a sparse random graph. Their bound is obtained through a rough analysis of a greedy algorithm. In the present work, we provide a sharp asymptotic for the length of the induced path constructed by their algorithm. To this end, we int… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: 16 pages

    MSC Class: 60K35; 82C21; 60J20; 60F10

  15. arXiv:2009.12541  [pdf, other

    math.PR

    Large deviations for spectral measures of some spiked matrices

    Authors: Nathan Noiry, Alain Rouault

    Abstract: We prove large deviations principles for spectral measures of perturbed (or spiked) matrix models in the direction of an eigenvector of the perturbation. In each model under study, we provide two approaches, one of which relying on large deviations principle of unperturbed models derived in the previous work "Sum rules via large deviations" (Gamboa-Nagel-Rouault, JFA, 2016).

    Submitted 23 September, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

    Comments: 28 pages, corrected and updated version

    MSC Class: 60F10; 60G57; 15B52; 47B36

  16. arXiv:2003.13053  [pdf, other

    math.PR

    A solvable class of renewal processes

    Authors: Nathanaël Enriquez, Nathan Noiry

    Abstract: When the distribution of the inter-arrival times of a renewal process is a mixture of geometric laws, we prove that the renewal function of the process is given by the moments of a probability measure which is explicitly related to the mixture distribution. We also present an analogous result in the continuous case when the inter-arrival law is a mixture of exponential laws. We then observe that t… ▽ More

    Submitted 5 September, 2020; v1 submitted 29 March, 2020; originally announced March 2020.

    Comments: 13 pages

    MSC Class: 60K05; 82D60; 30D35

  17. arXiv:1911.10083  [pdf, other

    math.PR math.CO

    Depth First Exploration of a Configuration Model

    Authors: Nathanaël Enriquez, Gabriel Faraud, Laurent Ménard, Nathan Noiry

    Abstract: We introduce an algorithm that constructs a random uniform graph with prescribed degree sequence together with a depth first exploration of it. In the so-called supercritical regime where the graph contains a giant component, we prove that the renormalized contour process of the Depth First Search Tree has a deterministic limiting profile that we identify. The proof goes through a detailed analysi… ▽ More

    Submitted 3 September, 2022; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: 30 pages

    MSC Class: 60K35; 82C21; 60J20; 60F10

  18. arXiv:1903.11731  [pdf, other

    math.PR

    Spectral Measures of Spiked Random Matrices

    Authors: Nathan Noiry

    Abstract: We study two spiked models of random matrices under general frameworks corresponding respectively to additive deformation of random symmetric matrices and multiplicative perturbation of random covariance matrices. In both cases, the limiting spectral measure in the direction of an eigenvector of the perturbation leads to old and new results on the coordinates of eigenvectors.

    Submitted 13 October, 2020; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: 23 pages, 3 figures

    MSC Class: 60B20

  19. arXiv:1710.06355  [pdf, other

    math.PR

    Spectra of Wishart Matrices with size-dependent entries

    Authors: Nathan Noiry

    Abstract: We prove the convergence of the empirical spectral measure of Wishart matrices with size-dependent entries and characterize the limiting law by its moments. We apply our result to the cases where the entries are Bernoulli variables with parameter c=n or truncated heavy-tailed random variables. In both cases, when c goes to infinity or when the truncation is small, the limiting spectrum is a pertur… ▽ More

    Submitted 17 October, 2017; originally announced October 2017.

    Comments: 17 pages, 6 figures

    MSC Class: 05C80; 60B20