Skip to main content

Showing 1–18 of 18 results for author: del Barrio, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2209.14455  [pdf, other

    math.ST stat.ME

    Using the Sinkhorn divergence in permutation tests for the multivariate two-sample problem

    Authors: E. del Barrio, J. S. Osorio, A. J. Quiroz

    Abstract: In order to adapt the Wasserstein distance to the large sample multivariate non-parametric two-sample problem, making its application computationally feasible, permutation tests based on the Sinkhorn divergence between probability vectors associated to data dependent partitions are considered. Different ways of implementing these tests are evaluated and the asymptotic distribution of the underlyin… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: 33 pages, 14 figures

    MSC Class: 62G10; 62H15; 62E20

  2. arXiv:2204.11756  [pdf, other

    stat.ME math.ST

    Nonparametric Multiple-Output Center-Outward Quantile Regression

    Authors: Eustasio del Barrio, Alberto Gonzalez Sanz, Marc Hallin

    Abstract: Based on the novel concept of multivariate center-outward quantiles introduced recently in Chernozhukov et al. (2017) and Hallin et al. (2021), we are considering the problem of nonparametric multiple-output quantile regression. Our approach defines nested conditional center-outward quantile regression contours and regions with given conditional probability content irrespective of the underlying d… ▽ More

    Submitted 26 April, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: 36 pages

    MSC Class: G.3

  3. arXiv:2102.02572  [pdf, ps, other

    math.ST stat.ME

    The complex behaviour of Galton rank order statistic

    Authors: E. del Barrio, J. A. Cuesta-Albertos, C. Matran

    Abstract: Galton's rank order statistic is one of the oldest statistical tools for two-sample comparisons. It is also a very natural index to measure departures from stochastic dominance. Yet, its asymptotic behaviour has been investigated only partially, under restrictive assumptions. This work provides a comprehensive {study} of this behaviour, based on the analysis of the so-called contact set (a modific… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: 35 pages. No figures

    MSC Class: 60E15

  4. arXiv:2006.06520  [pdf, other

    cs.LG stat.ML

    Achieving robustness in classification using optimal transport with hinge regularization

    Authors: Mathieu Serrurier, Franck Mamalet, Alberto González-Sanz, Thibaut Boissin, Jean-Michel Loubes, Eustasio del Barrio

    Abstract: Adversarial examples have pointed out Deep Neural Networks vulnerability to small local noise. It has been shown that constraining their Lipschitz constant should enhance robustness, but make them harder to learn with classical loss functions. We propose a new framework for binary classification, based on optimal transport, which integrates this Lipschitz constraint as a theoretical requirement. W… ▽ More

    Submitted 26 April, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Accepted by CVPR 2021

  5. arXiv:2005.13755  [pdf, other

    stat.ML cs.LG

    Review of Mathematical frameworks for Fairness in Machine Learning

    Authors: Eustasio del Barrio, Paula Gordaliza, Jean-Michel Loubes

    Abstract: A review of the main fairness definitions and fair learning methodologies proposed in the literature over the last years is presented from a mathematical point of view. Following our independence-based approach, we consider how to build fair algorithms and the consequences on the degradation of their performance compared to the possibly unfair case. This corresponds to the price for fairness given… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2001.07864, arXiv:1911.04322, arXiv:1906.05082 by other authors

  6. arXiv:2003.14263  [pdf, other

    stat.ML cs.CY cs.LG

    A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

    Authors: Philippe Besse, Eustasio del Barrio, Paula Gordaliza, Jean-Michel Loubes, Laurent Risser

    Abstract: Applications based on Machine Learning models have now become an indispensable part of the everyday life and the professional world. A critical question then recently arised among the population: Do algorithmic decisions convey any type of discrimination against specific groups of population or minorities? In this paper, we show the importance of understanding how a bias can be introduced into aut… ▽ More

    Submitted 6 April, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

  7. arXiv:1907.08006  [pdf, other

    stat.ML cs.LG

    optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

    Authors: Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes, Carlos Matrán, Agustín Mayo-Íscar

    Abstract: Data obtained from Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well-known phenomenon produced by measurements on different individuals, with different characteristics such as illness, age, sex, etc. The use of different settings for measurement, the variation of the conditions during experiments and the different types of flow… ▽ More

    Submitted 29 April, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: 26 pages, 6 figures, 5 tables

  8. arXiv:1904.05254  [pdf, other

    stat.ML cs.LG

    Attraction-Repulsion clustering with applications to fairness

    Authors: Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes

    Abstract: We consider the problem of diversity enhancing clustering, i.e, develo** clustering methods which produce clusters that favour diversity with respect to a set of protected attributes such as race, sex, age, etc. In the context of fair clustering, diversity plays a major role when fairness is understood as demographic parity. To promote diversity, we introduce perturbations to the distance in the… ▽ More

    Submitted 26 October, 2021; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: 35 pages, 11 figures, 5 tables

    MSC Class: 62H30; 68T10

  9. On approximate validation of models: A Kolmogorov-Smirnov based approach

    Authors: Eustasio del Barrio, Hristo Inouzhe, Carlos Matrán

    Abstract: Classical tests of fit typically reject a model for large enough real data samples. In contrast, often in statistical practice a model offers a good description of the data even though it is not the "true" random generator. We consider a more flexible approach based on contamination neighbourhoods around a model. Using trimming methods and the Kolmogorov metric we introduce a functional statistic… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.

    Comments: 14 figures, 32 pages

  10. arXiv:1807.06362  [pdf, ps, other

    stat.ML cs.LG math.ST

    Confidence Intervals for Testing Disparate Impact in Fair Learning

    Authors: Philippe Besse, Eustasio del Barrio, Paula Gordaliza, Jean-Michel Loubes

    Abstract: We provide the asymptotic distribution of the major indexes used in the statistical literature to quantify disparate treatment in machine learning. We aim at promoting the use of confidence intervals when testing the so-called group disparate impact. We illustrate on some examples the importance of using confidence intervals and not a single value.

    Submitted 17 July, 2018; originally announced July 2018.

  11. arXiv:1806.01238  [pdf, other

    stat.ME

    Center-Outward Distribution Functions, Quantiles, Ranks, and Signs in $\mathbb{R}^d$

    Authors: Eustasio del Barrio, Juan A. Cuesta-Albertos, Marc Hallin, Carlos Matrán

    Abstract: Univariate concepts as quantile and distribution functions involving ranks and signs, do not canonically extend to $\mathbb{R}^d, d\geq 2$. Palliating that has generated an abundant literature. Chapter 1 shows that, unlike the many definitions that have been proposed so far, the measure transportation-based ones introduced in Chernozhukov et al. (2017) enjoy all the properties that make univariate… ▽ More

    Submitted 27 February, 2020; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: 66 pages 6 figures

  12. arXiv:1804.02905  [pdf, other

    stat.ME

    Invariant measures of disagreement with stochastic dominance

    Authors: E. del Barrio, J. A. Cuesta-Albertos, C. Matran

    Abstract: An essential feature of stochastic order is its invariance against increasing maps. In this paper, we analyze a family of invariant indices of disagreement with respect to stochastic dominance. The indices in this family admit the representation $θ(F,G)=P(X>Y)$, where $(X,Y)$ is a random vector with marginal distribution functions $F$ and $G$. This includes the case of independent marginals, but a… ▽ More

    Submitted 25 March, 2022; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: 36 pages, 4 figures

    MSC Class: 60E15

  13. arXiv:1705.01788  [pdf, ps, other

    stat.ME

    An optimal transportation approach for assessing almost stochastic order

    Authors: E. del Barrio, J. A. Cuesta-Albertos, C. Matrán

    Abstract: When stochastic dominance $F\leq_{st}G$ does not hold, we can improve agreement to stochastic order by suitably trimming both distributions. In this work we consider the $L_2-$Wasserstein distance, $\mathcal W_2$, to stochastic order of these trimmed versions. Our characterization for that distance naturally leads to consider a $\mathcal W_2$-based index of disagreement with stochastic order,… ▽ More

    Submitted 4 May, 2017; originally announced May 2017.

  14. Models for the assessment of treatment improvement: the ideal and the feasible

    Authors: P. C. Álvarez-Esteban, E. del Barrio, J. A. Cuesta-Albertos, C. Matrán

    Abstract: Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two-sample problems play a main role in Statistics through natural questions such as `Is the the new treatment significantly better than the old?'. However, this is only partially answered by some of the usual statistical tools for this task. More importantly, of… ▽ More

    Submitted 18 April, 2017; v1 submitted 5 December, 2016; originally announced December 2016.

  15. Robust clustering tools based on optimal transportation

    Authors: E. del Barrio, J. A. Cuesta-Albertos, C. Matrán, A. Mayo-Íscar

    Abstract: A robust clustering method for probabilities in Wasserstein space is introduced. This new "trimmed $k$-barycenters" approach relies on recent results on barycenters in Wasserstein space that allow intensive computation, as required by clustering algorithms. The possibility of trimming the most discrepant distributions results in a gain in stability and robustness, highly convenient in this setting… ▽ More

    Submitted 23 November, 2016; v1 submitted 5 July, 2016; originally announced July 2016.

    Journal ref: Statistics and Computing, (2019), 29, 139-160 The final publication is available at link.springer.com

  16. arXiv:1511.05355  [pdf, other

    stat.CO math.PR

    A fixed-point approach to barycenters in Wasserstein space

    Authors: Pedro C. Álvarez-Esteban, E. del Barrio, J. A. Cuesta-Albertos, C. Matrán

    Abstract: Let $\mathcal{P}_{2,ac}$ be the set of Borel probabilities on $\mathbb{R}^d$ with finite second moment and absolutely continuous with respect to Lebesgue measure. We consider the problem of finding the barycenter (or Fréchet mean) of a finite set of probabilities $ν_1,\ldots,ν_k \in \mathcal{P}_{2,ac}$ with respect to the $L_2-$Wasserstein metric. For this task we introduce an operator on… ▽ More

    Submitted 22 April, 2016; v1 submitted 17 November, 2015; originally announced November 2015.

    Comments: 18 pages, 2 figures

    MSC Class: 60B05 (Primary); 47H10; 47J25; 65D99 (Secondary)

  17. arXiv:1511.05350  [pdf, other

    stat.ME

    Wide Consensus for Parallelized Inference

    Authors: P. C. Álvarez-Esteban, E. del Barrio, J. A. Cuesta-Albertos, C. Matrán

    Abstract: We develop a general theory to address a consensus-based combination of estimations in a parallelized or distributed estimation setting. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a "wide consensus" procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability distributions on R^… ▽ More

    Submitted 11 May, 2017; v1 submitted 17 November, 2015; originally announced November 2015.

    Comments: 27 pages, 3 fogures

    MSC Class: Primary: 62F35; Secondary 62H12

  18. arXiv:1412.1920  [pdf, ps, other

    stat.ME

    A contamination model for approximate stochastic order: extended version

    Authors: Pedro C. Alvarez-Esteban, Eustasio del Barrio, Juan A. Cuesta-Albertos, Carlos Matran

    Abstract: Stochastic ordering among distributions has been considered in a variety of scenarios. Economic studies often involve research about the ordering of investment strategies or social welfare. However, as noted in the literature, stochastic orderings are often a too strong assumption which is not supported by the data even in cases in which the researcher tends to believe that a certain variable is s… ▽ More

    Submitted 5 December, 2014; originally announced December 2014.

    MSC Class: 60E15; 62G10