Skip to main content

Showing 1–27 of 27 results for author: Dalalyan, A S

Searching in archive math. Search in all archives.
.
  1. arXiv:2304.03590  [pdf, other

    math.ST cs.LG

    Graphon Estimation in bipartite graphs with observable edge labels and unobservable node labels

    Authors: Etienne Donier-Meroz, Arnak S. Dalalyan, Francis Kramarz, Philippe Choné, Xavier D'Haultfoeuille

    Abstract: Many real-world data sets can be presented in the form of a matrix whose entries correspond to the interaction between two entities of different natures (number of times a web user visits a web page, a student's grade in a subject, a patient's rating of a doctor, etc.). We assume in this paper that the mentioned interaction is determined by unobservable latent variables describing each entity. Our… ▽ More

    Submitted 4 September, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

  2. arXiv:2212.12950  [pdf, ps, other

    math.ST

    Simple proof of the risk bound for denoising by exponential weights for asymmetric noise distributions

    Authors: Arnak S. Dalalyan

    Abstract: In this note, we consider the problem of aggregation of estimators in order to denoise a signal. The main contribution is a short proof of the fact that the exponentially weighted aggregate satisfies a sharp oracle inequality. While this result was already known for a wide class of symmetric noise distributions, the extension to asymmetric distributions presented in this note is new.

    Submitted 25 December, 2022; originally announced December 2022.

  3. arXiv:2204.02323  [pdf, other

    math.ST cs.LG

    Nearly minimax robust estimator of the mean vector by iterative spectral dimension reduction

    Authors: Amir-Hossein Bateni, Arshak Minasyan, Arnak S. Dalalyan

    Abstract: We study the problem of robust estimation of the mean vector of a sub-Gaussian distribution. We introduce an estimator based on spectral dimension reduction (SDR) and establish a finite sample upper bound on its error that is minimax-optimal up to a logarithmic factor. Furthermore, we prove that the breakdown point of the SDR estimator is equal to $1/2$, the highest possible value of the breakdown… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

  4. arXiv:2112.11086  [pdf, other

    math.ST

    Risk bounds for aggregated shallow neural networks using Gaussian prior

    Authors: Laura Tinsi, Arnak S. Dalalyan

    Abstract: Analysing statistical properties of neural networks is a central topic in statistics and machine learning. However, most results in the literature focus on the properties of the neural network minimizing the training error. The goal of this paper is to consider aggregated neural networks using a Gaussian prior. The departure point of our approach is an arbitrary aggregate satisfying the PAC-Bayesi… ▽ More

    Submitted 2 February, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

  5. arXiv:2006.13998  [pdf, other

    math.ST cs.LG stat.ML

    Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

    Authors: Avetik Karagulyan, Arnak S. Dalalyan

    Abstract: We study the problem of sampling from a probability distribution on $\mathbb R^p$ defined via a convex and smooth potential function. We consider a continuous-time diffusion-type process, termed Penalized Langevin dynamics (PLD), the drift of which is the negative gradient of the potential plus a linear penalty that vanishes when time goes to infinity. An upper bound on the Wasserstein-2 distance… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

  6. All-In-One Robust Estimator of the Gaussian Mean

    Authors: Arnak S. Dalalyan, Arshak Minasyan

    Abstract: The goal of this paper is to show that a single robust estimator of the mean of a multivariate Gaussian distribution can enjoy five desirable properties. First, it is computationally tractable in the sense that it can be computed in a time which is at most polynomial in dimension, sample size and the logarithm of the inverse of the contamination rate. Second, it is equivariant by translations, uni… ▽ More

    Submitted 4 March, 2021; v1 submitted 4 February, 2020; originally announced February 2020.

    Comments: 41 pages, 5 figures; added sub-Gaussian case with unknown Sigma or eps

    Journal ref: Ann. Statist. 50(2): 1193-1219 (April 2022)

  7. arXiv:1906.08530  [pdf, other

    math.ST cs.LG

    Bounding the error of discretized Langevin algorithms for non-strongly log-concave targets

    Authors: Arnak S. Dalalyan, Avetik Karagulyan, Lionel Riou-Durand

    Abstract: In this paper, we provide non-asymptotic upper bounds on the error of sampling from a target density using three schemes of discretized Langevin diffusions. The first scheme is the Langevin Monte Carlo (LMC) algorithm, the Euler discretization of the Langevin diffusion. The second and the third schemes are, respectively, the kinetic Langevin Monte Carlo (KLMC) for differentiable potentials and the… ▽ More

    Submitted 5 December, 2021; v1 submitted 20 June, 2019; originally announced June 2019.

  8. arXiv:1904.06288  [pdf, other

    math.ST cs.LG

    Outlier-robust estimation of a sparse linear model using $\ell_1$-penalized Huber's $M$-estimator

    Authors: Arnak S. Dalalyan, Philip Thompson

    Abstract: We study the problem of estimating a $p$-dimensional $s$-sparse vector in a linear model with Gaussian design and additive noise. In the case where the labels are contaminated by at most $o$ adversarial outliers, we prove that the $\ell_1$-penalized Huber's $M$-estimator based on $n$ samples attains the optimal rate of convergence $(s/n)^{1/2} + (o/n)$, up to a logarithmic factor. For more general… ▽ More

    Submitted 19 November, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: This is a follow up paper of arXiv:1805.08020

  9. arXiv:1903.06576  [pdf, other

    math.ST cs.LG

    A nonasymptotic law of iterated logarithm for general M-estimators

    Authors: Victor-Emmanuel Brunel, Arnak S. Dalalyan, Nicolas Schreuder

    Abstract: M-estimators are ubiquitous in machine learning and statistical learning theory. They are used both for defining prediction strategies and for evaluating their precision. In this paper, we propose the first non-asymptotic "any-time" deviation bounds for general M-estimators, where "any-time" means that the bound holds with a prescribed probability for every sample size. These bounds are nonasympto… ▽ More

    Submitted 24 May, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

  10. arXiv:1902.04650  [pdf, other

    math.ST cs.LG

    Confidence regions and minimax rates in outlier-robust estimation on the probability simplex

    Authors: Amir-Hossein Bateni, Arnak S. Dalalyan

    Abstract: We consider the problem of estimating the mean of a distribution supported by the $k$-dimensional probability simplex in the setting where an $\varepsilon$ fraction of observations are subject to adversarial corruption. A simple particular example is the problem of estimating the distribution of a discrete random variable. Assuming that the discrete variable takes $k$ values, the unknown parameter… ▽ More

    Submitted 1 February, 2020; v1 submitted 12 February, 2019; originally announced February 2019.

  11. arXiv:1807.09382  [pdf, other

    math.PR cs.LG math.ST

    On sampling from a log-concave density using kinetic Langevin diffusions

    Authors: Arnak S. Dalalyan, Lionel Riou-Durand

    Abstract: Langevin diffusion processes and their discretizations are often used for sampling from a target density. The most convenient framework for assessing the quality of such a sampling scheme corresponds to smooth and strongly log-concave densities defined on $\mathbb R^p$. The present work focuses on this framework and studies the behavior of Monte Carlo algorithms based on discretizations of the kin… ▽ More

    Submitted 26 December, 2018; v1 submitted 24 July, 2018; originally announced July 2018.

    Comments: In this version, the bound in Theorem 3 is better than in previous versions, in terms of its dependence on the condition number

  12. arXiv:1806.09405  [pdf, other

    math.ST

    Exponential weights in multivariate regression and a low-rankness favoring prior

    Authors: Arnak S. Dalalyan

    Abstract: We establish theoretical guarantees for the expected prediction error of the exponential weighting aggregate in the case of multivariate regression that is when the label vector is multidimensional. We consider the regression model with fixed design and bounded noise. The first new feature uncovered by our guarantees is that it is not necessary to require independence of the observations: a symmet… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

  13. arXiv:1805.08020  [pdf, ps, other

    math.ST

    Restricted eigenvalue property for corrupted Gaussian designs

    Authors: Philip Thompson, Arnak S. Dalalyan

    Abstract: Motivated by the construction of tractable robust estimators via convex relaxations, we present conditions on the sample size which guarantee an augmented notion of Restricted Eigenvalue-type condition for Gaussian designs. Such a notion is suitable for high-dimensional robust inference in a Gaussian linear model and a multivariate Gaussian model when samples are corrupted by outliers either in th… ▽ More

    Submitted 30 November, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: 21 pages. Some updates and corrections

  14. arXiv:1712.05495  [pdf, ps, other

    math.ST

    Minimax estimation of a p-dimensional linear functional in sparse Gaussian models and robust estimation of the mean

    Authors: Olivier Collier, Arnak S. Dalalyan

    Abstract: We consider two problems of estimation in high-dimensional Gaussian models. The first problem is that of estimating a linear functional of the means of $n$ independent $p$-dimensional Gaussian vectors, under the assumption that most of these means are equal to zero. We show that, up to a logarithmic factor, the minimax rate of estimation in squared Euclidean norm is between $(s^2\wedge n) +sp$ and… ▽ More

    Submitted 8 November, 2018; v1 submitted 14 December, 2017; originally announced December 2017.

  15. arXiv:1710.00095  [pdf, other

    math.ST cs.LG math.PR stat.CO stat.ML

    User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient

    Authors: Arnak S. Dalalyan, Avetik G. Karagulyan

    Abstract: In this paper, we study the problem of sampling from a given probability density function that is known to be smooth and strongly log-concave. We analyze several methods of approximate sampling based on discretizations of the (highly overdamped) Langevin diffusion and establish guarantees on its error measured in the Wasserstein-2 distance. Our guarantees improve or extend the state-of-the-art res… ▽ More

    Submitted 23 February, 2024; v1 submitted 29 September, 2017; originally announced October 2017.

    Journal ref: Stochastic Processes and their Applications, Volume 129, Issue 12, December 2019, Pages 5278-5311

  16. arXiv:1704.04752  [pdf, other

    math.ST

    Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent

    Authors: Arnak S. Dalalyan

    Abstract: In this paper, we revisit the recently established theoretical guarantees for the convergence of the Langevin Monte Carlo algorithm of sampling from a smooth and (strongly) log-concave density. We improve the existing results when the convergence is measured in the Wasserstein distance and provide further insights on the very tight relations between, on the one hand, the Langevin Monte Carlo for s… ▽ More

    Submitted 28 July, 2017; v1 submitted 16 April, 2017; originally announced April 2017.

    Comments: Updated version of the COLT 2017 paper, some typos are corrected and Theorem 3 slightly improved

  17. arXiv:1701.05009  [pdf, ps, other

    math.ST

    Optimal Kullback-Leibler Aggregation in Mixture Density Estimation by Maximum Likelihood

    Authors: Arnak S. Dalalyan, Mehdi Sebbar

    Abstract: We study the maximum likelihood estimator of density of $n$ independent observations, under the assumption that it is well approximated by a mixture with a large number of components. The main focus is on statistical properties with respect to the Kullback-Leibler loss. We establish risk bounds taking the form of sharp oracle inequalities both in deviation and in expectation. A simple consequence… ▽ More

    Submitted 18 January, 2017; originally announced January 2017.

  18. arXiv:1611.08483  [pdf, ps, other

    math.ST cs.LG

    On the Exponentially Weighted Aggregate with the Laplace Prior

    Authors: Arnak S. Dalalyan, Edwin Grappin, Quentin Paris

    Abstract: In this paper, we study the statistical behaviour of the Exponentially Weighted Aggregate (EWA) in the problem of high-dimensional regression with fixed design. Under the assumption that the underlying regression vector is sparse, it is reasonable to use the Laplace distribution as a prior. The resulting estimator and, specifically, a particular instance of it referred to as the Bayesian lasso, wa… ▽ More

    Submitted 25 November, 2016; originally announced November 2016.

    Comments: 30 pages, 2 figures

  19. arXiv:1606.06179  [pdf, ps, other

    math.ST stat.ML

    On the prediction loss of the lasso in the partially labeled setting

    Authors: Pierre C. Bellec, Arnak S. Dalalyan, Edwin Grappin, Quentin Paris

    Abstract: In this paper we revisit the risk bounds of the lasso estimator in the context of transductive and semi-supervised learning. In other terms, the setting under consideration is that of regression with random design under partial labeling. The main goal is to obtain user-friendly bounds on the off-sample prediction risk. To this end, the simple setting of bounded response variable and bounded (high-… ▽ More

    Submitted 8 November, 2016; v1 submitted 20 June, 2016; originally announced June 2016.

    Comments: 25 pages

  20. arXiv:1504.04696  [pdf, other

    math.ST stat.CO stat.ME

    On estimation of the diagonal elements of a sparse precision matrix

    Authors: Samuel Balmand, Arnak S. Dalalyan

    Abstract: In this paper, we present several estimators of the diagonal elements of the inverse of the covariance matrix, called precision matrix, of a sample of iid random vectors. The focus is on high dimensional vectors having a sparse precision matrix. It is now well understood that when the underlying distribution is Gaussian, the columns of the precision matrix can be estimated independently form one a… ▽ More

    Submitted 25 May, 2016; v1 submitted 18 April, 2015; originally announced April 2015.

    Comments: Companion R package at http://cran.r-project.org/web/packages/DESP/index.html

    Journal ref: Electron. J. Statist. Volume 10, Number 1, 1551-1579 (2016)

  21. Discussion on the paper: Hypotheses testing by convex optimization by Goldenshluger, Juditsky and Nemirovski

    Authors: Arnak S. Dalalyan

    Abstract: We briefly discuss some interesting questions related to the paper "Hypotheses testing by convex optimization" by Goldenshluger, Juditsky and Nemirovski.

    Submitted 2 February, 2015; originally announced February 2015.

    Comments: To appear in the EJS

    Journal ref: Electron. J. Statist. Volume 9, Number 2 (2015), 1733-1737

  22. arXiv:1412.7392  [pdf, other

    stat.CO math.ST stat.ML

    Theoretical guarantees for approximate sampling from smooth and log-concave densities

    Authors: Arnak S. Dalalyan

    Abstract: Sampling from various kinds of distributions is an issue of paramount importance in statistics since it is often the key ingredient for constructing estimators, test procedures or confidence intervals. In many situations, the exact sampling from a given distribution is impossible or computationally expensive and, therefore, one needs to resort to approximate sampling strategies. However, there is… ▽ More

    Submitted 3 December, 2016; v1 submitted 23 December, 2014; originally announced December 2014.

    Comments: To appear in JRSS B

  23. arXiv:1402.1700  [pdf, ps, other

    math.ST stat.ML

    On the Prediction Performance of the Lasso

    Authors: Arnak S. Dalalyan, Mohamed Hebiri, Johannes Lederer

    Abstract: Although the Lasso has been extensively studied, the relationship between its prediction performance and the correlations of the covariates is not fully understood. In this paper, we give new insights into this relationship in the context of multiple linear regression. We show, in particular, that the incorporation of a simple correlation measure into the tuning parameter can lead to a nearly opti… ▽ More

    Submitted 8 November, 2016; v1 submitted 7 February, 2014; originally announced February 2014.

    Journal ref: Bernoulli 23(1), 2017, 552-581

  24. arXiv:1310.4661  [pdf, other

    math.ST cs.LG

    Minimax rates in permutation estimation for feature matching

    Authors: Olivier Collier, Arnak S. Dalalyan

    Abstract: The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the minimax rate of separation is investigated and its expression is obtained as a function… ▽ More

    Submitted 2 February, 2015; v1 submitted 17 October, 2013; originally announced October 2013.

  25. arXiv:1104.4210  [pdf, ps, other

    math.ST

    Curve registration by nonparametric goodness-of-fit testing

    Authors: Olivier Collier, Arnak S. Dalalyan

    Abstract: The problem of curve registration appears in many different areas of applications ranging from neuroscience to road traffic modeling. In the present work, we propose a nonparametric testing framework in which we develop a generalized likelihood ratio test to perform curve registration. We first prove that, under the null hypothesis, the resulting test statistic is asymptotically distributed as a c… ▽ More

    Submitted 19 February, 2015; v1 submitted 21 April, 2011; originally announced April 2011.

  26. Mirror averaging with sparsity priors

    Authors: Arnak S. Dalalyan, Alexandre B. Tsybakov

    Abstract: We consider the problem of aggregating the elements of a possibly infinite dictionary for building a decision procedure that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is available, on which the performance of a given procedure can be tested. In a fairly general set-up, we establish an oracle inequality for the Mirror Ave… ▽ More

    Submitted 17 August, 2012; v1 submitted 5 March, 2010; originally announced March 2010.

    Comments: Published in at http://dx.doi.org/10.3150/11-BEJ361 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Report number: IMS-BEJ-BEJ361

    Journal ref: Bernoulli 18, 3 (2012) 914-944

  27. Penalized maximum likelihood and semiparametric second-order efficiency

    Authors: A. S. Dalalyan, G. K. Golubev, A. B. Tsybakov

    Abstract: We consider the problem of estimation of a shift parameter of an unknown symmetric function in Gaussian white noise. We introduce a notion of semiparametric second-order efficiency and propose estimators that are semiparametrically efficient and second-order efficient in our model. These estimators are of a penalized maximum likelihood type with an appropriately chosen penalty. We argue that sec… ▽ More

    Submitted 16 May, 2006; originally announced May 2006.

    Comments: Published at http://dx.doi.org/10.1214/009053605000000895 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS0107 MSC Class: 62G05; 62G20 (Primary)

    Journal ref: Annals of Statistics 2006, Vol. 34, No. 1, 169-201