Skip to main content

Showing 1–9 of 9 results for author: Gatmiry, K

Searching in archive math. Search in all archives.
.
  1. arXiv:2404.18869  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Learning Mixtures of Gaussians Using Diffusion Models

    Authors: Khashayar Gatmiry, Jonathan Kelner, Holden Lee

    Abstract: We give a new algorithm for learning mixtures of $k$ Gaussians (with identity covariance in $\mathbb{R}^n$) to TV error $\varepsilon$, with quasi-polynomial ($O(n^{\text{poly log}\left(\frac{n+k}{\varepsilon}\right)})$) time and sample complexity, under a minimum weight assumption. Unlike previous approaches, most of which are algebraic in nature, our approach is analytic and relies on the framewo… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2306.11121  [pdf, ps, other

    math.OC cs.LG

    Projection-Free Online Convex Optimization via Efficient Newton Iterations

    Authors: Khashayar Gatmiry, Zakaria Mhammedi

    Abstract: This paper presents new projection-free algorithms for Online Convex Optimization (OCO) over a convex domain $\mathcal{K} \subset \mathbb{R}^d$. Classical OCO algorithms (such as Online Gradient Descent) typically need to perform Euclidean projections onto the convex set $\cK$ to ensure feasibility of their iterates. Alternative algorithms, such as those based on the Frank-Wolfe method, swap poten… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  3. arXiv:2303.00480  [pdf, other

    cs.DS cs.LO math.FA math.NA stat.ML

    Sampling with Barriers: Faster Mixing via Lewis Weights

    Authors: Khashayar Gatmiry, Jonathan Kelner, Santosh S. Vempala

    Abstract: We analyze Riemannian Hamiltonian Monte Carlo (RHMC) for sampling a polytope defined by $m$ inequalities in $\R^n$ endowed with the metric defined by the Hessian of a convex barrier function. The advantage of RHMC over Euclidean methods such as the ball walk, hit-and-run and the Dikin walk is in its ability to take longer steps. However, in all previous work, the mixing rate has a linear dependenc… ▽ More

    Submitted 19 April, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  4. arXiv:2212.13669  [pdf, ps, other

    cs.LG math.OC

    Optimal algorithms for group distributionally robust optimization and beyond

    Authors: Tasuku Soma, Khashayar Gatmiry, Stefanie Jegelka

    Abstract: Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also pro… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

  5. arXiv:2211.01357  [pdf, ps, other

    math.OC cs.LG

    Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

    Authors: Zakaria Mhammedi, Khashayar Gatmiry

    Abstract: The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called generalized pro… ▽ More

    Submitted 14 February, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: First revision: presentation improvements

  6. arXiv:2208.07951  [pdf, other

    cs.LG math.DS math.OC stat.ML

    On the generalization of learning algorithms that do not converge

    Authors: Nisha Chandramoorthy, Andreas Loukas, Khashayar Gatmiry, Stefanie Jegelka

    Abstract: Generalization analyses of deep learning typically assume that the training converges to a fixed point. But, recent results indicate that in practice, the weights of deep neural networks optimized with stochastic gradient descent often oscillate indefinitely. To reduce this discrepancy between theory and practice, this paper focuses on the generalization of neural networks whose training dynamics… ▽ More

    Submitted 19 August, 2022; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: 27 pages, under review

  7. arXiv:2204.10818  [pdf, ps, other

    cs.LG math.DG math.PR

    Convergence of the Riemannian Langevin Algorithm

    Authors: Khashayar Gatmiry, Santosh S. Vempala

    Abstract: We study the Riemannian Langevin Algorithm for the problem of sampling from a distribution with density $ν$ with respect to the natural measure on a manifold with metric $g$. We assume that the target density satisfies a log-Sobolev inequality with respect to the metric and prove that the manifold generalization of the Unadjusted Langevin Algorithm converges rapidly to $ν$ for Hessian manifolds. T… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    MSC Class: 60K60; 58D17 ACM Class: F.2; G.3

  8. arXiv:2008.03650  [pdf, ps, other

    cs.LG math.ST stat.ML

    Testing Determinantal Point Processes

    Authors: Khashayar Gatmiry, Maryam Aliakbarpour, Stefanie Jegelka

    Abstract: Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution $q$ over the subsets of a ground set, we aim to distinguish whether $q$ is a DPP distribution, or $ε$-far from all DPP distributions in $\ell_1$-distance. In this work, we pro… ▽ More

    Submitted 9 August, 2020; originally announced August 2020.

  9. arXiv:1811.07307  [pdf, ps, other

    math.ST cs.IT

    Information Theoretic Bounds on Optimal Worst-case Error in Binary Mixture Identification

    Authors: Khashayar Gatmiry, Seyed Abolfazl Motahari

    Abstract: Identification of latent binary sequences from a pool of noisy observations has a wide range of applications in both statistical learning and population genetics. Each observed sequence is the result of passing one of the latent mother-sequences through a binary symmetric channel, which makes this configuration analogous to a special case of Bernoulli Mixture Models. This paper aims to attain an a… ▽ More

    Submitted 27 November, 2018; v1 submitted 18 November, 2018; originally announced November 2018.