Skip to main content

Showing 1–10 of 10 results for author: Canonne, C L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2310.06333  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Learning bounded-degree polytrees with known skeleton

    Authors: Davin Choo, Joy Qi** Yang, Arnab Bhattacharyya, Clément L. Canonne

    Abstract: We establish finite-sample guarantees for efficient proper learning of bounded-degree polytrees, a rich class of high-dimensional probability distributions and a subclass of Bayesian networks, a widely-studied type of graphical model. Recently, Bhattacharyya et al. (2021) obtained finite-sample guarantees for recovering tree-structured Bayesian networks, i.e., 1-polytrees. We extend their results… ▽ More

    Submitted 21 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Fixed some typos. Added some discussions. Accepted to ALT 2024

  2. arXiv:2308.06239  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Private Distribution Learning with Public Data: The View from Sample Compression

    Authors: Shai Ben-David, Alex Bie, Clément L. Canonne, Gautam Kamath, Vikrant Singhal

    Abstract: We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution $p$ belonging to a class $\mathcal Q$, with the goal of outputting an estimate of $p$ while adhering to privacy constraints (here, pure differential privacy) only with respec… ▽ More

    Submitted 14 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: 31 pages

  3. arXiv:2302.06869  [pdf, other

    stat.ML cs.DM cs.IT cs.LG math.PR

    Concentration Bounds for Discrete Distribution Estimation in KL Divergence

    Authors: Clément L. Canonne, Ziteng Sun, Ananda Theertha Suresh

    Abstract: We study the problem of discrete distribution estimation in KL divergence and provide concentration bounds for the Laplace estimator. We show that the deviation from mean scales as $\sqrt{k}/n$ when $n \ge k$, improving upon the best prior result of $k/n$. We also establish a matching lower bound that shows that our bounds are tight up to polylogarithmic factors.

    Submitted 12 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Updated discussion of previous work

  4. arXiv:2207.03652  [pdf, other

    math.ST cs.CR cs.LG stat.ME

    Private independence testing across two parties

    Authors: Praneeth Vepakomma, Mohammad Mohammadi Amiri, Clément L. Canonne, Ramesh Raskar, Alex Pentland

    Abstract: We introduce $π$-test, a privacy-preserving algorithm for testing statistical independence between data distributed across multiple parties. Our algorithm relies on privately estimating the distance correlation between datasets, a quantitative measure of independence introduced in Székely et al. [2007]. We establish both additive and multiplicative error bounds on the utility of our differentially… ▽ More

    Submitted 26 September, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

  5. arXiv:2205.07488  [pdf, other

    cs.IT cs.LG math.ST stat.ML

    Robust Testing in High-Dimensional Sparse Models

    Authors: Anand Jerry George, Clément L. Canonne

    Abstract: We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(θ,I_d\right)$ (with unknown $θ$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|θ\|_0\le s$, we want to correctly distinguish whether… ▽ More

    Submitted 4 November, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: Fixed typos, added a figure and discussion section

  6. arXiv:2108.08987  [pdf, other

    cs.DS cs.CR cs.DM stat.ML

    Uniformity Testing in the Shuffle Model: Simpler, Better, Faster

    Authors: Clément L. Canonne, Hongyi Lyu

    Abstract: Uniformity testing, or testing whether independent observations are uniformly distributed, is the prototypical question in distribution testing. Over the past years, a line of work has been focusing on uniformity testing under privacy constraints on the data, and obtained private and data-efficient algorithms under various privacy models such as central differential privacy (DP), local privacy (LD… ▽ More

    Submitted 18 October, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: Accepted to the SIAM Symposium on Simplicity in Algorithms (SOSA 2022). Added some details and discussions

  7. arXiv:2106.13414  [pdf, other

    cs.DS cs.IT math.PR math.ST stat.ML

    The Price of Tolerance in Distribution Testing

    Authors: Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li

    Abstract: We revisit the problem of tolerant distribution testing. That is, given samples from an unknown distribution $p$ over $\{1, \dots, n\}$, is it $\varepsilon_1$-close to or $\varepsilon_2$-far from a reference distribution $q$ (in total variation distance)? Despite significant interest over the past decade, this problem is well understood only in the extreme cases. In the noiseless setting (i.e.,… ▽ More

    Submitted 8 November, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: Added a result on instance-optimal testing, and further discussion in the introduction

  8. arXiv:2004.00010  [pdf, other

    cs.DS cs.CR stat.ML

    The Discrete Gaussian for Differential Privacy

    Authors: Clément L. Canonne, Gautam Kamath, Thomas Steinke

    Abstract: A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical e… ▽ More

    Submitted 18 January, 2021; v1 submitted 31 March, 2020; originally announced April 2020.

    Comments: Improved time analysis, and generalisation to the multivariate case

  9. arXiv:1905.11947  [pdf, ps, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Private Identity Testing for High-Dimensional Distributions

    Authors: Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou

    Abstract: In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$. Our testers have improved sample complexity compared to those derived from previous techniques, and are the first testers whose sample c… ▽ More

    Submitted 3 March, 2022; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Discussing a mistake in the proof of one of the algorithms (Theorem 1.2, computationally inefficient tester), and pointing to follow-up work by Narayanan (2022) who improves upon our results and fixes this mistake

  10. arXiv:1811.11148  [pdf, ps, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    The Structure of Optimal Private Tests for Simple Hypotheses

    Authors: Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman

    Abstract: Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple hypotheses: given two distributions $P$ and $Q$, and a privacy level $\varepsilon$, how many i.i.d. samples are needed to distinguish $P$ from $Q$ subject to $\varepsilon$-differential privacy, and wha… ▽ More

    Submitted 2 April, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

    Comments: To appear in STOC 2019