Search | arXiv e-print repository

Private PAC Learning May be Harder than Online Learning

Authors: Mark Bun, Aloni Cohen, Rathin Desai

Abstract: We continue the study of the computational complexity of differentially private PAC learning and how it is situated within the foundations of machine learning. A recent line of work uncovered a qualitative equivalence between the private PAC model and Littlestone's mistake-bounded model of online learning, in particular, showing that any concept class of Littlestone dimension $d$ can be privately… ▽ More We continue the study of the computational complexity of differentially private PAC learning and how it is situated within the foundations of machine learning. A recent line of work uncovered a qualitative equivalence between the private PAC model and Littlestone's mistake-bounded model of online learning, in particular, showing that any concept class of Littlestone dimension $d$ can be privately PAC learned using $\mathrm{poly}(d)$ samples. This raises the natural question of whether there might be a generic conversion from online learners to private PAC learners that also preserves computational efficiency. We give a negative answer to this question under reasonable cryptographic assumptions (roughly, those from which it is possible to build indistinguishability obfuscation for all circuits). We exhibit a concept class that admits an online learner running in polynomial time with a polynomial mistake bound, but for which there is no computationally-efficient differentially private PAC learner. Our construction and analysis strengthens and generalizes that of Bun and Zhandry (TCC 2016-A), who established such a separation between private and non-private PAC learner. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.09483 [pdf, ps, other]

Oracle-Efficient Differentially Private Learning with Public Data

Authors: Adam Block, Mark Bun, Rathin Desai, Abhishek Shetty, Steven Wu

Abstract: Due to statistical lower bounds on the learnability of many function classes under privacy constraints, there has been recent interest in leveraging public data to improve the performance of private learning algorithms. In this model, algorithms must always guarantee differential privacy with respect to the private samples while also ensuring learning guarantees when the private data distribution… ▽ More Due to statistical lower bounds on the learnability of many function classes under privacy constraints, there has been recent interest in leveraging public data to improve the performance of private learning algorithms. In this model, algorithms must always guarantee differential privacy with respect to the private samples while also ensuring learning guarantees when the private data distribution is sufficiently close to that of the public data. Previous work has demonstrated that when sufficient public, unlabelled data is available, private learning can be made statistically tractable, but the resulting algorithms have all been computationally inefficient. In this work, we present the first computationally efficient, algorithms to provably leverage public data to learn privately whenever a function class is learnable non-privately, where our notion of computational efficiency is with respect to the number of calls to an optimization oracle for the function class. In addition to this general result, we provide specialized algorithms with improved sample complexities in the special cases when the function class is convex or when the task is binary classification. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.00267 [pdf, ps, other]

Not All Learnable Distribution Classes are Privately Learnable

Authors: Mark Bun, Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

Abstract: We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, δ)$-differential privacy. This refutes a conjecture of Ashtiani. We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, δ)$-differential privacy. This refutes a conjecture of Ashtiani. △ Less

Submitted 5 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

Comments: To appear in ALT 2024. Added a minor clarification to the construction and an acknowledgement of the Fields Institute

arXiv:2309.04642 [pdf, ps, other]

doi 10.1109/CSF54842.2022.9919653

The Complexity of Verifying Boolean Programs as Differentially Private

Authors: Mark Bun, Marco Gaboardi, Ludmila Glinskih

Abstract: We study the complexity of the problem of verifying differential privacy for while-like programs working over boolean values and making probabilistic choices. Programs in this class can be interpreted into finite-state discrete-time Markov Chains (DTMC). We show that the problem of deciding whether a program is differentially private for specific values of the privacy parameters is PSPACE-complete… ▽ More We study the complexity of the problem of verifying differential privacy for while-like programs working over boolean values and making probabilistic choices. Programs in this class can be interpreted into finite-state discrete-time Markov Chains (DTMC). We show that the problem of deciding whether a program is differentially private for specific values of the privacy parameters is PSPACE-complete. To show that this problem is in PSPACE, we adapt classical results about computing hitting probabilities for DTMC. To show PSPACE-hardness we use a reduction from the problem of checking whether a program almost surely terminates or not. We also show that the problem of approximating the privacy parameters that a program provides is PSPACE-hard. Moreover, we investigate the complexity of similar problems also for several relaxations of differential privacy: Rényi differential privacy, concentrated differential privacy, and truncated concentrated differential privacy. For these notions, we consider gap-versions of the problem of deciding whether a program is private or not and we show that all of them are PSPACE-complete. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: Appeared in CSF 2022

arXiv:2306.07884 [pdf, other]

doi 10.1145/3651595

Continual Release of Differentially Private Synthetic Data from Longitudinal Data Collections

Authors: Mark Bun, Marco Gaboardi, Marcel Neunhoeffer, Wanrong Zhang

Abstract: Motivated by privacy concerns in long-term longitudinal studies in medical and social science research, we study the problem of continually releasing differentially private synthetic data from longitudinal data collections. We introduce a model where, in every time step, each individual reports a new data element, and the goal of the synthesizer is to incrementally update a synthetic dataset in a… ▽ More Motivated by privacy concerns in long-term longitudinal studies in medical and social science research, we study the problem of continually releasing differentially private synthetic data from longitudinal data collections. We introduce a model where, in every time step, each individual reports a new data element, and the goal of the synthesizer is to incrementally update a synthetic dataset in a consistent way to capture a rich class of statistical properties. We give continual synthetic data generation algorithms that preserve two basic types of queries: fixed time window queries and cumulative time queries. We show nearly tight upper bounds on the error rates of these algorithms and demonstrate their empirical performance on realistically sized datasets from the U.S. Census Bureau's Survey of Income and Program Participation. △ Less

Submitted 24 May, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

Journal ref: Proc. ACM Manag. Data 2, 2, Article 94 (May 2024), 26 pages (2024)

arXiv:2303.12921 [pdf, ps, other]

Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

Authors: Mark Bun, Marco Gaboardi, Max Hopkins, Russell Impagliazzo, Rex Lei, Toniann Pitassi, Satchit Sivakumar, Jessica Sorrell

Abstract: The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis… ▽ More The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis can facilitate the verification of published results by ensuring that the results of an analysis will be the same with high probability, even when that analysis is performed on a new data set. In this work, we establish new connections and separations between replicability and standard notions of algorithmic stability. In particular, we give sample-efficient algorithmic reductions between perfect generalization, approximate differential privacy, and replicability for a broad class of statistical problems. Conversely, we show any such equivalence must break down computationally: there exist statistical problems that are easy under differential privacy, but that cannot be solved replicably without breaking public-key cryptography. Furthermore, these results are tight: our reductions are statistically optimal, and we show that any computational separation between DP and replicability must imply the existence of one-way functions. Our statistical reductions give a new algorithmic framework for translating between notions of stability, which we instantiate to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various PAC learning, distribution estimation, and distribution testing problems, algorithmic amplification of $δ$ in approximate DP, conversions from item-level to user-level privacy, and the existence of private agnostic-to-realizable learning reductions under structured distributions. △ Less

Submitted 24 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: STOC 2023, minor typos fixed

arXiv:2303.03921 [pdf, ps, other]

Approximate degree lower bounds for oracle identification problems

Authors: Mark Bun, Nadezhda Voronova

Abstract: The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its quantum query complexity, and generically lifts to a quantum communication lower bound for a related function. We introduce a framework for proving approximate degree lower bounds for certain oracle i… ▽ More The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its quantum query complexity, and generically lifts to a quantum communication lower bound for a related function. We introduce a framework for proving approximate degree lower bounds for certain oracle identification problems, where the goal is to recover a hidden binary string $x \in \{0, 1\}^n$ given possibly non-standard oracle access to it. Our lower bounds apply to decision versions of these problems, where the goal is to compute the parity of $x$. We apply our framework to the ordered search and hidden string problems, proving nearly tight approximate degree lower bounds of $Ω(n/\log^2 n)$ for each. These lower bounds generalize to the weakly unbounded error setting, giving a new quantum query lower bound for the hidden string problem in this regime. Our lower bounds are driven by randomized communication upper bounds for the greater-than and equality functions. △ Less

Submitted 22 May, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: 39 Pages. To appear at TQC 2023. v2: This update adds the generalization of our results to the weakly unbounded error setting

arXiv:2301.08324 [pdf, other]

doi 10.1214/24-EJS2234

Differentially Private Confidence Intervals for Proportions under Stratified Random Sampling

Authors: Shurong Lin, Mark Bun, Marco Gaboardi, Eric D. Kolaczyk, Adam Smith

Abstract: Confidence intervals are a fundamental tool for quantifying the uncertainty of parameters of interest. With the increase of data privacy awareness, develo** a private version of confidence intervals has gained growing attention from both statisticians and computer scientists. Differential privacy is a state-of-the-art framework for analyzing privacy loss when releasing statistics computed from s… ▽ More Confidence intervals are a fundamental tool for quantifying the uncertainty of parameters of interest. With the increase of data privacy awareness, develo** a private version of confidence intervals has gained growing attention from both statisticians and computer scientists. Differential privacy is a state-of-the-art framework for analyzing privacy loss when releasing statistics computed from sensitive data. Recent work has been done around differentially private confidence intervals, yet to the best of our knowledge, rigorous methodologies on differentially private confidence intervals in the context of survey sampling have not been studied. In this paper, we propose three differentially private algorithms for constructing confidence intervals for proportions under stratified random sampling. We articulate two variants of differential privacy that make sense for data from stratified sampling designs, analyzing each of our algorithms within one of these two variants. We establish analytical privacy guarantees and asymptotic properties of the estimators. In addition, we conduct simulation studies to evaluate the proposed private confidence intervals, and two applications to the 1940 Census data are provided. △ Less

Submitted 11 April, 2024; v1 submitted 19 January, 2023; originally announced January 2023.

Comments: 39 pages, 4 figures

MSC Class: 68P27; 62G15; 62Dxx

Journal ref: Electronic Journal of Statistics, Electron. J. Statist. 18(1), 1455-1494, (2024)

arXiv:2206.04743 [pdf, ps, other]

Strong Memory Lower Bounds for Learning Natural Models

Authors: Gavin Brown, Mark Bun, Adam Smith

Abstract: We give lower bounds on the amount of memory required by one-pass streaming algorithms for solving several natural learning problems. In a setting where examples lie in $\{0,1\}^d$ and the optimal classifier can be encoded using $κ$ bits, we show that algorithms which learn using a near-minimal number of examples, $\tilde O(κ)$, must use $\tilde Ω( dκ)$ bits of space. Our space bounds match the di… ▽ More We give lower bounds on the amount of memory required by one-pass streaming algorithms for solving several natural learning problems. In a setting where examples lie in $\{0,1\}^d$ and the optimal classifier can be encoded using $κ$ bits, we show that algorithms which learn using a near-minimal number of examples, $\tilde O(κ)$, must use $\tilde Ω( dκ)$ bits of space. Our space bounds match the dimension of the ambient space of the problem's natural parametrization, even when it is quadratic in the size of examples and the final classifier. For instance, in the setting of $d$-sparse linear classifiers over degree-2 polynomial features, for which $κ=Θ(d\log d)$, our space lower bound is $\tildeΩ(d^2)$. Our bounds degrade gracefully with the stream length $N$, generally having the form $\tildeΩ\left(dκ\cdot \fracκ{N}\right)$. Bounds of the form $Ω(dκ)$ were known for learning parity and other problems defined over finite fields. Bounds that apply in a narrow range of sample sizes are also known for linear regression. Ours are the first such bounds for problems of the type commonly seen in recent learning applications that apply for a large range of input sizes. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 39 Pages. To appear at COLT 2022

arXiv:2107.10870 [pdf, other]

Multiclass versus Binary Differentially Private PAC Learning

Authors: Mark Bun, Marco Gaboardi, Satchit Sivakumar

Abstract: We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity that has a polynomial dependence on the multiclass Littlestone dimension and a poly-logarithmic dependence on the number of classes. This yields an… ▽ More We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity that has a polynomial dependence on the multiclass Littlestone dimension and a poly-logarithmic dependence on the number of classes. This yields an exponential improvement in the dependence on both parameters over learners from previous work. Our proof extends the notion of $Ψ$-dimension defined in work of Ben-David et al. [JCSS '95] to the online setting and explores its general properties. △ Less

Submitted 22 July, 2021; originally announced July 2021.

arXiv:2105.08346 [pdf, ps, other]

Identification robust inference for moments based analysis of linear dynamic panel data models

Authors: Maurice J. G. Bun, Frank Kleibergen

Abstract: We use identification robust tests to show that difference, level and non-linear moment conditions, as proposed by Arellano and Bond (1991), Arellano and Bover (1995), Blundell and Bond (1998) and Ahn and Schmidt (1995) for the linear dynamic panel data model, do not separately identify the autoregressive parameter when its true value is close to one and the variance of the initial observations is… ▽ More We use identification robust tests to show that difference, level and non-linear moment conditions, as proposed by Arellano and Bond (1991), Arellano and Bover (1995), Blundell and Bond (1998) and Ahn and Schmidt (1995) for the linear dynamic panel data model, do not separately identify the autoregressive parameter when its true value is close to one and the variance of the initial observations is large. We prove that combinations of these moment conditions, however, do so when there are more than three time series observations. This identification then solely results from a set of, so-called, robust moment conditions. These robust moments are spanned by the combined difference, level and non-linear moment conditions and only depend on differenced data. We show that, when only the robust moments contain identifying information on the autoregressive parameter, the discriminatory power of the Kleibergen (2005) LM test using the combined moments is identical to the largest rejection frequencies that can be obtained from solely using the robust moments. This shows that the KLM test implicitly uses the robust moments when only they contain information on the autoregressive parameter. △ Less

Submitted 18 May, 2021; originally announced May 2021.

Comments: 66 pages, 14 figures

arXiv:2102.08885 [pdf, other]

Differentially Private Correlation Clustering

Authors: Mark Bun, Marek Eliáš, Janardhan Kulkarni

Abstract: Correlation clustering is a widely used technique in unsupervised machine learning. Motivated by applications where individual privacy is a concern, we initiate the study of differentially private correlation clustering. We propose an algorithm that achieves subquadratic additive error compared to the optimal cost. In contrast, straightforward adaptations of existing non-private algorithms all lea… ▽ More Correlation clustering is a widely used technique in unsupervised machine learning. Motivated by applications where individual privacy is a concern, we initiate the study of differentially private correlation clustering. We propose an algorithm that achieves subquadratic additive error compared to the optimal cost. In contrast, straightforward adaptations of existing non-private algorithms all lead to a trivial quadratic error. Finally, we give a lower bound showing that any pure differentially private algorithm for correlation clustering requires additive error of $Ω(n)$. △ Less

Submitted 17 February, 2021; originally announced February 2021.

arXiv:2012.06421 [pdf, other]

doi 10.1145/3406325.3451131

When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?

Authors: Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, Kunal Talwar

Abstract: Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We… ▽ More Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the next-symbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text- and image-related prediction problems. To establish our results, we reduce from a family of one-way communication problems for which we prove new information complexity lower bounds. Additionally, we present synthetic-data experiments demonstrating successful attacks on logistic regression and neural network classifiers. △ Less

Submitted 21 July, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

Journal ref: STOC 2021 Pages 123-132

arXiv:2007.12674 [pdf, other]

Controlling Privacy Loss in Sampling Schemes: an Analysis of Stratified and Cluster Sampling

Authors: Mark Bun, Jörg Drechsler, Marco Gaboardi, Audra McMillan, Jayshree Sarathy

Abstract: Sampling schemes are fundamental tools in statistics, survey design, and algorithm design. A fundamental result in differential privacy is that a differentially private mechanism run on a simple random sample of a population provides stronger privacy guarantees than the same algorithm run on the entire population. However, in practice, sampling designs are often more complex than the simple, data-… ▽ More Sampling schemes are fundamental tools in statistics, survey design, and algorithm design. A fundamental result in differential privacy is that a differentially private mechanism run on a simple random sample of a population provides stronger privacy guarantees than the same algorithm run on the entire population. However, in practice, sampling designs are often more complex than the simple, data-independent sampling schemes that are addressed in prior work. In this work, we extend the study of privacy amplification results to more complex, data-dependent sampling schemes. We find that not only do these sampling schemes often fail to amplify privacy, they can actually result in privacy degradation. We analyze the privacy implications of the pervasive cluster sampling and stratified sampling paradigms, as well as provide some insight into the study of more general sampling designs. △ Less

Submitted 21 June, 2023; v1 submitted 24 July, 2020; originally announced July 2020.

Comments: Appeared at FORC 2022

arXiv:2007.05665 [pdf, ps, other]

A Computational Separation between Private Learning and Online Learning

Authors: Mark Bun

Abstract: A recent line of work has shown a qualitative equivalence between differentially private PAC learning and online learning: A concept class is privately learnable if and only if it is online learnable with a finite mistake bound. However, both directions of this equivalence incur significant losses in both sample and computational efficiency. Studying a special case of this connection, Gonen, Hazan… ▽ More A recent line of work has shown a qualitative equivalence between differentially private PAC learning and online learning: A concept class is privately learnable if and only if it is online learnable with a finite mistake bound. However, both directions of this equivalence incur significant losses in both sample and computational efficiency. Studying a special case of this connection, Gonen, Hazan, and Moran (NeurIPS 2019) showed that uniform or highly sample-efficient pure-private learners can be time-efficiently compiled into online learners. We show that, assuming the existence of one-way functions, such an efficient conversion is impossible even for general pure-private learners with polynomial sample complexity. This resolves a question of Neel, Roth, and Wu (FOCS 2019). △ Less

Submitted 10 July, 2020; originally announced July 2020.

Comments: 15 pages

arXiv:2007.05453 [pdf, other]

New Oracle-Efficient Algorithms for Private Synthetic Data Release

Authors: Giuseppe Vietri, Grace Tian, Mark Bun, Thomas Steinke, Zhiwei Steven Wu

Abstract: We present three new algorithms for constructing differentially private synthetic data---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. All three algorithms are \emph{oracle-efficient} in the sense that they are computationally efficient when given access to an optimization oracle. Such an oracle can be implemented… ▽ More We present three new algorithms for constructing differentially private synthetic data---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. All three algorithms are \emph{oracle-efficient} in the sense that they are computationally efficient when given access to an optimization oracle. Such an oracle can be implemented using many existing (non-private) optimization tools such as sophisticated integer program solvers. While the accuracy of the synthetic data is contingent on the oracle's optimization performance, the algorithms satisfy differential privacy even in the worst case. For all three algorithms, we provide theoretical guarantees for both accuracy and privacy. Through empirical evaluation, we demonstrate that our methods scale well with both the dimensionality of the data and the number of queries. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism \cite{McKennaMHM18}, our algorithms provide better accuracy in the large workload and high privacy regime (corresponding to low privacy loss $\varepsilon$). △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:2003.00563 [pdf, other]

An Equivalence Between Private Classification and Online Prediction

Authors: Mark Bun, Roi Livni, Shay Moran

Abstract: We prove that every concept class with finite Littlestone dimension can be learned by an (approximate) differentially-private algorithm. This answers an open question of Alon et al. (STOC 2019) who proved the converse statement (this question was also asked by Neel et al.~(FOCS 2019)). Together these two results yield an equivalence between online learnability and private PAC learnability. We in… ▽ More We prove that every concept class with finite Littlestone dimension can be learned by an (approximate) differentially-private algorithm. This answers an open question of Alon et al. (STOC 2019) who proved the converse statement (this question was also asked by Neel et al.~(FOCS 2019)). Together these two results yield an equivalence between online learnability and private PAC learnability. We introduce a new notion of algorithmic stability called "global stability" which is essential to our proof and may be of independent interest. We also discuss an application of our results to boosting the privacy and accuracy parameters of differentially-private learners. △ Less

Submitted 22 June, 2021; v1 submitted 1 March, 2020; originally announced March 2020.

Comments: An earlier version of this manuscript claimed an upper bound over the sample complexity that is exponential in the Littlestone dimension. The argument was erranous, and the current version contains a correction, which leads to double-exponential dependence in the Littlestone-dimension

arXiv:2002.01100 [pdf, other]

Efficient, Noise-Tolerant, and Private Learning via Boosting

Authors: Mark Bun, Marco Leandro Carmosino, Jessica Sorrell

Abstract: We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two samp… ▽ More We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise. △ Less

Submitted 3 February, 2020; originally announced February 2020.

Comments: 33 pages

arXiv:1906.02830 [pdf, other]

Average-Case Averages: Private Algorithms for Smooth Sensitivity and Mean Estimation

Authors: Mark Bun, Thomas Steinke

Abstract: The simplest and most widely applied method for guaranteeing differential privacy is to add instance-independent noise to a statistic of interest that is scaled to its global sensitivity. However, global sensitivity is a worst-case notion that is often too conservative for realized dataset instances. We provide methods for scaling noise in an instance-dependent way and demonstrate that they provid… ▽ More The simplest and most widely applied method for guaranteeing differential privacy is to add instance-independent noise to a statistic of interest that is scaled to its global sensitivity. However, global sensitivity is a worst-case notion that is often too conservative for realized dataset instances. We provide methods for scaling noise in an instance-dependent way and demonstrate that they provide greater accuracy under average-case distributional assumptions. Specifically, we consider the basic problem of privately estimating the mean of a real distribution from i.i.d.~samples. The standard empirical mean estimator can have arbitrarily-high global sensitivity. We propose the trimmed mean estimator, which interpolates between the mean and the median, as a way of attaining much lower sensitivity on average while losing very little in terms of statistical accuracy. To privately estimate the trimmed mean, we revisit the smooth sensitivity framework of Nissim, Raskhodnikova, and Smith (STOC 2007), which provides a framework for using instance-dependent sensitivity. We propose three new additive noise distributions which provide concentrated differential privacy when scaled to smooth sensitivity. We provide theoretical and experimental evidence showing that our noise distributions compare favorably to others in the literature, in particular, when applied to the mean estimation problem. △ Less

Submitted 6 June, 2019; originally announced June 2019.

arXiv:1905.13229 [pdf, ps, other]

Private Hypothesis Selection

Authors: Mark Bun, Gautam Kamath, Thomas Steinke, Zhiwei Steven Wu

Abstract: We provide a differentially private algorithm for hypothesis selection. Given samples from an unknown probability distribution $P$ and a set of $m$ probability distributions $\mathcal{H}$, the goal is to output, in a $\varepsilon$-differentially private manner, a distribution from $\mathcal{H}$ whose total variation distance to $P$ is comparable to that of the best such distribution (which we deno… ▽ More We provide a differentially private algorithm for hypothesis selection. Given samples from an unknown probability distribution $P$ and a set of $m$ probability distributions $\mathcal{H}$, the goal is to output, in a $\varepsilon$-differentially private manner, a distribution from $\mathcal{H}$ whose total variation distance to $P$ is comparable to that of the best such distribution (which we denote by $α$). The sample complexity of our basic algorithm is $O\left(\frac{\log m}{α^2} + \frac{\log m}{α\varepsilon}\right)$, representing a minimal cost for privacy when compared to the non-private algorithm. We also can handle infinite hypothesis classes $\mathcal{H}$ by relaxing to $(\varepsilon,δ)$-differential privacy. We apply our hypothesis selection algorithm to give learning algorithms for a number of natural distribution classes, including Gaussians, product distributions, sums of independent random variables, piecewise polynomials, and mixture classes. Our hypothesis selection procedure allows us to generically convert a cover for a class to a learning algorithm, complementing known learning lower bounds which are in terms of the size of the packing number of the class. As the covering and packing numbers are often closely related, for constant $α$, our algorithms achieve the optimal sample complexity for many classes of interest. Finally, we describe an application to private distribution-free PAC learning. △ Less

Submitted 4 January, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

Comments: Appeared in NeurIPS 2019. Final version to appear in IEEE Transactions on Information Theory

arXiv:1903.00544 [pdf, ps, other]

Sign-Rank Can Increase Under Intersection

Authors: Mark Bun, Nikhil S. Mande, Justin Thaler

Abstract: The communication class $\mathbf{UPP}^{\text{cc}}$ is a communication analog of the Turing Machine complexity class $\mathbf{PP}$. It is characterized by a matrix-analytic complexity measure called sign-rank (also called dimension complexity), and is essentially the most powerful communication class against which we know how to prove lower bounds. For a communication problem $f$, let… ▽ More The communication class $\mathbf{UPP}^{\text{cc}}$ is a communication analog of the Turing Machine complexity class $\mathbf{PP}$. It is characterized by a matrix-analytic complexity measure called sign-rank (also called dimension complexity), and is essentially the most powerful communication class against which we know how to prove lower bounds. For a communication problem $f$, let $f \wedge f$ denote the function that evaluates $f$ on two disjoint inputs and outputs the AND of the results. We exhibit a communication problem $f$ with $\mathbf{UPP}(f)= O(\log n)$, and $\mathbf{UPP}(f \wedge f) = Θ(\log^2 n)$. This is the first result showing that $\mathbf{UPP}$ communication complexity can increase by more than a constant factor under intersection. We view this as a first step toward showing that $\mathbf{UPP}^{\text{cc}}$, the class of problems with polylogarithmic-cost $\mathbf{UPP}$ communication protocols, is not closed under intersection. Our result shows that the function class consisting of intersections of two majorities on $n$ bits has dimension complexity $n^{Ω(\log n)}$. This matches an upper bound of (Klivans, O'Donnell, and Servedio, FOCS 2002), who used it to give a quasipolynomial time algorithm for PAC learning intersections of polylogarithmically many majorities. Hence, fundamentally new techniques will be needed to learn this class of functions in polynomial time. △ Less

Submitted 1 March, 2019; originally announced March 2019.

Comments: 18 pages

MSC Class: 68Q17

arXiv:1811.03763 [pdf, ps, other]

Towards Instance-Optimal Private Query Release

Authors: Jaroslaw Blasiok, Mark Bun, Aleksandar Nikolov, Thomas Steinke

Abstract: We study efficient mechanisms for the query release problem in differential privacy: given a workload of $m$ statistical queries, output approximate answers to the queries while satisfying the constraints of differential privacy. In particular, we are interested in mechanisms that optimally adapt to the given workload. Building on the projection mechanism of Nikolov, Talwar, and Zhang, and using t… ▽ More We study efficient mechanisms for the query release problem in differential privacy: given a workload of $m$ statistical queries, output approximate answers to the queries while satisfying the constraints of differential privacy. In particular, we are interested in mechanisms that optimally adapt to the given workload. Building on the projection mechanism of Nikolov, Talwar, and Zhang, and using the ideas behind Dudley's chaining inequality, we propose new efficient algorithms for the query release problem, and prove that they achieve optimal sample complexity for the given workload (up to constant factors, in certain parameter regimes) with respect to the class of mechanisms that satisfy concentrated differential privacy. We also give variants of our algorithms that satisfy local differential privacy, and prove that they also achieve optimal sample complexity among all local sequentially interactive private mechanisms. △ Less

Submitted 8 November, 2018; originally announced November 2018.

Comments: To appear in SODA 2019

arXiv:1809.02254 [pdf, other]

doi 10.22331/q-2021-09-16-543

Quantum algorithms and approximating polynomials for composed functions with shared inputs

Authors: Mark Bun, Robin Kothari, Justin Thaler

Abstract: We give new quantum algorithms for evaluating composed functions whose inputs may be shared between bottom-level gates. Let $f$ be an $m$-bit Boolean function and consider an $n$-bit function $F$ obtained by applying $f$ to conjunctions of possibly overlap** subsets of $n$ variables. If $f$ has quantum query complexity $Q(f)$, we give an algorithm for evaluating $F$ using… ▽ More We give new quantum algorithms for evaluating composed functions whose inputs may be shared between bottom-level gates. Let $f$ be an $m$-bit Boolean function and consider an $n$-bit function $F$ obtained by applying $f$ to conjunctions of possibly overlap** subsets of $n$ variables. If $f$ has quantum query complexity $Q(f)$, we give an algorithm for evaluating $F$ using $\tilde{O}(\sqrt{Q(f) \cdot n})$ quantum queries. This improves on the bound of $O(Q(f) \cdot \sqrt{n})$ that follows by treating each conjunction independently, and our bound is tight for worst-case choices of $f$. Using completely different techniques, we prove a similar tight composition theorem for the approximate degree of $f$. By recursively applying our composition theorems, we obtain a nearly optimal $\tilde{O}(n^{1-2^{-d}})$ upper bound on the quantum query complexity and approximate degree of linear-size depth-$d$ AC$^0$ circuits. As a consequence, such circuits can be PAC learned in subexponential time, even in the challenging agnostic setting. Prior to our work, a subexponential-time algorithm was not known even for linear-size depth-3 AC$^0$ circuits. As an additional consequence, we show that AC$^0 \circ \oplus$ circuits of depth $d+1$ require size $\tildeΩ(n^{1/(1- 2^{-d})}) \geq ω(n^{1+ 2^{-d}} )$ to compute the Inner Product function even on average. The previous best size lower bound was $Ω(n^{1+4^{-(d+1)}})$ and only held in the worst case (Cheraghchi et al., JCSS 2018). △ Less

Submitted 10 September, 2021; v1 submitted 6 September, 2018; originally announced September 2018.

Comments: v2: 31 pages; 1 figure. This update includes an additional result on lower bounds for AC$^0 \circ \oplus$ computing the Inner Product function on average. v3: Minor changes. Accepted to Quantum

Journal ref: Quantum 5, 543 (2021)

arXiv:1711.04740 [pdf, other]

Heavy Hitters and the Structure of Local Privacy

Authors: Mark Bun, Jelani Nelson, Uri Stemmer

Abstract: We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bou… ▽ More We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters. $\bullet$ Advanced Grouposition: In the local model, group privacy for $k$ users degrades proportionally to $\approx \sqrt{k}$, instead of linearly in $k$ as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via "packing arguments"), over the central model. $\bullet$ Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy. △ Less

Submitted 13 November, 2017; originally announced November 2017.

arXiv:1710.09079 [pdf, other]

doi 10.1145/3188745.3188784

The Polynomial Method Strikes Back: Tight Quantum Query Bounds via Dual Polynomials

Authors: Mark Bun, Robin Kothari, Justin Thaler

Abstract: The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. Approximate degree is known to be a lower bound on quantum query complexity. We resolve or nearly resolve the approximate degree and quantum query complexities of the following basic functions: $\bullet$ $k$-distinctness: For any constant $k$, the approximat… ▽ More The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. Approximate degree is known to be a lower bound on quantum query complexity. We resolve or nearly resolve the approximate degree and quantum query complexities of the following basic functions: $\bullet$ $k$-distinctness: For any constant $k$, the approximate degree and quantum query complexity of $k$-distinctness is $Ω(n^{3/4-1/(2k)})$. This is nearly tight for large $k$ (Belovs, FOCS 2012). $\bullet$ Image size testing: The approximate degree and quantum query complexity of testing the size of the image of a function $[n] \to [n]$ is $\tildeΩ(n^{1/2})$. This proves a conjecture of Ambainis et al. (SODA 2016), and it implies the following lower bounds: $-$ $k$-junta testing: A tight $\tildeΩ(k^{1/2})$ lower bound, answering the main open question of Ambainis et al. (SODA 2016). $-$ Statistical Distance from Uniform: A tight $\tildeΩ(n^{1/2})$ lower bound, answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011). $-$ Shannon entropy: A tight $\tildeΩ(n^{1/2})$ lower bound, answering a question of Li and Wu (2017). $\bullet$ Surjectivity: The approximate degree of the Surjectivity function is $\tildeΩ(n^{3/4})$. The best prior lower bound was $Ω(n^{2/3})$. Our result matches an upper bound of $\tilde{O}(n^{3/4})$ due to Sherstov, which we reprove using different techniques. The quantum query complexity of this function is known to be $Θ(n)$ (Beame and Machmouchi, QIC 2012 and Sherstov, FOCS 2015). Our upper bound for Surjectivity introduces new techniques for approximating Boolean functions by low-degree polynomials. Our lower bounds are proved by significantly refining techniques recently introduced by Bun and Thaler (FOCS 2017). △ Less

Submitted 16 August, 2019; v1 submitted 25 October, 2017; originally announced October 2017.

Comments: v1: 67 pages, 2 figures. Abstract shortened to fit within the arXiv limit; v2: Minor changes; v3: Minor fixes incorporating suggestions from referees

Journal ref: Proceedings of the 50th ACM Symposium on Theory of Computing (STOC 2018), pp. 297-310 (2018)

arXiv:1703.05784 [pdf, other]

A Nearly Optimal Lower Bound on the Approximate Degree of AC$^0$

Authors: Mark Bun, Justin Thaler

Abstract: The approximate degree of a Boolean function $f \colon \{-1, 1\}^n \rightarrow \{-1, 1\}$ is the least degree of a real polynomial that approximates $f$ pointwise to error at most $1/3$. We introduce a generic method for increasing the approximate degree of a given function, while preserving its computability by constant-depth circuits. Specifically, we show how to transform any Boolean function… ▽ More The approximate degree of a Boolean function $f \colon \{-1, 1\}^n \rightarrow \{-1, 1\}$ is the least degree of a real polynomial that approximates $f$ pointwise to error at most $1/3$. We introduce a generic method for increasing the approximate degree of a given function, while preserving its computability by constant-depth circuits. Specifically, we show how to transform any Boolean function $f$ with approximate degree $d$ into a function $F$ on $O(n \cdot \operatorname{polylog}(n))$ variables with approximate degree at least $D = Ω(n^{1/3} \cdot d^{2/3})$. In particular, if $d= n^{1-Ω(1)}$, then $D$ is polynomially larger than $d$. Moreover, if $f$ is computed by a polynomial-size Boolean circuit of constant depth, then so is $F$. By recursively applying our transformation, for any constant $δ> 0$ we exhibit an AC$^0$ function of approximate degree $Ω(n^{1-δ})$. This improves over the best previous lower bound of $Ω(n^{2/3})$ due to Aaronson and Shi (J. ACM 2004), and nearly matches the trivial upper bound of $n$ that holds for any function. Our lower bounds also apply to (quasipolynomial-size) DNFs of polylogarithmic width. We describe several applications of these results. We give: * For any constant $δ> 0$, an $Ω(n^{1-δ})$ lower bound on the quantum communication complexity of a function in AC$^0$. * A Boolean function $f$ with approximate degree at least $C(f)^{2-o(1)}$, where $C(f)$ is the certificate complexity of $f$. This separation is optimal up to the $o(1)$ term in the exponent. * Improved secret sharing schemes with reconstruction procedures in AC$^0$. △ Less

Submitted 16 March, 2017; originally announced March 2017.

Comments: 40 pages, 1 figure

arXiv:1605.02065 [pdf, other]

Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds

Authors: Mark Bun, Thomas Steinke

Abstract: "Concentrated differential privacy" was recently introduced by Dwork and Rothblum as a relaxation of differential privacy, which permits sharper analyses of many privacy-preserving computations. We present an alternative formulation of the concept of concentrated differential privacy in terms of the Renyi divergence between the distributions obtained by running an algorithm on neighboring inputs.… ▽ More "Concentrated differential privacy" was recently introduced by Dwork and Rothblum as a relaxation of differential privacy, which permits sharper analyses of many privacy-preserving computations. We present an alternative formulation of the concept of concentrated differential privacy in terms of the Renyi divergence between the distributions obtained by running an algorithm on neighboring inputs. With this reformulation in hand, we prove sharper quantitative results, establish lower bounds, and raise a few new questions. We also unify this approach with approximate differential privacy by giving an appropriate definition of "approximate concentrated differential privacy." △ Less

Submitted 6 May, 2016; originally announced May 2016.

arXiv:1604.04618 [pdf, other]

Make Up Your Mind: The Price of Online Queries in Differential Privacy

Authors: Mark Bun, Thomas Steinke, Jonathan Ullman

Abstract: We consider the problem of answering queries about a sensitive dataset subject to differential privacy. The queries may be chosen adversarially from a larger set Q of allowable queries in one of three ways, which we list in order from easiest to hardest to answer: Offline: The queries are chosen all at once and the differentially private mechanism answers the queries in a single batch. Online:… ▽ More We consider the problem of answering queries about a sensitive dataset subject to differential privacy. The queries may be chosen adversarially from a larger set Q of allowable queries in one of three ways, which we list in order from easiest to hardest to answer: Offline: The queries are chosen all at once and the differentially private mechanism answers the queries in a single batch. Online: The queries are chosen all at once, but the mechanism only receives the queries in a streaming fashion and must answer each query before seeing the next query. Adaptive: The queries are chosen one at a time and the mechanism must answer each query before the next query is chosen. In particular, each query may depend on the answers given to previous queries. Many differentially private mechanisms are just as efficient in the adaptive model as they are in the offline model. Meanwhile, most lower bounds for differential privacy hold in the offline setting. This suggests that the three models may be equivalent. We prove that these models are all, in fact, distinct. Specifically, we show that there is a family of statistical queries such that exponentially more queries from this family can be answered in the offline model than in the online model. We also exhibit a family of search queries such that exponentially more queries from this family can be answered in the online model than in the adaptive model. We also investigate whether such separations might hold for simple queries like threshold queries over the real line. △ Less

Submitted 15 April, 2016; originally announced April 2016.

arXiv:1511.08552 [pdf, other]

doi 10.1145/2840728.2840747

Simultaneous Private Learning of Multiple Concepts

Authors: Mark Bun, Kobbi Nissim, Uri Stemmer

Abstract: We investigate the direct-sum problem in the context of differentially private PAC learning: What is the sample complexity of solving $k$ learning tasks simultaneously under differential privacy, and how does this cost compare to that of solving $k$ learning tasks without privacy? In our setting, an individual example consists of a domain element $x$ labeled by $k$ unknown concepts… ▽ More We investigate the direct-sum problem in the context of differentially private PAC learning: What is the sample complexity of solving $k$ learning tasks simultaneously under differential privacy, and how does this cost compare to that of solving $k$ learning tasks without privacy? In our setting, an individual example consists of a domain element $x$ labeled by $k$ unknown concepts $(c_1,\ldots,c_k)$. The goal of a multi-learner is to output $k$ hypotheses $(h_1,\ldots,h_k)$ that generalize the input examples. Without concern for privacy, the sample complexity needed to simultaneously learn $k$ concepts is essentially the same as needed for learning a single concept. Under differential privacy, the basic strategy of learning each hypothesis independently yields sample complexity that grows polynomially with $k$. For some concept classes, we give multi-learners that require fewer samples than the basic strategy. Unfortunately, however, we also give lower bounds showing that even for very simple concept classes, the sample cost of private multi-learning must grow polynomially in $k$. △ Less

Submitted 26 November, 2015; originally announced November 2015.

Comments: 29 pages. To appear in ITCS '16

arXiv:1505.00388 [pdf, other]

Order-Revealing Encryption and the Hardness of Private Learning

Authors: Mark Bun, Mark Zhandry

Abstract: An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient $(ε, δ)$-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for whi… ▽ More An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient $(ε, δ)$-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest. △ Less

Submitted 2 May, 2015; originally announced May 2015.

Comments: 28 pages

arXiv:1504.07553 [pdf, other]

Differentially Private Release and Learning of Threshold Functions

Authors: Mark Bun, Kobbi Nissim, Uri Stemmer, Salil Vadhan

Abstract: We prove new upper and lower bounds on the sample complexity of $(ε, δ)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y \le x$, and evaluates to $0$ otherwise. We give the first nontrivial lower bound for releasing thresholds with $(ε,δ)$ differential privacy, s… ▽ More We prove new upper and lower bounds on the sample complexity of $(ε, δ)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y \le x$, and evaluates to $0$ otherwise. We give the first nontrivial lower bound for releasing thresholds with $(ε,δ)$ differential privacy, showing that the task is impossible over an infinite domain $X$, and moreover requires sample complexity $n \ge Ω(\log^*|X|)$, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with $n \le 2^{(1+ o(1))\log^*|X|}$ samples. This improves the previous best upper bound of $8^{(1 + o(1))\log^*|X|}$ (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with $(ε,δ)$ differential privacy and learning without privacy. For properly learning thresholds in $\ell$ dimensions, this lower bound extends to $n \ge Ω(\ell \cdot \log^*|X|)$. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database $D$ of elements from $X$, the interior point problem asks for an element between the smallest and largest elements in $D$. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy. △ Less

Submitted 28 April, 2015; originally announced April 2015.

Comments: 43 pages

arXiv:1503.07261 [pdf, ps, other]

Dual Polynomials for Collision and Element Distinctness

Authors: Mark Bun, Justin Thaler

Abstract: The approximate degree of a Boolean function $f: \{-1, 1\}^n \to \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within error $1/3$ in the $\ell_\infty$ norm. In an influential result, Aaronson and Shi (J. ACM 2004) proved tight $\tildeΩ(n^{1/3})$ and $\tildeΩ(n^{2/3})$ lower bounds on the approximate degree of the Collision and Element Distinctness functions, respec… ▽ More The approximate degree of a Boolean function $f: \{-1, 1\}^n \to \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within error $1/3$ in the $\ell_\infty$ norm. In an influential result, Aaronson and Shi (J. ACM 2004) proved tight $\tildeΩ(n^{1/3})$ and $\tildeΩ(n^{2/3})$ lower bounds on the approximate degree of the Collision and Element Distinctness functions, respectively. Their proof was non-constructive, using a sophisticated symmetrization argument and tools from approximation theory. More recently, several open problems in the study of approximate degree have been resolved via the construction of dual polynomials. These are explicit dual solutions to an appropriate linear program that captures the approximate degree of any function. We reprove Aaronson and Shi's results by constructing explicit dual polynomials for the Collision and Element Distinctness functions. △ Less

Submitted 24 March, 2015; originally announced March 2015.

Comments: 25 pages

arXiv:1412.2457 [pdf, ps, other]

Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

Authors: Mark Bun, Thomas Steinke

Abstract: Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstl… ▽ More Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on $\mathbb{R}^n$ in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to $\exp(-|x|^{0.99})$. Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only $O(\log(1/δ))$-wise independence, for a tail probability of $δ$. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science. △ Less

Submitted 8 December, 2014; originally announced December 2014.

Comments: 22 pages

arXiv:1311.3158 [pdf, other]

Fingerprinting Codes and the Price of Approximate Differential Privacy

Authors: Mark Bun, Jonathan Ullman, Salil Vadhan

Abstract: We show new lower bounds on the sample complexity of $(\varepsilon, δ)$-differentially private algorithms that accurately answer large sets of counting queries. A counting query on a database $D \in (\{0,1\}^d)^n$ has the form "What fraction of the individual records in the database satisfy the property $q$?" We show that in order to answer an arbitrary set $\mathcal{Q}$ of $\gg nd$ counting queri… ▽ More We show new lower bounds on the sample complexity of $(\varepsilon, δ)$-differentially private algorithms that accurately answer large sets of counting queries. A counting query on a database $D \in (\{0,1\}^d)^n$ has the form "What fraction of the individual records in the database satisfy the property $q$?" We show that in order to answer an arbitrary set $\mathcal{Q}$ of $\gg nd$ counting queries on $D$ to within error $\pm α$ it is necessary that $$ n \geq \tildeΩ\Bigg(\frac{\sqrt{d} \log |\mathcal{Q}|}{α^2 \varepsilon} \Bigg). $$ This bound is optimal up to poly-logarithmic factors, as demonstrated by the Private Multiplicative Weights algorithm (Hardt and Rothblum, FOCS'10). In particular, our lower bound is the first to show that the sample complexity required for accuracy and $(\varepsilon, δ)$-differential privacy is asymptotically larger than what is required merely for accuracy, which is $O(\log |\mathcal{Q}| / α^2)$. In addition, we show that our lower bound holds for the specific case of $k$-way marginal queries (where $|\mathcal{Q}| = 2^k \binom{d}{k}$) when $α$ is not too small compared to $d$ (e.g. when $α$ is any fixed constant). Our results rely on the existence of short \emph{fingerprinting codes} (Boneh and Shaw, CRYPTO'95, Tardos, STOC'03), which we show are closely connected to the sample complexity of differentially private data release. We also give a new method for combining certain types of sample complexity lower bounds into stronger lower bounds. △ Less

Submitted 23 October, 2018; v1 submitted 13 November, 2013; originally announced November 2013.

Comments: Full version of our STOC 2014 paper. To appear in SIAM Journal on Computing

arXiv:1311.1616 [pdf, ps, other]

Hardness Amplification and the Approximate Degree of Constant-Depth Circuits

Authors: Mark Bun, Justin Thaler

Abstract: We establish a generic form of hardness amplification for the approximability of constant-depth Boolean circuits by polynomials. Specifically, we show that if a Boolean circuit cannot be pointwise approximated by low-degree polynomials to within constant error in a certain one-sided sense, then an OR of disjoint copies of that circuit cannot be pointwise approximated even with very high error. As… ▽ More We establish a generic form of hardness amplification for the approximability of constant-depth Boolean circuits by polynomials. Specifically, we show that if a Boolean circuit cannot be pointwise approximated by low-degree polynomials to within constant error in a certain one-sided sense, then an OR of disjoint copies of that circuit cannot be pointwise approximated even with very high error. As our main application, we show that for every sequence of degrees $d(n)$, there is an explicit depth-three circuit $F: \{-1,1\}^n \to \{-1,1\}$ of polynomial-size such that any degree-$d$ polynomial cannot pointwise approximate $F$ to error better than $1-\exp\left(-\tildeΩ(nd^{-3/2})\right)$. As a consequence of our main result, we obtain an $\exp\left(-\tildeΩ(n^{2/5})\right)$ upper bound on the the discrepancy of a function in AC$^0$, and an $\exp\left(\tildeΩ(n^{2/5})\right)$ lower bound on the threshold weight of AC$^0$, improving over the previous best results of $\exp\left(-Ω(n^{1/3})\right)$ and $\exp\left(Ω(n^{1/3})\right)$ respectively. Our techniques also yield a new lower bound of $Ω\left(n^{1/2}/\log^{(d-2)/2}(n)\right)$ on the approximate degree of the AND-OR tree of depth $d$, which is tight up to polylogarithmic factors for any constant $d$, as well as new bounds for read-once DNF formulas. In turn, these results imply new lower bounds on the communication and circuit complexity of these classes, and demonstrate strong limitations on existing PAC learning algorithms. △ Less

Submitted 28 April, 2014; v1 submitted 7 November, 2013; originally announced November 2013.

Comments: 47 pages

arXiv:1302.6191 [pdf, ps, other]

Dual Lower Bounds for Approximate Degree and Markov-Bernstein Inequalities

Authors: Mark Bun, Justin Thaler

Abstract: The $ε$-approximate degree of a Boolean function $f: \{-1, 1\}^n \to \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within $ε$ in the $\ell_\infty$ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the $ε$-approximate degree of the two-… ▽ More The $ε$-approximate degree of a Boolean function $f: \{-1, 1\}^n \to \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within $ε$ in the $\ell_\infty$ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the $ε$-approximate degree of the two-level AND-OR tree for any constant $ε> 0$. We show that this quantity is $Θ(\sqrt{n})$, closing a line of incrementally larger lower bounds. The same lower bound was recently obtained independently by Sherstov using related techniques. Our second result gives an explicit dual polynomial that witnesses a tight lower bound for the approximate degree of any symmetric Boolean function, addressing a question of Špalek. Our final contribution is to reprove several Markov-type inequalities from approximation theory by constructing explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the best-known approximate degree lower bounds, and have important uses throughout theoretical computer science. △ Less

Submitted 21 March, 2014; v1 submitted 25 February, 2013; originally announced February 2013.

Comments: 34 pages. Appears in ICALP 2013. To appear in the special issue of Information and Computation for ICALP 2013

Showing 1–36 of 36 results for author: Bun, M