Search | arXiv e-print repository

arXiv:2208.04245 [pdf, other]

Differentially Private Fréchet Mean on the Manifold of Symmetric Positive Definite (SPD) Matrices with log-Euclidean Metric

Authors: Saiteja Utpala, Praneeth Vepakomma, Nina Miolane

Abstract: Differential privacy has become crucial in the real-world deployment of statistical and machine learning algorithms with rigorous privacy guarantees. The earliest statistical queries, for which differential privacy mechanisms have been developed, were for the release of the sample mean. In Geometric Statistics, the sample Fréchet mean represents one of the most fundamental statistical summaries, a… ▽ More Differential privacy has become crucial in the real-world deployment of statistical and machine learning algorithms with rigorous privacy guarantees. The earliest statistical queries, for which differential privacy mechanisms have been developed, were for the release of the sample mean. In Geometric Statistics, the sample Fréchet mean represents one of the most fundamental statistical summaries, as it generalizes the sample mean for data belonging to nonlinear manifolds. In that spirit, the only geometric statistical query for which a differential privacy mechanism has been developed, so far, is for the release of the sample Fréchet mean: the \emph{Riemannian Laplace mechanism} was recently proposed to privatize the Fréchet mean on complete Riemannian manifolds. In many fields, the manifold of Symmetric Positive Definite (SPD) matrices is used to model data spaces, including in medical imaging where privacy requirements are key. We propose a novel, simple and fast mechanism - the \emph{tangent Gaussian mechanism} - to compute a differentially private Fréchet mean on the SPD manifold endowed with the log-Euclidean Riemannian metric. We show that our new mechanism has significantly better utility and is computationally efficient -- as confirmed by extensive experiments. △ Less

Submitted 10 February, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

Comments: Accepted at TMLR

Journal ref: Transactions on Machine Learning Research (TMLR), 2022

arXiv:2207.03652 [pdf, other]

Private independence testing across two parties

Authors: Praneeth Vepakomma, Mohammad Mohammadi Amiri, Clément L. Canonne, Ramesh Raskar, Alex Pentland

Abstract: We introduce $π$-test, a privacy-preserving algorithm for testing statistical independence between data distributed across multiple parties. Our algorithm relies on privately estimating the distance correlation between datasets, a quantitative measure of independence introduced in Székely et al. [2007]. We establish both additive and multiplicative error bounds on the utility of our differentially… ▽ More We introduce $π$-test, a privacy-preserving algorithm for testing statistical independence between data distributed across multiple parties. Our algorithm relies on privately estimating the distance correlation between datasets, a quantitative measure of independence introduced in Székely et al. [2007]. We establish both additive and multiplicative error bounds on the utility of our differentially private test, which we believe will find applications in a variety of distributed hypothesis testing settings involving sensitive data. △ Less

Submitted 26 September, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

arXiv:2201.10115 [pdf, other]

Effects of Privacy-Inducing Noise on Welfare and Influence of Referendum Systems

Authors: Suat Evren, Praneeth Vepakomma

Abstract: Social choice functions help aggregate individual preferences while differentially private mechanisms provide formal privacy guarantees to release answers of queries operating on sensitive data. However, preserving differential privacy requires introducing noise to the system, and therefore may lead to undesired byproducts. Does an increase in the level of differential privacy for releasing the ou… ▽ More Social choice functions help aggregate individual preferences while differentially private mechanisms provide formal privacy guarantees to release answers of queries operating on sensitive data. However, preserving differential privacy requires introducing noise to the system, and therefore may lead to undesired byproducts. Does an increase in the level of differential privacy for releasing the outputs of social choice functions increase or decrease the level of influence and welfare, and at what rate? In this paper, we mainly address this question in more precise terms in a referendum setting with two candidates when the celebrated randomized response mechanism is used. We show that there is an inversely-proportional relation between welfare and privacy, and also influence and privacy. △ Less

Submitted 21 September, 2023; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: 22 pages, 2 figures

arXiv:2108.08758 [pdf, other]

Parallel Quasi-concave set optimization: A new frontier that scales without needing submodularity

Authors: Praneeth Vepakomma, Yulia Kempner, Ramesh Raskar

Abstract: Classes of set functions along with a choice of ground set are a bedrock to determine and develop corresponding variants of greedy algorithms to obtain efficient solutions for combinatorial optimization problems. The class of approximate constrained submodular optimization has seen huge advances at the intersection of good computational efficiency, versatility and approximation guarantees while ex… ▽ More Classes of set functions along with a choice of ground set are a bedrock to determine and develop corresponding variants of greedy algorithms to obtain efficient solutions for combinatorial optimization problems. The class of approximate constrained submodular optimization has seen huge advances at the intersection of good computational efficiency, versatility and approximation guarantees while exact solutions for unconstrained submodular optimization are NP-hard. What is an alternative to situations when submodularity does not hold? Can efficient and globally exact solutions be obtained? We introduce one such new frontier: The class of quasi-concave set functions induced as a dual class to monotone linkage functions. We provide a parallel algorithm with a time complexity over $n$ processors of $\mathcal{O}(n^2g) +\mathcal{O}(\log{\log{n}})$ where $n$ is the cardinality of the ground set and $g$ is the complexity to compute the monotone linkage function that induces a corresponding quasi-concave set function via a duality. The complexity reduces to $\mathcal{O}(gn\log(n))$ on $n^2$ processors and to $\mathcal{O}(gn)$ on $n^3$ processors. Our algorithm provides a globally optimal solution to a maxi-min problem as opposed to submodular optimization which is approximate. We show a potential for widespread applications via an example of diverse feature subset selection with exact global maxi-min guarantees upon showing that a statistical dependency measure called distance correlation can be used to induce a quasi-concave set function. △ Less

Submitted 19 August, 2021; originally announced August 2021.

Comments: SubSetML: Subset Selection in Machine Learning: From Theory to Practice

Showing 1–4 of 4 results for author: Vepakomma, P