Search | arXiv e-print repository

Differentially Private Gradient Flow based on the Sliced Wasserstein Distance for Non-Parametric Generative Modeling

Authors: Ilana Sebag, Muni Sreenivas PYDI, Jean-Yves Franceschi, Alain Rakotomamonjy, Mike Gartrell, Jamal Atif, Alexandre Allauzen

Abstract: Safeguarding privacy in sensitive training data is paramount, particularly in the context of generative modeling. This is done through either differentially private stochastic gradient descent, or with a differentially private metric for training models or generators. In this paper, we introduce a novel differentially private generative modeling approach based on parameter-free gradient flows in t… ▽ More Safeguarding privacy in sensitive training data is paramount, particularly in the context of generative modeling. This is done through either differentially private stochastic gradient descent, or with a differentially private metric for training models or generators. In this paper, we introduce a novel differentially private generative modeling approach based on parameter-free gradient flows in the space of probability measures. The proposed algorithm is a new discretized flow which operates through a particle scheme, utilizing drift derived from the sliced Wasserstein distance and computed in a private manner. Our experiments show that compared to a generator-based model, our proposed model can generate higher-fidelity data at a low privacy budget, offering a viable alternative to generator-based approaches. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2311.00460 [pdf, other]

Optimal Budgeted Rejection Sampling for Generative Models

Authors: Alexandre Verine, Muni Sreenivas Pydi, Benjamin Negrevergne, Yann Chevaleyre

Abstract: Rejection sampling methods have recently been proposed to improve the performance of discriminator-based generative models. However, these methods are only optimal under an unlimited sampling budget, and are usually applied to a generator trained independently of the rejection procedure. We first propose an Optimal Budgeted Rejection Sampling (OBRS) scheme that is provably optimal with respect to… ▽ More Rejection sampling methods have recently been proposed to improve the performance of discriminator-based generative models. However, these methods are only optimal under an unlimited sampling budget, and are usually applied to a generator trained independently of the rejection procedure. We first propose an Optimal Budgeted Rejection Sampling (OBRS) scheme that is provably optimal with respect to \textit{any} $f$-divergence between the true distribution and the post-rejection distribution, for a given sampling budget. Second, we propose an end-to-end method that incorporates the sampling scheme into the training procedure to further enhance the model's overall performance. Through experiments and supporting theory, we show that the proposed methods are effective in significantly improving the quality and diversity of the samples. △ Less

Submitted 1 March, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2305.18910 [pdf, other]

Precision-Recall Divergence Optimization for Generative Modeling with GANs and Normalizing Flows

Authors: Alexandre Verine, Benjamin Negrevergne, Muni Sreenivas Pydi, Yann Chevaleyre

Abstract: Achieving a balance between image quality (precision) and diversity (recall) is a significant challenge in the domain of generative models. Current state-of-the-art models primarily rely on optimizing heuristics, such as the Fréchet Inception Distance. While recent developments have introduced principled methods for evaluating precision and recall, they have yet to be successfully integrated into… ▽ More Achieving a balance between image quality (precision) and diversity (recall) is a significant challenge in the domain of generative models. Current state-of-the-art models primarily rely on optimizing heuristics, such as the Fréchet Inception Distance. While recent developments have introduced principled methods for evaluating precision and recall, they have yet to be successfully integrated into the training of generative models. Our main contribution is a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows, which explicitly optimizes a user-defined trade-off between precision and recall. More precisely, we show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the \textit{PR-divergences}. Conversely, any $f$-divergence can be written as a linear combination of PR-divergences and corresponds to a weighted precision-recall trade-off. Through comprehensive evaluations, we show that our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet. △ Less

Submitted 1 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: NeurIPS 2023

arXiv:2302.07221 [pdf, other]

On the Role of Randomization in Adversarially Robust Classification

Authors: Lucas Gnecco-Heredia, Yann Chevaleyre, Benjamin Negrevergne, Laurent Meunier, Muni Sreenivas Pydi

Abstract: Deep neural networks are known to be vulnerable to small adversarial perturbations in test data. To defend against adversarial attacks, probabilistic classifiers have been proposed as an alternative to deterministic ones. However, literature has conflicting findings on the effectiveness of probabilistic classifiers in comparison to deterministic ones. In this paper, we clarify the role of randomiz… ▽ More Deep neural networks are known to be vulnerable to small adversarial perturbations in test data. To defend against adversarial attacks, probabilistic classifiers have been proposed as an alternative to deterministic ones. However, literature has conflicting findings on the effectiveness of probabilistic classifiers in comparison to deterministic ones. In this paper, we clarify the role of randomization in building adversarially robust classifiers. Given a base hypothesis set of deterministic classifiers, we show the conditions under which a randomized ensemble outperforms the hypothesis set in adversarial risk, extending previous results. Additionally, we show that for any probabilistic binary classifier (including randomized ensembles), there exists a deterministic classifier that outperforms it. Finally, we give an explicit description of the deterministic hypothesis set that contains such a deterministic classifier for many types of commonly used probabilistic classifiers, i.e. randomized ensembles and parametric/input noise injection. △ Less

Submitted 28 November, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: 10 pages main paper (27 total), 2 figures in main paper. Neurips 2023

arXiv:2302.00628 [pdf, other]

Training Normalizing Flows with the Precision-Recall Divergence

Authors: Alexandre Verine, Benjamin Negrevergne, Muni Sreenivas Pydi, Yann Chevaleyre

Abstract: Generative models can have distinct mode of failures like mode drop** and low quality samples, which cannot be captured by a single scalar metric. To address this, recent works propose evaluating generative models using precision and recall, where precision measures quality of samples and recall measures the coverage of the target distribution. Although a variety of discrepancy measures between… ▽ More Generative models can have distinct mode of failures like mode drop** and low quality samples, which cannot be captured by a single scalar metric. To address this, recent works propose evaluating generative models using precision and recall, where precision measures quality of samples and recall measures the coverage of the target distribution. Although a variety of discrepancy measures between the target and estimated distribution are used to train generative models, it is unclear what precision-recall trade-offs are achieved by various choices of the discrepancy measures. In this paper, we show that achieving a specified precision-recall trade-off corresponds to minimising -divergences from a family we call the {\em PR-divergences }. Conversely, any -divergence can be written as a linear combination of PR-divergences and therefore correspond to minimising a weighted precision-recall trade-off. Further, we propose a novel generative model that is able to train a normalizing flow to minimise any -divergence, and in particular, achieve a given precision-recall trade-off. △ Less

Submitted 2 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

arXiv:2301.13192 [pdf, other]

Robust empirical risk minimization via Newton's method

Authors: Eirini Ioannou, Muni Sreenivas Pydi, Po-Ling Loh

Abstract: A new variant of Newton's method for empirical risk minimization is studied, where at each iteration of the optimization algorithm, the gradient and Hessian of the objective function are replaced by robust estimators taken from existing literature on robust mean estimation for multivariate data. After proving a general theorem about the convergence of successive iterates to a small ball around the… ▽ More A new variant of Newton's method for empirical risk minimization is studied, where at each iteration of the optimization algorithm, the gradient and Hessian of the objective function are replaced by robust estimators taken from existing literature on robust mean estimation for multivariate data. After proving a general theorem about the convergence of successive iterates to a small ball around the population-level minimizer, consequences of the theory in generalized linear models are studied when data are generated from Huber's epsilon-contamination model and/or heavytailed distributions. An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed, which may be more appropriate for high-dimensional settings, and conjectures about the convergence of the resulting algorithm are offered. Compared to robust gradient descent, the proposed algorithm enjoys the faster rates of convergence for successive iterates often achieved by second-order algorithms for convex problems, i.e., quadratic convergence in a neighborhood of the optimum, with a stepsize that may be chosen adaptively via backtracking linesearch. △ Less

Submitted 17 July, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2201.08956 [pdf, other]

The Many Faces of Adversarial Risk

Authors: Muni Sreenivas Pydi, Varun Jog

Abstract: Adversarial risk quantifies the performance of classifiers on adversarially perturbed data. Numerous definitions of adversarial risk -- not all mathematically rigorous and differing subtly in the details -- have appeared in the literature. In this paper, we revisit these definitions, make them rigorous, and critically examine their similarities and differences. Our technical tools derive from opti… ▽ More Adversarial risk quantifies the performance of classifiers on adversarially perturbed data. Numerous definitions of adversarial risk -- not all mathematically rigorous and differing subtly in the details -- have appeared in the literature. In this paper, we revisit these definitions, make them rigorous, and critically examine their similarities and differences. Our technical tools derive from optimal transport, robust statistics, functional analysis, and game theory. Our contributions include the following: generalizing Strassen's theorem to the unbalanced optimal transport setting with applications to adversarial classification with unequal priors; showing an equivalence between adversarial robustness and robust hypothesis testing with $\infty$-Wasserstein uncertainty sets; proving the existence of a pure Nash equilibrium in the two-player game between the adversary and the algorithm; and characterizing adversarial risk by the minimum Bayes error between a pair of distributions belonging to the $\infty$-Wasserstein uncertainty sets. Our results generalize and deepen recently discovered connections between optimal transport and adversarial robustness and reveal new connections to Choquet capacities and game theory. △ Less

Submitted 21 January, 2022; originally announced January 2022.

Comments: A version of this paper was presented at NeurIPS 2021

arXiv:1912.02794 [pdf, other]

Adversarial Risk via Optimal Transport and Optimal Couplings

Authors: Muni Sreenivas Pydi, Varun Jog

Abstract: Modern machine learning algorithms perform poorly on adversarially manipulated data. Adversarial risk quantifies the error of classifiers in adversarial settings; adversarial classifiers minimize adversarial risk. In this paper, we analyze adversarial risk and adversarial classifiers from an optimal transport perspective. We show that the optimal adversarial risk for binary classification with 0-1… ▽ More Modern machine learning algorithms perform poorly on adversarially manipulated data. Adversarial risk quantifies the error of classifiers in adversarial settings; adversarial classifiers minimize adversarial risk. In this paper, we analyze adversarial risk and adversarial classifiers from an optimal transport perspective. We show that the optimal adversarial risk for binary classification with 0-1 loss is determined by an optimal transport cost between the probability distributions of the two classes. We develop optimal transport plans (probabilistic couplings) for univariate distributions such as the normal, the uniform, and the triangular distribution. We also derive optimal adversarial classifiers in these settings. Our analysis leads to algorithm-independent fundamental limits on adversarial risk, which we calculate for several real-world datasets. We extend our results to general loss functions under convexity and smoothness assumptions. △ Less

Submitted 22 December, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

Comments: Revised version with 43 pages and 8 figures

arXiv:1910.04371 [pdf, ps, other]

Active Learning with Importance Sampling

Authors: Muni Sreenivas Pydi, Vishnu Suresh Lokhande

Abstract: We consider an active learning setting where the algorithm has access to a large pool of unlabeled data and a small pool of labeled data. In each iteration, the algorithm chooses few unlabeled data points and obtains their labels from an oracle. In this paper, we consider a probabilistic querying procedure to choose the points to be labeled. We propose an algorithm for Active Learning with Importa… ▽ More We consider an active learning setting where the algorithm has access to a large pool of unlabeled data and a small pool of labeled data. In each iteration, the algorithm chooses few unlabeled data points and obtains their labels from an oracle. In this paper, we consider a probabilistic querying procedure to choose the points to be labeled. We propose an algorithm for Active Learning with Importance Sampling (ALIS), and derive upper bounds on the true loss incurred by the algorithm for any arbitrary probabilistic sampling procedure. Further, we propose an optimal sampling distribution that minimizes the upper bound on the true loss. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: NeurIPS 2019 Workshop on Machine Learning with Guarantees, Vancouver, Canada

arXiv:1802.04475 [pdf, other]

Graph-Based Ascent Algorithms for Function Maximization

Authors: Muni Sreenivas Pydi, Varun Jog, Po-Ling Loh

Abstract: We study the problem of finding the maximum of a function defined on the nodes of a connected graph. The goal is to identify a node where the function obtains its maximum. We focus on local iterative algorithms, which traverse the nodes of the graph along a path, and the next iterate is chosen from the neighbors of the current iterate with probability distribution determined by the function values… ▽ More We study the problem of finding the maximum of a function defined on the nodes of a connected graph. The goal is to identify a node where the function obtains its maximum. We focus on local iterative algorithms, which traverse the nodes of the graph along a path, and the next iterate is chosen from the neighbors of the current iterate with probability distribution determined by the function values at the current iterate and its neighbors. We study two algorithms corresponding to a Metropolis-Hastings random walk with different transition kernels: (i) The first algorithm is an exponentially weighted random walk governed by a parameter $γ$. (ii) The second algorithm is defined with respect to the graph Laplacian and a smoothness parameter $k$. We derive convergence rates for the two algorithms in terms of total variation distance and hitting times. We also provide simulations showing the relative convergence rates of our algorithms in comparison to an unbiased random walk, as a function of the smoothness of the graph function. Our algorithms may be categorized as a new class of "descent-based" methods for function maximization on the nodes of a graph. △ Less

Submitted 13 February, 2018; originally announced February 2018.

arXiv:1702.03522 [pdf, other]

On Consistency of Compressive Spectral Clustering

Authors: Muni Sreenivas Pydi, Ambedkar Dukkipati

Abstract: Spectral clustering is one of the most popular methods for community detection in graphs. A key step in spectral clustering algorithms is the eigen decomposition of the $n{\times}n$ graph Laplacian matrix to extract its $k$ leading eigenvectors, where $k$ is the desired number of clusters among $n$ objects. This is prohibitively complex to implement for very large datasets. However, it has recentl… ▽ More Spectral clustering is one of the most popular methods for community detection in graphs. A key step in spectral clustering algorithms is the eigen decomposition of the $n{\times}n$ graph Laplacian matrix to extract its $k$ leading eigenvectors, where $k$ is the desired number of clusters among $n$ objects. This is prohibitively complex to implement for very large datasets. However, it has recently been shown that it is possible to bypass the eigen decomposition by computing an approximate spectral embedding through graph filtering of random signals. In this paper, we analyze the working of spectral clustering performed via graph filtering on the stochastic block model. Specifically, we characterize the effects of sparsity, dimensionality and filter approximation error on the consistency of the algorithm in recovering planted clusters. △ Less

Submitted 29 May, 2018; v1 submitted 12 February, 2017; originally announced February 2017.

Comments: Accepted for publication at the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, Colorado, USA

Journal ref: ISIT 2018

arXiv:1701.04548 [pdf, ps, other]

Analytic Connectivity in General Hypergraphs

Authors: Ashwin Guha, Muni Sreenivas Pydi, Biswajit Paria, Ambedkar Dukkipati

Abstract: In this paper we extend the known results of analytic connectivity to non-uniform hypergraphs. We prove a modified Cheeger's inequality and also give a bound on analytic connectivity with respect to the degree sequence and diameter of a hypergraph. In this paper we extend the known results of analytic connectivity to non-uniform hypergraphs. We prove a modified Cheeger's inequality and also give a bound on analytic connectivity with respect to the degree sequence and diameter of a hypergraph. △ Less

Submitted 17 January, 2017; originally announced January 2017.

Showing 1–12 of 12 results for author: PYDI, M S