Search | arXiv e-print repository

On the Growth of Mistakes in Differentially Private Online Learning: A Lower Bound Perspective

Authors: Daniil Dmitriev, Kristóf Szabó, Amartya Sanyal

Abstract: In this paper, we provide lower bounds for Differentially Private (DP) Online Learning algorithms. Our result shows that, for a broad class of $(\varepsilon,δ)$-DP online algorithms, for $T$ such that $\log T\leq O(1 / δ)$, the expected number of mistakes incurred by the algorithm grows as $Ω(\log \frac{T}δ)$. This matches the upper bound obtained by Golowich and Livni (2021) and is in contrast to… ▽ More In this paper, we provide lower bounds for Differentially Private (DP) Online Learning algorithms. Our result shows that, for a broad class of $(\varepsilon,δ)$-DP online algorithms, for $T$ such that $\log T\leq O(1 / δ)$, the expected number of mistakes incurred by the algorithm grows as $Ω(\log \frac{T}δ)$. This matches the upper bound obtained by Golowich and Livni (2021) and is in contrast to non-private online learning where the number of mistakes is independent of $T$. To the best of our knowledge, our work is the first result towards settling lower bounds for DP-Online learning and partially addresses the open question in Sanyal and Ramponi (2022). △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.13999 [pdf, other]

Asymptotics of Learning with Deep Structured (Random) Features

Authors: Dominik Schröder, Daniil Dmitriev, Hugo Cui, Bruno Loureiro

Abstract: For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated… ▽ More For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated by the problem of learning with Gaussian rainbow neural networks, namely deep non-linear fully-connected networks with random but structured weights, whose row-wise covariances are further allowed to depend on the weights of previous layers. For such networks we also derive a closed-form formula for the feature covariance in terms of the weight matrices. We further find that in some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent. △ Less

Submitted 10 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: ICML camera-ready version

arXiv:2305.05565 [pdf, ps, other]

Greedy Heuristics and Linear Relaxations for the Random Hitting Set Problem

Authors: Gabriel Arpino, Daniil Dmitriev, Nicolo Grometto

Abstract: Consider the Hitting Set problem where, for a given universe $\mathcal{X} = \left\{ 1, ... , n \right\}$ and a collection of subsets $\mathcal{S}_1, ... , \mathcal{S}_m$, one seeks to identify the smallest subset of $\mathcal{X}$ which has nonempty intersection with every element in the collection. We study a probabilistic formulation of this problem, where the underlying subsets are formed by inc… ▽ More Consider the Hitting Set problem where, for a given universe $\mathcal{X} = \left\{ 1, ... , n \right\}$ and a collection of subsets $\mathcal{S}_1, ... , \mathcal{S}_m$, one seeks to identify the smallest subset of $\mathcal{X}$ which has nonempty intersection with every element in the collection. We study a probabilistic formulation of this problem, where the underlying subsets are formed by including each element of the universe with probability $p$, independently of one another. For large enough values of $n$, we rigorously analyse the average case performance of Lovász's celebrated greedy algorithm (Lovász, 1975) with respect to the chosen input distribution. In addition, we study integrality gaps between linear programming and integer programming solutions of the problem. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2302.00401 [pdf, other]

Deterministic equivalent and error universality of deep random features learning

Authors: Dominik Schröder, Hugo Cui, Daniil Dmitriev, Bruno Loureiro

Abstract: This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the lear… ▽ More This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the learner and target networks share the same intermediate layers, and provide a sharp asymptotic formula for it. Establishing this result requires proving a deterministic equivalent for traces of the deep random features sample covariance matrices which can be of independent interest. Second, we conjecture the asymptotic Gaussian universality of the test error in the more general setting of arbitrary convex losses and generic learner/target architectures. We provide extensive numerical evidence for this conjecture, which requires the derivation of closed-form expressions for the layer-wise post-activation population covariances. In light of our results, we investigate the interplay between architecture design and implicit regularization. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2006.07253 [pdf, other]

Dynamic Model Pruning with Feedback

Authors: Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi

Abstract: Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedb… ▽ More Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedback signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models. Moreover, their performance surpasses that of models generated by all previously proposed pruning schemes. △ Less

Submitted 12 June, 2020; originally announced June 2020.

Comments: appearing at ICLR 2020

arXiv:1410.3120 [pdf]

Efficient randomized algorithms for PageRank problem

Authors: Alexander Gasnikov, Denis Dmitriev

Abstract: In the paper we compare well known numerical methods of finding PageRank vector. We propose Markov Chain Monte Carlo method and obtain a new estimation for this method. We also propose a new method for PageRank problem based on the reduction of this problem to the matrix game. We solve this (sparse) matrix game with randomized mirror descent. It should be mentioned that we used non-standard random… ▽ More In the paper we compare well known numerical methods of finding PageRank vector. We propose Markov Chain Monte Carlo method and obtain a new estimation for this method. We also propose a new method for PageRank problem based on the reduction of this problem to the matrix game. We solve this (sparse) matrix game with randomized mirror descent. It should be mentioned that we used non-standard randomization (in KL-projection) goes back to Grigoriadis-Khachiayn (1995). △ Less

Submitted 26 May, 2016; v1 submitted 12 October, 2014; originally announced October 2014.

Comments: 31 pages, in Russian

Journal ref: Comp. Math. and Math. Phys. 2015. V. 55. no. 3. P.355-371

Showing 1–6 of 6 results for author: Dmitriev, D