Upper Bounds on the Generalization Error of Private Algorithms for Discrete Data

Rodríguez-Gálvez, Borja; Bassi, Germán; Skoglund, Mikael

doi:10.1109/TIT.2021.3111480

Computer Science > Information Theory

arXiv:2005.05889 (cs)

[Submitted on 12 May 2020 (v1), last revised 13 Sep 2021 (this version, v3)]

Title:Upper Bounds on the Generalization Error of Private Algorithms for Discrete Data

Authors:Borja Rodríguez-Gálvez, Germán Bassi, Mikael Skoglund

View PDF

Abstract:In this work, we study the generalization capability of algorithms from an information-theoretic perspective. It has been shown that the expected generalization error of an algorithm is bounded from above by a function of the relative entropy between the conditional probability distribution of the algorithm's output hypothesis, given the dataset with which it was trained, and its marginal probability distribution. We build upon this fact and introduce a mathematical formulation to obtain upper bounds on this relative entropy. Assuming that the data is discrete, we then develop a strategy using this formulation, based on the method of types and typicality, to find explicit upper bounds on the generalization error of stable algorithms, i.e., algorithms that produce similar output hypotheses given similar input datasets. In particular, we show the bounds obtained with this strategy for the case of $\epsilon$-DP and $\mu$-GDP algorithms.

Comments:	18 pages (double column), 4 figures, accepted at IEEE Transactions on Information Theory
Subjects:	Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2005.05889 [cs.IT]
	(or arXiv:2005.05889v3 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.2005.05889
Journal reference:	IEEE Trans. Inf. Theory, vol. 67, no. 11, pp. 7362-7379, Nov. 2021
Related DOI:	https://doi.org/10.1109/TIT.2021.3111480

Submission history

From: Borja Rodríguez Gálvez [view email]
[v1] Tue, 12 May 2020 16:05:39 UTC (642 KB)
[v2] Thu, 12 Nov 2020 17:27:24 UTC (381 KB)
[v3] Mon, 13 Sep 2021 17:23:08 UTC (951 KB)

Computer Science > Information Theory

Title:Upper Bounds on the Generalization Error of Private Algorithms for Discrete Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Theory

Title:Upper Bounds on the Generalization Error of Private Algorithms for Discrete Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators