Generalization in Deep Learning

Kawaguchi, Kenji; Kaelbling, Leslie Pack; Bengio, Yoshua

Statistics > Machine Learning

arXiv:1710.05468v2 (stat)

[Submitted on 16 Oct 2017 (v1), revised 24 Dec 2017 (this version, v2), latest version 22 Aug 2023 (v9)]

Title:Generalization in Deep Learning

Authors:Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

View PDF

Abstract:This paper explains why deep learning can generalize well, despite large capacity and possible algorithmic instability, nonrobustness, and sharp minima, effectively addressing an open problem in the literature. Based on our theoretical insight, this paper also proposes a family of new regularization methods. Its simplest member was empirically shown to improve base models and achieve competitive performance on MNIST and CIFAR-10 benchmarks. Moreover, this paper presents both data-dependent and data-independent generalization guarantees with improved convergence rates. Our results suggest several new open areas of research.

Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1710.05468 [stat.ML]
	(or arXiv:1710.05468v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1710.05468

Submission history

From: Kenji Kawaguchi [view email]
[v1] Mon, 16 Oct 2017 02:21:24 UTC (659 KB)
[v2] Sun, 24 Dec 2017 19:44:43 UTC (223 KB)
[v3] Thu, 22 Feb 2018 23:39:50 UTC (258 KB)
[v4] Tue, 1 Jan 2019 00:07:45 UTC (724 KB)
[v5] Fri, 10 May 2019 18:41:13 UTC (724 KB)
[v6] Mon, 27 Jul 2020 23:01:04 UTC (723 KB)
[v7] Sun, 11 Dec 2022 10:00:06 UTC (723 KB)
[v8] Mon, 21 Aug 2023 05:06:26 UTC (723 KB)
[v9] Tue, 22 Aug 2023 03:04:22 UTC (723 KB)

Statistics > Machine Learning

Title:Generalization in Deep Learning

Submission history

Access Paper:

References & Citations

2 blog links

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Generalization in Deep Learning

Submission history

Access Paper:

References & Citations

2 blog links

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators