On the Benefits of Large Learning Rates for Kernel Methods

Beugnot, Gaspard; Mairal, Julien; Rudi, Alessandro

Statistics > Machine Learning

arXiv:2202.13733 (stat)

[Submitted on 28 Feb 2022 (v1), last revised 3 Jun 2022 (this version, v2)]

Title:On the Benefits of Large Learning Rates for Kernel Methods

Authors:Gaspard Beugnot, Julien Mairal, Alessandro Rudi

View PDF

Abstract:This paper studies an intriguing phenomenon related to the good generalization performance of estimators obtained by using large learning rates within gradient descent algorithms. First observed in the deep learning literature, we show that a phenomenon can be precisely characterized in the context of kernel methods, even though the resulting optimization problem is convex. Specifically, we consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stop**, the choice of learning rate influences the spectral decomposition of the obtained solution on the Hessian's eigenvectors. This extends an intuition described by Nakkiran (2020) on a two-dimensional toy problem to realistic learning scenarios such as kernel ridge regression. While large learning rates may be proven beneficial as soon as there is a mismatch between the train and test objectives, we further explain why it already occurs in classification tasks without assuming any particular mismatch between train and test data distributions.

Comments:	Accepted paper at Conference COLT 2022. To be published to Proceedings of Machine Learning Research (PMLR)
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2202.13733 [stat.ML]
	(or arXiv:2202.13733v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2202.13733

Submission history

From: Gaspard Beugnot [view email]
[v1] Mon, 28 Feb 2022 13:01:04 UTC (1,258 KB)
[v2] Fri, 3 Jun 2022 14:55:49 UTC (1,320 KB)

Statistics > Machine Learning

Title:On the Benefits of Large Learning Rates for Kernel Methods

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:On the Benefits of Large Learning Rates for Kernel Methods

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators