Random Reshuffling with Momentum for Nonconvex Problems: Iteration Complexity and Last Iterate Convergence

Qiu, Junwen; Milzarek, Andre

Mathematics > Optimization and Control

arXiv:2404.18452 (math)

[Submitted on 29 Apr 2024]

Title:Random Reshuffling with Momentum for Nonconvex Problems: Iteration Complexity and Last Iterate Convergence

Authors:Junwen Qiu, Andre Milzarek

View PDF HTML (experimental)

Abstract:Random reshuffling with momentum (RRM) corresponds to the SGD optimizer with momentum option enabled, as found in popular machine learning libraries like PyTorch and TensorFlow. Despite its widespread use in practical applications, the understanding of its convergence properties in nonconvex scenarios remains limited. Under a Lipschitz smoothness assumption, this paper provides one of the first iteration complexities for RRM. Specifically, we prove that RRM achieves the iteration complexity $O(n^{-1/3}((1-\beta^n)T)^{-2/3})$ where $n$ denotes the number of component functions $f(\cdot;i)$ and $\beta \in [0,1)$ is the momentum parameter. Furthermore, every accumulation point of a sequence of iterates $\{x^k\}_k$ generated by RRM is shown to be a stationary point of the problem. In addition, under the Kurdyka-Lojasiewicz inequality - a local geometric property - the iterates $\{x^k\}_k$ provably converge to a unique stationary point $x^*$ of the objective function. Importantly, in our analysis, this last iterate convergence is obtained without requiring convexity nor a priori boundedness of the iterates. Finally, for polynomial step size schemes, convergence rates of the form $\|x^k - x^*\| = O(k^{-p})$, $\|\nabla f(x^k)\|^2 = O(k^{-q})$, and $|f(x^k) - f(x^*)| = O(k^{-q})$, $p \in (0,1]$, $q \in (0,2]$ are derived.

Comments:	51 pages, 10 figures
Subjects:	Optimization and Control (math.OC)
MSC classes:	90C26, 90C15
Cite as:	arXiv:2404.18452 [math.OC]
	(or arXiv:2404.18452v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2404.18452

Submission history

From: Junwen Qiu [view email]
[v1] Mon, 29 Apr 2024 06:23:28 UTC (336 KB)

Mathematics > Optimization and Control

Title:Random Reshuffling with Momentum for Nonconvex Problems: Iteration Complexity and Last Iterate Convergence

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Random Reshuffling with Momentum for Nonconvex Problems: Iteration Complexity and Last Iterate Convergence

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators