Stochastic Bregman Subgradient Methods for Nonsmooth Nonconvex Optimization Problems

Ding, Kuangyu; Toh, Kim-Chuan

Abstract:This paper focuses on the problem of minimizing a locally Lipschitz continuous function. Motivated by the effectiveness of Bregman gradient methods in training nonsmooth deep neural networks and the recent progress in stochastic subgradient methods for nonsmooth nonconvex optimization problems \cite{bolte2021conservative,bolte2022subgradient,xiao2023adam}, we investigate the long-term behavior of stochastic Bregman subgradient methods in such context, especially when the objective function lacks Clarke regularity. We begin by exploring a general framework for Bregman-type methods, establishing their convergence by a differential inclusion approach. For practical applications, we develop a stochastic Bregman subgradient method that allows the subproblems to be solved inexactly. Furthermore, we demonstrate how a single timescale momentum can be integrated into the Bregman subgradient method with slight modifications to the momentum update. Additionally, we introduce a Bregman proximal subgradient method for solving composite optimization problems possibly with constraints, whose convergence can be guaranteed based on the general framework. Numerical experiments on training nonsmooth neural networks are conducted to validate the effectiveness of our proposed methods.

Comments:	28 pages, 6 figures
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:2404.17386 [math.OC]
	(or arXiv:2404.17386v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2404.17386

Mathematics > Optimization and Control

Title:Stochastic Bregman Subgradient Methods for Nonsmooth Nonconvex Optimization Problems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators