Revisiting LARS for Large Batch Training Generalization of Neural Networks

Do, Khoi; Nguyen, Duong; Nguyen, Hoa; Tran-Thanh, Long; Tran, Nguyen-Hoang; Pham, Quoc-Viet

Computer Science > Machine Learning

arXiv:2309.14053 (cs)

[Submitted on 25 Sep 2023 (v1), last revised 15 Feb 2024 (this version, v4)]

Title:Revisiting LARS for Large Batch Training Generalization of Neural Networks

Authors:Khoi Do, Duong Nguyen, Hoa Nguyen, Long Tran-Thanh, Nguyen-Hoang Tran, Quoc-Viet Pham

View PDF

Abstract:This paper explores Large Batch Training techniques using layer-wise adaptive scaling ratio (LARS) across diverse settings, uncovering insights. LARS algorithms with warm-up tend to be trapped in sharp minimizers early on due to redundant ratio scaling. Additionally, a fixed steep decline in the latter phase restricts deep neural networks from effectively navigating early-phase sharp minimizers. Building on these findings, we propose Time Varying LARS (TVLARS), a novel algorithm that replaces warm-up with a configurable sigmoid-like function for robust training in the initial phase. TVLARS promotes gradient exploration early on, surpassing sharp optimizers and gradually transitioning to LARS for robustness in later phases. Extensive experiments demonstrate that TVLARS consistently outperforms LARS and LAMB in most cases, with up to 2\% improvement in classification scenarios. Notably, in all self-supervised learning cases, TVLARS dominates LARS and LAMB with performance improvements of up to 10\%.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2309.14053 [cs.LG]
	(or arXiv:2309.14053v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2309.14053

Submission history

From: Khoi Do [view email]
[v1] Mon, 25 Sep 2023 11:35:10 UTC (16,680 KB)
[v2] Tue, 28 Nov 2023 05:18:31 UTC (22,450 KB)
[v3] Sun, 28 Jan 2024 11:01:35 UTC (23,695 KB)
[v4] Thu, 15 Feb 2024 17:37:56 UTC (23,692 KB)

Computer Science > Machine Learning

Title:Revisiting LARS for Large Batch Training Generalization of Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Revisiting LARS for Large Batch Training Generalization of Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators