On skip connections and normalisation layers in deep optimisation

MacDonald, Lachlan Ewen; Valmadre, Jack; Saratchandran, Hemanth; Lucey, Simon

Computer Science > Machine Learning

arXiv:2210.05371 (cs)

[Submitted on 10 Oct 2022 (v1), last revised 4 Dec 2023 (this version, v4)]

Title:On skip connections and normalisation layers in deep optimisation

Authors:Lachlan Ewen MacDonald, Jack Valmadre, Hemanth Saratchandran, Simon Lucey

View PDF

Abstract:We introduce a general theoretical framework, designed for the study of gradient optimisation of deep neural networks, that encompasses ubiquitous architecture choices including batch normalisation, weight normalisation and skip connections. Our framework determines the curvature and regularity properties of multilayer loss landscapes in terms of their constituent layers, thereby elucidating the roles played by normalisation layers and skip connections in globalising these properties. We then demonstrate the utility of this framework in two respects. First, we give the only proof of which we are aware that a class of deep neural networks can be trained using gradient descent to global optima even when such optima only exist at infinity, as is the case for the cross-entropy cost. Second, we identify a novel causal mechanism by which skip connections accelerate training, which we verify predictively with ResNets on MNIST, CIFAR10, CIFAR100 and ImageNet.

Comments:	NeurIPS 2023
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2210.05371 [cs.LG]
	(or arXiv:2210.05371v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.05371

Submission history

From: Lachlan MacDonald [view email]
[v1] Mon, 10 Oct 2022 06:22:46 UTC (187 KB)
[v2] Fri, 28 Oct 2022 07:48:27 UTC (371 KB)
[v3] Thu, 10 Nov 2022 06:53:58 UTC (1,116 KB)
[v4] Mon, 4 Dec 2023 15:37:47 UTC (280 KB)

Computer Science > Machine Learning

Title:On skip connections and normalisation layers in deep optimisation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On skip connections and normalisation layers in deep optimisation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators