Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Mihaylova, Tsvetomila; Niculae, Vlad; Martins, André F. T.

Computer Science > Computation and Language

arXiv:2010.02357 (cs)

[Submitted on 5 Oct 2020]

Title:Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Authors:Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

View PDF

Abstract:Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT - a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases.

Comments:	EMNLP 2020
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2010.02357 [cs.CL]
	(or arXiv:2010.02357v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.02357

Submission history

From: Tsvetomila Mihaylova [view email]
[v1] Mon, 5 Oct 2020 21:56:00 UTC (98 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Tsvetomila Mihaylova
Vlad Niculae
André F. T. Martins

export BibTeX citation

Computer Science > Computation and Language

Title:Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators