Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Dery, Lucio M.; Michel, Paul; Talwalkar, Ameet; Neubig, Graham

Computer Science > Machine Learning

arXiv:2109.07437v1 (cs)

[Submitted on 15 Sep 2021 (this version), latest version 7 Feb 2022 (v2)]

Title:Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Authors:Lucio M. Dery, Paul Michel, Ameet Talwalkar, Graham Neubig

View PDF

Abstract:Pre-training, where models are trained on an auxiliary objective with abundant data before being fine-tuned on data from the downstream task, is now the dominant paradigm in NLP. In general, the pre-training step relies on little to no direct knowledge of the task on which the model will be fine-tuned, even when the end-task is known in advance. Our work challenges this status-quo of end-task agnostic pre-training. First, on three different low-resource NLP tasks from two domains, we demonstrate that multi-tasking the end-task and auxiliary objectives results in significantly better downstream task performance than the widely-used task-agnostic continued pre-training paradigm of Gururangan et al. (2020). We next introduce an online meta-learning algorithm that learns a set of multi-task weights to better balance among our multiple auxiliary objectives, achieving further improvements on end task performance and data efficiency.

Comments:	16 pages, 4 figures
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2109.07437 [cs.LG]
	(or arXiv:2109.07437v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.07437

Submission history

From: Lucio Dery [view email]
[v1] Wed, 15 Sep 2021 17:13:18 UTC (586 KB)
[v2] Mon, 7 Feb 2022 03:12:49 UTC (289 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Lucio M. Dery
Paul Michel
Ameet Talwalkar
Graham Neubig

export BibTeX citation

Computer Science > Machine Learning

Title:Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators