Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Lauscher, Anne; Vulić, Ivan; Ponti, Edoardo Maria; Korhonen, Anna; Glavaš, Goran

Computer Science > Computation and Language

arXiv:1909.02339 (cs)

[Submitted on 5 Sep 2019 (v1), last revised 20 Apr 2020 (this version, v2)]

Title:Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Authors:Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš

View PDF

Abstract:Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applications. These models, however, retain some of the limitations of traditional static word embeddings. In particular, they encode only the distributional knowledge available in raw text corpora, incorporated through language modeling objectives. In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining. To this end, we generalize the standard BERT model to a multi-task learning setting where we couple BERT's masked language modeling and next sentence prediction objectives with an auxiliary task of binary word relation classification. Our experiments suggest that our "Lexically Informed" BERT (LIBERT), specialized for the word-level semantic similarity, yields better performance than the lexically blind "vanilla" BERT on several language understanding tasks. Concretely, LIBERT outperforms BERT in 9 out of 10 tasks of the GLUE benchmark and is on a par with BERT in the remaining one. Moreover, we show consistent gains on 3 benchmarks for lexical simplification, a task where knowledge about word-level semantic similarity is paramount.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1909.02339 [cs.CL]
	(or arXiv:1909.02339v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.02339

Submission history

From: Anne Lauscher [view email]
[v1] Thu, 5 Sep 2019 11:49:40 UTC (95 KB)
[v2] Mon, 20 Apr 2020 15:31:20 UTC (110 KB)

Computer Science > Computation and Language

Title:Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators