An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs

Zhang, Jiarui; Ilievski, Filip; Ma, Kaixin; Francis, Jonathan; Oltramari, Alessandro

Computer Science > Computation and Language

arXiv:2205.10661 (cs)

[Submitted on 21 May 2022]

Title:An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs

Authors:Jiarui Zhang, Filip Ilievski, Kaixin Ma, Jonathan Francis, Alessandro Oltramari

View PDF

Abstract:Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models, in zero-shot evaluation on various downstream language reasoning tasks. Since these improvements are reported in aggregate, however, little is known about (i) how to select the appropriate knowledge for solid performance across tasks, (ii) how to combine this knowledge with neural language models, and (iii) how these pairings affect granular task performance. In this paper, we study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models. We study the effect of different synthetic datasets on language models with various architectures and sizes. The resulting models are evaluated against four task properties: domain overlap, answer similarity, vocabulary overlap, and answer length. Our experiments show that encoder-decoder models benefit from more data to learn from, whereas sampling strategies that balance across different aspects yield best performance. Most of the improvement occurs on questions with short answers and dissimilar answer candidates, which corresponds to the characteristics of the data used for pre-training.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.10661 [cs.CL]
	(or arXiv:2205.10661v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.10661

Submission history

From: Filip Ilievski [view email]
[v1] Sat, 21 May 2022 19:49:04 UTC (442 KB)

Computer Science > Computation and Language

Title:An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators