Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

Wilcox, Ethan; Qian, Peng; Futrell, Richard; Ballesteros, Miguel; Levy, Roger

Computer Science > Computation and Language

arXiv:1903.00943 (cs)

[Submitted on 3 Mar 2019 (v1), last revised 6 Apr 2019 (this version, v2)]

Title:Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

Authors:Ethan Wilcox, Peng Qian, Richard Futrell, Miguel Ballesteros, Roger Levy

View PDF

Abstract:State-of-the-art LSTM language models trained on large corpora learn sequential contingencies in impressive detail and have been shown to acquire a number of non-local grammatical dependencies with some success. Here we investigate whether supervision with hierarchical structure enhances learning of a range of grammatical dependencies, a question that has previously been addressed only for subject-verb agreement. Using controlled experimental methods from psycholinguistics, we compare the performance of word-based LSTM models versus two models that represent hierarchical structure and deploy it in left-to-right processing: Recurrent Neural Network Grammars (RNNGs) (Dyer et al., 2016) and a incrementalized version of the Parsing-as-Language-Modeling configuration from Chariak et al., (2016). Models are tested on a diverse range of configurations for two classes of non-local grammatical dependencies in English---Negative Polarity licensing and Filler--Gap Dependencies. Using the same training data across models, we find that structurally-supervised models outperform the LSTM, with the RNNG demonstrating best results on both types of grammatical dependencies and even learning many of the Island Constraints on the filler--gap dependency. Structural supervision thus provides data efficiency advantages over purely string-based training of neural language models in acquiring human-like generalizations about non-local grammatical dependencies.

Comments:	To appear: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1903.00943 [cs.CL]
	(or arXiv:1903.00943v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1903.00943

Submission history

From: Ethan Wilcox [view email]
[v1] Sun, 3 Mar 2019 17:08:00 UTC (1,197 KB)
[v2] Sat, 6 Apr 2019 17:48:38 UTC (2,948 KB)

Computer Science > Computation and Language

Title:Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators