Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Davis, Forrest; van Schijndel, Marten

Computer Science > Computation and Language

arXiv:2005.00165 (cs)

[Submitted on 1 May 2020 (v1), last revised 7 May 2020 (this version, v3)]

Title:Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Authors:Forrest Davis, Marten van Schijndel

View PDF

Abstract:A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all.

Comments:	Proceedings of 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020); v3 updated references and added additional corpus stats in discussion
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2005.00165 [cs.CL]
	(or arXiv:2005.00165v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2005.00165

Submission history

From: Forrest Davis [view email]
[v1] Fri, 1 May 2020 01:21:47 UTC (132 KB)
[v2] Wed, 6 May 2020 17:13:23 UTC (134 KB)
[v3] Thu, 7 May 2020 15:21:58 UTC (133 KB)

Computer Science > Computation and Language

Title:Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators