Search | arXiv e-print repository

arXiv:2009.11523 [pdf, other]

Grounded Compositional Outputs for Adaptive Language Modeling

Authors: Nikolaos Pappas, Phoebe Mulcaire, Noah A. Smith

Abstract: Language models have emerged as a central component across NLP, and a great deal of progress depends on the ability to cheaply adapt them (e.g., through finetuning) to new domains and tasks. A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size and is part of what makes it resistant to such adaptation. Prior work has used compositional in… ▽ More Language models have emerged as a central component across NLP, and a great deal of progress depends on the ability to cheaply adapt them (e.g., through finetuning) to new domains and tasks. A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size and is part of what makes it resistant to such adaptation. Prior work has used compositional input embeddings based on surface forms to ameliorate this issue. In this work, we go one step beyond and propose a fully compositional output embedding layer for language models, which is further grounded in information from a structured lexicon (WordNet), namely semantically related words and free-text definitions. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary. We evaluate the model on conventional language modeling as well as challenging cross-domain settings with an open vocabulary, finding that it matches or outperforms previous state-of-the-art output embedding methods and adaptation approaches. Our analysis attributes the improvements to sample efficiency: our model is more accurate for low-frequency words. △ Less

Submitted 5 October, 2020; v1 submitted 24 September, 2020; originally announced September 2020.

Comments: EMNLP 2020

arXiv:2004.02709 [pdf, other]

Evaluating Models' Local Decision Boundaries via Contrast Sets

Authors: Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang , et al. (1 additional authors not shown)

Abstract: Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities. We propose a new annotation paradigm for NLP that helps to close systemati… ▽ More Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities. We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data. In particular, after a dataset is constructed, we recommend that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Contrast sets provide a local view of a model's decision boundary, which can be used to more accurately evaluate a model's true linguistic capabilities. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets (e.g., DROP reading comprehension, UD parsing, IMDb sentiment analysis). Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets---up to 25\% in some cases. We release our contrast sets as new evaluation benchmarks and encourage future dataset construction efforts to follow similar annotation processes. △ Less

Submitted 1 October, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

arXiv:1909.08744 [pdf, other]

Low-Resource Parsing with Crosslingual Contextualized Representations

Authors: Phoebe Mulcaire, Jungo Kasai, Noah A. Smith

Abstract: Despite advances in dependency parsing, languages with small treebanks still present challenges. We assess recent approaches to multilingual contextual word representations (CWRs), and compare them for crosslingual transfer from a language with a large treebank to a language with a small or nonexistent treebank, by sharing parameters between languages in the parser itself. We experiment with a div… ▽ More Despite advances in dependency parsing, languages with small treebanks still present challenges. We assess recent approaches to multilingual contextual word representations (CWRs), and compare them for crosslingual transfer from a language with a large treebank to a language with a small or nonexistent treebank, by sharing parameters between languages in the parser itself. We experiment with a diverse selection of languages in both simulated and truly low-resource scenarios, and show that multilingual CWRs greatly facilitate low-resource dependency parsing even without crosslingual supervision such as dictionaries or parallel text. Furthermore, we examine the non-contextual part of the learned language models (which we call a "decontextual probe") to demonstrate that polyglot language models better encode crosslingual lexical correspondence compared to aligned monolingual language models. This analysis provides further evidence that polyglot training is an effective approach to crosslingual transfer. △ Less

Submitted 18 September, 2019; originally announced September 2019.

Comments: CoNLL 2019

arXiv:1902.09697 [pdf, other]

Polyglot Contextual Representations Improve Crosslingual Transfer

Authors: Phoebe Mulcaire, Jungo Kasai, Noah A. Smith

Abstract: We introduce Rosita, a method to produce multilingual contextual word representations by training a single language model on text from multiple languages. Our method combines the advantages of contextual word representations with those of multilingual representation learning. We produce language models from dissimilar language pairs (English/Arabic and English/Chinese) and use them in dependency p… ▽ More We introduce Rosita, a method to produce multilingual contextual word representations by training a single language model on text from multiple languages. Our method combines the advantages of contextual word representations with those of multilingual representation learning. We produce language models from dissimilar language pairs (English/Arabic and English/Chinese) and use them in dependency parsing, semantic role labeling, and named entity recognition, with comparisons to monolingual and non-contextual variants. Our results provide further evidence for the benefits of polyglot learning, in which representations are shared across multiple languages. △ Less

Submitted 18 March, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

Comments: NAACL 2019

arXiv:1812.09383 [pdf, other]

Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

Authors: John Akers, Gagan Bansal, Gabriel Cadamuro, Christine Chen, Quanze Chen, Lucy Lin, Phoebe Mulcaire, Rajalakshmi Nandakumar, Matthew Rockett, Lucy Simko, John Toman, Tongshuang Wu, Eric Zeng, Bill Zorn, Franziska Roesner

Abstract: Technology is increasingly used -- unintentionally (misinformation) or intentionally (disinformation) -- to spread false information at scale, with potentially broad-reaching societal effects. For example, technology enables increasingly realistic false images and videos, and hyper-personal targeting means different people may see different versions of reality. This report is the culmination of a… ▽ More Technology is increasingly used -- unintentionally (misinformation) or intentionally (disinformation) -- to spread false information at scale, with potentially broad-reaching societal effects. For example, technology enables increasingly realistic false images and videos, and hyper-personal targeting means different people may see different versions of reality. This report is the culmination of a PhD-level special topics course (https://courses.cs.washington.edu/courses/cse599b/18au/) in Computer Science & Engineering at the University of Washington's Paul G. Allen School in the fall of 2018. The goals of this course were to study (1) how technologies and today's technical platforms enable and support the creation and spread of such mis- and disinformation, as well as (2) how technical approaches could be used to mitigate these issues. In this report, we summarize the space of technology-enabled mis- and disinformation based on our investigations, and then surface our lessons and recommendations for technologists, researchers, platform designers, policymakers, and users. △ Less

Submitted 3 January, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

arXiv:1805.11598 [pdf, other]

Polyglot Semantic Role Labeling

Authors: Phoebe Mulcaire, Swabha Swayamdipta, Noah Smith

Abstract: Previous approaches to multilingual semantic dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. We experiment with a new approach where we combine resources from a pair of languages in the CoNLL 2009 shared task to build a polyglot semantic role labeler. Notwithstanding the absence of parallel data, and the dissimilar… ▽ More Previous approaches to multilingual semantic dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. We experiment with a new approach where we combine resources from a pair of languages in the CoNLL 2009 shared task to build a polyglot semantic role labeler. Notwithstanding the absence of parallel data, and the dissimilarity in annotations between languages, our approach results in an improvement in SRL performance on multiple languages over a monolingual baseline. Analysis of the polyglot model shows it to be advantageous in lower-resource settings. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: To appear at ACL 2018

Showing 1–6 of 6 results for author: Mulcaire, P