Skip to main content

Showing 1–38 of 38 results for author: Swayamdipta, S

.
  1. arXiv:2407.01878  [pdf, other

    cs.CL

    Compare without Despair: Reliable Preference Evaluation with Generation Separability

    Authors: Sayan Ghosh, Tejas Srinivasan, Swabha Swayamdipta

    Abstract: Human evaluation of generated language through pairwise preference judgments is pervasive. However, under common scenarios, such as when generations from a model pair are very similar, or when stochastic decoding results in large variations in generations, it results in inconsistent preference ratings. We address these challenges by introducing a meta-evaluation measure, separability, which estima… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.14883  [pdf, other

    cs.CL cs.CY

    OATH-Frames: Characterizing Online Attitudes Towards Homelessness with LLM Assistants

    Authors: Jaspreet Ranjit, Brihi Joshi, Rebecca Dorn, Laura Petry, Olga Koumoundouros, Jayne Bottarini, Peichen Liu, Eric Rice, Swabha Swayamdipta

    Abstract: Warning: Contents of this paper may be upsetting. Public attitudes towards key societal issues, expressed on online media, are of immense value in policy and reform efforts, yet challenging to understand at scale. We study one such social issue: homelessness in the U.S., by leveraging the remarkable capabilities of large language models to assist social work experts in analyzing millions of post… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Project website: https://dill-lab.github.io/oath-frames/

  3. arXiv:2406.04834  [pdf, other

    cs.CL

    Annotating FrameNet via Structure-Conditioned Language Generation

    Authors: Xinyue Cui, Swabha Swayamdipta

    Abstract: Despite the remarkable generative capabilities of language models in producing naturalistic language, their effectiveness on explicit manipulation and generation of linguistic structures remain understudied. In this paper, we investigate the task of generating new sentences preserving a given semantic structure, following the FrameNet formalism. We propose a framework to produce novel frame-semant… ▽ More

    Submitted 24 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to ACL 2024

  4. arXiv:2403.09539  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Logits of API-Protected LLMs Leak Proprietary Information

    Authors: Matthew Finlayson, Xiang Ren, Swabha Swayamdipta

    Abstract: The commercialization of large language models (LLMs) has led to the common practice of high-level API-only access to proprietary models. In this work, we show that even with a conservative assumption about the model architecture, it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a relatively small number of API queries (e.g., costing und… ▽ More

    Submitted 14 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    MSC Class: 68T50 ACM Class: I.2.7

  5. arXiv:2403.03429  [pdf, other

    cs.PL

    Generative Explanations for Program Synthesizers

    Authors: Amirmohammad Nazari, Souti Chattopadhyay, Swabha Swayamdipta, Mukund Raghothaman

    Abstract: Despite great advances in program synthesis techniques, they remain algorithmic black boxes. Although they guarantee that when synthesis is successful, the implementation satisfies the specification, they provide no additional information regarding how the implementation works or the manner in which the specification is realized. One possibility to answer these questions is to use large language m… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  6. arXiv:2310.01693  [pdf, other

    cs.CL

    Closing the Curious Case of Neural Text Degeneration

    Authors: Matthew Finlayson, John Hewitt, Alexander Koller, Swabha Swayamdipta, Ashish Sabharwal

    Abstract: Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonze… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    MSC Class: 68T50 ACM Class: I.2.7

  7. arXiv:2309.09405  [pdf, other

    cs.AI cs.CL cs.CV

    Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization

    Authors: Yoonsoo Nam, Adam Lehavi, Daniel Yang, Digbalay Bose, Swabha Swayamdipta, Shrikanth Narayanan

    Abstract: Video summarization remains a huge challenge in computer vision due to the size of the input videos to be summarized. We propose an efficient, language-only video summarizer that achieves competitive accuracy with high data efficiency. Using only textual captions obtained via a zero-shot approach, we train a language transformer model and forego image representations. This method allows us to perf… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  8. arXiv:2306.01985  [pdf, other

    cs.CL

    COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

    Authors: Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap

    Abstract: Warning: This paper contains content that may be offensive or upsetting. Understanding the harms and offensiveness of statements requires reasoning about the social and situational context in which statements are made. For example, the utterance "your English is very good" may implicitly signal an insult when uttered by a white man to a non-white colleague, but uttered by an ESL teacher to their s… ▽ More

    Submitted 8 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted to Findings of ACL 2023

  9. arXiv:2305.04978  [pdf, other

    cs.CL

    NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge

    Authors: Phillip Howard, Junlin Wang, Vasudev Lal, Gadi Singer, Ye** Choi, Swabha Swayamdipta

    Abstract: Comparative knowledge (e.g., steel is stronger and heavier than styrofoam) is an essential component of our world knowledge, yet understudied in prior literature. In this paper, we harvest the dramatic improvements in knowledge capabilities of language models into a large-scale comparative knowledge base. While the ease of acquisition of such comparative knowledge is much higher from extreme-scale… ▽ More

    Submitted 5 April, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted to NAACL 2024 Findings

  10. arXiv:2304.14399  [pdf, other

    cs.CL

    We're Afraid Language Models Aren't Modeling Ambiguity

    Authors: Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Ye** Choi

    Abstract: Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our interpretations as listeners. As language models (LMs) are increasingly employed as dialogue interfaces and writing aids, handling ambiguous language is critical to their success. We characterize ambiguit… ▽ More

    Submitted 20 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: EMNLP 2023 camera-ready

  11. arXiv:2212.14578  [pdf, other

    cs.LG cs.AI cs.CL

    MAUVE Scores for Generative Models: Theory and Practice

    Authors: Krishna Pillutla, Lang Liu, John Thickstun, Sean Welleck, Swabha Swayamdipta, Rowan Zellers, Sewoong Oh, Ye** Choi, Zaid Harchaoui

    Abstract: Generative artificial intelligence has made significant strides, producing text indistinguishable from human prose and remarkably photorealistic images. Automatically measuring how close the generated data distribution is to the target distribution is central to diagnosing existing models and develo** better ones. We present MAUVE, a family of comparison measures between pairs of distributions s… ▽ More

    Submitted 7 December, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: Published in Journal of Machine Learning Research

  12. arXiv:2212.09246  [pdf, other

    cs.CL

    I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

    Authors: Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Lianhui Qin, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Ye** Choi

    Abstract: Commonsense capabilities of pre-trained language models dramatically improve with scale, leading many to believe that scale is the only winning recipe. But is it? Here, we investigate an alternative that a priori seems impossible: can smaller language models (e.g., GPT-2) win over models that are orders of magnitude larger and better (e.g., GPT-3), if powered with novel commonsense distillation al… ▽ More

    Submitted 26 May, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  13. arXiv:2210.12365  [pdf, other

    cs.CL

    NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation

    Authors: Phillip Howard, Gadi Singer, Vasudev Lal, Ye** Choi, Swabha Swayamdipta

    Abstract: While counterfactual data augmentation offers a promising step towards robust generalization in natural language processing, producing a set of counterfactuals that offer valuable inductive bias for models remains a challenge. Most existing approaches for producing counterfactuals, manual or automated, rely on small perturbations via minimal edits, resulting in simplistic changes. We introduce Neu… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022

  14. arXiv:2210.04982  [pdf, other

    cs.CL

    REV: Information-Theoretic Evaluation of Free-Text Rationales

    Authors: Hanjie Chen, Faeze Brahman, Xiang Ren, Yangfeng Ji, Ye** Choi, Swabha Swayamdipta

    Abstract: Generating free-text rationales is a promising step towards explainable NLP, yet evaluating such rationales remains a challenge. Existing metrics have mostly focused on measuring the association between the rationale and a given label. We argue that an ideal metric should focus on the new information uniquely provided in the rationale that is otherwise not provided in the input or the label. We in… ▽ More

    Submitted 2 June, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: ACL 2023

  15. arXiv:2206.11083  [pdf, other

    cs.CL cs.AI

    Investigating the Benefits of Free-Form Rationales

    Authors: Jiao Sun, Swabha Swayamdipta, Jonathan May, Xuezhe Ma

    Abstract: Free-form rationales aim to aid model interpretability by supplying the background knowledge that can help understand model decisions. Crowdsourced rationales are provided for commonsense QA instances in popular datasets such as CoS-E and ECQA, but their utility remains under-investigated. We present human studies which show that ECQA rationales indeed provide additional background information to… ▽ More

    Submitted 25 October, 2022; v1 submitted 25 May, 2022; originally announced June 2022.

    Comments: EMNLP 2022, Findings

  16. arXiv:2201.05955  [pdf, other

    cs.CL

    WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation

    Authors: Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Ye** Choi

    Abstract: A recurring challenge of crowdsourcing NLP datasets at scale is that human writers often rely on repetitive patterns when crafting examples, leading to a lack of linguistic diversity. We introduce a novel approach for dataset creation based on worker and AI collaboration, which brings together the generative strength of language models and the evaluative strength of humans. Starting with an existi… ▽ More

    Submitted 14 November, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: EMNLP Findings camera-ready

  17. arXiv:2112.08674  [pdf, other

    cs.CL

    Reframing Human-AI Collaboration for Generating Free-Text Explanations

    Authors: Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Ye** Choi

    Abstract: Large language models are increasingly capable of generating fluent-appearing text with relatively little task-specific supervision. But can these models accurately explain classification decisions? We consider the task of generating free-text explanations using human-written examples in a few-shot manner. We find that (1) authoring higher quality prompts results in higher quality generations; and… ▽ More

    Submitted 4 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: NAACL 2022 Camera-ready. 13 pages main + references, 14 pages appendix

  18. arXiv:2111.07997  [pdf, other

    cs.CL cs.HC

    Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

    Authors: Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Ye** Choi, Noah A. Smith

    Abstract: The perceived toxicity of language can vary based on someone's identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases. We seek to understand the who, why, and what behind biases in toxicity annotations. In two online studies with demographically and politically diverse participants, we investigate the effect of annot… ▽ More

    Submitted 9 May, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

    Comments: NAACL 2022 Camera Ready

  19. arXiv:2110.08420  [pdf, other

    cs.CL cs.AI cs.LG

    Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information

    Authors: Kawin Ethayarajh, Ye** Choi, Swabha Swayamdipta

    Abstract: Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison provides little understanding of how difficult each instance in a given distribution is, or what attributes make the dataset difficult for a given model. To address these questions, we frame dataset dif… ▽ More

    Submitted 14 June, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ICML 2022 (long talk)

  20. arXiv:2109.07725  [pdf, other

    cs.CL

    Sister Help: Data Augmentation for Frame-Semantic Role Labeling

    Authors: Ayush Pancholy, Miriam R. L. Petruck, Swabha Swayamdipta

    Abstract: While FrameNet is widely regarded as a rich resource of semantics in natural language processing, a major criticism concerns its lack of coverage and the relative paucity of its labeled data compared to other commonly used lexical resources such as PropBank and VerbNet. This paper reports on a pilot study to address these gaps. We propose a data augmentation approach, which uses existing frame-spe… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: Accepted to LAW-DMR at EMNLP 2021

  21. arXiv:2105.03023  [pdf, other

    cs.CL

    DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts

    Authors: Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, Ye** Choi

    Abstract: Despite recent advances in natural language generation, it remains challenging to control attributes of generated text. We propose DExperts: Decoding-time Experts, a decoding-time method for controlled text generation that combines a pretrained language model with "expert" LMs and/or "anti-expert" LMs in a product of experts. Intuitively, under the ensemble, tokens only get high probability if the… ▽ More

    Submitted 3 June, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: ACL 2021 camera-ready

  22. arXiv:2103.01378  [pdf, other

    cs.CL cs.AI cs.LG

    Contrastive Explanations for Model Interpretability

    Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Ye** Choi, Yoav Goldberg

    Abstract: Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning. Our method… ▽ More

    Submitted 14 September, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted to EMNLP 2021 as a long paper

  23. arXiv:2102.01454  [pdf, other

    cs.CL

    MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

    Authors: Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, Sean Welleck, Ye** Choi, Zaid Harchaoui

    Abstract: As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE, a comparison measure for open-ended text generation, which directly compares the learnt distribution from a text generation model to the distribution of human-written text using divergence frontiers. MAUVE scales up to modern… ▽ More

    Submitted 23 November, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021 (Oral Presentation). Package: https://github.com/krishnap25/mauve

  24. arXiv:2102.00086  [pdf, other

    cs.CL

    Challenges in Automated Debiasing for Toxic Language Detection

    Authors: Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Noah A. Smith, Ye** Choi

    Abstract: Biased associations have been a challenge in the development of classifiers for detecting toxic language, hindering both fairness and accuracy. As potential solutions, we investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection. Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (s… ▽ More

    Submitted 29 January, 2021; originally announced February 2021.

    Comments: EACL 2021

  25. arXiv:2009.10795  [pdf, other

    cs.CL

    Dataset Cartography: Map** and Diagnosing Datasets with Training Dynamics

    Authors: Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, Ye** Choi

    Abstract: Large datasets have become commonplace in NLP research. However, the increased emphasis on data quantity has made it challenging to assess the quality of data. We introduce Data Maps---a model-based tool to characterize and diagnose datasets. We leverage a largely ignored source of information: the behavior of the model on individual instances during training (training dynamics) for building data… ▽ More

    Submitted 15 October, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: Proceedings of EMNLP 2020

  26. Generative Data Augmentation for Commonsense Reasoning

    Authors: Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta, Ronan Le Bras, Ji-** Wang, Chandra Bhagavatula, Ye** Choi, Doug Downey

    Abstract: Recent advances in commonsense reasoning depend on large-scale human-annotated training data to achieve peak performance. However, manual curation of training examples is expensive and has been shown to introduce annotation artifacts that neural models can readily exploit and overfit on. We investigate G-DAUG^C, a novel generative data augmentation method that aims to achieve more accurate and rob… ▽ More

    Submitted 16 November, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Findings of the Association for Computational Linguistics: EMNLP 2020

  27. arXiv:2004.10964  [pdf, other

    cs.CL cs.LG

    Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

    Authors: Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. Smith

    Abstract: Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, s… ▽ More

    Submitted 5 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  28. arXiv:2004.07453  [pdf, other

    cs.CL cs.LG

    The Right Tool for the Job: Matching Model and Instance Complexities

    Authors: Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. Smith

    Abstract: As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit" from neural network calculations for simple instances, and late (and accurate) exi… ▽ More

    Submitted 8 May, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: ACL 2020; 12 pages; code available in https://github.com/allenai/sledgehammer

  29. arXiv:2002.04108  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Adversarial Filters of Dataset Biases

    Authors: Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Ye** Choi

    Abstract: Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite,… ▽ More

    Submitted 10 July, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: Accepted to ICML 2020

  30. arXiv:1908.11047  [pdf, other

    cs.CL

    Shallow Syntax in Deep Water

    Authors: Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith

    Abstract: Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain. We investigate the role of shallow syntax-aware representations for NLP tasks using two techniques. First, we enhance the ELMo architecture to allow pretraining on predicted shallow syntactic parses, instead of just raw text, so that co… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

  31. arXiv:1808.10485  [pdf, other

    cs.CL

    Syntactic Scaffolds for Semantic Structures

    Authors: Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer, Noah A. Smith

    Abstract: We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks. Syntactic scaffolds avoid expensive syntactic processing at runtime, only making use of a treebank during training, through a multitask objective. We improve over strong baselines on PropBank semantics, frame semantics, and coreference resolution, achieving competitive performance on all th… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: Accepted at EMNLP 2018

  32. arXiv:1805.11598  [pdf, other

    cs.CL

    Polyglot Semantic Role Labeling

    Authors: Phoebe Mulcaire, Swabha Swayamdipta, Noah Smith

    Abstract: Previous approaches to multilingual semantic dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. We experiment with a new approach where we combine resources from a pair of languages in the CoNLL 2009 shared task to build a polyglot semantic role labeler. Notwithstanding the absence of parallel data, and the dissimilar… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: To appear at ACL 2018

  33. arXiv:1804.05990  [pdf, other

    cs.CL

    Learning Joint Semantic Parsers from Disjoint Data

    Authors: Hao Peng, Sam Thomson, Swabha Swayamdipta, Noah A. Smith

    Abstract: We present a new approach to learning semantic parsers from multiple datasets, even when the target semantic formalisms are drastically different, and the underlying corpora do not overlap. We handle such "disjoint" data by treating annotations for unobserved formalisms as latent structured variables. Building on state-of-the-art baselines, we show improvements both in frame-semantic parsing and s… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: NAACL 2018

  34. arXiv:1803.02324  [pdf, other

    cs.CL cs.AI

    Annotation Artifacts in Natural Language Inference Data

    Authors: Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith

    Abstract: Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hy… ▽ More

    Submitted 16 April, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

    Comments: 6 pages, 1 figure, NAACL 2018

  35. arXiv:1711.00894  [pdf, other

    cs.CL

    Multi-Mention Learning for Reading Comprehension with Neural Cascades

    Authors: Swabha Swayamdipta, Ankur P. Parikh, Tom Kwiatkowski

    Abstract: Reading comprehension is a challenging task, especially when executed across longer or across multiple evidence documents, where the answer is likely to reoccur. Existing neural architectures typically do not scale to the entire evidence, and hence, resort to selecting a single passage in the document (either via truncation or other means), and carefully searching for the answer within that passag… ▽ More

    Submitted 30 May, 2018; v1 submitted 2 November, 2017; originally announced November 2017.

    Comments: Proceedings of ICLR 2018

  36. arXiv:1706.09528  [pdf, other

    cs.CL

    Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold

    Authors: Swabha Swayamdipta, Sam Thomson, Chris Dyer, Noah A. Smith

    Abstract: We present a new, efficient frame-semantic parser that labels semantic arguments to FrameNet predicates. Built using an extension to the segmental RNN that emphasizes recall, our basic system achieves competitive performance without any calls to a syntactic parser. We then introduce a method that uses phrase-syntactic annotations from the Penn Treebank during training only, through a multitask obj… ▽ More

    Submitted 28 June, 2017; originally announced June 2017.

  37. arXiv:1701.03980  [pdf, other

    stat.ML cs.CL cs.MS

    DyNet: The Dynamic Neural Network Toolkit

    Authors: Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

    Abstract: We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its deriva… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

    Comments: 33 pages

  38. arXiv:1606.08954  [pdf, other

    cs.CL cs.AI

    Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs

    Authors: Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, Noah A. Smith

    Abstract: We present a transition-based parser that jointly produces syntactic and semantic dependencies. It learns a representation of the entire algorithm state, using stack long short-term memories. Our greedy inference algorithm has linear time, including feature extraction. On the CoNLL 2008--9 English shared tasks, we obtain the best published parsing performance among models that jointly learn syntax… ▽ More

    Submitted 4 July, 2018; v1 submitted 29 June, 2016; originally announced June 2016.

    Comments: Proceedings of CoNLL 2016; 13 pages, 5 figures