Skip to main content

Showing 1–18 of 18 results for author: Rajani, N F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2110.07166  [pdf, other

    cs.CL

    CaPE: Contrastive Parameter Ensembling for Reducing Hallucination in Abstractive Summarization

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani

    Abstract: Hallucination is a known issue for neural abstractive summarization models. Recent work suggests that the degree of hallucination may depend on errors in the training data. In this work, we propose a new method called Contrastive Parameter Ensembling (CaPE) to use training data more effectively, utilizing variations in noise in training samples to reduce hallucination. We first select clean and no… ▽ More

    Submitted 20 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  2. arXiv:2110.04400  [pdf, other

    cs.CL

    HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

    Authors: Tanya Goyal, Nazneen Fatema Rajani, Wenhao Liu, Wojciech Kryściński

    Abstract: Summarization systems make numerous "decisions" about summary properties during inference, e.g. degree of copying, specificity and length of outputs, etc. However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixtu… ▽ More

    Submitted 21 October, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: EMNLP2022

  3. arXiv:2104.07605  [pdf, other

    cs.CL

    SummVis: Interactive Visual Analysis of Models, Data, and Evaluation for Text Summarization

    Authors: Jesse Vig, Wojciech Kryściński, Karan Goel, Nazneen Fatema Rajani

    Abstract: Novel neural architectures, training strategies, and the availability of large-scale corpora haven been the driving force behind recent progress in abstractive text summarization. However, due to the black-box nature of neural models, uninformative evaluation metrics, and scarce tooling for model and data analysis, the true performance and failure modes of summarization models remain largely unkno… ▽ More

    Submitted 26 July, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to ACL 2021 System Demonstrations

  4. arXiv:2012.15781  [pdf, other

    cs.LG cs.AI cs.CL

    FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging

    Authors: Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong

    Abstract: Influence functions approximate the "influences" of training data-points for test predictions and have a wide variety of applications. Despite the popularity, their computational cost does not scale well with model and training data size. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. We use k-Nearest Neighbors (kNN) to narrow th… ▽ More

    Submitted 9 September, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: 18 pages

  5. arXiv:2012.00195  [pdf, other

    cs.LG q-bio.BM

    Profile Prediction: An Alignment-Based Pre-Training Task for Protein Sequence Models

    Authors: Pascal Sturmfels, Jesse Vig, Ali Madani, Nazneen Fatema Rajani

    Abstract: For protein sequence datasets, unlabeled data has greatly outpaced labeled data due to the high cost of wet-lab characterization. Recent deep-learning approaches to protein prediction have shown that pre-training on unlabeled data can yield useful representations for downstream tasks. However, the optimal pre-training strategy remains an open question. Instead of strictly borrowing from natural la… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

  6. arXiv:2010.09030  [pdf, other

    cs.CL cs.LG

    Explaining and Improving Model Behavior with k Nearest Neighbor Representations

    Authors: Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong

    Abstract: Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens. We propose using k nearest neighbor (kNN) representations to identify training examples responsible for a model's predictions and obtain a corpus-level understanding of the model's behavior. Apart from interpretability, we show th… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

  7. arXiv:2010.07126  [pdf

    cs.AI

    Explaining Creative Artifacts

    Authors: Lav R. Varshney, Nazneen Fatema Rajani, Richard Socher

    Abstract: Human creativity is often described as the mental process of combining associative elements into a new form, but emerging computational creativity algorithms may not operate in this manner. Here we develop an inverse problem formulation to deconstruct the products of combinatorial and compositional creativity into associative chains as a form of post-hoc interpretation that matches the human creat… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 2020 Workshop on Human Interpretability in Machine Learning (WHI), at ICML 2020

  8. arXiv:2010.06119  [pdf, other

    cs.CL cs.AI

    ReviewRobot: Explainable Paper Review Generation based on Knowledge Synthesis

    Authors: Qingyun Wang, Qi Zeng, Lifu Huang, Kevin Knight, Heng Ji, Nazneen Fatema Rajani

    Abstract: To assist human review process, we build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison. A good review needs to be knowledgeable, namely that the comments should be constructive and informative to help improve the paper; and explainable by providing detailed evidence. ReviewRobot achieves these goals v… ▽ More

    Submitted 3 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 14 pages. Accepted by The 14th International Conference on Natural Language Generation (INLG 2020) Code and resource is available at https://github.com/EagleW/ReviewRobot

  9. arXiv:2010.02584  [pdf, other

    cs.CL

    Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start

    Authors: Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, Caiming Xiong

    Abstract: A standard way to address different NLP problems is by first constructing a problem-specific dataset, then building a model to fit this dataset. To build the ultimate artificial intelligence, we desire a single machine that can handle diverse new problems, for which task-specific annotations are limited. We bring up textual entailment as a unified solver for such NLP problems. However, current res… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP2020 Long, camera-ready

  10. arXiv:2009.06367  [pdf, other

    cs.CL cs.LG

    GeDi: Generative Discriminator Guided Sequence Generation

    Authors: Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, Nazneen Fatema Rajani

    Abstract: While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. We propose GeDi as an efficient method for us… ▽ More

    Submitted 22 October, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

  11. arXiv:2007.02871  [pdf, other

    cs.CL

    DART: Open-Domain Structured Data Record to Text Generation

    Authors: Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

    Abstract: We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploi… ▽ More

    Submitted 12 April, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: NAACL 2021

  12. arXiv:2006.15222  [pdf, other

    cs.CL cs.LG q-bio.BM

    BERTology Meets Biology: Interpreting Attention in Protein Language Models

    Authors: Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

    Abstract: Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. In this work, we demonstrate a set of methods for analyzing protein Transformer models through the lens of attention. We show that attention: (1) captures the folding structure of proteins, connecting amino aci… ▽ More

    Submitted 28 March, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: To appear in ICLR 2021

    ACM Class: I.2

  13. arXiv:2005.00965  [pdf, other

    cs.CL cs.LG

    Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

    Authors: Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

    Abstract: Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures that project pre-trained word embeddings into a subspace orthogonal to an inferred gender subspace. We discover that semantic-agnostic corpus reg… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

  14. arXiv:2005.00730  [pdf, other

    cs.CL cs.LG

    ESPRIT: Explaining Solutions to Physical Reasoning Tasks

    Authors: Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming XIong, Richard Socher, Dragomir Radev

    Abstract: Neural networks lack the ability to reason about qualitative physics and so cannot generalize to scenarios and tasks unseen during training. We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events. We use a two-step approach of first identifying the pivotal physical events in an environment… ▽ More

    Submitted 13 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  15. arXiv:1911.03429  [pdf, other

    cs.CL cs.AI cs.LG

    ERASER: A Benchmark to Evaluate Rationalized NLP Models

    Authors: Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace

    Abstract: State-of-the-art models in NLP are now predominantly based on deep neural networks that are opaque in terms of how they come to make predictions. This limitation has increased interest in designing more interpretable deep models for NLP that reveal the `reasoning' behind model outputs. But work in this direction has been conducted on different datasets and tasks with correspondingly unique aims an… ▽ More

    Submitted 24 April, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted as a long paper to ACL2020 Website and leaderboard available at http://www.eraserbenchmark.com/ Code available at https://github.com/jayded/eraserbenchmark

  16. arXiv:1906.02361  [pdf, other

    cs.CL

    Explain Yourself! Leveraging Language Models for Commonsense Reasoning

    Authors: Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, Richard Socher

    Abstract: Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of world-knowledge or reasoning over information not immediately present in the input. We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Accepted at ACL, 11 pages total

    Journal ref: In Proceedings of the Association for Computational Linguistics (ACL), 2019. Florence, Italy

  17. arXiv:1605.08764  [pdf, other

    cs.CL cs.CV cs.LG

    Stacking With Auxiliary Features

    Authors: Nazneen Fatema Rajani, Raymond J. Mooney

    Abstract: Ensembling methods are well known for improving prediction accuracy. However, they are limited in the sense that they cannot discriminate among component models effectively. In this paper, we propose stacking with auxiliary features that learns to fuse relevant information from multiple systems to improve performance. Auxiliary features enable the stacker to rely on systems that not just agree on… ▽ More

    Submitted 27 May, 2016; originally announced May 2016.

    Comments: arXiv admin note: substantial text overlap with arXiv:1604.04802

  18. arXiv:1604.04802  [pdf, other

    cs.CL cs.LG

    Supervised and Unsupervised Ensembling for Knowledge Base Population

    Authors: Nazneen Fatema Rajani, Raymond J. Mooney

    Abstract: We present results on combining supervised and unsupervised methods to ensemble multiple systems for two popular Knowledge Base Population (KBP) tasks, Cold Start Slot Filling (CSSF) and Tri-lingual Entity Discovery and Linking (TEDL). We demonstrate that our combined system along with auxiliary features outperforms the best performing system for both tasks in the 2015 competition, several ensembl… ▽ More

    Submitted 16 April, 2016; originally announced April 2016.