Skip to main content

Showing 1–31 of 31 results for author: Moosavi, N S

.
  1. arXiv:2407.00894  [pdf, other

    cs.CL

    How to Leverage Digit Embeddings to Represent Numbers?

    Authors: Jasivan Alex Sivakumar, Nafise Sadat Moosavi

    Abstract: Apart from performing arithmetic operations, understanding numbers themselves is still a challenge for existing language models. Simple generalisations, such as solving 100+200 instead of 1+2, can substantially affect model performance (Sivakumar and Moosavi, 2023). Among various techniques, character-level embeddings of numbers have emerged as a promising approach to improve number representation… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  2. arXiv:2402.13818  [pdf, other

    cs.CL

    Beyond Hate Speech: NLP's Challenges and Opportunities in Uncovering Dehumanizing Language

    Authors: Hezhao Zhang, Lasana Harris, Nafise Sadat Moosavi

    Abstract: Dehumanization, characterized as a subtle yet harmful manifestation of hate speech, involves denying individuals of their human qualities and often results in violence against marginalized groups. Despite significant progress in Natural Language Processing across various domains, its application in detecting dehumanizing language is limited, largely due to the scarcity of publicly available annota… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  3. arXiv:2402.11621  [pdf, other

    cs.CL

    Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection

    Authors: Valeria Pastorino, Jasivan A. Sivakumar, Nafise Sadat Moosavi

    Abstract: Previous studies on framing have relied on manual analysis or fine-tuning models with limited annotated datasets. However, pre-trained models, with their diverse training backgrounds, offer a promising alternative. This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. We evaluated these models in various scenarios: zero-sho… ▽ More

    Submitted 15 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  4. arXiv:2311.09766  [pdf, other

    cs.CL

    LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores

    Authors: Yiqi Liu, Nafise Sadat Moosavi, Chenghua Lin

    Abstract: Automatic evaluation of generated textual content presents an ongoing challenge within the field of NLP. Given the impressive capabilities of modern language models (LMs) across diverse NLP tasks, there is a growing trend to employ these models in creating innovative evaluation metrics for automated assessment of generation tasks. This paper investigates a pivotal question: Do language model-drive… ▽ More

    Submitted 7 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  5. arXiv:2310.15758  [pdf, other

    cs.CL

    Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones?

    Authors: Dominic Petrak, Nafise Sadat Moosavi, Ye Tian, Nikolai Rozanov, Iryna Gurevych

    Abstract: Learning from free-text human feedback is essential for dialog systems, but annotated data is scarce and usually covers only a small fraction of error types known in conversational AI. Instead of collecting and annotating new datasets from scratch, recent advances in synthetic dialog generation could be used to augment existing dialog datasets with the necessary annotations. However, to assess the… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to be presented at EMNLP 2023

  6. arXiv:2305.17491  [pdf, other

    cs.CL

    FERMAT: An Alternative to Accuracy for Numerical Reasoning

    Authors: Jasivan Alex Sivakumar, Nafise Sadat Moosavi

    Abstract: While pre-trained language models achieve impressive performance on various NLP benchmarks, they still struggle with tasks that require numerical reasoning. Recent advances in improving numerical reasoning are mostly achieved using very large language models that contain billions of parameters and are not accessible to everyone. In addition, numerical reasoning is measured using a single score on… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  7. arXiv:2304.12836  [pdf, other

    cs.CL

    Lessons Learned from a Citizen Science Project for Natural Language Processing

    Authors: Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Gül Şahin, Nafise Sadat Moosavi, Luke Bates, Dominic Petrak, Richard Eckart de Castilho, Iryna Gurevych

    Abstract: Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsourced to paid crowdworkers. Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. To investigate whether and how… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted to EACL 2023. Code will be published on github: https://github.com/UKPLab/eacl2023-citizen-science-lessons-learned

  8. arXiv:2209.02317  [pdf, other

    cs.CL

    Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?

    Authors: Doan Nam Long Vu, Nafise Sadat Moosavi, Steffen Eger

    Abstract: The evaluation of recent embedding-based evaluation metrics for text generation is primarily based on measuring their correlation with human evaluations on standard benchmarks. However, these benchmarks are mostly from similar domains to those used for pretraining word embeddings. This raises concerns about the (lack of) generalization of embedding-based metrics to new and noisy domains that conta… ▽ More

    Submitted 7 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: COLING 2022 camera-ready version

  9. arXiv:2208.14111  [pdf, other

    cs.CL

    Transformers with Learnable Activation Functions

    Authors: Haishuo Fang, Ji-Ung Lee, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: Activation functions can have a significant impact on reducing the topological complexity of input data and therefore improve the performance of the model. Selecting a suitable activation function is an essential step in neural model design. However, the choice of activation function is seldom discussed or explored in Transformer-based language models. Their activation functions are chosen beforeh… ▽ More

    Submitted 14 February, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: Accepted by EACL2023 findings

  10. arXiv:2205.12323  [pdf, ps, other

    cs.CL

    Scoring Coreference Chains with Split-Antecedent Anaphors

    Authors: Silviu Paun, Juntao Yu, Nafise Sadat Moosavi, Massimo Poesio

    Abstract: Anaphoric reference is an aspect of language interpretation covering a variety of types of interpretation beyond the simple case of identity reference to entities introduced via nominal expressions covered by the traditional coreference task in its most recent incarnation in ONTONOTES and similar datasets. One of these cases that go beyond simple coreference is anaphoric reference to entities that… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

  11. arXiv:2205.06733  [pdf, other

    cs.CL

    Arithmetic-Based Pretraining -- Improving Numeracy of Pretrained Language Models

    Authors: Dominic Petrak, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: State-of-the-art pretrained language models tend to perform below their capabilities when applied out-of-the-box on tasks that require understanding and working with numbers. Recent work suggests two main reasons for this: (1) popular tokenisation algorithms have limited expressiveness for numbers, and (2) common pretraining objectives do not target numeracy. Approaches that address these shortcom… ▽ More

    Submitted 9 June, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: Published at StarSEM2023

  12. arXiv:2205.06009  [pdf, other

    cs.CL

    Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

    Authors: Prasetya Ajie Utama, Joshua Bambrick, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: Neural abstractive summarization models are prone to generate summaries which are factually inconsistent with their source documents. Previous work has introduced the task of recognizing such factual inconsistency as a downstream application of natural language inference (NLI). However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target ta… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: Accepted to appear at NAACL 2022

  13. arXiv:2205.01549  [pdf, other

    cs.CL cs.AI

    Adaptable Adapters

    Authors: Nafise Sadat Moosavi, Quentin Delfosse, Kristian Kersting, Iryna Gurevych

    Abstract: State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of pretrained weights. Adapter layers are initialized randomly. However, existing work uses the same adapter architecture -- i.e., the same adapter layer on top of e… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL-2022 main conference

  14. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, **ho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  15. arXiv:2109.04144  [pdf, other

    cs.CL cs.AI

    Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning

    Authors: Prasetya Ajie Utama, Nafise Sadat Moosavi, Victor Sanh, Iryna Gurevych

    Abstract: Recent prompt-based approaches allow pretrained language models to achieve strong performances on few-shot finetuning by reformulating downstream tasks as a language modeling problem. In this work, we demonstrate that, despite its advantages on low data regimes, finetuned prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inference heuristics… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  16. arXiv:2105.15115  [pdf

    cs.CY

    Adoption of Precision Medicine; Limitations and Considerations

    Authors: Nasim Sadat Mosavi, Manuel Filipe Santos

    Abstract: Research is ongoing all over the world for identifying the barriers and finding effective solutions to accelerate the projection of Precision Medicine (PM) in the healthcare industry. Yet there has not been a valid and practical model to tackle the several challenges that have slowed down the widespread of this clinical practice. This study aimed to highlight the major limitations and consideratio… ▽ More

    Submitted 22 May, 2021; originally announced May 2021.

    Comments: 10 pages

  17. arXiv:2104.08296  [pdf, other

    cs.CL

    Learning to Reason for Text Generation from Scientific Tables

    Authors: Nafise Sadat Moosavi, Andreas Rücklé, Dan Roth, Iryna Gurevych

    Abstract: In this paper, we introduce SciGen, a new challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions. Describing scientific tables goes beyond the surface realization of the table content and requires reasoning over table values. The unique properties of SciGen are that (1) tables mostly contain numer… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  18. arXiv:2104.05320  [pdf, other

    cs.CL

    Stay Together: A System for Single and Split-antecedent Anaphora Resolution

    Authors: Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio

    Abstract: The state-of-the-art on basic, single-antecedent anaphora has greatly improved in recent years. Researchers have therefore started to pay more attention to more complex cases of anaphora such as split-antecedent anaphora, as in Time-Warner is considering a legal challenge to Telecommunications Inc's plan to buy half of Showtime Networks Inc-a move that could lead to all-out war between the two pow… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: accepted at NAACL 2021

  19. arXiv:2012.15573  [pdf, other

    cs.CL

    Coreference Reasoning in Machine Reading Comprehension

    Authors: Mingzhu Wu, Nafise Sadat Moosavi, Dan Roth, Iryna Gurevych

    Abstract: Coreference resolution is essential for natural language understanding and has been long studied in NLP. In recent years, as the format of Question Answering (QA) became a standard for machine reading comprehension (MRC), there have been data collection efforts, e.g., Dasigi et al. (2019), that attempt to evaluate the ability of MRC models to reason about coreference. However, as we show, corefere… ▽ More

    Submitted 9 June, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Accepted at ACL-IJCNLP 2021

  20. arXiv:2011.00245  [pdf, ps, other

    cs.CL

    Free the Plural: Unrestricted Split-Antecedent Anaphora Resolution

    Authors: Juntao Yu, Nafise Sadat Moosavi, Silviu Paun, Massimo Poesio

    Abstract: Now that the performance of coreference resolvers on the simpler forms of anaphoric reference has greatly improved, more attention is devoted to more complex aspects of anaphora. One limitation of virtually all coreference resolution models is the focus on single-antecedent anaphors. Plural anaphors with multiple antecedents-so-called split-antecedent anaphors (as in John met Mary. They went to th… ▽ More

    Submitted 31 October, 2020; originally announced November 2020.

    Comments: accepted at COLING 2020

  21. arXiv:2010.12510  [pdf, other

    cs.CL

    Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures

    Authors: Nafise Sadat Moosavi, Marcel de Boer, Prasetya Ajie Utama, Iryna Gurevych

    Abstract: Existing NLP datasets contain various biases, and models tend to quickly learn those biases, which in turn limits their robustness. Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective so that models learn less from biased examples. Besides, they mostly focus on addressing a specific bias, and while they improve the performance on adversa… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  22. arXiv:2010.03338  [pdf, other

    cs.CL

    Improving QA Generalization by Concurrent Modeling of Multiple Biases

    Authors: Mingzhu Wu, Nafise Sadat Moosavi, Andreas Rücklé, Iryna Gurevych

    Abstract: Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets. However, focusing on dataset-specific biases limits their ability to learn more generalizable knowledge about the task from more general data patterns. In this paper, we investigate the impact of debiasing methods for improving generalization and propose a… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

  23. arXiv:2009.12303  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Debiasing NLU Models from Unknown Biases

    Authors: Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: NLU models often exploit biases to achieve high dataset-specific performance without properly learning the intended task. Recently proposed debiasing methods are shown to be effective in mitigating this tendency. However, these methods rely on a major assumption that the types of bias should be known a-priori, which limits their application to many NLU tasks and datasets. In this work, we present… ▽ More

    Submitted 13 October, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

    Comments: Accepted at EMNLP 2020

  24. arXiv:2005.00315  [pdf, other

    cs.CL

    Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

    Authors: Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: Models for natural language understanding (NLU) tasks often rely on the idiosyncratic biases of the dataset, which make them brittle against test cases outside the training distribution. Recently, several proposed debiasing methods are shown to be very effective in improving out-of-distribution performance. However, their improvements come at the expense of performance drop when models are evaluat… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: to appear at ACL 2020

  25. arXiv:1911.05594  [pdf, other

    cs.CL cs.LG

    Neural Duplicate Question Detection without Labeled Training Data

    Authors: Andreas Rücklé, Nafise Sadat Moosavi, Iryna Gurevych

    Abstract: Supervised training of neural models to duplicate question detection in community Question Answering (cQA) requires large amounts of labeled question pairs, which are costly to obtain. To minimize this cost, recent works thus often used alternative methods, e.g., adversarial domain adaptation. In this work, we propose two novel methods: (1) the automatic generation of duplicate questions, and (2)… ▽ More

    Submitted 19 September, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

    Comments: Accepted as long paper at EMNLP-2019

  26. arXiv:1909.08940  [pdf, ps, other

    cs.CL

    Improving Generalization by Incorporating Coverage in Natural Language Inference

    Authors: Nafise Sadat Moosavi, Prasetya Ajie Utama, Andreas Rücklé, Iryna Gurevych

    Abstract: The task of natural language inference (NLI) is to identify the relation between the given premise and hypothesis. While recent NLI models achieve very high performance on individual datasets, they fail to generalize across similar datasets. This indicates that they are solving NLI datasets instead of the task itself. In order to improve generalization, we propose to extend the input representatio… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

  27. arXiv:1906.06703  [pdf, other

    cs.CL

    Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection

    Authors: Nafise Sadat Moosavi, Leo Born, Massimo Poesio, Michael Strube

    Abstract: The common practice in coreference resolution is to identify and evaluate the maximum span of mentions. The use of maximum spans tangles coreference evaluation with the challenges of mention boundary detection like prepositional phrase attachment. To address this problem, minimum spans are manually annotated in smaller corpora. However, this additional annotation is costly and therefore, this solu… ▽ More

    Submitted 16 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  28. arXiv:1708.00160  [pdf, ps, other

    cs.CL

    Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or dat… ▽ More

    Submitted 12 October, 2018; v1 submitted 1 August, 2017; originally announced August 2017.

    Comments: EMNLP 2018 long paper

  29. arXiv:1707.06456  [pdf, other

    cs.CL

    Revisiting Selectional Preferences for Coreference Resolution

    Authors: Benjamin Heinzerling, Nafise Sadat Moosavi, Michael Strube

    Abstract: Selectional preferences have long been claimed to be essential for coreference resolution. However, they are mainly modeled only implicitly by current coreference resolvers. We propose a dependency-based embedding model of selectional preferences which allows fine-grained compatibility judgments with high coverage. We show that the incorporation of our model improves coreference resolution perform… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017 - short paper

  30. arXiv:1704.06779  [pdf, other

    cs.CL

    Lexical Features in Coreference Resolution: To be Used With Caution

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference… ▽ More

    Submitted 22 April, 2017; originally announced April 2017.

    Comments: 6 pages, ACL 2017

  31. arXiv:1702.07507  [pdf, ps, other

    cs.CL

    Use Generalized Representations, But Do Not Forget Surface Features

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Only a year ago, all state-of-the-art coreference resolvers were using an extensive amount of surface features. Recently, there was a paradigm shift towards using word embeddings and deep neural networks, where the use of surface features is very limited. In this paper, we show that a simple SVM model with surface features outperforms more complex neural models for detecting anaphoric mentions. Ou… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

    Comments: CORBON workshop@EACL 2017