Skip to main content

Showing 1–11 of 11 results for author: Dubossarsky, H

.
  1. arXiv:2404.18570  [pdf, other

    cs.CL

    Analyzing Semantic Change through Lexical Replacements

    Authors: Francesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi

    Abstract: Modern language models are capable of contextualizing words based on their surrounding context. However, this capability is often compromised due to semantic change that leads to words being used in new, unexpected contexts not encountered during pre-training. In this paper, we model \textit{semantic change} by studying the effect of unexpected contexts introduced by \textit{lexical replacements}.… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  2. arXiv:2401.14040  [pdf, other

    cs.CL

    (Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection

    Authors: Francesco Periti, Haim Dubossarsky, Nina Tahmasebi

    Abstract: In the universe of Natural Language Processing, Transformer-based language models like BERT and (Chat)GPT have emerged as lexical superheroes with great power to solve open research problems. In this paper, we specifically focus on the temporal problem of semantic change, and evaluate their ability to solve two diachronic extensions of the Word-in-Context (WiC) task: TempoWiC and HistoWiC. In part… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted to the Findings of EACL 2024 (https://aclanthology.org/2024.findings-eacl.29.pdf)

  3. arXiv:2305.13214  [pdf, other

    cs.CL

    Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms

    Authors: Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Oana-Maria Camburu, Marek Rei

    Abstract: State-of-the-art neural models can now reach human performance levels across various natural language understanding tasks. However, despite this impressive performance, models are known to learn from annotation artefacts at the expense of the underlying task. While interpretability methods can identify influential features for each prediction, there are no guarantees that these features are respon… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    ACM Class: I.2.7

  4. arXiv:2304.06337  [pdf, ps, other

    cs.CL

    Computational modeling of semantic change

    Authors: Nina Tahmasebi, Haim Dubossarsky

    Abstract: In this chapter we provide an overview of computational modeling for semantic change using large and semi-large textual corpora. We aim to provide a key for the interpretation of relevant methods and evaluation techniques, and also provide insights into important aspects of the computational study of semantic change. We discuss the pros and cons of different classes of models with respect to the p… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: This chapter is submitted to Routledge Handbook of Historical Linguistics, 2nd Edition

  5. arXiv:2205.11432  [pdf, other

    cs.CL cs.LG

    Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models

    Authors: Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Marek Rei

    Abstract: Current Natural Language Inference (NLI) models achieve impressive results, sometimes outperforming humans when evaluating on in-distribution test sets. However, as these models are known to learn from annotation artefacts and dataset biases, it is unclear to what extent the models are learning the task of NLI instead of learning from shallow heuristics in their training data. We address this issu… ▽ More

    Submitted 21 October, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022

  6. arXiv:2104.08540  [pdf, other

    cs.CL

    DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages

    Authors: Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray

    Abstract: Word meaning is notoriously difficult to capture, both synchronically and diachronically. In this paper, we describe the creation of the largest resource of graded contextualized, diachronic word meaning annotation in four different languages, based on 100,000 human semantic proximity judgments. We thoroughly describe the multi-round incremental annotation process, the choice for a clustering algo… ▽ More

    Submitted 8 July, 2024; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, and Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7079--7091, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics

  7. arXiv:2101.07668  [pdf, other

    cs.CL

    Challenges for Computational Lexical Semantic Change

    Authors: Simon Hengchen, Nina Tahmasebi, Dominik Schlechtweg, Haim Dubossarsky

    Abstract: The computational study of lexical semantic change (LSC) has taken off in the past few years and we are seeing increasing interest in the field, from both computational sciences and linguistics. Most of the research so far has focused on methods for modelling and detecting semantic change using large diachronic textual data, with the majority of the approaches employing neural embeddings. While me… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: To appear in: Nina Tahmasebi, Lars Borin, Adam Jatowt, Yang Xu, Simon Hengchen (eds). Computational Approaches to Semantic Change. Berlin: Language Science Press. [preliminary page numbering]

  8. arXiv:2007.11464  [pdf, other

    cs.CL

    SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

    Authors: Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, Nina Tahmasebi

    Abstract: Lexical Semantic Change detection, i.e., the task of identifying words that change meaning over time, is a very active research area, with applications in NLP, lexicography, and linguistics. Evaluation is currently the most pressing problem in Lexical Semantic Change detection, as no gold standards are available to the community, which hinders progress. We present the results of the first shared t… ▽ More

    Submitted 28 August, 2020; v1 submitted 22 July, 2020; originally announced July 2020.

    Comments: SemEval@COLING2020, 12 pages

  9. arXiv:2004.07790  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

    Authors: Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Sebastian Riedel, Tim Rocktäschel

    Abstract: Natural Language Inference (NLI) datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes. These artefacts are exploited by neural networks even when only considering the hypothesis and ignoring the premise, leading to unwanted biases. Belinkov et al. (2019b) proposed tackling this problem via adversari… ▽ More

    Submitted 27 May, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: Accepted at EMNLP 2020

  10. arXiv:2001.11136  [pdf, other

    cs.CL

    The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures

    Authors: Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen

    Abstract: Performance in cross-lingual NLP tasks is impacted by the (dis)similarity of languages at hand: e.g., previous work has suggested there is a connection between the expected success of bilingual lexicon induction (BLI) and the assumption of (approximate) isomorphism between monolingual embedding spaces. In this work we present a large-scale study focused on the correlations between monolingual embe… ▽ More

    Submitted 12 October, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

    Comments: EMNLP 2020: Long paper

  11. Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change

    Authors: Haim Dubossarsky, Simon Hengchen, Nina Tahmasebi, Dominik Schlechtweg

    Abstract: State-of-the-art models of lexical semantic change detection suffer from noise stemming from vector space alignment. We have empirically tested the Temporal Referencing method for lexical semantic change and show that, by avoiding alignment, it is less affected by this noise. We show that, trained on a diachronic corpus, the skip-gram with negative sampling architecture with temporal referencing o… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: To appear in the 57th Annual Meeting of the Association for Computational Linguistics (ACL2019)