Skip to main content

Showing 1–20 of 20 results for author: Lenci, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.14859  [pdf, other

    cs.CL cs.AI

    Comparing Plausibility Estimates in Base and Instruction-Tuned Large Language Models

    Authors: Carina Kauf, Emmanuele Chersoni, Alessandro Lenci, Evelina Fedorenko, Anna A. Ivanova

    Abstract: Instruction-tuned LLMs can respond to explicit queries formulated as prompts, which greatly facilitates interaction with human users. However, prompt-based approaches might not always be able to tap into the wealth of implicit knowledge acquired by LLMs during pre-training. This paper presents a comprehensive study of ways to evaluate semantic plausibility in LLMs. We compare base and instruction-… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  2. arXiv:2307.02910  [pdf

    cs.CL

    Agentività e telicità in GilBERTo: implicazioni cognitive

    Authors: Agnese Lombardi, Alessandro Lenci

    Abstract: The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics and use this information for the completion of morphosyntactic patterns. The semantic properties considered are telicity (also combined with definiteness) and agentivity. Both act at the interface between semantics and morphosyntax: they are semantically determined and syntactically… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 7 pages, in Italian language

  3. arXiv:2303.04229  [pdf

    cs.AI cs.CL

    Understanding Natural Language Understanding Systems. A Critical Analysis

    Authors: Alessandro Lenci

    Abstract: The development of machines that «talk like us», also known as Natural Language Understanding (NLU) systems, is the Holy Grail of Artificial Intelligence (AI), since language is the quintessence of human intelligence. The brief but intense life of NLU research in AI and Natural Language Processing (NLP) is full of ups and downs, with periods of high hopes that the Grail is finally within reach, ty… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: to appear in Sistemi Intelligenti

  4. arXiv:2212.01488  [pdf

    cs.CL cs.AI

    Event knowledge in large language models: the gap between the impossible and the unlikely

    Authors: Carina Kauf, Anna A. Ivanova, Giulia Rambelli, Emmanuele Chersoni, **gyuan Selena She, Zawad Chowdhury, Evelina Fedorenko, Alessandro Lenci

    Abstract: Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of co… ▽ More

    Submitted 26 October, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: The two lead authors have contributed equally to this work

  5. arXiv:2211.04427  [pdf, other

    cs.CL cs.LG cs.NE

    Word Order Matters when you Increase Masking

    Authors: Karim Lasri, Alessandro Lenci, Thierry Poibeau

    Abstract: Word order, an essential property of natural languages, is injected in Transformer-based neural language models using position encoding. However, recent experiments have shown that explicit position encoding is not always useful, since some models without such feature managed to achieve state-of-the art performance on some tasks. To understand better this phenomenon, we examine the effect of remov… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022 (main conference)

  6. arXiv:2209.10538  [pdf, other

    cs.CL

    Subject Verb Agreement Error Patterns in Meaningless Sentences: Humans vs. BERT

    Authors: Karim Lasri, Olga Seminck, Alessandro Lenci, Thierry Poibeau

    Abstract: Both humans and neural language models are able to perform subject-verb number agreement (SVA). In principle, semantics shouldn't interfere with this task, which only requires syntactic knowledge. In this work we test whether meaning interferes with this type of agreement in English in syntactic structures of various complexities. To do so, we generate both semantically well-formed and nonsensical… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: COLING 2022 Main Conference (The 29th international conference on computational linguistics)

  7. Probing for the Usage of Grammatical Number

    Authors: Karim Lasri, Tiago Pimentel, Alessandro Lenci, Thierry Poibeau, Ryan Cotterell

    Abstract: A central quest of probing is to uncover how pre-trained models encode a linguistic property within their representations. An encoding, however, might be spurious-i.e., the model might not rely on it when making predictions. In this paper, we try to find encodings that the model actually uses, introducing a usage-based probing setup. We first choose a behavioral task which cannot be solved without… ▽ More

    Submitted 22 May, 2024; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: ACL 2022 (Main Conference) The discussion section had been inadvertently removed before the article was published on arxiv

  8. Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task

    Authors: Karim Lasri, Alessandro Lenci, Thierry Poibeau

    Abstract: Although transformer-based Neural Language Models demonstrate impressive performance on a variety of tasks, their generalization abilities are not well understood. They have been shown to perform strongly on subject-verb number agreement in a wide array of settings, suggesting that they learned to track syntactic dependencies during their training even without explicit supervision. In this paper,… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: ACL 2022 (Findings)

  9. arXiv:2107.10922  [pdf, other

    cs.CL

    Did the Cat Drink the Coffee? Challenging Transformers with Generalized Event Knowledge

    Authors: Paolo Pedinotti, Giulia Rambelli, Emmanuele Chersoni, Enrico Santus, Alessandro Lenci, Philippe Blache

    Abstract: Prior research has explored the ability of computational models to predict a word semantic fit with a given predicate. While much work has been devoted to modeling the typicality relation between verbs and arguments in isolation, in this paper we take a broader perspective by assessing whether and to what extent computational approaches have access to the information about the typicality of entire… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

  10. arXiv:2105.09825  [pdf

    cs.CL cs.AI

    A comparative evaluation and analysis of three generations of Distributional Semantic Models

    Authors: Alessandro Lenci, Magnus Sahlgren, Patrick Jeuniaux, Amaru Cuba Gyllensten, Martina Miliani

    Abstract: Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by Transformer neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a… ▽ More

    Submitted 1 April, 2022; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: Language Resources and Evaluation

  11. arXiv:1906.07280  [pdf, other

    cs.CL

    A Structured Distributional Model of Sentence Meaning and Processing

    Authors: Emmanuele Chersoni, Enrico Santus, Ludovica Pannitto, Alessandro Lenci, Philippe Blache, Chu-Ren Huang

    Abstract: Most compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a Structured Distributional Model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from Discourse Representation Th… ▽ More

    Submitted 17 June, 2019; originally announced June 2019.

    Comments: accepted at JLNE; Journal of Natural Language Engineering; 26 pages, thematic fit, selectional preference, natural language processing, nlp, ai

  12. arXiv:1710.00998  [pdf, other

    cs.CL

    Is Structure Necessary for Modeling Argument Expectations in Distributional Semantics?

    Authors: Emmanuele Chersoni, Enrico Santus, Philippe Blache, Alessandro Lenci

    Abstract: Despite the number of NLP studies dedicated to thematic fit estimation, little attention has been paid to the related task of composing and updating verb argument expectations. The few exceptions have mostly modeled this phenomenon with structured distributional models, implicitly assuming a similarly structured representation of events. Recent experimental evidence, however, suggests that human p… ▽ More

    Submitted 3 October, 2017; originally announced October 2017.

    Comments: conference paper, IWCS

  13. arXiv:1707.05967  [pdf, other

    cs.CL

    Measuring Thematic Fit with Distributional Feature Overlap

    Authors: Enrico Santus, Emmanuele Chersoni, Alessandro Lenci, Philippe Blache

    Abstract: In this paper, we introduce a new distributional method for modeling predicate-argument thematic fit judgments. We use a syntax-based DSM to build a prototypical representation of verb-specific roles: for every verb, we extract the most salient second order contexts for each of its roles (i.e. the most salient dimensions of typical role fillers), and then we compute thematic fit as a weighted over… ▽ More

    Submitted 26 July, 2017; v1 submitted 19 July, 2017; originally announced July 2017.

    Comments: 9 pages, 2 figures, 5 tables, EMNLP, 2017, thematic fit, selectional preference, semantic role, DSMs, Distributional Semantic Models, Vector Space Models, VSMs, cosine, APSyn, similarity, prototype

  14. arXiv:1609.08293  [pdf, ps, other

    cs.CL

    The Effects of Data Size and Frequency Range on Distributional Semantic Models

    Authors: Magnus Sahlgren, Alessandro Lenci

    Abstract: This paper investigates the effects of data size and frequency range on distributional semantic models. We compare the performance of a number of representative models for several test settings over data of varying sizes, and over test items of various frequency. Our results show that neural network-based models underperform when the data is small, and that the most reliable model over data of var… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: Accepted at EMNLP 2016

  15. arXiv:1608.07738  [pdf, other

    cs.CL

    Testing APSyn against Vector Cosine on Similarity Estimation

    Authors: Enrico Santus, Emmanuele Chersoni, Alessandro Lenci, Chu-Ren Huang, Philippe Blache

    Abstract: In Distributional Semantic Models (DSMs), Vector Cosine is widely used to estimate similarity between word vectors, although this measure was noticed to suffer from several shortcomings. The recent literature has proposed other methods which attempt to mitigate such biases. In this paper, we intend to investigate APSyn, a measure that computes the extent of the intersection between the most associ… ▽ More

    Submitted 5 October, 2016; v1 submitted 27 August, 2016; originally announced August 2016.

    Comments: 8 pages, 1 figure, 4 tables, PACLIC, cosine, vectors, DSMs

  16. arXiv:1607.02061  [pdf, ps, other

    cs.CL cs.AI

    Representing Verbs with Rich Contexts: an Evaluation on Verb Similarity

    Authors: Emmanuele Chersoni, Enrico Santus, Alessandro Lenci, Philippe Blache, Chu-Ren Huang

    Abstract: Several studies on sentence processing suggest that the mental lexicon keeps track of the mutual expectations between words. Current DSMs, however, represent context words as separate features, thereby loosing important information for word expectations, such as word interrelations. In this paper, we present a DSM that addresses this issue by defining verb contexts as joint syntactic dependencies.… ▽ More

    Submitted 5 October, 2016; v1 submitted 7 July, 2016; originally announced July 2016.

    Comments: 5 pages

  17. arXiv:1603.09054  [pdf

    cs.CL

    Unsupervised Measure of Word Similarity: How to Outperform Co-occurrence and Vector Cosine in VSMs

    Authors: Enrico Santus, Tin-Shing Chiu, Qin Lu, Alessandro Lenci, Chu-Ren Huang

    Abstract: In this paper, we claim that vector cosine, which is generally considered among the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by an unsupervised measure that calculates the extent of the intersection among the most mutually dependent contexts of the target words. To prove it, we describe and evaluate APSyn, a variant of the Ave… ▽ More

    Submitted 30 March, 2016; originally announced March 2016.

    Comments: in AAAI 2016. arXiv admin note: substantial text overlap with arXiv:1603.08701

  18. arXiv:1603.08705  [pdf

    cs.CL

    ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms

    Authors: Enrico Santus, Tin-Shing Chiu, Qin Lu, Alessandro Lenci, Chu-Ren Huang

    Abstract: In this paper, we describe ROOT13, a supervised system for the classification of hypernyms, co-hyponyms and random words. The system relies on a Random Forest algorithm and 13 unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When al… ▽ More

    Submitted 29 March, 2016; originally announced March 2016.

    Comments: in AAAI 2016

  19. arXiv:1603.08702  [pdf

    cs.CL

    Nine Features in a Random Forest to Learn Taxonomical Semantic Relations

    Authors: Enrico Santus, Alessandro Lenci, Tin-Shing Chiu, Qin Lu, Chu-Ren Huang

    Abstract: ROOT9 is a supervised system for the classification of hypernyms, co-hyponyms and random words that is derived from the already introduced ROOT13 (Santus et al., 2016). It relies on a Random Forest algorithm and nine unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i… ▽ More

    Submitted 29 March, 2016; originally announced March 2016.

    Comments: in LREC 2016

  20. arXiv:1603.08701  [pdf

    cs.CL

    What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

    Authors: Enrico Santus, Tin-Shing Chiu, Qin Lu, Alessandro Lenci, Chu-Ren Huang

    Abstract: In this paper, we claim that Vector Cosine, which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared… ▽ More

    Submitted 29 March, 2016; originally announced March 2016.

    Comments: in LREC 2016