Skip to main content

Showing 1–11 of 11 results for author: Ranaldi, L

.
  1. arXiv:2405.00402  [pdf, other

    cs.CL

    Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models

    Authors: Leonardo Ranaldi, Andrè Freitas

    Abstract: The alignments of reasoning abilities between smaller and larger Language Models are largely conducted via Supervised Fine-Tuning (SFT) using demonstrations generated from robust Large Language Models (LLMs). Although these approaches deliver more performant models, they do not show sufficiently strong generalization ability as the training only relies on the provided demonstrations. In this pap… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  2. arXiv:2402.08100  [pdf, other

    cs.CL cs.LG

    Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

    Authors: Federico Ranaldi, Elena Sofia Ruzzetti, Dario Onorati, Leonardo Ranaldi, Cristina Giannone, Andrea Favalli, Raniero Romagnoli, Fabio Massimo Zanzotto

    Abstract: Understanding textual description to generate code seems to be an achieved capability of instruction-following Large Language Models (LLMs) in zero-shot scenario. However, there is a severe possibility that this translation ability may be influenced by having seen target textual descriptions and the related code. This effect is known as Data Contamination. In this study, we investigate the impac… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2311.09410  [pdf, other

    cs.CL cs.AI

    When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour

    Authors: Leonardo Ranaldi, Giulia Pucci

    Abstract: Large Language Models have been demonstrating the ability to solve complex tasks by delivering answers that are positively evaluated by humans due in part to the intensive use of human feedback that refines responses. However, the suggestibility transmitted through human feedback increases the inclination to produce responses that correspond to the users' beliefs or misleading prompts as opposed t… ▽ More

    Submitted 28 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  4. arXiv:2311.08097  [pdf, other

    cs.CL cs.AI

    Empowering Multi-step Reasoning across Languages via Tree-of-Thoughts

    Authors: Leonardo Ranaldi, Giulia Pucci, Federico Ranaldi, Elena Sofia Ruzzetti, Fabio Massimo Zanzotto

    Abstract: Reasoning methods, best exemplified by the well-known Chain-of-Thought (CoT), empower the reasoning abilities of Large Language Models (LLMs) by eliciting them to solve complex tasks in a step-by-step manner. Although they are achieving significant success, the ability to deliver multi-step reasoning remains limited to English because of the imbalance in the distribution of pre-training data, whic… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Findings of the Association for Computational Linguistics: NAACL 2024

    Report number: 2024.findings-naacl.78

    Journal ref: 2024.findings-naacl.78

  5. arXiv:2309.12481   

    cs.CL cs.AI

    HANS, are you clever? Clever Hans Effect Analysis of Neural Systems

    Authors: Leonardo Ranaldi, Fabio Massimo Zanzotto

    Abstract: Instruction-tuned Large Language Models (It-LLMs) have been exhibiting outstanding abilities to reason around cognitive states, intentions, and reactions of all people involved, letting humans guide and comprehend day-to-day social interactions effectively. In fact, several multiple-choice questions (MCQ) benchmarks have been proposed to construct solid assessments of the models' abilities. Howeve… ▽ More

    Submitted 2 May, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: This paper contains erroneous evaluations and we would like to withdraw it

  6. arXiv:2308.14186  [pdf, other

    cs.CL cs.AI

    Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

    Authors: Leonardo Ranaldi, Giulia Pucci, Andre Freitas

    Abstract: The language ability of Large Language Models (LLMs) is often unbalanced towards English because of the imbalance in the distribution of the pre-training data. This disparity is demanded in further fine-tuning and affecting the cross-lingual abilities of LLMs. In this paper, we propose to empower Instructiontuned LLMs (It-LLMs) in languages other than English by building semantic alignment between… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  7. arXiv:2305.13862  [pdf, other

    cs.CL

    A Trip Towards Fairness: Bias and De-Biasing in Large Language Models

    Authors: Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario Onorati, Fabio Massimo Zanzotto

    Abstract: Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training are emerging as the next big revolution in natural language processing and understanding. These CtB-LLMs are democratizing access to trainable Very Large-Language Models (VLLMs) and, thus, may represent the building blocks of many NLP systems solving downstream tasks. Hence, a little or a large bias in CtB-LLMs may cause… ▽ More

    Submitted 29 August, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  8. PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models

    Authors: Leonardo Ranaldi, Elena Sofia Ruzzetti, Fabio Massimo Zanzotto

    Abstract: Pre-trained Language Models such as BERT are impressive machines with the ability to memorize, possibly generalized learning examples. We present here a small, focused contribution to the analysis of the interplay between memorization and performance of BERT in downstream tasks. We propose PreCog, a measure for evaluating memorization from pre-training, and we analyze its correlation with the BERT… ▽ More

    Submitted 9 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Report number: 2023.ranlp-1.103

    Journal ref: 2023.ranlp-1.103

  9. arXiv:2305.02215  [pdf, other

    cs.CL cs.AI

    Exploring Linguistic Properties of Monolingual BERTs with Typological Classification among Languages

    Authors: Elena Sofia Ruzzetti, Federico Ranaldi, Felicia Logozzo, Michele Mastromattei, Leonardo Ranaldi, Fabio Massimo Zanzotto

    Abstract: The impressive achievements of transformers force NLP researchers to delve into how these models represent the underlying structure of natural language. In this paper, we propose a novel standpoint to investigate the above issue: using typological similarities among languages to observe how their respective monolingual models encode structural information. We aim to layer-wise compare transformers… ▽ More

    Submitted 29 February, 2024; v1 submitted 3 May, 2023; originally announced May 2023.

  10. The Dark Side of the Language: Pre-trained Transformers in the DarkNet

    Authors: Leonardo Ranaldi, Aria Nourbakhsh, Arianna Patrizi, Elena Sofia Ruzzetti, Dario Onorati, Francesca Fallucchi, Fabio Massimo Zanzotto

    Abstract: Pre-trained Transformers are challenging human performances in many NLP tasks. The massive datasets used for pre-training seem to be the key to their success on existing tasks. In this paper, we explore how a range of pre-trained Natural Language Understanding models perform on definitely unseen sentences provided by classification tasks over a DarkNet corpus. Surprisingly, results show that synta… ▽ More

    Submitted 17 November, 2023; v1 submitted 14 January, 2022; originally announced January 2022.

    Report number: 2023.ranlp-1.102

    Journal ref: 2023.ranlp-1.102

  11. Lacking the embedding of a word? Look it up into a traditional dictionary

    Authors: Elena Sofia Ruzzetti, Leonardo Ranaldi, Michele Mastromattei, Francesca Fallucchi, Fabio Massimo Zanzotto

    Abstract: Word embeddings are powerful dictionaries, which may easily capture language variations. However, these dictionaries fail to give sense to rare words, which are surprisingly often covered by traditional dictionaries. In this paper, we propose to use definitions retrieved in traditional dictionaries to produce word embeddings for rare words. For this purpose, we introduce two methods: Definition Ne… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Journal ref: Findings of the Association for Computational Linguistics: ACL 2022