Skip to main content

Showing 1–8 of 8 results for author: Stolfo, A

.
  1. arXiv:2406.16254  [pdf, other

    cs.LG cs.AI cs.CL

    Confidence Regulation Neurons in Language Models

    Authors: Alessandro Stolfo, Ben Wu, Wes Gurnee, Yonatan Belinkov, Xingyi Song, Mrinmaya Sachan, Neel Nanda

    Abstract: Despite their widespread use, the mechanisms by which large language models (LLMs) represent and regulate uncertainty in next-token predictions remain largely unexplored. This study investigates two critical components believed to influence this uncertainty: the recently discovered entropy neurons and a new set of components that we term token frequency neurons. Entropy neurons are characterized b… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 25 pages, 14 figures

  2. arXiv:2404.07060  [pdf, other

    cs.CL cs.LG

    Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study

    Authors: Alessandro Stolfo

    Abstract: We present an empirical study of groundedness in long-form question answering (LFQA) by retrieval-augmented large language models (LLMs). In particular, we evaluate whether every generated sentence is grounded in the retrieved documents or the model's pre-training data. Across 3 datasets and 4 model families, our findings reveal that a significant fraction of generated sentences are consistently u… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 (Findings)

  3. arXiv:2401.18070  [pdf, other

    cs.CL cs.AI cs.LG

    Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

    Authors: Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan

    Abstract: There is increasing interest in employing large language models (LLMs) as cognitive models. For such purposes, it is central to understand which properties of human cognition are well-modeled by LLMs, and which are not. In this work, we study the biases of LLMs in relation to those known in children when solving arithmetic word problems. Surveying the learning science literature, we posit that the… ▽ More

    Submitted 17 June, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted at ICML 2024

  4. arXiv:2310.14491  [pdf, other

    cs.CL

    Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

    Authors: Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan

    Abstract: Recent work has shown that language models (LMs) have strong multi-step (i.e., procedural) reasoning capabilities. However, it is unclear whether LMs perform these tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step reasoning mechanism. In this paper, we try to answer this question by exploring a mechanistic interpretation of LMs for multi-step reasoning tasks. C… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: This work is published in EMNLP 2023

  5. arXiv:2305.15054  [pdf, other

    cs.CL cs.LG

    A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

    Authors: Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan

    Abstract: Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture. In order to improve our understanding of this aspect of language models, we present a mechanistic interpretation of Transformer-based LMs on arithmetic q… ▽ More

    Submitted 20 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023. 18 pages, 19 figures

  6. arXiv:2212.00193  [pdf, other

    cs.LG cs.CL

    Distilling Reasoning Capabilities into Smaller Language Models

    Authors: Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan

    Abstract: Step-by-step reasoning approaches like chain of thought (CoT) have proved to be very effective in inducing reasoning capabilities in large language models. However, the success of the CoT approach is fundamentally tied to the model size, and billion parameter-scale models are often needed to get CoT to work. In this paper, we propose a knowledge distillation approach that leverages the step-by-ste… ▽ More

    Submitted 18 May, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: Accepted at ACL 2023 (Findings)

  7. arXiv:2210.12023  [pdf, other

    cs.CL cs.LG

    A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

    Authors: Alessandro Stolfo, Zhi**g **, Kumar Shridhar, Bernhard Schölkopf, Mrinmaya Sachan

    Abstract: We have recently witnessed a number of impressive results on hard mathematical reasoning problems with language models. At the same time, the robustness of these models has also been called into question; recent works have shown that models can rely on shallow patterns in the problem description when generating a solution. Building on the idea of behavioral testing, we propose a novel framework, w… ▽ More

    Submitted 7 June, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: ACL 2023. A shorter version of the paper was accepted at the MATH-AI Workshop at NeurIPS 2022. 15 pages, 8 figures

  8. arXiv:2210.03650  [pdf, other

    cs.CL cs.LG

    Longtonotes: OntoNotes with Longer Coreference Chains

    Authors: Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan

    Abstract: Ontonotes has served as the most important benchmark for coreference resolution. However, for ease of annotation, several long documents in Ontonotes were split into smaller parts. In this work, we build a corpus of coreference-annotated documents of significantly longer length than what is currently available. We do so by providing an accurate, manually-curated, merging of annotations from docume… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.