Skip to main content

Showing 1–20 of 20 results for author: Stoehr, N

.
  1. arXiv:2404.04633  [pdf, other

    cs.CL

    Context versus Prior Knowledge in Language Models

    Authors: Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell

    Abstract: To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: Long paper accepted at ACL 2024

  2. arXiv:2403.19851  [pdf, other

    cs.CL cs.CR cs.LG stat.ML

    Localizing Paragraph Memorization in Language Models

    Authors: Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis

    Abstract: Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern, being larger in lower model layers than gradients of non-memorized examples. Moreover, the me… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  3. arXiv:2309.06991  [pdf, other

    cs.LG cs.CL stat.ML

    Unsupervised Contrast-Consistent Ranking with Language Models

    Authors: Niklas Stoehr, Pengxiang Cheng, **g Wang, Daniel Preotiuc-Pietro, Rajarshi Bhowmik

    Abstract: Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful cal… ▽ More

    Submitted 3 February, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Long Paper at EACL 2024

  4. arXiv:2307.06954  [pdf, other

    cs.CL cs.AI

    ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification Task

    Authors: Giuseppe Russo, Niklas Stoehr, Manoel Horta Ribeiro

    Abstract: Conspiracy Theory Identication task is a new shared task proposed for the first time at the Evalita 2023. The ACTI challenge, based exclusively on comments published on conspiratorial channels of telegram, is divided into two subtasks: (i) Conspiratorial Content Classification: identifying conspiratorial content and (ii) Conspiratorial Category Classification about specific conspiracy theory class… ▽ More

    Submitted 2 September, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: Accepted at the Evalita Workshop 2023

  5. arXiv:2307.03056  [pdf, other

    cs.LG cs.AI cs.CL

    Generalizing Backpropagation for Gradient-Based Interpretability

    Authors: Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

    Abstract: Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a speci… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: Long paper accepted at ACL 2023

  6. arXiv:2306.04347  [pdf, other

    cs.CL

    World Models for Math Story Problems

    Authors: Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

    Abstract: Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models po… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: ACL Findings 2023

  7. arXiv:2302.12367  [pdf, other

    cs.CL cs.AI cs.LG

    Extracting Victim Counts from Text

    Authors: Mian Zhong, Shehzaad Dhuliawala, Niklas Stoehr

    Abstract: Decision-makers in the humanitarian sector rely on timely and exact information during crisis events. Knowing how many civilians were injured during an earthquake is vital to allocate aids properly. Information about such victim counts is often only available within full-text event descriptions from newspapers and other reports. Extracting numbers from text is challenging: numbers have different f… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: Long paper accepted at EACL 2023 main conference

    ACM Class: I.2.7; J.0

  8. arXiv:2212.04130  [pdf, other

    stat.ML cs.LG cs.SI stat.AP

    The Ordered Matrix Dirichlet for State-Space Models

    Authors: Niklas Stoehr, Benjamin J. Radford, Ryan Cotterell, Aaron Schein

    Abstract: Many dynamical systems in the real world are naturally described by latent states with intrinsic orderings, such as "ally", "neutral", and "enemy" relationships in international relations. These latent states manifest through countries' cooperative versus conflictual interactions over time. State-space models (SSMs) explicitly relate the dynamics of observed measurements to transitions in latent s… ▽ More

    Submitted 25 February, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: Presented at the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

  9. arXiv:2211.11360  [pdf, other

    cs.CL cs.AI cs.LG

    Extended Multilingual Protest News Detection -- Shared Task 1, CASE 2021 and 2022

    Authors: Ali Hürriyetoğlu, Osman Mutlu, Fırat Duruşan, Onur Uca, Alaeddin Selçuk Gürel, Benjamin Radford, Yaoyao Dai, Hansi Hettiarachchi, Niklas Stoehr, Tadashi Nomoto, Milena Slavcheva, Francielle Vargas, Aaqib Javid, Fatih Beyhan, Erdem Yörük

    Abstract: We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: To appear in CASE 2022 @ EMNLP 2022

  10. arXiv:2211.06420  [pdf, other

    cs.CL cs.LG

    The Architectural Bottleneck Principle

    Authors: Tiago Pimentel, Josef Valvoda, Niklas Stoehr, Ryan Cotterell

    Abstract: In this paper, we seek to measure how much information a component in a neural network could extract from the representations fed into it. Our work stands in contrast to prior probing work, most of which investigates how much information a model's representations contain. This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022. Tiago Pimentel and Josef Valvoda contributed equally to this work. Code available in https://github.com/rycolab/attentional-probe

  11. arXiv:2210.05257  [pdf, other

    cs.CL cs.HC cs.LG

    Rethinking the Event Coding Pipeline with Prompt Entailment

    Authors: Clément Lefebvre, Niklas Stoehr

    Abstract: For monitoring crises, political events are extracted from the news. The large amount of unstructured full-text event descriptions makes a case-by-case analysis unmanageable, particularly for low-resource humanitarian aid organizations. This creates a demand to classify events into event types, a task referred to as event coding. Typically, domain experts craft an event type ontology, annotators l… ▽ More

    Submitted 5 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Revised from published version in the proceedings of the FEVER workshop at EACL 2023

  12. arXiv:2210.03971  [pdf, other

    cs.LG stat.AP

    An Ordinal Latent Variable Model of Conflict Intensity

    Authors: Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein

    Abstract: Measuring the intensity of events is crucial for monitoring and tracking armed conflict. Advances in automated event extraction have yielded massive data sets of "who did what to whom" micro-records that enable data-driven approaches to monitoring conflict. The Goldstein scale is a widely-used expert-based measure that scores events on a conflictual-cooperative scale. It is based only on the actio… ▽ More

    Submitted 4 June, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

    Comments: Long Paper at ACL 2023

  13. arXiv:2205.03608  [pdf, other

    cs.CL

    UniMorph 4.0: Universal Morphology

    Authors: Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay , et al. (71 additional authors not shown)

    Abstract: The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This pa… ▽ More

    Submitted 19 June, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: LREC 2022; The first two authors made equal contributions

  14. arXiv:2109.12860  [pdf, other

    cs.CL cs.LG cs.SI

    Classifying Dyads for Militarized Conflict Analysis

    Authors: Niklas Stoehr, Lucas Torroba Hennigen, Samin Ahbab, Robert West, Ryan Cotterell

    Abstract: Understanding the origins of militarized conflict is a complex, yet important undertaking. Existing research seeks to build this understanding by considering bi-lateral relationships between entity pairs (dyadic causes) and multi-lateral relationships among multiple entities (systemic causes). The aim of this work is to compare these two causes in terms of how they correlate with conflict between… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  15. arXiv:2107.12443  [pdf, other

    cs.HC

    SeismographAPI: Visualising Temporal-Spatial Crisis Data

    Authors: Raphael Lepuschitz, Niklas Stoehr

    Abstract: Effective decision-making for crisis mitigation increasingly relies on visualisation of large amounts of data. While interactive dashboards are more informative than static visualisations, their development is far more time-demanding and requires a range of technical and financial capabilities. There are few open-source libraries available, which is blocking contributions from low-resource environ… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: Published in the KDD Workshop on Data-driven Humanitarian Map** held with the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '21), August 14, 2021

    ACM Class: H.5.2; J.0

  16. arXiv:2105.00997  [pdf, other

    cs.SI cs.LG stat.ML

    Recovering Barabási-Albert Parameters of Graphs through Disentanglement

    Authors: Cristina Guzman, Daphna Keidar, Tristan Meynier, Andreas Opedal, Niklas Stoehr

    Abstract: Classical graph modeling approaches such as Erdős Rényi (ER) random graphs or Barabási-Albert (BA) graphs, here referred to as stylized models, aim to reproduce properties of real-world graphs in an interpretable way. While useful, graph generation with stylized models requires domain knowledge and iterative trial and error simulation. Previous work by Stoehr et al. (2019) addresses these issues b… ▽ More

    Submitted 4 May, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

    Comments: Accepted at the 9th International Conference on Learning Representations (ICLR 2021), Workshop on Geometrical and Topological Representation Learning

  17. arXiv:2104.12133  [pdf, other

    cs.CY

    What About the Precedent: An Information-Theoretic Analysis of Common Law

    Authors: Josef Valvoda, Tiago Pimentel, Niklas Stoehr, Ryan Cotterell, Simone Teufel

    Abstract: In common law, the outcome of a new case is determined mostly by precedent cases, rather than by existing statutes. However, how exactly does the precedent influence the outcome of a new case? Answering this question is crucial for guaranteeing fair and consistent judicial decision-making. We are the first to approach this question computationally by comparing two longstanding jurisprudential view… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.

  18. arXiv:2003.12432  [pdf, other

    econ.GN

    The CoRisk-Index: A data-mining approach to identify industry-specific risk assessments related to COVID-19 in real-time

    Authors: Fabian Stephany, Niklas Stoehr, Philipp Darius, Leonie Neuhäuser, Ole Teutloff, Fabian Braesemann

    Abstract: While the coronavirus spreads, governments are attempting to reduce contagion rates at the expense of negative economic effects. Market expectations plummeted, foreshadowing the risk of a global economic crisis and mass unemployment. Governments provide huge financial aid programmes to mitigate the economic shocks. To achieve higher effectiveness with such policy measures, it is key to identify th… ▽ More

    Submitted 27 April, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

  19. arXiv:1912.10097  [pdf, other

    cs.SI econ.GN

    Mining the Automotive Industry: A Network Analysis of Corporate Positioning and Technological Trends

    Authors: Niklas Stoehr, Fabian Braesemann, Michael Frommelt, Shi Zhou

    Abstract: The digital transformation is driving revolutionary innovations and new market entrants threaten established sectors of the economy such as the automotive industry. Following the need for monitoring shifting industries, we present a network-centred analysis of car manufacturer web pages. Solely exploiting publicly-available information, we construct large networks from web pages and hyperlinks. Th… ▽ More

    Submitted 8 January, 2020; v1 submitted 20 December, 2019; originally announced December 2019.

    Comments: Preprint version to be published in Springer Nature (presented at CompleNet 2020)

  20. arXiv:1910.05639  [pdf, other

    cs.LG cs.SI nlin.CD stat.ML

    Disentangling Interpretable Generative Parameters of Random and Real-World Graphs

    Authors: Niklas Stoehr, Emine Yilmaz, Marc Brockschmidt, Jan Stuehmer

    Abstract: While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer littl… ▽ More

    Submitted 6 November, 2019; v1 submitted 12 October, 2019; originally announced October 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Workshop on Graph Representation Learning