Search | arXiv e-print repository

Context versus Prior Knowledge in Language Models

Authors: Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell

Abstract: To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to… ▽ More To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits. △ Less

Submitted 16 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

Comments: Long paper accepted at ACL 2024

arXiv:2403.19851 [pdf, other]

Localizing Paragraph Memorization in Language Models

Authors: Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis

Abstract: Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern, being larger in lower model layers than gradients of non-memorized examples. Moreover, the me… ▽ More Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern, being larger in lower model layers than gradients of non-memorized examples. Moreover, the memorized examples can be unlearned by fine-tuning only the high-gradient weights. We localize a low-layer attention head that appears to be especially involved in paragraph memorization. This head is predominantly focusing its attention on distinctive, rare tokens that are least frequent in a corpus-level unigram distribution. Next, we study how localized memorization is across the tokens in the prefix by perturbing tokens and measuring the caused change in the decoding. A few distinctive tokens early in a prefix can often corrupt the entire continuation. Overall, memorized continuations are not only harder to unlearn, but also to corrupt than non-memorized ones. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2309.06991 [pdf, other]

Unsupervised Contrast-Consistent Ranking with Language Models

Authors: Niklas Stoehr, Pengxiang Cheng, **g Wang, Daniel Preotiuc-Pietro, Rajarshi Bhowmik

Abstract: Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful cal… ▽ More Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce. This motivates us to explore an alternative approach that is inspired by an unsupervised probing method called Contrast-Consistent Search (CCS). The idea is to train a probe guided by a logical constraint: a language model's representation of a statement and its negation must be mapped to contrastive true-false poles consistently across multiple statements. We hypothesize that similar constraints apply to ranking tasks where all items are related via consistent, pairwise or listwise comparisons. To this end, we extend the binary CCS method to Contrast-Consistent Ranking (CCR) by adapting existing ranking methods such as the Max-Margin Loss, Triplet Loss and an Ordinal Regression objective. Across different models and datasets, our results confirm that CCR probing performs better or, at least, on a par with prompting. △ Less

Submitted 3 February, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: Long Paper at EACL 2024

arXiv:2307.06954 [pdf, other]

ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification Task

Authors: Giuseppe Russo, Niklas Stoehr, Manoel Horta Ribeiro

Abstract: Conspiracy Theory Identication task is a new shared task proposed for the first time at the Evalita 2023. The ACTI challenge, based exclusively on comments published on conspiratorial channels of telegram, is divided into two subtasks: (i) Conspiratorial Content Classification: identifying conspiratorial content and (ii) Conspiratorial Category Classification about specific conspiracy theory class… ▽ More Conspiracy Theory Identication task is a new shared task proposed for the first time at the Evalita 2023. The ACTI challenge, based exclusively on comments published on conspiratorial channels of telegram, is divided into two subtasks: (i) Conspiratorial Content Classification: identifying conspiratorial content and (ii) Conspiratorial Category Classification about specific conspiracy theory classification. A total of fifteen teams participated in the task for a total of 81 submissions. We illustrate the best performing approaches were based on the utilization of large language models. We finally draw conclusions about the utilization of these models for counteracting the spreading of misinformation in online platforms. △ Less

Submitted 2 September, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: Accepted at the Evalita Workshop 2023

arXiv:2307.03056 [pdf, other]

Generalizing Backpropagation for Gradient-Based Interpretability

Authors: Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

Abstract: Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a speci… ▽ More Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a special case of a more general formulation using semirings. This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics about the gradient graph of a neural network, such as the highest-weighted path and entropy. We implement this generalized algorithm, evaluate it on synthetic datasets to better understand the statistics it computes, and apply it to study BERT's behavior on the subject-verb number agreement task (SVA). With this method, we (a) validate that the amount of gradient flow through a component of a model reflects its importance to a prediction and (b) for SVA, identify which pathways of the self-attention mechanism are most important. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: Long paper accepted at ACL 2023

arXiv:2306.04347 [pdf, other]

World Models for Math Story Problems

Authors: Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

Abstract: Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models po… ▽ More Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models possess accurate representations of mathematical concepts. This leads to lack of interpretability and trustworthiness which impedes their usefulness in various applications. In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems. With MathWorld, we can assign world models to math story problems which represent the situations and actions introduced in the text and their mathematical relationships. We combine math story problems from several existing datasets and annotate a corpus of 1,019 problems and 3,204 logical forms with MathWorld. Using this data, we demonstrate the following use cases of MathWorld: (1) prompting language models with synthetically generated question-answer pairs to probe their reasoning and world modeling abilities, and (2) generating new problems by using the world models as a design space. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: ACL Findings 2023

arXiv:2302.12367 [pdf, other]

Extracting Victim Counts from Text

Authors: Mian Zhong, Shehzaad Dhuliawala, Niklas Stoehr

Abstract: Decision-makers in the humanitarian sector rely on timely and exact information during crisis events. Knowing how many civilians were injured during an earthquake is vital to allocate aids properly. Information about such victim counts is often only available within full-text event descriptions from newspapers and other reports. Extracting numbers from text is challenging: numbers have different f… ▽ More Decision-makers in the humanitarian sector rely on timely and exact information during crisis events. Knowing how many civilians were injured during an earthquake is vital to allocate aids properly. Information about such victim counts is often only available within full-text event descriptions from newspapers and other reports. Extracting numbers from text is challenging: numbers have different formats and may require numeric reasoning. This renders purely string matching-based approaches insufficient. As a consequence, fine-grained counts of injured, displaced, or abused victims beyond fatalities are often not extracted and remain unseen. We cast victim count extraction as a question answering (QA) task with a regression or classification objective. We compare regex, dependency parsing, semantic role labeling-based approaches, and advanced text-to-text models. Beyond model accuracy, we analyze extraction reliability and robustness which are key for this sensitive task. In particular, we discuss model calibration and investigate few-shot and out-of-distribution performance. Ultimately, we make a comprehensive recommendation on which model to select for different desiderata and data domains. Our work is among the first to apply numeracy-focused large language models in a real-world use case with a positive impact. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Long paper accepted at EACL 2023 main conference

ACM Class: I.2.7; J.0

arXiv:2212.04130 [pdf, other]

The Ordered Matrix Dirichlet for State-Space Models

Authors: Niklas Stoehr, Benjamin J. Radford, Ryan Cotterell, Aaron Schein

Abstract: Many dynamical systems in the real world are naturally described by latent states with intrinsic orderings, such as "ally", "neutral", and "enemy" relationships in international relations. These latent states manifest through countries' cooperative versus conflictual interactions over time. State-space models (SSMs) explicitly relate the dynamics of observed measurements to transitions in latent s… ▽ More Many dynamical systems in the real world are naturally described by latent states with intrinsic orderings, such as "ally", "neutral", and "enemy" relationships in international relations. These latent states manifest through countries' cooperative versus conflictual interactions over time. State-space models (SSMs) explicitly relate the dynamics of observed measurements to transitions in latent states. For discrete data, SSMs commonly do so through a state-to-action emission matrix and a state-to-state transition matrix. This paper introduces the Ordered Matrix Dirichlet (OMD) as a prior distribution over ordered stochastic matrices wherein the discrete distribution in the kth row stochastically dominates the (k+1)th, such that probability mass is shifted to the right when moving down rows. We illustrate the OMD prior within two SSMs: a hidden Markov model, and a novel dynamic Poisson Tucker decomposition model tailored to international relations data. We find that models built on the OMD recover interpretable ordered latent structure without forfeiting predictive performance. We suggest future applications to other domains where models with stochastic matrices are popular (e.g., topic modeling), and publish user-friendly code. △ Less

Submitted 25 February, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

Comments: Presented at the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

arXiv:2211.11360 [pdf, other]

Extended Multilingual Protest News Detection -- Shared Task 1, CASE 2021 and 2022

Authors: Ali Hürriyetoğlu, Osman Mutlu, Fırat Duruşan, Onur Uca, Alaeddin Selçuk Gürel, Benjamin Radford, Yaoyao Dai, Hansi Hettiarachchi, Niklas Stoehr, Tadashi Nomoto, Milena Slavcheva, Francielle Vargas, Aaqib Javid, Fatih Beyhan, Erdem Yörük

Abstract: We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available… ▽ More We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification. The training data from CASE 2021 in English, Portuguese and Spanish were utilized. Therefore, predicting document labels in Hindi, Mandarin, Turkish, and Urdu occurs in a zero-shot setting. The CASE 2022 workshop accepts reports on systems developed for predicting test data of CASE 2021 as well. We observe that the best systems submitted by CASE 2022 participants achieve between 79.71 and 84.06 F1-macro for new languages in a zero-shot setting. The winning approaches are mainly ensembling models and merging data in multiple languages. The best two submissions on CASE 2021 data outperform submissions from last year for Subtask 1 and Subtask 2 in all languages. Only the following scenarios were not outperformed by new submissions on CASE 2021: Subtask 3 Portuguese \& Subtask 4 English. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: To appear in CASE 2022 @ EMNLP 2022

arXiv:2211.06420 [pdf, other]

The Architectural Bottleneck Principle

Authors: Tiago Pimentel, Josef Valvoda, Niklas Stoehr, Ryan Cotterell

Abstract: In this paper, we seek to measure how much information a component in a neural network could extract from the representations fed into it. Our work stands in contrast to prior probing work, most of which investigates how much information a model's representations contain. This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to… ▽ More In this paper, we seek to measure how much information a component in a neural network could extract from the representations fed into it. Our work stands in contrast to prior probing work, most of which investigates how much information a model's representations contain. This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to estimate how much information a given component could extract, a probe should look exactly like the component. Relying on this principle, we estimate how much syntactic information is available to transformers through our attentional probe, a probe that exactly resembles a transformer's self-attention head. Experimentally, we find that, in three models (BERT, ALBERT, and RoBERTa), a sentence's syntax tree is mostly extractable by our probe, suggesting these models have access to syntactic information while composing their contextual representations. Whether this information is actually used by these models, however, remains an open question. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: Accepted at EMNLP 2022. Tiago Pimentel and Josef Valvoda contributed equally to this work. Code available in https://github.com/rycolab/attentional-probe

arXiv:2210.05257 [pdf, other]

Rethinking the Event Coding Pipeline with Prompt Entailment

Authors: Clément Lefebvre, Niklas Stoehr

Abstract: For monitoring crises, political events are extracted from the news. The large amount of unstructured full-text event descriptions makes a case-by-case analysis unmanageable, particularly for low-resource humanitarian aid organizations. This creates a demand to classify events into event types, a task referred to as event coding. Typically, domain experts craft an event type ontology, annotators l… ▽ More For monitoring crises, political events are extracted from the news. The large amount of unstructured full-text event descriptions makes a case-by-case analysis unmanageable, particularly for low-resource humanitarian aid organizations. This creates a demand to classify events into event types, a task referred to as event coding. Typically, domain experts craft an event type ontology, annotators label a large dataset and technical experts develop a supervised coding system. In this work, we propose PR-ENT, a new event coding approach that is more flexible and resource-efficient, while maintaining competitive accuracy: first, we extend an event description such as "Military injured two civilians'' by a template, e.g. "People were [Z]" and prompt a pre-trained (cloze) language model to fill the slot Z. Second, we select answer candidates Z* = {"injured'', "hurt"...} by treating the event description as premise and the filled templates as hypothesis in a textual entailment task. This allows domain experts to draft the codebook directly as labeled prompts and interpretable answer candidates. This human-in-the-loop process is guided by our interactive codebook design tool. We evaluate PR-ENT in several robustness checks: perturbing the event description and prompt template, restricting the vocabulary and removing contextual information. △ Less

Submitted 5 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Revised from published version in the proceedings of the FEVER workshop at EACL 2023

arXiv:2210.03971 [pdf, other]

An Ordinal Latent Variable Model of Conflict Intensity

Authors: Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein

Abstract: Measuring the intensity of events is crucial for monitoring and tracking armed conflict. Advances in automated event extraction have yielded massive data sets of "who did what to whom" micro-records that enable data-driven approaches to monitoring conflict. The Goldstein scale is a widely-used expert-based measure that scores events on a conflictual-cooperative scale. It is based only on the actio… ▽ More Measuring the intensity of events is crucial for monitoring and tracking armed conflict. Advances in automated event extraction have yielded massive data sets of "who did what to whom" micro-records that enable data-driven approaches to monitoring conflict. The Goldstein scale is a widely-used expert-based measure that scores events on a conflictual-cooperative scale. It is based only on the action category ("what") and disregards the subject ("who") and object ("to whom") of an event, as well as contextual information, like associated casualty count, that should contribute to the perception of an event's "intensity". This paper takes a latent variable-based approach to measuring conflict intensity. We introduce a probabilistic generative model that assumes each observed event is associated with a latent intensity class. A novel aspect of this model is that it imposes an ordering on the classes, such that higher-valued classes denote higher levels of intensity. The ordinal nature of the latent variable is induced from naturally ordered aspects of the data (e.g., casualty counts) where higher values naturally indicate higher intensity. We evaluate the proposed model both intrinsically and extrinsically, showing that it obtains comparatively good held-out predictive performance. △ Less

Submitted 4 June, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

Comments: Long Paper at ACL 2023

arXiv:2205.03608 [pdf, other]

UniMorph 4.0: Universal Morphology

Authors: Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay , et al. (71 additional authors not shown)

Abstract: The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This pa… ▽ More The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This paper presents the expansions and improvements made on several fronts over the last couple of years (since McCarthy et al. (2020)). Collaborative efforts by numerous linguists have added 67 new languages, including 30 endangered languages. We have implemented several improvements to the extraction pipeline to tackle some issues, e.g. missing gender and macron information. We have also amended the schema to use a hierarchical structure that is needed for morphological phenomena like multiple-argument agreement and case stacking, while adding some missing morphological features to make the schema more inclusive. In light of the last UniMorph release, we also augmented the database with morpheme segmentation for 16 languages. Lastly, this new release makes a push towards inclusion of derivational morphology in UniMorph by enriching the data and annotation schema with instances representing derivational processes from MorphyNet. △ Less

Submitted 19 June, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: LREC 2022; The first two authors made equal contributions

arXiv:2109.12860 [pdf, other]

Classifying Dyads for Militarized Conflict Analysis

Authors: Niklas Stoehr, Lucas Torroba Hennigen, Samin Ahbab, Robert West, Ryan Cotterell

Abstract: Understanding the origins of militarized conflict is a complex, yet important undertaking. Existing research seeks to build this understanding by considering bi-lateral relationships between entity pairs (dyadic causes) and multi-lateral relationships among multiple entities (systemic causes). The aim of this work is to compare these two causes in terms of how they correlate with conflict between… ▽ More Understanding the origins of militarized conflict is a complex, yet important undertaking. Existing research seeks to build this understanding by considering bi-lateral relationships between entity pairs (dyadic causes) and multi-lateral relationships among multiple entities (systemic causes). The aim of this work is to compare these two causes in terms of how they correlate with conflict between two entities. We do this by devising a set of textual and graph-based features which represent each of the causes. The features are extracted from Wikipedia and modeled as a large graph. Nodes in this graph represent entities connected by labeled edges representing ally or enemy-relationships. This allows casting the problem as an edge classification task, which we term dyad classification. We propose and evaluate classifiers to determine if a particular pair of entities are allies or enemies. Our results suggest that our systemic features might be slightly better correlates of conflict. Further, we find that Wikipedia articles of allies are semantically more similar than enemies. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Comments: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)

arXiv:2107.12443 [pdf, other]

SeismographAPI: Visualising Temporal-Spatial Crisis Data

Authors: Raphael Lepuschitz, Niklas Stoehr

Abstract: Effective decision-making for crisis mitigation increasingly relies on visualisation of large amounts of data. While interactive dashboards are more informative than static visualisations, their development is far more time-demanding and requires a range of technical and financial capabilities. There are few open-source libraries available, which is blocking contributions from low-resource environ… ▽ More Effective decision-making for crisis mitigation increasingly relies on visualisation of large amounts of data. While interactive dashboards are more informative than static visualisations, their development is far more time-demanding and requires a range of technical and financial capabilities. There are few open-source libraries available, which is blocking contributions from low-resource environments and impeding rapid crisis responses. To address these limitations, we present SeismographAPI, an open-source library for visualising temporal-spatial crisis data on the country- and sub-country level in two use cases: Conflict Monitoring Map and Pandemic Monitoring Map. The library provides easy-to-use data connectors, broad functionality, clear documentation and run time-efficiency. △ Less

Submitted 26 July, 2021; originally announced July 2021.

Comments: Published in the KDD Workshop on Data-driven Humanitarian Map** held with the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '21), August 14, 2021

ACM Class: H.5.2; J.0

arXiv:2105.00997 [pdf, other]

Recovering Barabási-Albert Parameters of Graphs through Disentanglement

Authors: Cristina Guzman, Daphna Keidar, Tristan Meynier, Andreas Opedal, Niklas Stoehr

Abstract: Classical graph modeling approaches such as Erdős Rényi (ER) random graphs or Barabási-Albert (BA) graphs, here referred to as stylized models, aim to reproduce properties of real-world graphs in an interpretable way. While useful, graph generation with stylized models requires domain knowledge and iterative trial and error simulation. Previous work by Stoehr et al. (2019) addresses these issues b… ▽ More Classical graph modeling approaches such as Erdős Rényi (ER) random graphs or Barabási-Albert (BA) graphs, here referred to as stylized models, aim to reproduce properties of real-world graphs in an interpretable way. While useful, graph generation with stylized models requires domain knowledge and iterative trial and error simulation. Previous work by Stoehr et al. (2019) addresses these issues by learning the generation process from graph data, using a disentanglement-focused deep autoencoding framework, more specifically, a $β$-Variational Autoencoder ($β$-VAE). While they successfully recover the generative parameters of ER graphs through the model's latent variables, their model performs badly on sequentially generated graphs such as BA graphs, due to their oversimplified decoder. We focus on recovering the generative parameters of BA graphs by replacing their $β$-VAE decoder with a sequential one. We first learn the generative BA parameters in a supervised fashion using a Graph Neural Network (GNN) and a Random Forest Regressor, by minimizing the squared loss between the true generative parameters and the latent variables. Next, we train a $β$-VAE model, combining the GNN encoder from the first stage with an LSTM-based decoder with a customized loss. △ Less

Submitted 4 May, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: Accepted at the 9th International Conference on Learning Representations (ICLR 2021), Workshop on Geometrical and Topological Representation Learning

arXiv:2104.12133 [pdf, other]

What About the Precedent: An Information-Theoretic Analysis of Common Law

Authors: Josef Valvoda, Tiago Pimentel, Niklas Stoehr, Ryan Cotterell, Simone Teufel

Abstract: In common law, the outcome of a new case is determined mostly by precedent cases, rather than by existing statutes. However, how exactly does the precedent influence the outcome of a new case? Answering this question is crucial for guaranteeing fair and consistent judicial decision-making. We are the first to approach this question computationally by comparing two longstanding jurisprudential view… ▽ More In common law, the outcome of a new case is determined mostly by precedent cases, rather than by existing statutes. However, how exactly does the precedent influence the outcome of a new case? Answering this question is crucial for guaranteeing fair and consistent judicial decision-making. We are the first to approach this question computationally by comparing two longstanding jurisprudential views; Halsbury's, who believes that the arguments of the precedent are the main determinant of the outcome, and Goodhart's, who believes that what matters most is the precedent's facts. We base our study on the corpus of legal cases from the European Court of Human Rights (ECtHR), which allows us to access not only the case itself, but also cases cited in the judges' arguments (i.e. the precedent cases). Taking an information-theoretic view, and modelling the question as a case outcome classification task, we find that the precedent's arguments share 0.38 nats of information with the case's outcome, whereas precedent's facts only share 0.18 nats of information (i.e., 58% less); suggesting Halsbury's view may be more accurate in this specific court. We found however in a qualitative analysis that there are specific statues where Goodhart's view dominates, and present some evidence these are the ones where the legal concept at hand is less straightforward. △ Less

Submitted 25 April, 2021; originally announced April 2021.

arXiv:2003.12432 [pdf, other]

The CoRisk-Index: A data-mining approach to identify industry-specific risk assessments related to COVID-19 in real-time

Authors: Fabian Stephany, Niklas Stoehr, Philipp Darius, Leonie Neuhäuser, Ole Teutloff, Fabian Braesemann

Abstract: While the coronavirus spreads, governments are attempting to reduce contagion rates at the expense of negative economic effects. Market expectations plummeted, foreshadowing the risk of a global economic crisis and mass unemployment. Governments provide huge financial aid programmes to mitigate the economic shocks. To achieve higher effectiveness with such policy measures, it is key to identify th… ▽ More While the coronavirus spreads, governments are attempting to reduce contagion rates at the expense of negative economic effects. Market expectations plummeted, foreshadowing the risk of a global economic crisis and mass unemployment. Governments provide huge financial aid programmes to mitigate the economic shocks. To achieve higher effectiveness with such policy measures, it is key to identify the industries that are most in need of support. In this study, we introduce a data-mining approach to measure industry-specific risks related to COVID-19. We examine company risk reports filed to the U.S. Securities and Exchange Commission (SEC). This alternative data set can complement more traditional economic indicators in times of the fast-evolving crisis as it allows for a real-time analysis of risk assessments. Preliminary findings suggest that the companies' awareness towards corona-related business risks is ahead of the overall stock market developments. Our approach allows to distinguish the industries by their risk awareness towards COVID-19. Based on natural language processing, we identify corona-related risk topics and their perceived relevance for different industries. The preliminary findings are summarised as an up-to-date online index. The CoRisk-Index tracks the industry-specific risk assessments related to the crisis, as it spreads through the economy. The tracking tool is updated weekly. It could provide relevant empirical data to inform models on the economic effects of the crisis. Such complementary empirical information could ultimately help policymakers to effectively target financial support in order to mitigate the economic shocks of the crisis. △ Less

Submitted 27 April, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

arXiv:1912.10097 [pdf, other]

Mining the Automotive Industry: A Network Analysis of Corporate Positioning and Technological Trends

Authors: Niklas Stoehr, Fabian Braesemann, Michael Frommelt, Shi Zhou

Abstract: The digital transformation is driving revolutionary innovations and new market entrants threaten established sectors of the economy such as the automotive industry. Following the need for monitoring shifting industries, we present a network-centred analysis of car manufacturer web pages. Solely exploiting publicly-available information, we construct large networks from web pages and hyperlinks. Th… ▽ More The digital transformation is driving revolutionary innovations and new market entrants threaten established sectors of the economy such as the automotive industry. Following the need for monitoring shifting industries, we present a network-centred analysis of car manufacturer web pages. Solely exploiting publicly-available information, we construct large networks from web pages and hyperlinks. The network properties disclose the internal corporate positioning of the three largest automotive manufacturers, Toyota, Volkswagen and Hyundai with respect to innovative trends and their international outlook. We tag web pages concerned with topics like e-mobility and environment or autonomous driving, and investigate their relevance in the network. Sentiment analysis on individual web pages uncovers a relationship between page linking and use of positive language, particularly with respect to innovative trends. Web pages of the same country domain form clusters of different size in the network that reveal strong correlations with sales market orientation. Our approach maintains the web content's hierarchical structure imposed by the web page networks. It, thus, presents a method to reveal hierarchical structures of unstructured text content obtained from web scra**. It is highly transparent, reproducible and data driven, and could be used to gain complementary insights into innovative strategies of firms and competitive landscapes, which would not be detectable by the analysis of web content alone. △ Less

Submitted 8 January, 2020; v1 submitted 20 December, 2019; originally announced December 2019.

Comments: Preprint version to be published in Springer Nature (presented at CompleNet 2020)

arXiv:1910.05639 [pdf, other]

Disentangling Interpretable Generative Parameters of Random and Real-World Graphs

Authors: Niklas Stoehr, Emine Yilmaz, Marc Brockschmidt, Jan Stuehmer

Abstract: While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer littl… ▽ More While a wide range of interpretable generative procedures for graphs exist, matching observed graph topologies with such procedures and choices for its parameters remains an open problem. Devising generative models that closely reproduce real-world graphs requires domain knowledge and time-consuming simulation. While existing deep learning approaches rely on less manual modelling, they offer little interpretability. This work approaches graph generation (decoding) as the inverse of graph compression (encoding). We show that in a disentanglement-focused deep autoencoding framework, specifically Beta-Variational Autoencoders (Beta-VAE), choices of generative procedures and their parameters arise naturally in the latent space. Our model is capable of learning disentangled, interpretable latent variables that represent the generative parameters of procedurally generated random graphs and real-world graphs. The degree of disentanglement is quantitatively measured using the Mutual Information Gap (MIG). When training our Beta-VAE model on ER random graphs, its latent variables have a near one-to-one map** to the ER random graph parameters n and p. We deploy the model to analyse the correlation between graph topology and node attributes measuring their mutual dependence without handpicking topological properties. △ Less

Submitted 6 November, 2019; v1 submitted 12 October, 2019; originally announced October 2019.

Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Workshop on Graph Representation Learning

Showing 1–20 of 20 results for author: Stoehr, N