Skip to main content

Showing 1–4 of 4 results for author: Wankmüller, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.06790  [pdf, other

    cs.CL

    Drawing Causal Inferences About Performance Effects in NLP

    Authors: Sandra Wankmüller

    Abstract: This article emphasizes that NLP as a science seeks to make inferences about the performance effects that result from applying one method (compared to another method) in the processing of natural language. Yet NLP research in practice usually does not achieve this goal: In NLP research articles, typically only a few models are compared. Each model results from a specific procedural pipeline (here… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: 15 pages

  2. arXiv:2205.01600  [pdf, other

    cs.IR cs.CL stat.AP stat.ML

    A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis

    Authors: Sandra Wankmüller

    Abstract: One of the first steps in many text-based social science studies is to retrieve documents that are relevant for the analysis from large corpora of otherwise irrelevant documents. The conventional approach in social science to address this retrieval task is to apply a set of keywords and to consider those documents to be relevant that contain at least one of the keywords. But the application of inc… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: 78 pages, 17 figures, 9 tables

    ACM Class: I.2.7; H.3.3

  3. arXiv:2104.02496  [pdf, other

    cs.CL cs.LG stat.ML

    Exploring Topic-Metadata Relationships with the STM: A Bayesian Approach

    Authors: P. Schulze, S. Wiegrebe, P. W. Thurner, C. Heumann, M. Aßenmacher, S. Wankmüller

    Abstract: Topic models such as the Structural Topic Model (STM) estimate latent topical clusters within text. An important step in many topic modeling applications is to explore relationships between the discovered topical structure and metadata associated with the text documents. Methods used to estimate such relationships must take into account that the topical structure is not directly observed, but inst… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: 8 pages, 4 figures

  4. arXiv:2102.02111  [pdf, other

    cs.CL stat.AP

    Introduction to Neural Transfer Learning with Transformers for Social Science Text Analysis

    Authors: Sandra Wankmüller

    Abstract: Transformer-based models for transfer learning have the potential to achieve high prediction accuracies on text-based supervised learning tasks with relatively few training data instances. These models are thus likely to benefit social scientists that seek to have as accurate as possible text-based measures but only have limited resources for annotating training data. To enable social scientists t… ▽ More

    Submitted 31 August, 2022; v1 submitted 3 February, 2021; originally announced February 2021.

    Comments: 80 pages, 12 figures; changed the title; more focused presentation of contents; moved contents to the appendix; created a new Figure 9; discussion of additional aspects (zero-shot learning, cross-lingual learning, interpretability, foundation models); removed old Figures 4 and 5; made non-essential changes to Figures 1, 2, 4, 6, 7, 8 and 10; changed notation. The original results are unchanged

    ACM Class: I.2.7