Skip to main content

Showing 1–12 of 12 results for author: Daxenberger, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.03053  [pdf, other

    cs.CL

    Crowdsourcing on Sensitive Data with Privacy-Preserving Text Rewriting

    Authors: Nina Mouhammad, Johannes Daxenberger, Benjamin Schiller, Ivan Habernal

    Abstract: Most tasks in NLP require labeled data. Data labeling is often done on crowdsourcing platforms due to scalability reasons. However, publishing data on public platforms can only be done if no privacy-relevant information is included. Textual data often contains sensitive information like person names or locations. In this work, we investigate how removing personally identifiable information (PII) a… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  2. arXiv:2205.11472  [pdf, other

    cs.CL

    Diversity Over Size: On the Effect of Sample and Topic Sizes for Argument Mining Datasets

    Authors: Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

    Abstract: The task of Argument Mining, that is extracting argumentative sentences for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large Argument Mining datasets are rare and recognition of argumentative sentences requires expert knowledge. The task becomes even more difficult if it also involves stance detection of retrieved… ▽ More

    Submitted 15 July, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

  3. arXiv:2010.08240  [pdf, other

    cs.CL

    Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

    Authors: Nandan Thakur, Nils Reimers, Johannes Daxenberger, Iryna Gurevych

    Abstract: There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-attention over the input pair, and Bi-encoders, which map each input independently to a dense vector space. While cross-encoders often achieve higher performance, they are too slow for many practical use cases. Bi-encoders, on the other hand, require substantial training data and fine-tuning over the target… ▽ More

    Submitted 12 April, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: Accepted at NAACL 2021

  4. arXiv:2006.09109  [pdf, other

    cs.CL

    How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

    Authors: Steffen Eger, Johannes Daxenberger, Iryna Gurevych

    Abstract: Sentence encoders map sentences to real valued vectors for use in downstream applications. To peek into these representations - e.g., to increase interpretability of their results - probing tasks have been designed which query them for linguistic knowledge. However, designing probing tasks for lesser-resourced languages is tricky, because these often lack large-scale annotated data or (high-qualit… ▽ More

    Submitted 28 October, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Accepted for Publication at CONLL 2020

  5. arXiv:2005.00084  [pdf, other

    cs.CL

    Aspect-Controlled Neural Argument Generation

    Authors: Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

    Abstract: We rely on arguments in our daily lives to deliver our opinions and base them on evidence, making them more convincing in turn. However, finding and formulating arguments can be challenging. In this work, we train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect. We define argument asp… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

  6. arXiv:2001.01565  [pdf, other

    cs.CL

    Stance Detection Benchmark: How Robust Is Your Stance Detection?

    Authors: Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

    Abstract: Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim and has become a key component in applications like fake news detection, claim validation, and argument search. However, while stance is easily detected by humans, machine learning models are clearly falling short of this task. Given the major differences in dataset sizes and framing of StD (e.g. number of cl… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  7. arXiv:1906.09821  [pdf, ps, other

    cs.CL

    Classification and Clustering of Arguments with Contextualized Word Embeddings

    Authors: Nils Reimers, Benjamin Schiller, Tilman Beck, Johannes Daxenberger, Christian Stab, Iryna Gurevych

    Abstract: We experiment with two recent contextualized word embedding methods (ELMo and BERT) in the context of open-domain argument search. For the first time, we show how to leverage the power of contextualized word embeddings to classify and cluster topic-dependent arguments, achieving impressive results on both tasks and across multiple datasets. For argument classification, we improve the state-of-the-… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

    Comments: Conference paper at ACL 2019

  8. arXiv:1904.09688  [pdf, other

    cs.CL

    Fine-Grained Argument Unit Recognition and Classification

    Authors: Dietrich Trautmann, Johannes Daxenberger, Christian Stab, Hinrich Schütze, Iryna Gurevych

    Abstract: Prior work has commonly defined argument retrieval from heterogeneous document collections as a sentence-level classification task. Consequently, argument retrieval suffers both from low recall and from sentence segmentation errors making it difficult for humans and machines to consume the arguments. In this work, we argue that the task should be performed on a more fine-grained level of sequence… ▽ More

    Submitted 21 November, 2019; v1 submitted 21 April, 2019; originally announced April 2019.

    Comments: AAAI 2020

  9. arXiv:1807.08998  [pdf, ps, other

    cs.CL

    Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

    Authors: Steffen Eger, Johannes Daxenberger, Christian Stab, Iryna Gurevych

    Abstract: Argumentation mining (AM) requires the identification of complex discourse structures and has lately been applied with success monolingually. In this work, we show that the existing resources are, however, not adequate for assessing cross-lingual AM, due to their heterogeneity or lack of complexity. We therefore create suitable parallel corpora by (human and machine) translating a popular AM datas… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

    Comments: Accepted at Coling 2018

  10. arXiv:1804.04083  [pdf, other

    cs.CL

    Multi-Task Learning for Argumentation Mining in Low-Resource Settings

    Authors: Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych

    Abstract: We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification. Our results show that MTL performs particularly well (and better than single-task learning) when little training data is available for the main task, a common scenario in AM. Our findings challenge previous assumpt… ▽ More

    Submitted 4 May, 2018; v1 submitted 11 April, 2018; originally announced April 2018.

    Comments: Accepted at NAACL 2018

  11. What is the Essence of a Claim? Cross-Domain Claim Identification

    Authors: Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, Iryna Gurevych

    Abstract: Argument mining has become a popular research area in NLP. It typically includes the identification of argumentative components, e.g. claims, as the central component of an argument. We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. To learn about the consequences of such different conceptualizations of claim for p… ▽ More

    Submitted 13 September, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: Published at EMNLP 2017: http://www.aclweb.org/anthology/D/D17/D17-1217.pdf

  12. arXiv:1704.06104  [pdf, ps, other

    cs.CL

    Neural End-to-End Learning for Computational Argumentation Mining

    Authors: Steffen Eger, Johannes Daxenberger, Iryna Gurevych

    Abstract: We investigate neural techniques for end-to-end computational argumentation mining (AM). We frame AM both as a token-based dependency parsing and as a token-based sequence tagging problem, including a multi-task learning setup. Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. In contrast, less comple… ▽ More

    Submitted 22 April, 2017; v1 submitted 20 April, 2017; originally announced April 2017.

    Comments: To be published at ACL 2017