Skip to main content

Showing 1–20 of 20 results for author: Strötgen, J

.
  1. arXiv:2406.18708  [pdf, other

    cs.LG cs.CL

    Learn it or Leave it: Module Composition and Pruning for Continual Learning

    Authors: Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

    Abstract: In real-world environments, continual learning is essential for machine learning models, as they need to acquire new knowledge incrementally without forgetting what they have already learned. While pretrained language models have shown impressive capabilities on various static tasks, applying them to continual learning poses significant challenges, including avoiding catastrophic forgetting, facil… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2404.07775  [pdf, other

    cs.CL cs.AI cs.LG

    Discourse-Aware In-Context Learning for Temporal Expression Normalization

    Authors: Akash Kumar Gautam, Lukas Lange, Jannik Strötgen

    Abstract: Temporal expression (TE) normalization is a well-studied problem. However, the predominately used rule-based systems are highly restricted to specific settings, and upcoming machine learning approaches suffer from a lack of labeled data. In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization using in-context learning to inject task… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted at NAACL 2024

  3. arXiv:2404.00790  [pdf, other

    cs.LG cs.CL

    Rehearsal-Free Modular and Compositional Continual Learning for Language Models

    Authors: Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

    Abstract: Continual learning aims at incrementally acquiring new knowledge while not forgetting existing knowledge. To overcome catastrophic forgetting, methods are either rehearsal-based, i.e., store data examples from previous tasks for data replay, or isolate parameters dedicated to each task. However, rehearsal-based methods raise privacy and memory issues, and parameter-isolation continual learning doe… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  4. arXiv:2310.15269  [pdf, other

    cs.LG cs.CL

    GradSim: Gradient-Based Language Grou** for Effective Multilingual Training

    Authors: Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

    Abstract: Most languages of the world pose low-resource challenges to natural language processing models. With multilingual training, knowledge can be shared among languages. However, not all languages positively influence each other and it is an open research question how to select the most suitable set of languages for multilingual training and avoid negative interference among languages whose characteris… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  5. arXiv:2305.12717  [pdf, other

    cs.CL cs.LG

    TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

    Authors: Chia-Chien Hung, Lukas Lange, Jannik Strötgen

    Abstract: Intermediate training of pre-trained transformer-based language models on domain-specific data leads to substantial gains for downstream tasks. To increase efficiency and prevent catastrophic forgetting alleviated from full domain-adaptive pre-training, approaches such as adapters have been developed. However, these require additional parameters for each layer, and are criticized for their limited… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: ACL-Findings 2023

  6. NLNDE at SemEval-2023 Task 12: Adaptive Pretraining and Source Language Selection for Low-Resource Multilingual Sentiment Analysis

    Authors: Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

    Abstract: This paper describes our system developed for the SemEval-2023 Task 12 "Sentiment Analysis for Low-resource African Languages using Twitter Dataset". Sentiment analysis is one of the most widely studied applications in natural language processing. However, most prior work still focuses on a small number of high-resource languages. Building reliable sentiment analysis systems for low-resource langu… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

  7. arXiv:2205.10399  [pdf, other

    cs.CL cs.LG

    Multilingual Normalization of Temporal Expressions with Masked Language Models

    Authors: Lukas Lange, Jannik Strötgen, Heike Adel, Dietrich Klakow

    Abstract: The detection and normalization of temporal expressions is an important task and preprocessing step for many applications. However, prior work on normalization is rule-based, which severely limits the applicability in real-world multilingual settings, due to the costly creation of new rules. We propose a novel neural method for normalizing temporal expressions based on masked language modeling. Ou… ▽ More

    Submitted 10 February, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted at EACL 2023

  8. CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow

    Abstract: The field of natural language processing (NLP) has recently seen a large change towards using pre-trained language models for solving almost any task. Despite showing great improvements in benchmark datasets for various tasks, these models often perform sub-optimal in non-standard domains like the clinical domain where a large gap between pre-training documents and target documents is observed. In… ▽ More

    Submitted 20 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: This article has been accepted for publication in Bioinformatics \c{opyright}: 2022 The Author(s). Published by Oxford University Press. All rights reserved. The published manuscript can be found here: https://doi.org/10.1093/bioinformatics/btac297

  9. arXiv:2109.08597  [pdf, other

    cs.CL cs.LG

    Boosting Transformers for Job Expression Extraction and Classification in a Low-Resource Setting

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen

    Abstract: In this paper, we explore possible improvements of transformer models in a low-resource setting. In particular, we present our approaches to tackle the first two of three subtasks of the MEDDOPROF competition, i.e., the extraction and classification of job expressions in Spanish clinical texts. As neither language nor domain experts, we experiment with the multilingual XLM-R transformer model and… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: Published at IberLEF 2021. Best system of the NER and CLASS tracks of the MEDDOPROF shared task

  10. arXiv:2104.10899  [pdf, other

    cs.CL

    Enriched Attention for Robust Relation Extraction

    Authors: Heike Adel, Jannik Strötgen

    Abstract: The performance of relation extraction models has increased considerably with the rise of neural networks. However, a key issue of neural relation extraction is robustness: the models do not scale well to long sentences with multiple entities and relations. In this work, we address this problem with an enriched attention mechanism. Attention allows the model to focus on parts of the input sentence… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

  11. arXiv:2104.08078  [pdf, other

    cs.CL cs.LG

    To Share or not to Share: Predicting Sets of Sources for Model Transfer Learning

    Authors: Lukas Lange, Jannik Strötgen, Heike Adel, Dietrich Klakow

    Abstract: In low-resource settings, model transfer can help to overcome a lack of labeled data for many tasks and domains. However, predicting useful transfer sources is a challenging problem, as even the most similar sources might lead to unexpected negative transfer results. Thus, ranking methods based on task and text similarity -- as suggested in prior work -- may not be sufficient to identify promising… ▽ More

    Submitted 29 October, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted at EMNLP 2021

  12. arXiv:2010.12322  [pdf, other

    cs.CL cs.LG

    NLNDE at CANTEMIST: Neural Sequence Labeling and Parsing Approaches for Clinical Concept Extraction

    Authors: Lukas Lange, Xiang Dai, Heike Adel, Jannik Strötgen

    Abstract: The recognition and normalization of clinical information, such as tumor morphology mentions, is an important, but complex process consisting of multiple subtasks. In this paper, we describe our system for the CANTEMIST shared task, which is able to extract, normalize and rank ICD codes from Spanish electronic health records using neural sequence labeling and parsing approaches with context-aware… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: IberLEF 2020

  13. arXiv:2010.12309  [pdf, other

    cs.CL cs.LG

    A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios

    Authors: Michael A. Hedderich, Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow

    Abstract: Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches… ▽ More

    Submitted 9 April, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at NAACL 2021

  14. arXiv:2010.12305  [pdf, other

    cs.CL cs.LG

    FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow

    Abstract: Combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information. It has been shown that even models using embeddings from transformers still benefit from the inclusion of standard word embeddings. However, the combination of embeddings of different types and dimensions is challenging. As an alternative to attention-based meta-emb… ▽ More

    Submitted 29 October, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2021

  15. arXiv:2007.01030  [pdf, other

    cs.CL cs.LG

    NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical Document De-Identification

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen

    Abstract: Natural language processing has huge potential in the medical domain which recently led to a lot of research in this field. However, a prerequisite of secure processing of medical documents, e.g., patient notes and clinical trials, is the proper de-identification of privacy-sensitive information. In this paper, we describe our NLNDE system, with which we participated in the MEDDOCAN competition, t… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: Published at IberLEF 2019. Winning System of the MEDDOCAN shared task

  16. NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen

    Abstract: Named entity recognition has been extensively studied on English news texts. However, the transfer to other domains and languages is still a challenging problem. In this paper, we describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the task provides a n… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: Published at BioNLP-OST@EMNLP 2019

  17. arXiv:2005.09397  [pdf, other

    cs.CL cs.LG

    Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen

    Abstract: Exploiting natural language processing in the clinical domain requires de-identification, i.e., anonymization of personal information in texts. However, current research considers de-identification and downstream tasks, such as concept extraction, only in isolation and does not study the effects of de-identification on other tasks. In this paper, we close this gap by reporting concept extraction p… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  18. arXiv:2005.09392  [pdf, other

    cs.CL cs.LG

    Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text

    Authors: Lukas Lange, Anastasiia Iurshina, Heike Adel, Jannik Strötgen

    Abstract: Although temporal tagging is still dominated by rule-based systems, there have been recent attempts at neural temporal taggers. However, all of them focus on monolingual settings. In this paper, we explore multilingual methods for the extraction of temporal expressions from text and investigate adversarial training for aligning embedding spaces to one common space. With this, we create a single mu… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Comments: RepL4NLP at ACL 2020

  19. arXiv:2005.09389  [pdf, other

    cs.CL cs.LG

    On the Choice of Auxiliary Languages for Improved Sequence Tagging

    Authors: Lukas Lange, Heike Adel, Jannik Strötgen

    Abstract: Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models. In this analysis paper, we investigate whether the best auxiliary language can be predicted based on language distances and show that the most related language is not always the best auxiliary language. Further, we show that attention-based meta-embeddings can eff… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Comments: RepL4NLP at ACL 2020

  20. TEQUILA: Temporal Question Answering over Knowledge Bases

    Authors: Zhen Jia, Abdalghani Abujabal, Rishiraj Saha Roy, Jannik Stroetgen, Gerhard Weikum

    Abstract: Question answering over knowledge bases (KB-QA) poses challenges in handling complex questions that need to be decomposed into sub-questions. An important case, addressed here, is that of temporal questions, where cues for temporal relations need to be discovered and handled. We present TEQUILA, an enabler method for temporal QA that can run on top of any KB-QA engine. TEQUILA has four stages. It… ▽ More

    Submitted 25 January, 2021; v1 submitted 9 August, 2019; originally announced August 2019.

    Comments: CIKM 2018 Short Paper

    Journal ref: CIKM 2018