Skip to main content

Showing 1–6 of 6 results for author: Lell, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.14579  [pdf, other

    cs.CV cs.CL cs.LG

    Text Role Classification in Scientific Charts Using Multimodal Transformers

    Authors: Hye ** Kim, Nicolas Lell, Ansgar Scherp

    Abstract: Text role classification involves classifying the semantic role of textual elements within scientific charts. For this task, we propose to finetune two pretrained multimodal document layout analysis models, LayoutLMv3 and UDOP, on chart datasets. The transformers utilize the three modalities of text, image, and layout as input. We further investigate whether data augmentation and balancing methods… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  2. arXiv:2306.10974  [pdf, other

    cs.CL

    Fine-Tuning Language Models for Scientific Writing Support

    Authors: Justin Mücke, Daria Waldow, Luise Metzger, Philipp Schauz, Marcel Hoffman, Nicolas Lell, Ansgar Scherp

    Abstract: We support scientific writers in determining whether a written sentence is scientific, to which section it belongs, and suggest paraphrasings to improve the sentence. Firstly, we propose a regression model trained on a corpus of scientific sentences extracted from peer-reviewed scientific papers and non-scientific text to assign a score that indicates the scientificness of a sentence. We investiga… ▽ More

    Submitted 21 June, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

  3. The Split Matters: Flat Minima Methods for Improving the Performance of GNNs

    Authors: Nicolas Lell, Ansgar Scherp

    Abstract: When training a Neural Network, it is optimized using the available training data with the hope that it generalizes well to new or unseen testing data. At the same absolute value, a flat minimum in the loss landscape is presumed to generalize better than a sharp minimum. Methods for determining flat minima have been mostly researched for independent and identically distributed (i. i. d.) data such… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  4. Memorization of Named Entities in Fine-tuned BERT Models

    Authors: Andor Diera, Nicolas Lell, Aygul Garifullina, Ansgar Scherp

    Abstract: Privacy preserving deep learning is an emerging field in machine learning that aims to mitigate the privacy risks in the use of deep neural networks. One such risk is training data extraction from language models that have been trained on datasets, which contain personal and privacy sensitive information. In our study, we investigate the extent of named entity memorization in fine-tuned BERT model… ▽ More

    Submitted 10 October, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: accepted at CD-MAKE 2023

  5. arXiv:2211.13632  [pdf, other

    cs.DL

    Reducing a Set of Regular Expressions and Analyzing Differences of Domain-specific Statistic Reporting

    Authors: Tobias Kalmbach, Marcel Hoffmann, Nicolas Lell, Ansgar Scherp

    Abstract: Due to the large amount of daily scientific publications, it is impossible to manually review each one. Therefore, an automatic extraction of key information is desirable. In this paper, we examine STEREO, a tool for extracting statistics from scientific papers using regular expressions. By adapting an existing regular expression inclusion algorithm for our use case, we decrease the number of regu… ▽ More

    Submitted 25 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  6. arXiv:2103.14124  [pdf, other

    cs.DL

    STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topics from Scientific Papers

    Authors: Steffen Epp, Marcel Hoffmann, Nicolas Lell, Michael Mohr, Ansgar Scherp

    Abstract: A common writing style for statistical results are the recommendations of the American Psychology Association, known as APA-style. However, in practice, writing styles vary as reports are not 100% following APA-style or parameters are not reported despite being mandatory. In addition, the statistics are not reported in isolation but in context of experimental conditions investigated and the genera… ▽ More

    Submitted 6 December, 2022; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: Paper accepted at iiWAS2021