Search | arXiv e-print repository

Controlled Randomness Improves the Performance of Transformer Models

Authors: Tobias Deußer, Cong Zhao, Wolfgang Krämer, David Leonhard, Christian Bauckhage, Rafet Sifa

Abstract: During the pre-training step of natural language models, the main objective is to learn a general representation of the pre-training dataset, usually requiring large amounts of textual data to capture the complexity and diversity of natural language. Contrasting this, in most cases, the size of the data available to solve the specific downstream task is often dwarfed by the aforementioned pre-trai… ▽ More During the pre-training step of natural language models, the main objective is to learn a general representation of the pre-training dataset, usually requiring large amounts of textual data to capture the complexity and diversity of natural language. Contrasting this, in most cases, the size of the data available to solve the specific downstream task is often dwarfed by the aforementioned pre-training dataset, especially in domains where data is scarce. We introduce controlled randomness, i.e. noise, into the training process to improve fine-tuning language models and explore the performance of targeted noise in addition to the parameters of these models. We find that adding such noise can improve the performance in our two downstream tasks of joint named entity recognition and relation extraction and text summarization. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted at ICMLA 2023, 10 pages, 2 tables

arXiv:2308.06111 [pdf, other]

Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models

Authors: Lars Hillebrand, Armin Berger, Tobias Deußer, Tim Dilmaghani, Mohamed Khaled, Bernd Kliem, Rüdiger Loitz, Maren Pielka, David Leonhard, Christian Bauckhage, Rafet Sifa

Abstract: Auditing financial documents is a very tedious and time-consuming process. As of today, it can already be simplified by employing AI-based solutions to recommend relevant text passages from a report for each legal requirement of rigorous accounting standards. However, these methods need to be fine-tuned regularly, and they require abundant annotated data, which is often lacking in industrial envir… ▽ More Auditing financial documents is a very tedious and time-consuming process. As of today, it can already be simplified by employing AI-based solutions to recommend relevant text passages from a report for each legal requirement of rigorous accounting standards. However, these methods need to be fine-tuned regularly, and they require abundant annotated data, which is often lacking in industrial environments. Hence, we present ZeroShotALI, a novel recommender system that leverages a state-of-the-art large language model (LLM) in conjunction with a domain-specifically optimized transformer-based text-matching solution. We find that a two-step approach of first retrieving a number of best matching document sections per legal requirement with a custom BERT-based model and second filtering these selections using an LLM yields significant performance improvements over existing approaches. △ Less

Submitted 14 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: Accepted at DocEng 2023, 4 pages, 1 figure, 2 tables

arXiv:2305.08711 [pdf, other]

doi 10.1145/3594536.3595131

sustain.AI: a Recommender System to analyze Sustainability Reports

Authors: Lars Hillebrand, Maren Pielka, David Leonhard, Tobias Deußer, Tim Dilmaghani, Bernd Kliem, Rüdiger Loitz, Milad Morad, Christian Temath, Thiago Bell, Robin Stenzel, Rafet Sifa

Abstract: We present sustainAI, an intelligent, context-aware recommender system that assists auditors and financial investors as well as the general public to efficiently analyze companies' sustainability reports. The tool leverages an end-to-end trainable architecture that couples a BERT-based encoding module with a multi-label classification head to match relevant text passages from sustainability report… ▽ More We present sustainAI, an intelligent, context-aware recommender system that assists auditors and financial investors as well as the general public to efficiently analyze companies' sustainability reports. The tool leverages an end-to-end trainable architecture that couples a BERT-based encoding module with a multi-label classification head to match relevant text passages from sustainability reports to their respective law regulations from the Global Reporting Initiative (GRI) standards. We evaluate our model on two novel German sustainability reporting data sets and consistently achieve a significantly higher recommendation performance compared to multiple strong baselines. Furthermore, sustainAI is publicly available for everyone at https://sustain.ki.nrw/. △ Less

Submitted 26 May, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: Accepted at ICAIL 2023, 5 pages, 3 figure, 3 tables

ACM Class: H.3.3

arXiv:1408.3324 [pdf, other]

doi 10.1103/PhysRevA.91.012345

Universal entanglement decay in atmospheric turbulence

Authors: Nina D. Leonhard, Vyacheslav N. Shatokhin, Andreas Buchleitner

Abstract: We consider the propagation of two photonic qubits, initially maximally entangled in their orbital angular momenta (OAM), across a turbulent atmosphere. By introducing the {\it phase correlation length} of an OAM beam, we show that the entanglement of OAM photons exhibits a universal exponential decay under turbulence. We consider the propagation of two photonic qubits, initially maximally entangled in their orbital angular momenta (OAM), across a turbulent atmosphere. By introducing the {\it phase correlation length} of an OAM beam, we show that the entanglement of OAM photons exhibits a universal exponential decay under turbulence. △ Less

Submitted 14 August, 2014; originally announced August 2014.

Comments: 4 pages, 3 figures

Journal ref: Phys. Rev. A 91, 012345 (2015)

Showing 1–4 of 4 results for author: Leonhard, D