Skip to main content

Showing 1–50 of 163 results for author: Färber, M

.
  1. arXiv:2406.04866  [pdf, other

    cs.CL

    ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering

    Authors: Raphael Gruber, Abdelrahman Abdallah, Michael Färber, Adam Jatowt

    Abstract: We introduce ComplexTempQA,a large-scale dataset consisting of over 100 million question-answer pairs designed to tackle the challenges in temporal question answering. ComplexTempQA significantly surpasses existing benchmarks like HOTPOTQA, TORQUE, and TEQUILA in scale and scope. Utilizing data from Wikipedia and Wikidata, the dataset covers questions spanning over two decades and offers an unmatc… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2405.09557  [pdf, other

    eess.SP cs.LG

    Machine Learning in Short-Reach Optical Systems: A Comprehensive Survey

    Authors: Chen Shao, Elias Giacoumidis, Syed Moktacim Billah, Shi Li, Jialei Li, Prashasti Sahu, Andre Richter, Tobias Kaefer, Michael Faerber

    Abstract: In recent years, extensive research has been conducted to explore the utilization of machine learning algorithms in various direct-detected and self-coherent short-reach communication applications. These applications encompass a wide range of tasks, including bandwidth request prediction, signal quality monitoring, fault detection, traffic prediction, and digital signal processing (DSP)-based equa… ▽ More

    Submitted 29 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 23 pages, 2 figure, 3 tables, Accepted as MDPI Photonics Journal Speical Issue Machine Learning Applied to Optical Communication Systems

  3. arXiv:2405.02609  [pdf, other

    cs.LG

    Advanced Equalization in 112 Gb/s Upstream PON Using a Novel Fourier Convolution-based Network

    Authors: Chen Shao, Elias Giacoumidis, Patrick Matalla, Jialei Li, Shi Li, Sebastian Randel, Andre Richter, Michael Faerber, Tobias Kaefer

    Abstract: We experimentally demonstrate a novel, low-complexity Fourier Convolution-based Network (FConvNet) based equalizer for 112 Gb/s upstream PAM4-PON. At a BER of 0.005, FConvNet enhances the receiver sensitivity by 2 and 1 dB compared to a 51-tap Sato equalizer and benchmark machine learning algorithms respectively.

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 4 pages, 5 figures

  4. arXiv:2405.00720  [pdf, other

    eess.SP cs.LG

    A Novel Machine Learning-based Equalizer for a Downstream 100G PAM-4 PON

    Authors: Chen Shao, Elias Giacoumidis, Shi Li, Jialei Li, Michael Faerber, Tobias Kaefer, Andre Richter

    Abstract: A frequency-calibrated SCINet (FC-SCINet) equalizer is proposed for down-stream 100G PON with 28.7 dB path loss. At 5 km, FC-SCINet improves the BER by 88.87% compared to FFE and a 3-layer DNN with 10.57% lower complexity.

    Submitted 25 April, 2024; originally announced May 2024.

    Comments: 3 pages, 6 figures, accepted by Optical Fiber Communications Conference and Exhibition 2024

  5. arXiv:2404.06911  [pdf, other

    cs.CL

    GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

    Authors: Shuzhou Yuan, Michael Färber

    Abstract: Pretrained Language Models (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings

  6. arXiv:2403.20132  [pdf

    cs.LO cs.PL

    A formal specification of the jq language

    Authors: Michael Färber

    Abstract: jq is a widely used tool that provides a programming language to manipulate JSON data. However, the jq language is currently only specified by its implementation, making it difficult to reason about its behaviour. To this end, we provide a formal syntax and denotational semantics for a large subset of the jq language. Our most significant contribution is to provide a new way to interpret updates t… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    ACM Class: D.3.1

  7. arXiv:2403.16846  [pdf, other

    cs.LG cs.AI

    GreeDy and CoDy: Counterfactual Explainers for Dynamic Graphs

    Authors: Zhan Qu, Daniel Gomm, Michael Färber

    Abstract: Temporal Graph Neural Networks (TGNNs), crucial for modeling dynamic graphs with time-varying interactions, face a significant challenge in explainability due to their complex model structure. Counterfactual explanations, crucial for understanding model decisions, examine how input graph changes affect outcomes. This paper introduces two novel counterfactual explanation methods for TGNNs: GreeDy (… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  8. arXiv:2403.11747  [pdf, other

    cs.CL

    Embedded Named Entity Recognition using Probing Classifiers

    Authors: Nicholas Popovič, Michael Färber

    Abstract: Extracting semantic information from generated text is a useful tool for applications such as automated fact checking or retrieval augmented generation. Currently, this requires either separate models during inference, which increases computational cost, or destructive fine-tuning of the language model. Instead, we propose directly embedding information extraction capabilities into pre-trained lan… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  9. arXiv:2402.18397  [pdf, other

    cs.CL

    Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

    Authors: Ercong Nie, Shuzhou Yuan, Bolei Ma, Helmut Schmid, Michael Färber, Frauke Kreuter, Hinrich Schütze

    Abstract: Despite the predominance of English in their training data, English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities. This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence la… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 18 pages, 7 figures

  10. arXiv:2402.11709  [pdf, other

    cs.CL cs.AI

    GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network

    Authors: Shuzhou Yuan, Ercong Nie, Michael Färber, Helmut Schmid, Hinrich Schütze

    Abstract: Large Language Models (LLMs) exhibit strong In-Context Learning (ICL) capabilities when prompts with demonstrations are used. However, fine-tuning still remains crucial to further enhance their adaptability. Prompt-based fine-tuning proves to be an effective fine-tuning method in low-data scenarios, but high demands on computing resources limit its practicality. We address this issue by introducin… ▽ More

    Submitted 7 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: ACL2024 Findings

  11. arXiv:2402.11700  [pdf, other

    cs.CL

    Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers

    Authors: Shuzhou Yuan, Ercong Nie, Bolei Ma, Michael Färber

    Abstract: Large Language Models (LLMs) possess outstanding capabilities in addressing various natural language processing (NLP) tasks. However, the sheer size of these models poses challenges in terms of storage, training and inference due to the inclusion of billions of parameters through layer stacking. While traditional approaches such as model pruning or distillation offer ways for reducing model size,… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures

  12. arXiv:2401.16589  [pdf, other

    cs.CL

    ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

    Authors: Bolei Ma, Ercong Nie, Shuzhou Yuan, Helmut Schmid, Michael Färber, Frauke Kreuter, Hinrich Schütze

    Abstract: Prompt-based methods have been successfully applied to multilingual pretrained language models for zero-shot cross-lingual understanding. However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging. In this paper, we propose Token-Level Prompt De… ▽ More

    Submitted 13 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: EACL 2024

  13. HyperPIE: Hyperparameter Information Extraction from Scientific Publications

    Authors: Tarek Saier, Mayumi Ohta, Takuto Asakura, Michael Färber

    Abstract: Automatic extraction of information from publications is key to making scientific knowledge machine readable at a large scale. The extracted information can, for example, facilitate academic search, decision making, and knowledge graph construction. An important type of information not covered by existing approaches is hyperparameters. In this paper, we formalize and tackle hyperparameter informat… ▽ More

    Submitted 10 January, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: accepted at ECIR2024

  14. arXiv:2312.01124  [pdf, ps, other

    math.AT math.GT

    Sequential topological complexity of aspherical spaces and sectional categories of subgroup inclusions

    Authors: Arturo Espinosa Baro, Michael Farber, Stephan Mescher, John Oprea

    Abstract: We generalize results from topological robotics on the topological complexity (TC) of aspherical spaces to sectional categories of fibrations inducing subgroup inclusions on the level of fundamental groups. In doing so, we establish new lower bounds on sequential TCs of aspherical spaces as well as the parametrized TC of epimorphisms. Moreover, we generalize the Costa-Farber canonical class for TC… ▽ More

    Submitted 8 December, 2023; v1 submitted 2 December, 2023; originally announced December 2023.

    Comments: 40 pages

    MSC Class: 55M30 (68T40; 20J05)

  15. arXiv:2310.20475  [pdf, other

    cs.DL cs.AI

    Linked Papers With Code: The Latest in Machine Learning as an RDF Knowledge Graph

    Authors: Michael Färber, David Lamprecht

    Abstract: In this paper, we introduce Linked Papers With Code (LPWC), an RDF knowledge graph that provides comprehensive, current information about almost 400,000 machine learning publications. This includes the tasks addressed, the datasets utilized, the methods implemented, and the evaluations conducted, along with their results. Compared to its non-RDF-based counterpart Papers With Code, LPWC not only tr… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Published at ISWC'23

  16. arXiv:2310.20444  [pdf, other

    cs.AI cs.DL cs.IR

    Analyzing the Impact of Companies on AI Research Based on Publications

    Authors: Michael Färber, Lazaros Tampakis

    Abstract: Artificial Intelligence (AI) is one of the most momentous technologies of our time. Thus, it is of major importance to know which stakeholders influence AI research. Besides researchers at universities and colleges, researchers in companies have hardly been considered in this context. In this article, we consider how the influence of companies on AI research can be made measurable on the basis of… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Published in Scientometrics

  17. arXiv:2309.04797  [pdf, other

    cs.SE cs.AI cs.LG

    A Full-fledged Commit Message Quality Checker Based on Machine Learning

    Authors: David Faragó, Michael Färber, Christian Petrov

    Abstract: Commit messages (CMs) are an essential part of version control. By providing important context in regard to what has changed and why, they strongly support software maintenance and evolution. But writing good CMs is difficult and often neglected by developers. So far, there is no tool suitable for practice that automatically assesses how well a CM is written, including its meaning and context. Sin… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Comments: published at COMPSAC'23

  18. arXiv:2308.10595  [pdf, other

    math.AT

    Sequential parametrized topological complexity of sphere bundles

    Authors: Michael Farber, Amit Kumar Paul

    Abstract: Autonomous motion of a system (robot) is controlled by a motion planning algorithm. A sequential parametrized motion planning algorithm \cite{FP22} works under variable external conditions and generates continuous motions of the system to attain the prescribed sequence of states at prescribed moments of time. Topological complexity of such algorithms characterises their structure and discontinuiti… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    MSC Class: 55M30

  19. arXiv:2308.03671  [pdf, other

    cs.DL cs.AI

    SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples

    Authors: Michael Färber, David Lamprecht, Johan Krause, Linn Aung, Peter Haase

    Abstract: We present SemOpenAlex, an extensive RDF knowledge graph that contains over 26 billion triples about scientific publications and their associated entities, such as authors, institutions, journals, and concepts. SemOpenAlex is licensed under CC0, providing free and open access to the data. We offer the data through multiple channels, including RDF dump files, a SPARQL endpoint, and as a data source… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: accepted at ISWC'23

  20. arXiv:2308.03531  [pdf, other

    cs.CL cs.AI

    Measuring Variety, Balance, and Disparity: An Analysis of Media Coverage of the 2021 German Federal Election

    Authors: Michael Färber, Jannik Schwade, Adam Jatowt

    Abstract: Determining and measuring diversity in news articles is important for a number of reasons, including preventing filter bubbles and fueling public discourse, especially before elections. So far, the identification and analysis of diversity have been illuminated in a variety of ways, such as measuring the overlap of words or topics between news articles related to US elections. However, the question… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  21. arXiv:2308.03519  [pdf, other

    cs.CL cs.AI

    Vocab-Expander: A System for Creating Domain-Specific Vocabularies Based on Word Embeddings

    Authors: Michael Färber, Nicholas Popovic

    Abstract: In this paper, we propose Vocab-Expander at https://vocab-expander.com, an online tool that enables end-users (e.g., technology scouts) to create and expand a vocabulary of their domain of interest. It utilizes an ensemble of state-of-the-art word embedding techniques based on web text and ConceptNet, a common-sense knowledge base, to suggest related terms for already given terms. The system has a… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: accepted at RANLP'23

  22. arXiv:2307.14712  [pdf, other

    cs.CL cs.AI

    Evaluating Generative Models for Graph-to-Text Generation

    Authors: Shuzhou Yuan, Michael Färber

    Abstract: Large language models (LLMs) have been widely employed for graph-to-text generation tasks. However, the process of finetuning LLMs requires significant training resources and annotation work. In this paper, we explore the capability of generative models to generate descriptive text from graph data in a zero-shot setting. Specifically, we evaluate GPT-3 and ChatGPT on two graph-to-text datasets and… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted as short paper in RANLP2023

  23. CoCon: A Data Set on Combined Contextualized Research Artifact Use

    Authors: Tarek Saier, Youxiang Dong, Michael Färber

    Abstract: In the wake of information overload in academia, methodologies and systems for search, recommendation, and prediction to aid researchers in identifying relevant research are actively studied and developed. Existing work, however, is limited in terms of granularity, focusing only on the level of papers or a single type of artifact, such as data sets. To enable more holistic analyses and systems dea… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: submitted to JCDL2023

  24. unarXive 2022: All arXiv Publications Pre-Processed for NLP, Including Structured Full-Text and Citation Network

    Authors: Tarek Saier, Johan Krause, Michael Färber

    Abstract: Large-scale data sets on scholarly publications are the basis for a variety of bibliometric analyses and natural language processing (NLP) applications. Especially data sets derived from publication's full-text have recently gained attention. While several such data sets already exist, we see key shortcomings in terms of their domain and time coverage, citation network completeness, and representa… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: submitted to JCDL2023

  25. arXiv:2302.10576  [pdf, other

    cs.LO cs.PL

    Denotational Semantics and a Fast Interpreter for jq

    Authors: Michael Färber

    Abstract: jq is a widely used tool that provides a programming language to manipulate JSON data. However, its semantics are currently only specified by its implementation, making it difficult to reason about its behaviour. To this end, I provide a syntax and denotational semantics for a subset of the jq language. In particular, the semantics provide a new way to interpret updates. I implement an extended ve… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Submitted to OOPSLA 2023

  26. arXiv:2301.07809  [pdf, other

    math.PR math.CO

    A Random Graph Growth Model

    Authors: Michael Farber, Alexander Gnedin, Wajid Mannan

    Abstract: A growing random graph is constructed by successively sampling without replacement an element from the pool of virtual vertices and edges. At start of the process the pool contains $N$ virtual vertices and no edges. Each time a vertex is sampled and occupied, the edges linking the vertex to previously occupied vertices are added to the pool of virtual elements. We focus on the edge-counting at tim… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: 21 pages, 1 figure

    MSC Class: 05C80; 60B20

    Journal ref: Bulletin of the London Mathematical Society 56, Issue 2 (2024) pp. 662-680

  27. arXiv:2301.07483  [pdf, other

    cs.IR cs.AI

    Biases in Scholarly Recommender Systems: Impact, Prevalence, and Mitigation

    Authors: Michael Färber, Melissa Coutinho, Shuzhou Yuan

    Abstract: With the remarkable increase in the number of scientific entities such as publications, researchers, and scientific topics, and the associated information overload in science, academic recommender systems have become increasingly important for millions of researchers and science enthusiasts. However, it is often overlooked that these systems are subject to various biases. In this article, we first… ▽ More

    Submitted 13 February, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: 44 pages, 6 figures. To be published in Scientometrics

  28. arXiv:2301.07404  [pdf, other

    math.CO math.AT

    Large simplicial complexes: Universality, Randomness, and Ampleness

    Authors: Michael Farber

    Abstract: The paper surveys recent progress in understanding geometric, topological and combinatorial properties of large simplicial complexes, focusing mainly on ampleness, connectivity and universality. In the first part of the paper we concentrate on $r$-ample simplicial complexes which are high dimensional analogues of the $r$-e.c. graphs introduced originally by Erd\H os and Réniy. The class of $r$-amp… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: The Appendix was written by J.A. Barmak. arXiv admin note: text overlap with arXiv:2012.01483

  29. Impact, Attention, Influence: Early Assessment of Autonomous Driving Datasets

    Authors: Daniel Bogdoll, Jonas Hendl, Felix Schreyer, Nishanth Gowda, Michael Färber, J. Marius Zöllner

    Abstract: Autonomous Driving (AD), the area of robotics with the greatest potential impact on society, has gained a lot of momentum in the last decade. As a result of this, the number of datasets in AD has increased rapidly. Creators and users of datasets can benefit from a better understanding of developments in the field. While scientometric analysis has been conducted in other fields, it rarely revolves… ▽ More

    Submitted 31 March, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: Daniel Bogdoll and Jonas Hendl contributed equally. Accepted for publication at ICCRE 2023

  30. arXiv:2212.11765  [pdf, other

    q-fin.GN cs.IR cs.LG

    Predicting Companies' ESG Ratings from News Articles Using Multivariate Timeseries Analysis

    Authors: Tanja Aue, Adam Jatowt, Michael Färber

    Abstract: Environmental, social and governance (ESG) engagement of companies moved into the focus of public attention over recent years. With the requirements of compulsory reporting being implemented and investors incorporating sustainability in their investment decisions, the demand for transparent and reliable ESG ratings is increasing. However, automatic approaches for forecasting ESG ratings have been… ▽ More

    Submitted 13 November, 2022; originally announced December 2022.

  31. arXiv:2212.01091  [pdf, other

    cs.RO math.AT

    Sequential parametrized motion planning and its complexity, II

    Authors: Michael Farber, Amit Kumar Paul

    Abstract: This is a continuation of our recent paper in which we developed the theory of sequential parametrized motion planning. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in a certain order, at specified moments of time. In the previous publication we analysed the sequential parametrized topological comple… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    MSC Class: 55M30

  32. arXiv:2209.05418  [pdf, ps, other

    math.AT math.CO

    The homology of random simplicial complexes in the multi-parameter upper model

    Authors: Michael Farber, Tahl Nowik

    Abstract: We study random simplicial complexes in the multi-parameter upper model. In this model simplices of various dimensions are taken randomly and independently, and our random simplicial complex $Y$ is then taken to be the minimal simplicial complex containing this collection of simplices. We study the asymptotic behavior of the homology of $Y$ as the number of vertices goes to $\infty$. We observe… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  33. arXiv:2209.01990  [pdf, ps, other

    math.AT

    Sequential parametrized topological complexity and related invariants

    Authors: Michael Farber, John Oprea

    Abstract: Parametrized motion planning algorithms \cite{CFW} have a high degree of universality and flexibility; they generate the motion of a robotic system under a variety of external conditions. The latter are viewed as parameters and constitute part of the input of the algorithm. The concept of sequential parametrized topological complexity ${\sf TC}_r[p:E\to B]$ is a measure of the complexity of such a… ▽ More

    Submitted 26 February, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

    MSC Class: 55M30

  34. arXiv:2206.07688  [pdf, other

    math.CO

    Spectra of infinite graphs with summable weight functions

    Authors: Michael Farber, Lewin Strauss

    Abstract: In this paper we study spectra of Laplacians of infinite weighted graphs. Instead of the assumption of local finiteness we impose the condition of summability of the weight function. Such graphs correspond to reversible Markov chains with countable state spaces. We adopt the concept of the Cheeger constant to this setting and prove an analogue of the Cheeger inequality characterising the spectral… ▽ More

    Submitted 25 August, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    MSC Class: 05C50; 05C63; 05C48; 05C81

  35. arXiv:2205.08453  [pdf, ps, other

    math.AT

    Sequential Parametrized Motion Planning and its Complexity

    Authors: Michael Farber, Amit Kumar Paul

    Abstract: In this paper we develop theory of sequential parametrized motion planning which generalises the approach of parametrized motion planning, which was introduced recently in [3]. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in certain order, at specified moments of time. The sequential parametrized alg… ▽ More

    Submitted 17 September, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    MSC Class: 55M30

  36. arXiv:2205.02048  [pdf, other

    cs.CL cs.AI cs.LG

    Few-Shot Document-Level Relation Extraction

    Authors: Nicholas Popovic, Michael Färber

    Abstract: We present FREDo, a few-shot document-level relation extraction (FSDLRE) benchmark. As opposed to existing benchmarks which are built on sentence-level relation extraction corpora, we argue that document-level corpora provide more realism, particularly regarding none-of-the-above (NOTA) distributions. Therefore, we propose a set of FSDLRE tasks and construct a benchmark based on two existing super… ▽ More

    Submitted 1 July, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: Published at NAACL 2022

  37. How Does Author Affiliation Affect Preprint Citation Count? Analyzing Citation Bias at the Institution and Country Level

    Authors: Chifumi Nishioka, Michael Färber, Tarek Saier

    Abstract: Citing is an important aspect of scientific discourse and important for quantifying the scientific impact quantification of researchers. Previous works observed that citations are made not only based on the pure scholarly contributions but also based on non-scholarly attributes, such as the affiliation or gender of authors. In this way, citation bias is produced. Existing works, however, have not… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted at the ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2022

  38. arXiv:2203.05325  [pdf, other

    cs.CL cs.AI cs.LG

    AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First -- Using Relation Extraction to Identify Entities

    Authors: Nicholas Popovic, Walter Laurito, Michael Färber

    Abstract: In this paper, we present an end-to-end joint entity and relation extraction approach based on transformer-based language models. We apply the model to the task of linking mathematical symbols to their descriptions in LaTeX documents. In contrast to existing approaches, which perform entity and relation extraction in sequence, our system incorporates information from relation extraction into entit… ▽ More

    Submitted 4 May, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Camera ready version

  39. arXiv:2202.05801  [pdf, other

    cs.RO math.AT

    Parametrized motion planning and topological complexity

    Authors: Michael Farber, Shmuel Weinberger

    Abstract: In this paper we study paramertized motion planning algorithms which provide universal and flexible solutions to diverse motion planning problems. Such algorithms are intended to function under a variety of external conditions which are viewed as parameters and serve as part of the input of the algorithm. Continuing a recent paper, we study further the concept of parametrized topological complexit… ▽ More

    Submitted 23 February, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    MSC Class: 58E05

  40. arXiv:2202.05796  [pdf, ps, other

    math.AT

    Parametrized topological complexity of sphere bundles

    Authors: Michael Farber, Shmuel Weinberger

    Abstract: Parametrized motion planning algorithms have high degree of flexibility and universality, they can work under a variety of external conditions, which are viewed as parameters and form part of the input of the algorithm. In this paper we analyse the parameterized motion planning problem in the case of sphere bundles. Our main results provide upper and lower bounds for the parametrized topological c… ▽ More

    Submitted 12 May, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    MSC Class: 58E05

  41. arXiv:2112.00859  [pdf, other

    cs.SI cs.IR

    Are Investors Biased Against Women? Analyzing How Gender Affects Startup Funding in Europe

    Authors: Michael Färber, Alexander Klein

    Abstract: One of the main challenges of startups is to raise capital from investors. For startup founders, it is therefore crucial to know whether investors have a bias against women as startup founders and in which way startups face disadvantages due to gender bias. Existing works on gender studies have mainly analyzed the US market. In this paper, we aim to give a more comprehensive picture of gender bias… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 35 pages

  42. arXiv:2112.00160  [pdf, other

    cs.CL cs.IR

    Towards Full-Fledged Argument Search: A Framework for Extracting and Clustering Arguments from Unstructured Text

    Authors: Michael Färber, Anna Steyer

    Abstract: Argument search aims at identifying arguments in natural language texts. In the past, this task has been addressed by a combination of keyword search and argument identification on the sentence- or document-level. However, existing frameworks often address only specific components of argument search and do not address the following aspects: (1) argument-query matching: identifying arguments that f… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

  43. Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Usage, and Impact

    Authors: Tarek Saier, Michael Färber, Tornike Tsereteli

    Abstract: Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in dat… ▽ More

    Submitted 10 November, 2021; v1 submitted 7 November, 2021; originally announced November 2021.

    Comments: to be published in the International Journal on Digital Libraries

    ACM Class: H.3.3; H.3.7; I.2.7

  44. arXiv:2109.09389  [pdf, other

    cs.CV cs.LG

    Explaining Convolutional Neural Networks by Tagging Filters

    Authors: Anna Nguyen, Daniel Hagenmayer, Tobias Weller, Michael Färber

    Abstract: Convolutional neural networks (CNNs) have achieved astonishing performance on various image classification tasks, but it is difficult for humans to understand how a classification comes about. Recent literature proposes methods to explain the classification process to humans. These focus mostly on visualizing feature maps and filter weights, which are not very intuitive for non-experts in analyzin… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  45. arXiv:2106.13722  [pdf, other

    cs.LO

    A Curiously Effective Backtracking Strategy for Connection Tableaux

    Authors: Michael Färber

    Abstract: Automated proof search with connection tableaux, such as implemented by Otten's leanCoP prover, depends on backtracking for completeness. Otten's restricted backtracking strategy loses completeness, yet for many problems, it significantly reduces the time required to find a proof. I introduce a new, less restricted backtracking strategy based on the notion of exclusive cuts. I implement the strate… ▽ More

    Submitted 16 January, 2024; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: Accepted at AReCCa 2023

    ACM Class: F.4.1

  46. Safe, Fast, Concurrent Proof Checking for the lambda-Pi Calculus Modulo Rewriting

    Authors: Michael Färber

    Abstract: Several proof assistants, such as Isabelle or Coq, can concurrently check multiple proofs. In contrast, the vast majority of today's small proof checkers either does not support concurrency at all or only limited forms thereof, restricting the efficiency of proof checking on multi-core processors. This work shows the design of a small, memory- and thread-safe kernel that efficiently checks proofs… ▽ More

    Submitted 3 March, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: 11th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP '22), Jan 2022, Philadelphia, PA, United States

  47. Ample simplicial complexes

    Authors: Chaim Even-Zohar, Michael Farber, Lewis Mead

    Abstract: Motivated by potential applications in network theory, engineering and computer science, we study $r$-ample simplicial complexes. These complexes can be viewed as finite approximations to the Rado complex which has a remarkable property of {\it indestructibility,} in the sense that removing any finite number of its simplexes leaves a complex isomorphic to itself. We prove that an $r$-ample simplic… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Journal ref: European Journal of Mathematics, 2022, 8, 1-32

  48. arXiv:2010.09809  [pdf, ps, other

    math.AT cs.RO math.GT

    Parametrized topological complexity of collision-free motion planning in the plane

    Authors: Daniel C. Cohen, Michael Farber, Shmuel Weinberger

    Abstract: Parametrized motion planning algorithms have high degrees of universality and flexibility, as they are designed to work under a variety of external conditions, which are viewed as parameters and form part of the input of the underlying motion planning problem. In this paper, we analyze the parameterized motion planning problem for the motion of many distinct points in the plane, moving without col… ▽ More

    Submitted 14 October, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: revision includes an appendix on fibrations of certain map** spaces

    MSC Class: 55S40; 55M30; 55R80; 70Q05

  49. Topology of parametrised motion planning algorithms

    Authors: Daniel C. Cohen, Michael Farber, Shmuel Weinberger

    Abstract: In this paper we introduce and study a new concept of parametrised topological complexity, a topological invariant motivated by the motion planning problem of robotics. In the parametrised setting, a motion planning algorithm has high degree of universality and flexibility, it can function under a variety of external conditions (such as positions of the obstacles etc). We explicitly compute the pa… ▽ More

    Submitted 21 May, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

    Comments: To appear in SIAM Journal of Applied Algebra and Geometry

    MSC Class: 55S40; 55M30; 55R80; 70Q05

    Journal ref: SIAM Journal on Applied Algebra and Geometry, 5 (2021), 229-249

  50. arXiv:2007.11924  [pdf, other

    cs.CV cs.LG

    Right for the Right Reason: Making Image Classification Robust

    Authors: Anna Nguyen, Adrian Oberföll, Michael Färber

    Abstract: The effectiveness of Convolutional Neural Networks (CNNs)in classifying image data has been thoroughly demonstrated. In order to explain the classification to humans, methods for visualizing classification evidence have been developed in recent years. These explanations reveal that sometimes images are classified correctly, but for the wrong reasons,i.e., based on incidental evidence. Of course, i… ▽ More

    Submitted 12 January, 2021; v1 submitted 23 July, 2020; originally announced July 2020.