Skip to main content

Showing 1–31 of 31 results for author: Usbeck, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09323  [pdf, other

    cs.IR

    Master of Disaster: A Disaster-Related Event Monitoring System From News Streams

    Authors: Junbo Huang, Ricardo Usbeck

    Abstract: The need for a disaster-related event monitoring system has arisen due to the societal and economic impact caused by the increasing number of severe disaster events. An event monitoring system should be able to extract event-related information from texts, and discriminates event instances. We demonstrate our open-source event monitoring system, namely, Master of Disaster (MoD), which receives new… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 6 pages, 2 figures

  2. arXiv:2406.07257  [pdf, other

    cs.CL cs.AI

    Scholarly Question Answering using Large Language Models in the NFDI4DataScience Gateway

    Authors: Hamed Babaei Giglou, Tilahun Abedissa Taffa, Rana Abdullah, Aida Usmanova, Ricardo Usbeck, Jennifer D'Souza, Sören Auer

    Abstract: This paper introduces a scholarly Question Answering (QA) system on top of the NFDI4DataScience Gateway, employing a Retrieval Augmented Generation-based (RAG) approach. The NFDI4DS Gateway, as a foundational framework, offers a unified and intuitive interface for querying various scientific databases using federated search. The RAG-based scholarly QA, powered by a Large Language Model (LLM), faci… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 13 pages main content, 16 pages overall, 3 Figures, accepted for publication at NSLP 2024 workshop at ESWC 2024

  3. arXiv:2401.09553  [pdf

    cs.CL cs.AI

    BERTologyNavigator: Advanced Question Answering with BERT-based Semantics

    Authors: Shreya Rajpal, Ricardo Usbeck

    Abstract: The development and integration of knowledge graphs and language models has significance in artificial intelligence and natural language processing. In this study, we introduce the BERTologyNavigator -- a two-phased system that combines relation extraction techniques and BERT embeddings to navigate the relationships within the DBLP Knowledge Graph (KG). Our approach focuses on extracting one-hop r… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Accepted in Scholarly QALD Challenge @ ISWC 2023

    ACM Class: I.2.4; I.2.7

    Journal ref: Joint Proceedings of Scholarly QALD 2023 and SemREC 2023 co-located with 22nd International Semantic Web Conference ISWC 2023. Athens, Greece, November 6-10, 2023

  4. arXiv:2311.09841  [pdf, other

    cs.CL cs.AI cs.DB cs.LG

    Leveraging LLMs in Scholarly Knowledge Graph Question Answering

    Authors: Tilahun Abedissa Taffa, Ricardo Usbeck

    Abstract: This paper presents a scholarly Knowledge Graph Question Answering (KGQA) that answers bibliographic natural language questions by leveraging a large language model (LLM) in a few-shot manner. The model initially identifies the top-n similar training questions related to a given test question via a BERT-based sentence encoder and retrieves their corresponding SPARQL. Using the top-n similar questi… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  5. arXiv:2309.07545  [pdf, other

    cs.CL

    DBLPLink: An Entity Linker for the DBLP Scholarly Knowledge Graph

    Authors: Debayan Banerjee, Arefa, Ricardo Usbeck, Chris Biemann

    Abstract: In this work, we present a web application named DBLPLink, which performs entity linking over the DBLP scholarly knowledge graph. DBLPLink uses text-to-text pre-trained language models, such as T5, to produce entity label spans from an input text question. Entity candidates are fetched from a database based on the labels, and an entity re-ranker sorts them based on entity embeddings, such as Trans… ▽ More

    Submitted 25 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted at International Semantic Web Conference (ISWC) 2023 Posters & Demo Track

  6. arXiv:2308.14429  [pdf, other

    cs.CL cs.AI

    Biomedical Entity Linking with Triple-aware Pre-Training

    Authors: Xi Yan, Cedric Möller, Ricardo Usbeck

    Abstract: Linking biomedical entities is an essential aspect in biomedical natural language processing tasks, such as text mining and question answering. However, a difficulty of linking the biomedical entities using current large language models (LLM) trained on a general corpus is that biomedical entities are scarcely distributed in texts and therefore have been rarely seen during training by the LLM. At… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  7. arXiv:2305.15108  [pdf, other

    cs.CL

    The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

    Authors: Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, Chris Biemann

    Abstract: In this work, we analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We perform experiments within the the context of knowledge graph question answering (KGQA), where the task is to convert questions in natural language to the SPARQL query language. We observe that the query vocabulary is distinct from human vocabulary. Language Models (LMs)… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted as a short paper to ACL 2023 findings

  8. arXiv:2303.13351  [pdf, other

    cs.DL cs.CL

    DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

    Authors: Debayan Banerjee, Sushil Awale, Ricardo Usbeck, Chris Biemann

    Abstract: In this work we create a question answering dataset over the DBLP scholarly knowledge graph (KG). DBLP is an on-line reference for bibliographic information on major computer science publications that indexes over 4.4 million publications published by more than 2.2 million authors. Our dataset consists of 10,000 question answer pairs with the corresponding SPARQL queries which can be executed over… ▽ More

    Submitted 29 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 12 pages ceur-ws 1 column accepted at International Bibliometric Information Retrieval Workshp @ ECIR 2023

  9. arXiv:2303.13284  [pdf, other

    cs.CL cs.DB cs.IR

    GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering

    Authors: Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, Chris Biemann

    Abstract: In this work, we present an end-to-end Knowledge Graph Question Answering (KGQA) system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. The model takes a question in natural language as input and produces a simpler form of the intended SPARQL query. In the simpler form, the model does not directly produce entity and relation IDs. Instead, it produces correspondin… ▽ More

    Submitted 28 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 16 pages single column format accepted at ESWC 2023 research track

  10. arXiv:2303.03290  [pdf

    cs.CL cs.AI cs.IR

    AmQA: Amharic Question Answering Dataset

    Authors: Tilahun Abedissa, Ricardo Usbeck, Yaregal Assabie

    Abstract: Question Answering (QA) returns concise answers or answer lists from natural language text given a context document. Many resources go into curating QA datasets to advance robust models' development. There is a surge of QA datasets for languages like English, however, this is not true for Amharic. Amharic, the official language of Ethiopia, is the second most spoken Semitic language in the world.… ▽ More

    Submitted 16 November, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

  11. arXiv:2210.03353  [pdf, other

    cs.CL cs.AI cs.CY

    The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs

    Authors: Angelie Kraft, Ricardo Usbeck

    Abstract: Knowledge graphs are increasingly used in a plethora of downstream tasks or in the augmentation of statistical models to improve factuality. However, social biases are engraved in these representations and propagate downstream. We conducted a critical analysis of literature concerning biases at different steps of a knowledge graph lifecycle. We investigated factors introducing bias, as well as the… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted to AACL-IJCNLP 2022

    Journal ref: https://aclanthology.org/2022.aacl-main.49

  12. arXiv:2210.03352  [pdf, other

    cs.LG cs.CY cs.SI

    The Ethical Risks of Analyzing Crisis Events on Social Media with Machine Learning

    Authors: Angelie Kraft, Ricardo Usbeck

    Abstract: Social media platforms provide a continuous stream of real-time news regarding crisis events on a global scale. Several machine learning methods utilize the crowd-sourced data for the automated detection of crises and the characterization of their precursors and aftermaths. Early detection and localization of crisis-related events can help save lives and economies. Yet, the applied automation meth… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted to D2R2'22: International Workshop on Data-driven Resilience Research

    Journal ref: D2R2 2022

  13. arXiv:2206.13354  [pdf, other

    cs.CL cs.AI

    Transformer with Tree-order Encoding for Neural Program Generation

    Authors: Klaudia-Doris Thellmann, Bernhard Stadler, Ricardo Usbeck, Jens Lehmann

    Abstract: While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical information of the underlying programming language syntax has proven to be effective for code generation. Since the positional encoding of the Transformer can on… ▽ More

    Submitted 30 May, 2022; originally announced June 2022.

    Comments: This paper was authored in late 2020 and early 2021 for the most part

    MSC Class: 68T07; 68T50 ACM Class: I.2.7

  14. arXiv:2205.06573  [pdf, other

    cs.CL cs.AI

    Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research?

    Authors: Longquan Jiang, Ricardo Usbeck

    Abstract: Existing approaches on Question Answering over Knowledge Graphs (KGQA) have weak generalizability. That is often due to the standard i.i.d. assumption on the underlying dataset. Recently, three levels of generalization for KGQA were defined, namely i.i.d., compositional, zero-shot. We analyze 25 well-known KGQA datasets for 5 different Knowledge Graphs (KGs). We show that according to this definit… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: Accepted by SIGIR '22 (Resource Track)

  15. Modern Baselines for SPARQL Semantic Parsing

    Authors: Debayan Banerjee, Pranav Ajit Nair, Jivat Neet Kaur, Ricardo Usbeck, Chris Biemann

    Abstract: In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not… ▽ More

    Submitted 14 September, 2023; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: 5 pages, short paper, SIGIR 2022

  16. arXiv:2204.09149  [pdf, other

    cs.CL cs.AI

    DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation

    Authors: Md Rashad Al Hasan Rony, Ricardo Usbeck, Jens Lehmann

    Abstract: Task-oriented dialogue generation is challenging since the underlying knowledge is often dynamic and effectively incorporating knowledge into the learning process is hard. It is particularly challenging to generate both human-like and informative responses in this setting. Recent research primarily focused on various knowledge distillation methods where the underlying relationship between the fact… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted by the North American Chapter of the Association for Computational Linguistics (NAACL) 2022

  17. arXiv:2203.09183  [pdf, other

    cs.CL cs.AI

    RoMe: A Robust Metric for Evaluating Natural Language Generation

    Authors: Md Rashad Al Hasan Rony, Liubov Kovriguina, Debanjan Chaudhuri, Ricardo Usbeck, Jens Lehmann

    Abstract: Evaluating Natural Language Generation (NLG) systems is a challenging task. Firstly, the metric should ensure that the generated hypothesis reflects the reference's semantics. Secondly, it should consider the grammatical quality of the generated sentence. Thirdly, it should be robust enough to handle various surface forms of the generated sentence. Thus, an effective evaluation metric has to be mu… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted by the Association for Computational Linguistics (ACL) 2022

  18. arXiv:2202.00120  [pdf, other

    cs.CL cs.IR

    QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers

    Authors: Aleksandr Perevalov, Dennis Diefenbach, Ricardo Usbeck, Andreas Both

    Abstract: The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems,… ▽ More

    Submitted 7 February, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

  19. arXiv:2201.08174  [pdf, other

    cs.CL cs.AI cs.IR

    Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis

    Authors: Aleksandr Perevalov, Xi Yan, Liubov Kovriguina, Longquan Jiang, Andreas Both, Ricardo Usbeck

    Abstract: Data-driven systems need to be evaluated to establish trust in the scientific approach and its applicability. In particular, this is true for Knowledge Graph (KG) Question Answering (QA), where complex data structures are made accessible via natural-language interfaces. Evaluating the capabilities of these systems has been a driver for the community for more than ten years while establishing diffe… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  20. arXiv:2112.07606  [pdf, ps, other

    cs.CL cs.AI

    Semantic Answer Type and Relation Prediction Task (SMART 2021)

    Authors: Nandana Mihindukulasooriya, Mohnish Dubey, Alfio Gliozzo, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, Ricardo Usbeck, Gaetano Rossiello, Uttam Kumar

    Abstract: Each year the International Semantic Web Conference organizes a set of Semantic Web Challenges to establish competitions that will advance state-of-the-art solutions in some problem domains. The Semantic Answer Type and Relation Prediction Task (SMART) task is one of the ISWC 2021 Semantic Web challenges. This is the second year of the challenge after a successful SMART 2020 at ISWC 2020. This yea… ▽ More

    Submitted 10 January, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    ACM Class: F.4.1; I.2.4; I.2.7

  21. arXiv:2112.01989  [pdf, other

    cs.CL cs.AI cs.LG

    Survey on English Entity Linking on Wikidata

    Authors: Cedric Möller, Jens Lehmann, Ricardo Usbeck

    Abstract: Wikidata is a frequently updated, community-driven, and multilingual knowledge graph. Hence, Wikidata is an attractive basis for Entity Linking, which is evident by the recent increase in published papers. This survey focuses on four subjects: (1) Which Wikidata Entity Linking datasets exist, how widely used are they and how are they constructed? (2) Do the characteristics of Wikidata matter for t… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: Disclaimer: Cedric Möller, Jens Lehmann, Ricardo Usbeck, 2021. The definitive, peer reviewed and edited version of this article is published in the Semantic Web Journal, Special issue: Latest Advancements in Linguistic 3 Linked Data, 2021

  22. Knowledge Graph Question Answering using Graph-Pattern Isomorphism

    Authors: Daniel Vollmers, Rricha Jalota, Diego Moussallem, Hardik Topiwala, Axel-Cyrille Ngonga Ngomo, Ricardo Usbeck

    Abstract: Knowledge Graph Question Answering (KGQA) systems are based on machine learning algorithms, requiring thousands of question-answer pairs as training examples or natural language processing pipelines that need module fine-tuning. In this paper, we present a novel QA approach, dubbed TeBaQA. Our approach learns to answer questions based on graph isomorphisms from basic graph patterns of SPARQL queri… ▽ More

    Submitted 2 February, 2022; v1 submitted 11 March, 2021; originally announced March 2021.

    Comments: Version published in the proceedings of the 17th International Conference on Semantic Systems

    Journal ref: Further with Knowledge Graphs - Proceedings of the 17th International Conference on Semantic Systems 53 (2021) 103-117

  23. arXiv:2012.00555  [pdf, ps, other

    cs.AI cs.CL cs.IR

    SeMantic AnsweR Type prediction task (SMART) at ISWC 2020 Semantic Web Challenge

    Authors: Nandana Mihindukulasooriya, Mohnish Dubey, Alfio Gliozzo, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, Ricardo Usbeck

    Abstract: Each year the International Semantic Web Conference accepts a set of Semantic Web Challenges to establish competitions that will advance the state of the art solutions in any given problem domain. The SeMantic AnsweR Type prediction task (SMART) was part of ISWC 2020 challenges. Question type and answer type prediction can play a key role in knowledge base question answering systems providing insi… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  24. arXiv:2009.10847  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Message Passing for Hyper-Relational Knowledge Graphs

    Authors: Mikhail Galkin, Priyansh Trivedi, Gaurav Maheshwari, Ricardo Usbeck, Jens Lehmann

    Abstract: Hyper-relational knowledge graphs (KGs) (e.g., Wikidata) enable associating additional key-value pairs along with the main triple to disambiguate, or restrict the validity of a fact. In this work, we propose a message passing based graph encoder - StarE capable of modeling such hyper-relational KGs. Unlike existing approaches, StarE can encode an arbitrary number of additional information (qualifi… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

    Comments: Accepted to EMNLP 2020

  25. arXiv:2004.13843  [pdf, other

    cs.CL cs.DB cs.LG

    Template-based Question Answering using Recursive Neural Networks

    Authors: Ram G Athreya, Srividya Bansal, Axel-Cyrille Ngonga Ngomo, Ricardo Usbeck

    Abstract: We propose a neural network-based approach to automatically learn and classify natural language questions into its corresponding template using recursive neural networks. An obvious advantage of using neural networks is the elimination of the need for laborious feature engineering that can be cumbersome and error-prone. The input question is encoded into a vector representation. The model is train… ▽ More

    Submitted 8 June, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

  26. arXiv:2004.08355  [pdf

    cs.CL cs.AI

    Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

    Authors: Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim Köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer, Christian Chiarcos, Nils Feldhus, Julián Moreno-Schneider, Florian Kintzel, Elena Montiel, Víctor Rodríguez Doncel, John P. McCrae, David Laqua, Irina Patricia Theile, Christian Dittmar, Kalina Bontcheva, Ian Roberts, Andrejs Vasiljevs, Andis Lagzdiņš

    Abstract: With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the a… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Comments: Proceedings of the 1st International Workshop on Language Technology Platforms (IWLTP 2020). To appear

  27. arXiv:1805.11467  [pdf, other

    cs.CL

    Entity Linking in 40 Languages using MAG

    Authors: Diego Moussallem, Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo

    Abstract: A plethora of Entity Linking (EL) approaches has recently been developed. While many claim to be multilingual, the MAG (Multilingual AGDISTIS) approach has been shown recently to outperform the state of the art in multilingual EL on 7 languages. With this demo, we extend MAG to support EL in 40 different languages, including especially low-resources languages such as Ukrainian, Greek, Hungarian, C… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: Accepted at ESWC 2018

  28. arXiv:1710.08691  [pdf, other

    cs.CL

    BENGAL: An Automatic Benchmark Generator for Entity Recognition and Linking

    Authors: Axel-Cyrille Ngonga Ngomo, Michael Röder, Diego Moussallem, Ricardo Usbeck, René Speck

    Abstract: The manual creation of gold standards for named entity recognition and entity linking is time- and resource-intensive. Moreover, recent works show that such gold standards contain a large proportion of mistakes in addition to being difficult to maintain. We hence present BENGAL, a novel automatic generation of such gold standards as a complement to manually created benchmarks. The main advantage o… ▽ More

    Submitted 1 November, 2018; v1 submitted 24 October, 2017; originally announced October 2017.

    Comments: Accepted at INLG 2018

  29. arXiv:1710.08634  [pdf, other

    cs.IR cs.CL

    Using Multi-Label Classification for Improved Question Answering

    Authors: Ricardo Usbeck, Michael Hoffmann, Michael Röder, Jens Lehmann, Axel-Cyrille Ngonga Ngomo

    Abstract: A plethora of diverse approaches for question answering over RDF data have been developed in recent years. While the accuracy of these systems has increased significantly over time, most systems still focus on particular types of questions or particular challenges in question answering. What is a curse for single systems is a blessing for the combination of these systems. We show in this paper how… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

    Comments: 15 pages, 4 Tables, 3 Figues

  30. MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach

    Authors: Diego Moussallem, Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo

    Abstract: Entity linking has recently been the subject of a significant body of research. Currently, the best performing approaches rely on trained mono-lingual models. Porting these approaches to other languages is consequently a difficult endeavor as it requires corresponding training data and retraining of the models. We address this drawback by presenting a novel multilingual, knowledge-based agnostic a… ▽ More

    Submitted 17 October, 2017; v1 submitted 17 July, 2017; originally announced July 2017.

    Comments: Accepted in K-CAP 2017: Knowledge Capture Conference

    ACM Class: I.2.7

  31. arXiv:1611.01802  [pdf, other

    cs.AI cs.CL cs.IR

    Self-Wiring Question Answering Systems

    Authors: Ricardo Usbeck, Jonathan Huthmann, Nico Duldhardt, Axel-Cyrille Ngonga Ngomo

    Abstract: Question answering (QA) has been the subject of a resurgence over the past years. The said resurgence has led to a multitude of question answering (QA) systems being developed both by companies and research facilities. While a few components of QA systems get reused across implementations, most systems do not leverage the full potential of component reuse. Hence, the development of QA systems is c… ▽ More

    Submitted 8 November, 2016; v1 submitted 6 November, 2016; originally announced November 2016.

    Comments: 6 pages, 1 figure, pre-print in lncs