Skip to main content

Showing 1–50 of 58 results for author: Paulheim, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12634  [pdf, other

    cs.IR cs.AI

    News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation

    Authors: Andreea Iana, Fabian David Schmidt, Goran Glavaš, Heiko Paulheim

    Abstract: Rapidly growing numbers of multilingual news consumers pose an increasing challenge to news recommender systems in terms of providing customized recommendations. First, existing neural news recommenders, even when powered by multilingual language models (LMs), suffer substantial performance losses in zero-shot cross-lingual transfer (ZS-XLT). Second, the current paradigm of fine-tuning the backbon… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; H.3.3

  2. arXiv:2406.02650  [pdf, other

    cs.LG cs.AI

    By Fair Means or Foul: Quantifying Collusion in a Market Simulation with Deep Reinforcement Learning

    Authors: Michael Schlechtinger, Damaris Kosack, Franz Krause, Heiko Paulheim

    Abstract: In the rapidly evolving landscape of eCommerce, Artificial Intelligence (AI) based pricing algorithms, particularly those utilizing Reinforcement Learning (RL), are becoming increasingly prevalent. This rise has led to an inextricable pricing situation with the potential for market collusion. Our research employs an experimental oligopoly model of repeated price competition, systematically varying… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Preprint for IJCAI 2024

  3. arXiv:2405.15375  [pdf, other

    cs.AI cs.DB cs.DC

    A Planet Scale Spatial-Temporal Knowledge Graph Based On OpenStreetMap And H3 Grid

    Authors: Martin Böckling, Heiko Paulheim, Sarah Detzler

    Abstract: Geospatial data plays a central role in modeling our world, for which OpenStreetMap (OSM) provides a rich source of such data. While often spatial data is represented in a tabular format, a graph based representation provides the possibility to interconnect entities which would have been separated in a tabular representation. We propose in our paper a framework which supports a planet scale transf… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 12 pages, 2 figures, GeoLD2024: 6th Geospatial Linked Data Workshop, May 26, 2024, Hersonissos, Greece

  4. arXiv:2404.14970  [pdf, other

    cs.LG

    Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction

    Authors: Rita T. Sousa, Heiko Paulheim

    Abstract: Diabetes is a worldwide health issue affecting millions of people. Machine learning methods have shown promising results in improving diabetes prediction, particularly through the analysis of diverse data types, namely gene expression data. While gene expression data can provide valuable insights, challenges arise from the fact that the sample sizes in expression datasets are usually limited, and… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures, 7th Workshop on Semantic Web Solutions for Large-scale Biomedical Data Analytics at ESWC2024

    ACM Class: J.3

  5. arXiv:2403.17876  [pdf, other

    cs.IR

    MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation

    Authors: Andreea Iana, Goran Glavaš, Heiko Paulheim

    Abstract: Digital news platforms use news recommenders as the main instrument to cater to the individual information needs of readers. Despite an increasingly language-diverse online community, in which many Internet users consume news in multiple languages, the majority of news recommendation focuses on major, resource-rich languages, and English in particular. Moreover, nearly all news recommendation effo… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024)

    ACM Class: H.3.3

  6. arXiv:2312.10370  [pdf, other

    cs.AI cs.IR cs.LG

    Do Similar Entities have Similar Embeddings?

    Authors: Nicolas Hubert, Heiko Paulheim, Armelle Brun, Davy Monticolo

    Abstract: Knowledge graph embedding models (KGEMs) developed for link prediction learn vector representations for entities in a knowledge graph, known as embeddings. A common tacit assumption is the KGE entity similarity assumption, which states that these KGEMs retain the graph's structure within their embedding space, \textit{i.e.}, position similar entities within the graph close to one another. This des… ▽ More

    Submitted 28 March, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted at ESWC 2024

  7. arXiv:2312.04997  [pdf, other

    cs.AI cs.LG

    Beyond Transduction: A Survey on Inductive, Few Shot, and Zero Shot Link Prediction in Knowledge Graphs

    Authors: Nicolas Hubert, Pierre Monnin, Heiko Paulheim

    Abstract: Knowledge graphs (KGs) comprise entities interconnected by relations of different semantic meanings. KGs are being used in a wide range of applications. However, they inherently suffer from incompleteness, i.e. entities or facts about entities are missing. Consequently, a larger body of works focuses on the completion of missing information in KGs, which is commonly referred to as link prediction… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  8. OLaLa: Ontology Matching with Large Language Models

    Authors: Sven Hertling, Heiko Paulheim

    Abstract: Ontology (and more generally: Knowledge Graph) Matching is a challenging task where information in natural language is one of the most important signals to process. With the rise of Large Language Models, it is possible to incorporate this knowledge in a better way into the matching pipeline. A number of decisions still need to be taken, e.g., how to generate a prompt that is useful to the model,… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted at K-CAP 2023 conference

  9. arXiv:2310.01146  [pdf, other

    cs.IR

    NewsRecLib: A PyTorch-Lightning Library for Neural News Recommendation

    Authors: Andreea Iana, Goran Glavaš, Heiko Paulheim

    Abstract: NewsRecLib is an open-source library based on Pytorch-Lightning and Hydra developed for training and evaluating neural news recommendation models. The foremost goals of NewsRecLib are to promote reproducible research and rigorous experimental evaluation by (i) providing a unified and highly configurable framework for exhaustive experimental studies and (ii) enabling a thorough analysis of the perf… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted at the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

    ACM Class: H.3.3; D.2.13

  10. arXiv:2309.13939  [pdf, other

    cs.AI

    The Time Traveler's Guide to Semantic Web Research: Analyzing Fictitious Research Themes in the ESWC "Next 20 Years" Track

    Authors: Irene Celino, Heiko Paulheim

    Abstract: What will Semantic Web research focus on in 20 years from now? We asked this question to the community and collected their visions in the "Next 20 years" track of ESWC 2023. We challenged the participants to submit "future" research papers, as if they were submitting to the 2043 edition of the conference. The submissions - entirely fictitious - were expected to be full scientific papers, with rese… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 13 pages, 8 figures, 2 tables

  11. arXiv:2309.03023  [pdf, other

    cs.AI cs.LG

    Universal Preprocessing Operators for Embedding Knowledge Graphs with Literals

    Authors: Patryk Preisner, Heiko Paulheim

    Abstract: Knowledge graph embeddings are dense numerical representations of entities in a knowledge graph (KG). While the majority of approaches concentrate only on relational information, i.e., relations between entities, fewer approaches exist which also take information about literal values (e.g., textual descriptions or numerical information) into account. Those which exist are typically tailored toward… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted for DL4KG Workshop at ISWC 2023

  12. arXiv:2309.00550  [pdf, other

    cs.IR

    NeMig -- A Bilingual News Collection and Knowledge Graph about Migration

    Authors: Andreea Iana, Mehwish Alam, Alexander Grote, Nevena Nikolajevic, Katharina Ludwig, Philipp Müller, Christof Weinhardt, Heiko Paulheim

    Abstract: News recommendation plays a critical role in sha** the public's worldviews through the way in which it filters and disseminates information about different topics. Given the crucial impact that media plays in opinion formation, especially for sensitive topics, understanding the effects of personalized recommendation beyond accuracy has become essential in today's digital society. In this work, w… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted at the 11th International Workshop on News Recommendation and Analytics (INRA 2023) in conjunction with ACM RecSys 2023

  13. arXiv:2308.10537  [pdf, other

    cs.AI cs.DB cs.LG

    KGrEaT: A Framework to Evaluate Knowledge Graphs via Downstream Tasks

    Authors: Nicolas Heist, Sven Hertling, Heiko Paulheim

    Abstract: In recent years, countless research papers have addressed the topics of knowledge graph creation, extension, or completion in order to create knowledge graphs that are larger, more correct, or more diverse. This research is typically motivated by the argumentation that using such enhanced knowledge graphs to solve downstream tasks will improve performance. Nonetheless, this is hardly ever evaluate… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted for the Short Paper track of CIKM'23, October 21-25, 2023, Birmingham, United Kingdom

  14. arXiv:2308.03447  [pdf, other

    cs.AI

    Biomedical Knowledge Graph Embeddings with Negative Statements

    Authors: Rita T. Sousa, Sara Silva, Heiko Paulheim, Catia Pesquita

    Abstract: A knowledge graph is a powerful representation of real-world entities and their relations. The vast majority of these relations are defined as positive statements, but the importance of negative statements is increasingly recognized, especially under an Open World Assumption. Explicitly considering negative statements has been shown to improve performance on tasks such as entity summarization and… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 19 pages, 4 figures

  15. arXiv:2307.16089  [pdf, other

    cs.IR

    Train Once, Use Flexibly: A Modular Framework for Multi-Aspect Neural News Recommendation

    Authors: Andreea Iana, Goran Glavaš, Heiko Paulheim

    Abstract: Recent neural news recommenders (NNRs) extend content-based recommendation (1) by aligning additional aspects (e.g., topic, sentiment) between candidate news and user history or (2) by diversifying recommendations w.r.t. these aspects. This customization is achieved by "hardcoding" additional constraints into the NNR's architecture and/or training objectives: any change in the desired recommendati… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

    ACM Class: H.3.3; I.2.7

  16. arXiv:2306.03659  [pdf, other

    cs.AI cs.LG

    Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE

    Authors: Nicolas Hubert, Heiko Paulheim, Pierre Monnin, Armelle Brun, Davy Monticolo

    Abstract: Knowledge graph embedding models (KGEMs) have gained considerable traction in recent years. These models learn a vector representation of knowledge graph entities and relations, a.k.a. knowledge graph embeddings (KGEs). Learning versatile KGEs is desirable as it makes them useful for a broad range of tasks. However, KGEMs are usually trained for a specific task, which makes their embeddings task-d… ▽ More

    Submitted 19 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  17. arXiv:2305.02966  [pdf, other

    cs.LG cs.AI

    ExeKGLib: Knowledge Graphs-Empowered Machine Learning Analytics

    Authors: Antonis Klironomos, Baifan Zhou, Zhipeng Tan, Zhuoxun Zheng, Gad-Elrab Mohamed, Heiko Paulheim, Evgeny Kharlamov

    Abstract: Many machine learning (ML) libraries are accessible online for ML practitioners. Typical ML pipelines are complex and consist of a series of steps, each of them invoking several ML libraries. In this demo paper, we present ExeKGLib, a Python library that allows users with coding skills and minimal ML knowledge to build ML pipelines. ExeKGLib relies on knowledge graphs to improve the transparency a… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: This paper has been accepted as a Demo paper at ESWC 2023

  18. arXiv:2304.03112  [pdf, other

    cs.IR

    Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives

    Authors: Andreea Iana, Goran Glavaš, Heiko Paulheim

    Abstract: The advent of personalized news recommendation has given rise to increasingly complex recommender architectures. Most neural news recommenders rely on user click behavior and typically introduce dedicated user encoders that aggregate the content of clicked news into user embeddings (early fusion). These models are predominantly trained with standard point-wise classification objectives. The existi… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted at the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023)

    ACM Class: H.3.3

  19. arXiv:2303.15113  [pdf, other

    cs.AI

    Describing and Organizing Semantic Web and Machine Learning Systems in the SWeMLS-KG

    Authors: Fajar J. Ekaputra, Majlinda Llugiqi, Marta Sabou, Andreas Ekelhart, Heiko Paulheim, Anna Breit, Artem Revenko, Laura Waltersdorfer, Kheir Eddine Farfar, Sören Auer

    Abstract: In line with the general trend in artificial intelligence research to create intelligent systems that combine learning and symbolic components, a new sub-area has emerged that focuses on combining machine learning (ML) components with techniques developed by the Semantic Web (SW) community - Semantic Web Machine Learning (SWeML for short). Due to its rapid growth and impact on several communities… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Preprint of a paper in the resource track of the 20th Extended Semantic Web Conference (ESWC'23)

  20. arXiv:2303.04426  [pdf, other

    cs.CL cs.AI cs.IR

    NASTyLinker: NIL-Aware Scalable Transformer-based Entity Linker

    Authors: Nicolas Heist, Heiko Paulheim

    Abstract: Entity Linking (EL) is the task of detecting mentions of entities in text and disambiguating them to a reference knowledge base. Most prevalent EL approaches assume that the reference knowledge base is complete. In practice, however, it is necessary to deal with the case of linking to an entity that is not contained in the knowledge base (NIL entity). Recent works have shown that, instead of focus… ▽ More

    Submitted 13 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Preprint of a paper in the research track of the 20th Extended Semantic Web Conference (ESWC'23)

  21. arXiv:2210.02864  [pdf, other

    cs.IR

    DBkWik++ -- Multi Source Matching of Knowledge Graphs

    Authors: Sven Hertling, Heiko Paulheim

    Abstract: Large knowledge graphs like DBpedia and YAGO are always based on the same source, i.e., Wikipedia. But there are more wikis that contain information about long-tail entities such as wiki hosting platforms like Fandom. In this paper, we present the approach and analysis of DBkWik++, a fused Knowledge Graph from thousands of wikis. A modified version of the DBpedia framework is applied to each wiki… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: Published at KGSWC 2022

  22. arXiv:2210.01482  [pdf, other

    cs.IR

    Transformer-based Subject Entity Detection in Wikipedia Listings

    Authors: Nicolas Heist, Heiko Paulheim

    Abstract: In tasks like question answering or text summarisation, it is essential to have background knowledge about the relevant entities. The information about entities - in particular, about long-tail or emerging entities - in publicly available knowledge graphs like DBpedia or CaLiGraph is far from complete. In this paper, we present an approach that exploits the semi-structured nature of listings (like… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Published at Deep Learning for Knowledge Graphs workshop (DL4KG) at International Semantic Web Conference 2022 (ISWC 2022)

  23. arXiv:2209.07479  [pdf, other

    cs.AI

    Gollum: A Gold Standard for Large Scale Multi Source Knowledge Graph Matching

    Authors: Sven Hertling, Heiko Paulheim

    Abstract: The number of Knowledge Graphs (KGs) generated with automatic and manual approaches is constantly growing. For an integrated view and usage, an alignment between these KGs is necessary on the schema as well as instance level. While there are approaches that try to tackle this multi source knowledge graph matching problem, large gold standards are missing to evaluate their effectiveness and scalabi… ▽ More

    Submitted 16 September, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: accepted at AKBC 2022

  24. arXiv:2207.14094  [pdf, other

    cs.CL cs.AI

    Entity Type Prediction Leveraging Graph Walks and Entity Descriptions

    Authors: Russa Biswas, Jan Portisch, Heiko Paulheim, Harald Sack, Mehwish Alam

    Abstract: The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation or human curation. Entity ty** is the task of assigning or inferring the semantic type of an entity in a KG. This paper presents \textit{GRAND}, a novel approach for entity ty** leveraging different graph walk strategies in RDF2vec together with textual entity d… ▽ More

    Submitted 29 July, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

  25. arXiv:2207.09964  [pdf, other

    cs.AI

    On a Generalized Framework for Time-Aware Knowledge Graphs

    Authors: Franz Krause, Tobias Weller, Heiko Paulheim

    Abstract: Knowledge graphs have emerged as an effective tool for managing and standardizing semistructured domain knowledge in a human- and machine-interpretable way. In terms of graph-based domain applications, such as embeddings and graph neural networks, current research is increasingly taking into account the time-related evolution of the information encoded within a graph. Algorithms and models for sta… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted for publication at Semantics 2022

  26. arXiv:2207.06014  [pdf, other

    cs.AI

    The DLCC Node Classification Benchmark for Analyzing Knowledge Graph Embeddings

    Authors: Jan Portisch, Heiko Paulheim

    Abstract: Knowledge graph embedding is a representation learning technique that projects entities and relations in a knowledge graph to continuous vector spaces. Embeddings have gained a lot of uptake and have been heavily used in link prediction and other downstream prediction tasks. Most approaches are evaluated on a single task or a single group of tasks to determine their overall performance. The evalua… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: Accepted at International Semantic Web Conference (ISWC) 2022

  27. arXiv:2204.13931  [pdf, ps, other

    cs.CL cs.AI

    KERMIT -- A Transformer-Based Approach for Knowledge Graph Matching

    Authors: Sven Hertling, Jan Portisch, Heiko Paulheim

    Abstract: One of the strongest signals for automated matching of knowledge graphs and ontologies are textual concept descriptions. With the rise of transformer-based language models, text comparison based on meaning (rather than lexical features) is available to researchers. However, performing pairwise comparisons of all textual descriptions of concepts in two knowledge graphs is expensive and scales quadr… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Comments: accepted at the DeepOntoNLP Workshop at the ESWC 2022

  28. arXiv:2204.13329  [pdf, other

    cs.AI

    Refining Diagnosis Paths for Medical Diagnosis based on an Augmented Knowledge Graph

    Authors: Niclas Heilig, Jan Kirchhoff, Florian Stumpe, Joan Plepi, Lucie Flek, Heiko Paulheim

    Abstract: Medical diagnosis is the process of making a prediction of the disease a patient is likely to have, given a set of symptoms and observations. This requires extensive expert knowledge, in particular when covering a large variety of diseases. Such knowledge can be coded in a knowledge graph -- encompassing diseases, symptoms, and diagnosis paths. Since both the knowledge itself and its encoding can… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Accepted at the 5th Workshop on Semantic Web solutions for large-scale biomedical data analytics

  29. arXiv:2204.04040  [pdf, other

    cs.AI cs.DB cs.IR cs.LG

    Ontology Matching Through Absolute Orientation of Embedding Spaces

    Authors: Jan Portisch, Guilherme Costa, Karolin Stefani, Katharina Kreplin, Michael Hladik, Heiko Paulheim

    Abstract: Ontology matching is a core task when creating interoperable and linked open datasets. In this paper, we explore a novel structure-based map** approach which is based on knowledge graph embeddings: The ontologies to be matched are embedded, and an approach known as absolute orientation is used to align the two embedding spaces. Next to the approach, the paper presents a first, preliminary evalua… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: accepted at the ESWC Posters and Demos Track

  30. arXiv:2204.02777  [pdf, other

    cs.LG cs.AI

    Walk this Way! Entity Walks and Property Walks for RDF2vec

    Authors: Jan Portisch, Heiko Paulheim

    Abstract: RDF2vec is a knowledge graph embedding mechanism which first extracts sequences from knowledge graphs by performing random walks, then feeds those into the word embedding algorithm word2vec for computing vector representations for entities. In this poster, we introduce two new flavors of walk extraction coined e-walks and p-walks, which put an emphasis on the structure or the neighborhood of an en… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: accepted at the ESWC Posters and Demos Track

  31. Towards Analyzing the Bias of News Recommender Systems Using Sentiment and Stance Detection

    Authors: Mehwish Alam, Andreea Iana, Alexander Grote, Katharina Ludwig, Philipp Müller, Heiko Paulheim

    Abstract: News recommender systems are used by online news providers to alleviate information overload and to provide personalized content to users. However, algorithmic news curation has been hypothesized to create filter bubbles and to intensify users' selective exposure, potentially increasing their vulnerability to polarized opinions and fake news. In this paper, we show how information on news items' s… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted at the 2nd International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2022) collocated with The Web Conference 2022 (WWW'22), 25-29 April 2022, Lyon, France

  32. Order Matters: Matching Multiple Knowledge Graphs

    Authors: Sven Hertling, Heiko Paulheim

    Abstract: Knowledge graphs (KGs) provide information in machine interpretable form. In cases where multiple KGs are used in the same system, that information needs to be integrated. This is usually done by automated matching systems. Most of those systems consider only 1:1 (binary) matching tasks. Thus, matching a larger number of knowledge graphs with such systems would lead to quadratic efforts. In this p… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  33. arXiv:2110.05028  [pdf, other

    cs.AI

    The CaLiGraph Ontology as a Challenge for OWL Reasoners

    Authors: Nicolas Heist, Heiko Paulheim

    Abstract: CaLiGraph is a large-scale cross-domain knowledge graph generated from Wikipedia by exploiting the category system, list pages, and other list structures in Wikipedia, containing more than 15 million typed entities and around 10 million relation assertions. Other than knowledge graphs such as DBpedia and YAGO, whose ontologies are comparably simplistic, CaLiGraph also has a rich ontology, comprisi… ▽ More

    Submitted 4 January, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Winner of the Dataset Track of the Semantic Reasoning Evaluation Challenge at the International Semantic Web Conference (ISWC), 2021

  34. arXiv:2109.07401  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Matching with Transformers in MELT

    Authors: Sven Hertling, Jan Portisch, Heiko Paulheim

    Abstract: One of the strongest signals for automated matching of ontologies and knowledge graphs are the textual descriptions of the concepts. The methods that are typically applied (such as character- or token-based comparisons) are relatively simple, and therefore do not capture the actual meaning of the texts. With the rise of transformer-based language models, text comparison based on meaning (rather th… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: accepted at the Ontology Matching Workshop at the International Semantic Web Conference (ISWC 2021)

  35. arXiv:2108.05280  [pdf, other

    cs.LG cs.AI

    Putting RDF2vec in Order

    Authors: Jan Portisch, Heiko Paulheim

    Abstract: The RDF2vec method for creating node embeddings on knowledge graphs is based on word2vec, which, in turn, is agnostic towards the position of context words. In this paper, we argue that this might be a shortcoming when training RDF2vec, and show that using a word2vec variant which respects order yields considerable performance gains especially on tasks where entities of different classes are invol… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: Accepted at the ISWC 2021 posters and demos track

  36. arXiv:2107.01856  [pdf, other

    cs.AI cs.LG cs.MA

    Winning at Any Cost -- Infringing the Cartel Prohibition With Reinforcement Learning

    Authors: Michael Schlechtinger, Damaris Kosack, Heiko Paulheim, Thomas Fetzer

    Abstract: Pricing decisions are increasingly made by AI. Thanks to their ability to train with live market data while making decisions on the fly, deep reinforcement learning algorithms are especially effective in taking such pricing decisions. In e-commerce scenarios, multiple reinforcement learning agents can set prices based on their competitor's prices. Therefore, research states that agents might end u… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: accepted at the 19th International Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS 2021)

  37. arXiv:2107.00873  [pdf, other

    cs.IR cs.AI cs.DB

    On-Demand and Lightweight Knowledge Graph Generation -- a Demonstration with DBpedia

    Authors: Malte Brockmeier, Yawen Liu, Sunita Pateer, Sven Hertling, Heiko Paulheim

    Abstract: Modern large-scale knowledge graphs, such as DBpedia, are datasets which require large computational resources to serve and process. Moreover, they often have longer release cycles, which leads to outdated information in those graphs. In this paper, we present DBpedia on Demand -- a system which serves DBpedia resources on demand without the need to materialize and store the entire graph, and whic… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

    Comments: Accepted at Semantics 2021

  38. arXiv:2107.00001  [pdf, other

    cs.DB cs.AI cs.LG

    Background Knowledge in Schema Matching: Strategy vs. Data

    Authors: Jan Portisch, Michael Hladik, Heiko Paulheim

    Abstract: The use of external background knowledge can be beneficial for the task of matching schemas or ontologies automatically. In this paper, we exploit six general-purpose knowledge graphs as sources of background knowledge for the matching task. The background sources are evaluated by applying three different exploitation strategies. We find that explicit strategies still outperform latent ones and th… ▽ More

    Submitted 29 June, 2021; originally announced July 2021.

    Comments: accepted at the International Semantic Web Conference '21 (ISWC 2021)

  39. GraphConfRec: A Graph Neural Network-Based Conference Recommender System

    Authors: Andreea Iana, Heiko Paulheim

    Abstract: In today's academic publishing model, especially in Computer Science, conferences commonly constitute the main platforms for releasing the latest peer-reviewed advancements in their respective fields. However, choosing a suitable academic venue for publishing one's research can represent a challenging task considering the plethora of available conferences, particularly for those at the start of th… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

    Comments: Accepted at the Joint Conference on Digital Libraries (JCDL 2021)

    Journal ref: 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2021, pp. 90-99

  40. arXiv:2105.01305  [pdf, other

    cs.AI cs.CL

    Large-scale Taxonomy Induction Using Entity and Word Embeddings

    Authors: Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, Heiko Paulheim

    Abstract: Taxonomies are an important ingredient of knowledge organization, and serve as a backbone for more sophisticated knowledge representations in intelligent systems, such as formal ontologies. However, building taxonomies manually is a costly endeavor, and hence, automatic methods for taxonomy induction are a good alternative to build large-scale taxonomies. In this paper, we propose TIEmb, an approa… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: Published at IEEE/WIC/ACM International Conference on Web Intelligence 2017 (WI'17)

  41. arXiv:2105.00674  [pdf, other

    cs.IR cs.AI

    Bias in Knowledge Graphs -- an Empirical Study with Movie Recommendation and Different Language Editions of DBpedia

    Authors: Michael Matthias Voit, Heiko Paulheim

    Abstract: Public knowledge graphs such as DBpedia and Wikidata have been recognized as interesting sources of background knowledge to build content-based recommender systems. They can be used to add information about the items to be recommended and links between those. While quite a few approaches for exploiting knowledge graphs have been proposed, most of them aim at optimizing the recommendation strategy… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

    Comments: Accepted for publication at 3rd Conference on Language, Data and Knowledge (LDK 2021)

  42. arXiv:2103.05110  [pdf, other

    cs.CV

    Web Table Classification based on Visual Features

    Authors: Babette Bühler, Heiko Paulheim

    Abstract: Tables on the web constitute a valuable data source for many applications, like factual search and knowledge base augmentation. However, as genuine tables containing relational knowledge only account for a small proportion of tables on the web, reliable genuine web table classification is a crucial first step of table extraction. Previous works usually rely on explicit feature construction from th… ▽ More

    Submitted 25 February, 2021; originally announced March 2021.

    Comments: Accepted at International Conference of Web Engineering (ICWE 2021)

  43. arXiv:2103.01576  [pdf, other

    cs.LG cs.CL cs.IR

    FinMatcher at FinSim-2: Hypernym Detection in the Financial Services Domain using Knowledge Graphs

    Authors: Jan Portisch, Michael Hladik, Heiko Paulheim

    Abstract: This paper presents the FinMatcher system and its results for the FinSim 2021 shared task which is co-located with the Workshop on Financial Technology on the Web (FinWeb) in conjunction with The Web Conference. The FinSim-2 shared task consists of a set of concept labels from the financial services domain. The goal is to find the most relevant top-level concept from a given set of concepts. The F… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

  44. Information Extraction From Co-Occurring Similar Entities

    Authors: Nicolas Heist, Heiko Paulheim

    Abstract: Knowledge about entities and their interrelations is a crucial factor of success for tasks like question answering or text summarization. Publicly available knowledge graphs like Wikidata or DBpedia are, however, far from being complete. In this paper, we explore how information extracted from similar entities that co-occur in structures like tables or lists can help to increase the coverage of su… ▽ More

    Submitted 15 February, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: Preprint of a paper accepted for the research track of the Web Conference (WWW'21), April 19-23, 2021, Ljubljana, Slovenia

  45. arXiv:2012.11936  [pdf, other

    cs.AI

    Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019

    Authors: Nacira Abbas, Kholoud Alghamdi, Mortaza Alinam, Francesca Alloatti, Glenda Amaral, Claudia d'Amato, Luigi Asprino, Martin Beno, Felix Bensmann, Russa Biswas, Ling Cai, Riley Capshaw, Valentina Anita Carriero, Irene Celino, Amine Dadoun, Stefano De Giorgis, Harm Delva, John Domingue, Michel Dumontier, Vincent Emonet, Marieke van Erp, Paola Espinoza Arias, Omaima Fallatah, Sebastián Ferrada, Marc Gallofré Ocaña , et al. (49 additional authors not shown)

    Abstract: One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this fur… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  46. arXiv:2009.11102  [pdf, other

    cs.AI cs.DB cs.LG

    Supervised Ontology and Instance Matching with MELT

    Authors: Sven Hertling, Jan Portisch, Heiko Paulheim

    Abstract: In this paper, we present MELT-ML, a machine learning extension to the Matching and EvaLuation Toolkit (MELT) which facilitates the application of supervised learning for ontology and instance matching. Our contributions are twofold: We present an open source machine learning extension to the matching toolkit as well as two supervised learning use cases demonstrating the capabilities of the new ex… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: accepted at the the Fifteenth International Workshop on Ontology Matching collocated with the 19th International Semantic Web Conference ISWC-2020

  47. arXiv:2009.07659  [pdf, other

    cs.AI cs.CL

    RDF2Vec Light -- A Lightweight Approach for Knowledge Graph Embeddings

    Authors: Jan Portisch, Michael Hladik, Heiko Paulheim

    Abstract: Knowledge graph embedding approaches represent nodes and edges of graphs as mathematical vectors. Current approaches focus on embedding complete knowledge graphs, i.e. all nodes and edges. This leads to very high computational requirements on large graphs such as DBpedia or Wikidata. However, for most downstream application scenarios, only a small subset of concepts is of actual interest. In this… ▽ More

    Submitted 17 September, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted at International Semantic Web Conference (ISWC) 2020, Posters and Demonstrations Track

  48. arXiv:2009.04404  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge Graphs

    Authors: Gilles Vandewiele, Bram Steenwinckel, Pieter Bonte, Michael Weyns, Heiko Paulheim, Petar Ristoski, Filip De Turck, Femke Ongenae

    Abstract: As KGs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modelling techniques. The original work proposed the Weisfeiler-Lehman (WL) kernel to improve the quality of the repr… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

  49. arXiv:2009.00318  [pdf, other

    cs.AI

    More is not Always Better: The Negative Impact of A-box Materialization on RDF2vec Knowledge Graph Embeddings

    Authors: Andreea Iana, Heiko Paulheim

    Abstract: RDF2vec is an embedding technique for representing knowledge graph entities in a continuous vector space. In this paper, we investigate the effect of materializing implicit A-box axioms induced by subproperties, as well as symmetric and transitive properties. While it might be a reasonable assumption that such a materialization before computing embeddings might lead to better embeddings, we conduc… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

    Comments: Accepted at the Workshop on Combining Symbolic and Sub-symbolic methods and their Applications (CSSA 2020)

  50. arXiv:2008.05239  [pdf, other

    cs.AI cs.DB

    A Knowledge Graph for Assessing Aggressive Tax Planning Strategies

    Authors: Niklas Lüdemann, Ageda Shiba, Nikolaos Thymianis, Nicolas Heist, Christopher Ludwig, Heiko Paulheim

    Abstract: The taxation of multi-national companies is a complex field, since it is influenced by the legislation of several states. Laws in different states may have unforeseen interaction effects, which can be exploited by allowing multinational companies to minimize taxes, a concept known as tax planning. In this paper, we present a knowledge graph of multinational companies and their relationships, compr… ▽ More

    Submitted 16 October, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: Accepted to International Semantic Web Conference 2020