Search | arXiv e-print repository

Informed Pre-Training on Prior Knowledge

Authors: Laura von Rueden, Sebastian Houben, Kostadin Cvejoski, Christian Bauckhage, Nico Piatkowski

Abstract: When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training on more concise forms of knowledge has rather been overlooked. In this paper, we propose a novel informed machine learning approach and suggest to pre-train on p… ▽ More When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training on more concise forms of knowledge has rather been overlooked. In this paper, we propose a novel informed machine learning approach and suggest to pre-train on prior knowledge. Formal knowledge representations, e.g. graphs or equations, are first transformed into a small and condensed data set of knowledge prototypes. We show that informed pre-training on such knowledge prototypes (i) speeds up the learning processes, (ii) improves generalization capabilities in the regime where not enough training data is available, and (iii) increases model robustness. Analyzing which parts of the model are affected most by the prototypes reveals that improvements come from deeper layers that typically represent high-level features. This confirms that informed pre-training can indeed transfer semantic knowledge. This is a novel effect, which shows that knowledge-based pre-training has additional and complementary strengths to existing approaches. △ Less

Submitted 23 May, 2022; originally announced May 2022.

arXiv:2205.04712 [pdf, other]

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Authors: Julian Wörmann, Daniel Bogdoll, Christian Brunner, Etienne Bührle, Han Chen, Evaristus Fuh Chuo, Kostadin Cvejoski, Ludger van Elst, Philip Gottschall, Stefan Griesche, Christian Hellert, Christian Hesels, Sebastian Houben, Tim Joseph, Niklas Keil, Johann Kelsch, Mert Keser, Hendrik Königshof, Erwin Kraft, Leonie Kreuser, Kevin Krone, Tobias Latka, Denny Mattern, Stefan Matthes, Franz Motzkus , et al. (27 additional authors not shown)

Abstract: The availability of representative datasets is an essential prerequisite for many successful artificial intelligence and machine learning models. However, in real life applications these models often encounter scenarios that are inadequately represented in the data used for training. There are various reasons for the absence of sufficient data, ranging from time and cost constraints to ethical con… ▽ More The availability of representative datasets is an essential prerequisite for many successful artificial intelligence and machine learning models. However, in real life applications these models often encounter scenarios that are inadequately represented in the data used for training. There are various reasons for the absence of sufficient data, ranging from time and cost constraints to ethical considerations. As a consequence, the reliable usage of these models, especially in safety-critical applications, is still a tremendous challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches. Knowledge augmented machine learning approaches offer the possibility of compensating for deficiencies, errors, or ambiguities in the data, thus increasing the generalization capability of the applied models. Even more, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-driven models with existing knowledge. The identified approaches are structured according to the categories knowledge integration, extraction and conformity. In particular, we address the application of the presented methods in the field of autonomous driving. △ Less

Submitted 20 November, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: 111 pages, Added section on Run-time Network Verification

arXiv:2112.10712 [pdf, other]

Evolutionary Hierarchical Harvest Schedule Optimization for Food Waste Prevention

Authors: Maurice Günder, Nico Piatkowski, Laura von Rueden, Rafet Sifa, Christian Bauckhage

Abstract: In order to avoid disadvantages of monocrop** for soil and environment, it is advisable to practice intercrop** of various plant species whenever possible. However, intercrop** is challenging as it requires a balanced planting schedule due to individual cultivation time frames. Maintaining a continuous harvest reduces logistical costs and related greenhouse gas emissions, and contributes to… ▽ More In order to avoid disadvantages of monocrop** for soil and environment, it is advisable to practice intercrop** of various plant species whenever possible. However, intercrop** is challenging as it requires a balanced planting schedule due to individual cultivation time frames. Maintaining a continuous harvest reduces logistical costs and related greenhouse gas emissions, and contributes to food waste prevention. In this work, we address these issues and propose an optimization method for a full harvest season of large crop ensembles that complies with given constraints. By using an approach based on an evolutionary algorithm combined with a novel hierarchical loss function and adaptive mutation rate, we transfer the multi-objective into a pseudo-single-objective optimization problem and obtain faster convergence and better solutions than for conventional approaches. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: 4 pages, AAAI-2022 Workshop AI for Agriculture and Food Systems (AIAFS)

arXiv:2105.10172 [pdf, other]

Explainable Machine Learning with Prior Knowledge: An Overview

Authors: Katharina Beckh, Sebastian Müller, Matthias Jakobs, Vanessa Toborek, Hanxiao Tan, Raphael Fischer, Pascal Welke, Sebastian Houben, Laura von Rueden

Abstract: This survey presents an overview of integrating prior knowledge into machine learning systems in order to improve explainability. The complexity of machine learning models has elicited research to make them more explainable. However, most explainability methods cannot provide insight beyond the given data, requiring additional information about the context. We propose to harness prior knowledge to… ▽ More This survey presents an overview of integrating prior knowledge into machine learning systems in order to improve explainability. The complexity of machine learning models has elicited research to make them more explainable. However, most explainability methods cannot provide insight beyond the given data, requiring additional information about the context. We propose to harness prior knowledge to improve upon the explanation capabilities of machine learning models. In this paper, we present a categorization of current research into three main categories which either integrate knowledge into the machine learning pipeline, into the explainability method or derive knowledge from explanations. To classify the papers, we build upon the existing taxonomy of informed machine learning and extend it from the perspective of explainability. We conclude with open challenges and research directions. △ Less

Submitted 21 May, 2021; originally announced May 2021.

arXiv:2104.07538 [pdf, other]

Street-Map Based Validation of Semantic Segmentation in Autonomous Driving

Authors: Laura von Rueden, Tim Wirtz, Fabian Hueger, Jan David Schneider, Nico Piatkowski, Christian Bauckhage

Abstract: Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness, which motivates the thorough validation of learned models. However, current validation approaches mostly require ground truth data and are thus both cost-intensive and limited in their applicability. We propose to overcome these limitations by a model agnostic validation using a-priori knowledge… ▽ More Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness, which motivates the thorough validation of learned models. However, current validation approaches mostly require ground truth data and are thus both cost-intensive and limited in their applicability. We propose to overcome these limitations by a model agnostic validation using a-priori knowledge from street maps. In particular, we show how to validate semantic segmentation masks and demonstrate the potential of our approach using OpenStreetMap. We introduce validation metrics that indicate false positive or negative road segments. Besides the validation approach, we present a method to correct the vehicle's GPS position so that a more accurate localization can be used for the street-map based validation. Lastly, we present quantitative results on the Cityscapes dataset indicating that our validation approach can indeed uncover errors in semantic segmentation masks. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: Final version accepted at the International Conference on Pattern Recognition (ICPR). arXiv admin note: substantial text overlap with arXiv:2011.08008

arXiv:2011.08008 [pdf, other]

Towards Map-Based Validation of Semantic Segmentation Masks

Authors: Laura von Rueden, Tim Wirtz, Fabian Hueger, Jan David Schneider, Christian Bauckhage

Abstract: Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness. We propose to validate machine learning models for self-driving vehicles not only with given ground truth labels, but also with additional a-priori knowledge. In particular, we suggest to validate the drivable area in semantic segmentation masks using given street map data. We present first resul… ▽ More Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness. We propose to validate machine learning models for self-driving vehicles not only with given ground truth labels, but also with additional a-priori knowledge. In particular, we suggest to validate the drivable area in semantic segmentation masks using given street map data. We present first results, which indicate that prediction errors can be uncovered by map-based validation. △ Less

Submitted 26 November, 2020; v1 submitted 3 November, 2020; originally announced November 2020.

arXiv:1903.12394 [pdf, other]

doi 10.1109/TKDE.2021.3079836

Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems

Authors: Laura von Rueden, Sebastian Mayer, Katharina Beckh, Bogdan Georgiev, Sven Giesselbach, Raoul Heese, Birgit Kirsch, Julius Pfrommer, Annika Pick, Rajkumar Ramamurthy, Michal Walczak, Jochen Garcke, Christian Bauckhage, Jannis Schuecker

Abstract: Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for inf… ▽ More Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for informed machine learning which illustrates its building blocks and distinguishes it from conventional machine learning. We introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Based on this taxonomy, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning. △ Less

Submitted 28 May, 2021; v1 submitted 29 March, 2019; originally announced March 2019.

Comments: Accepted at IEEE Transactions on Knowledge and Data Engineering: https://ieeexplore.ieee.org/document/9429985

Showing 1–7 of 7 results for author: von Rueden, L