-
A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data
Authors:
Andrej Tschalzev,
Sascha Marton,
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling…
▽ More
Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling pipelines often require dataset-specific preprocessing and feature engineering. Therefore, we propose a data-centric evaluation framework. We select 10 relevant datasets from Kaggle competitions and implement expert-level preprocessing pipelines for each dataset. We conduct experiments with different preprocessing pipelines and hyperparameter optimization (HPO) regimes to quantify the impact of model selection, HPO, feature engineering, and test-time adaptation. Our main findings are: 1. After dataset-specific feature engineering, model rankings change considerably, performance differences decrease, and the importance of model selection reduces. 2. Recent models, despite their measurable progress, still significantly benefit from manual feature engineering. This holds true for both tree-based models and neural networks. 3. While tabular data is typically considered static, samples are often collected over time, and adapting to distribution shifts can be important even in supposedly static data. These insights suggest that research efforts should be directed toward a data-centric perspective, acknowledging that tabular data requires feature engineering and often exhibits temporal characteristics.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods
Authors:
Andrej Tschalzev,
Paul Nitschke,
Lukas Kirchdorfer,
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and…
▽ More
Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Interlinking User Stories and GUI Prototy**: A Semi-Automatic LLM-based Approach
Authors:
Kristian Kolthoff,
Felix Kretzer,
Christian Bartelt,
Alexander Maedche,
Simone Paolo Ponzetto
Abstract:
Interactive systems are omnipresent today and the need to create graphical user interfaces (GUIs) is just as ubiquitous. For the elicitation and validation of requirements, GUI prototy** is a well-known and effective technique, typically employed after gathering initial user requirements represented in natural language (NL) (e.g., in the form of user stories). Unfortunately, GUI prototy** ofte…
▽ More
Interactive systems are omnipresent today and the need to create graphical user interfaces (GUIs) is just as ubiquitous. For the elicitation and validation of requirements, GUI prototy** is a well-known and effective technique, typically employed after gathering initial user requirements represented in natural language (NL) (e.g., in the form of user stories). Unfortunately, GUI prototy** often requires extensive resources, resulting in a costly and time-consuming process. Despite various easy-to-use prototy** tools in practice, there is often a lack of adequate resources for develo** GUI prototypes based on given user requirements. In this work, we present a novel Large Language Model (LLM)-based approach providing assistance for validating the implementation of functional NL-based requirements in a GUI prototype embedded in a prototy** tool. In particular, our approach aims to detect functional user stories that are not implemented in a GUI prototype and provides recommendations for suitable GUI components directly implementing the requirements. We collected requirements for existing GUIs in the form of user stories and evaluated our proposed validation and recommendation approach with this dataset. The obtained results are promising for user story validation and we demonstrate feasibility for the GUI component recommendations.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
AbstractBeam: Enhancing Bottom-Up Program Synthesis using Library Learning
Authors:
Janis Zenkner,
Lukas Dierkes,
Tobias Sesterhenn,
Chrisitan Bartelt
Abstract:
LambdaBeam is a state-of-the-art execution-guided algorithm for program synthesis that incorporates higher-order functions, lambda functions, and iterative loops into the Domain-Specific Language (DSL). LambdaBeam generates every program from the start. Yet, many program blocks or subprograms occur frequently in a given domain, e.g., loops to traverse a list. Thus, repeating programs can be used t…
▽ More
LambdaBeam is a state-of-the-art execution-guided algorithm for program synthesis that incorporates higher-order functions, lambda functions, and iterative loops into the Domain-Specific Language (DSL). LambdaBeam generates every program from the start. Yet, many program blocks or subprograms occur frequently in a given domain, e.g., loops to traverse a list. Thus, repeating programs can be used to enhance the synthesis algorithm. However, LambdaBeam fails to leverage this potential. For this purpose, we introduce AbstractBeam: A novel program synthesis framework that employs Library Learning to identify such program repetitions, integrates them into the DSL, and thus utilizes their potential to boost LambdaBeam's synthesis algorithm. Our experimental evaluations demonstrate that AbstractBeam significantly improves LambdaBeam's performance in the LambdaBeam integer list manipulation domain. Additionally, AbstractBeam's program generation is more efficient compared to LambdaBeam's synthesis. Finally, our findings indicate that Library Learning is effective in domains not specifically crafted to highlight its benefits.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
DSEG-LIME: Improving Image Explanation by Hierarchical Data-Driven Segmentation
Authors:
Patrick Knab,
Sascha Marton,
Christian Bartelt
Abstract:
Explainable Artificial Intelligence is critical in unraveling decision-making processes in complex machine learning models. LIME (Local Interpretable Model-agnostic Explanations) is a well-known XAI framework for image analysis. It utilizes image segmentation to create features to identify relevant areas for classification. Consequently, poor segmentation can compromise the consistency of the expl…
▽ More
Explainable Artificial Intelligence is critical in unraveling decision-making processes in complex machine learning models. LIME (Local Interpretable Model-agnostic Explanations) is a well-known XAI framework for image analysis. It utilizes image segmentation to create features to identify relevant areas for classification. Consequently, poor segmentation can compromise the consistency of the explanation and undermine the importance of the segments, affecting the overall interpretability. Addressing these challenges, we introduce DSEG-LIME (Data-Driven Segmentation LIME), featuring: i) a data-driven segmentation for human-recognized feature generation, and ii) a hierarchical segmentation procedure through composition. We benchmark DSEG-LIME on pre-trained models with images from the ImageNet dataset - scenarios without domain-specific knowledge. The analysis includes a quantitative evaluation using established XAI metrics, complemented by a qualitative assessment through a user study. Our findings demonstrate that DSEG outperforms in most of the XAI metrics and enhances the alignment of explanations with human-recognized concepts, significantly improving interpretability. The code is available under: https://github. com/patrick-knab/DSEG-LIME
△ Less
Submitted 27 May, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task
Authors:
Jannik Brinkmann,
Abhay Sheshadri,
Victor Levoso,
Paul Swoboda,
Christian Bartelt
Abstract:
Transformers demonstrate impressive performance on a range of reasoning benchmarks. To evaluate the degree to which these abilities are a result of actual reasoning, existing work has focused on develo** sophisticated benchmarks for behavioral studies. However, these studies do not provide insights into the internal mechanisms driving the observed capabilities. To improve our understanding of th…
▽ More
Transformers demonstrate impressive performance on a range of reasoning benchmarks. To evaluate the degree to which these abilities are a result of actual reasoning, existing work has focused on develo** sophisticated benchmarks for behavioral studies. However, these studies do not provide insights into the internal mechanisms driving the observed capabilities. To improve our understanding of the internal mechanisms of transformers, we present a comprehensive mechanistic analysis of a transformer trained on a synthetic reasoning task. We identify a set of interpretable mechanisms the model uses to solve the task, and validate our findings using correlational and causal evidence. Our results suggest that it implements a depth-bounded recurrent mechanisms that operates in parallel and stores intermediate results in selected token positions. We anticipate that the motifs we identified in our synthetic setting can provide valuable insights into the broader operating principles of transformers and thus provide a basis for understanding more complex models.
△ Less
Submitted 29 June, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data
Authors:
Sascha Marton,
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles,…
▽ More
Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. The method is available under: https://github.com/s-marton/GRANDE
△ Less
Submitted 12 March, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Emergent Software Service Platform and its Application in a Smart Mobility Setting
Authors:
Nils Wilken,
Christoph Knieke,
Eric Nyakam,
Andreas Rausch,
Christian Schindler,
Christian Bartelt,
Nikolaus Ziebura
Abstract:
The development dynamics of digital innovations for industry, business, and society are producing complex system conglomerates that can no longer be designed centrally and hierarchically in classic development processes. Instead, systems are evolving in DevOps processes in which heterogeneous actors act together on an open platform. Influencing and controlling such dynamically and autonomously cha…
▽ More
The development dynamics of digital innovations for industry, business, and society are producing complex system conglomerates that can no longer be designed centrally and hierarchically in classic development processes. Instead, systems are evolving in DevOps processes in which heterogeneous actors act together on an open platform. Influencing and controlling such dynamically and autonomously changing system landscapes is currently a major challenge and a fundamental interest of service users and providers, as well as operators of the platform infrastructures. In this paper, we propose an architecture for such an emergent software service platform. A software platform that implements this architecture with the underlying engineering methodology is demonstrated by a smart parking lot scenario.
△ Less
Submitted 5 February, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
A Multidimensional Analysis of Social Biases in Vision Transformers
Authors:
Jannik Brinkmann,
Paul Swoboda,
Christian Bartelt
Abstract:
The embedding spaces of image models have been shown to encode a range of social biases such as racism and sexism. Here, we investigate specific factors that contribute to the emergence of these biases in Vision Transformers (ViT). Therefore, we measure the impact of training data, model architecture, and training objectives on social biases in the learned representations of ViTs. Our findings ind…
▽ More
The embedding spaces of image models have been shown to encode a range of social biases such as racism and sexism. Here, we investigate specific factors that contribute to the emergence of these biases in Vision Transformers (ViT). Therefore, we measure the impact of training data, model architecture, and training objectives on social biases in the learned representations of ViTs. Our findings indicate that counterfactual augmentation training using diffusion-based image editing can mitigate biases, but does not eliminate them. Moreover, we find that larger models are less biased than smaller models, and that models trained using discriminative objectives are less biased than those trained using generative objectives. In addition, we observe inconsistencies in the learned social biases. To our surprise, ViTs can exhibit opposite biases when trained on the same data set using different self-supervised objectives. Our findings give insights into the factors that contribute to the emergence of social biases and suggests that we could achieve substantial fairness improvements based on model design choices.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Planning Landmark Based Goal Recognition Revisited: Does Using Initial State Landmarks Make Sense?
Authors:
Nils Wilken,
Lea Cohausz,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computatio…
▽ More
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computation time to calculate a solution. Mainly to address this issue, recently, Pereira et al. developed an approach that is based on planning landmarks and is much more computationally efficient than previous approaches. However, the approach, as proposed by Pereira et al., also uses trivial landmarks (i.e., facts that are part of the initial state and goal description are landmarks by definition). In this paper, we show that it does not provide any benefit to use landmarks that are part of the initial state in a planning landmark based goal recognition approach. The empirical results show that omitting initial state landmarks for goal recognition improves goal recognition performance.
△ Less
Submitted 10 November, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent
Authors:
Sascha Marton,
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure ca…
▽ More
Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure can lead to inaccurate trees. In this paper, we present a novel approach for learning hard, axis-aligned DTs with gradient descent. The proposed method uses backpropagation with a straight-through operator on a dense DT representation, to jointly optimize all tree parameters. Our approach outperforms existing methods on binary classification benchmarks and achieves competitive results for multi-class tasks. The method is available under: https://github.com/s-marton/GradTree
△ Less
Submitted 12 March, 2024; v1 submitted 5 May, 2023;
originally announced May 2023.
-
Leveraging Planning Landmarks for Hybrid Online Goal Recognition
Authors:
Nils Wilken,
Lea Cohausz,
Johannes Schaum,
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that co…
▽ More
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that combines a symbolic planning landmark based approach and a data-driven goal recognition approach and evaluate it in a real-world cooking scenario. The empirical results show that the proposed method is not only significantly more efficient in terms of computation time than the state-of-the-art but also improves goal recognition performance. Furthermore, we show that the utilized planning landmark based approach, which was so far only evaluated on artificial benchmark domains, achieves also good recognition performance when applied to a real-world cooking scenario.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Outlier Explanation via Sum-Product Networks
Authors:
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this…
▽ More
Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this problem, we propose a novel outlier explanation algorithm based on Sum-Product Networks (SPNs), a class of probabilistic circuits. Our approach leverages the tractability of marginal inference in SPNs to compute outlier scores in feature subsets. By using SPNs, it becomes feasible to perform backwards elimination instead of the usual forward beam search, which is less susceptible to missing relevant features in an explanation, especially when the number of features is large. We empirically show that our approach achieves state-of-the-art results for outlier explanation, outperforming recent search-based as well as deep learning-based explanation methods
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Explaining Neural Networks without Access to Training Data
Authors:
Sascha Marton,
Stefan Lüdtke,
Christian Bartelt,
Andrej Tschalzev,
Heiner Stuckenschmidt
Abstract:
We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps netwo…
▽ More
We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps network representations (parameters) to a representation of an interpretable function. In this paper, we extend the $\mathcal{I}$-Net framework to the cases of standard and soft decision trees as surrogate models. We propose a suitable decision tree representation and design of the corresponding $\mathcal{I}$-Net output layers. Furthermore, we make $\mathcal{I}$-Nets applicable to real-world tasks by considering more realistic distributions when generating the $\mathcal{I}$-Net's training data. We empirically evaluate our approach against traditional global, post-hoc interpretability approaches and show that it achieves superior results when the training data is not accessible.
△ Less
Submitted 10 June, 2022;
originally announced June 2022.
-
Exchangeability-Aware Sum-Product Networks
Authors:
Stefan Lüdtke,
Christian Bartelt,
Heiner Stuckenschmidt
Abstract:
Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which…
▽ More
Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which arises naturally in relational domains, has not been considered for efficient representation and inference in SPNs yet. The contribution of this paper is a novel probabilistic model which we call Exchangeability-Aware Sum-Product Networks (XSPNs). It contains both SPNs and MEVMs as special cases, and combines the ability of SPNs to efficiently learn deep probabilistic models with the ability of MEVMs to efficiently handle exchangeable random variables. We introduce a structure learning algorithm for XSPNs and empirically show that they can be more accurate than conventional SPNs when the data contains repeated, interchangeable parts.
△ Less
Submitted 28 April, 2022; v1 submitted 11 October, 2021;
originally announced October 2021.
-
xRAI: Explainable Representations through AI
Authors:
Christiann Bartelt,
Sascha Marton,
Heiner Stuckenschmidt
Abstract:
We present xRAI an approach for extracting symbolic representations of the mathematical function a neural network was supposed to learn from the trained network. The approach is based on the idea of training a so-called interpretation network that receives the weights and biases of the trained network as input and outputs the numerical representation of the function the network was supposed to lea…
▽ More
We present xRAI an approach for extracting symbolic representations of the mathematical function a neural network was supposed to learn from the trained network. The approach is based on the idea of training a so-called interpretation network that receives the weights and biases of the trained network as input and outputs the numerical representation of the function the network was supposed to learn that can be directly translated into a symbolic representation. We show that interpretation nets for different classes of functions can be trained on synthetic data offline using Boolean functions and low-order polynomials as examples. We show that the training is rather efficient and the quality of the results are promising. Our work aims to provide a contribution to the problem of better understanding neural decision making by making the target function explicit
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
Orchestration of Global Software Engineering Projects
Authors:
Christian Bartelt,
Manfred Broy,
Christoph Herrmann,
Eric Knauss,
Marco Kuhrmann,
Andreas Rausch,
Bernhard Rumpe,
Kurt Schneider
Abstract:
Global software engineering has become a fact in many companies due to real necessity in practice. In contrast to co-located projects global projects face a number of additional software engineering challenges. Among them quality management has become much more difficult and schedule and budget overruns can be observed more often. Compared to co-located projects global software engineering is even…
▽ More
Global software engineering has become a fact in many companies due to real necessity in practice. In contrast to co-located projects global projects face a number of additional software engineering challenges. Among them quality management has become much more difficult and schedule and budget overruns can be observed more often. Compared to co-located projects global software engineering is even more challenging due to the need for integration of different cultures, different languages, and different time zones - across companies, and across countries. The diversity of development locations on several levels seriously endangers an effective and goal-oriented progress of projects. In this position paper we discuss reasons for global development, sketch settings for distribution and views of orchestration of dislocated companies in a global project that can be seen as a "virtual project environment". We also present a collection of questions, which we consider relevant for global software engineering. The questions motivate further discussion to derive a research agenda in global software engineering.
△ Less
Submitted 22 September, 2014;
originally announced September 2014.