Search | arXiv e-print repository

GradCheck: Analyzing classifier guidance gradients for conditional diffusion sampling

Authors: Philipp Vaeth, Alexander M. Fruehwald, Benjamin Paassen, Magda Gregorova

Abstract: To sample from an unconditionally trained Denoising Diffusion Probabilistic Model (DDPM), classifier guidance adds conditional information during sampling, but the gradients from classifiers, especially those not trained on noisy images, are often unstable. This study conducts a gradient analysis comparing robust and non-robust classifiers, as well as multiple gradient stabilization techniques. Ex… ▽ More To sample from an unconditionally trained Denoising Diffusion Probabilistic Model (DDPM), classifier guidance adds conditional information during sampling, but the gradients from classifiers, especially those not trained on noisy images, are often unstable. This study conducts a gradient analysis comparing robust and non-robust classifiers, as well as multiple gradient stabilization techniques. Experimental results demonstrate that these techniques significantly improve the quality of class-conditional samples for non-robust classifiers by providing more stable and informative classifier guidance gradients. The findings highlight the importance of gradient stability in enhancing the performance of classifier guidance, especially on non-robust classifiers. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2405.01114 [pdf, other]

Continual Imitation Learning for Prosthetic Limbs

Authors: Sharmita Dey, Benjamin Paassen, Sarath Ravindran Nair, Sabri Boughorbel, Arndt F. Schilling

Abstract: Lower limb amputations and neuromuscular impairments severely restrict mobility, necessitating advancements beyond conventional prosthetics. Motorized bionic limbs offer promise, but their utility depends on mimicking the evolving synergy of human movement in various settings. In this context, we present a novel model for bionic prostheses' application that leverages camera-based motion capture an… ▽ More Lower limb amputations and neuromuscular impairments severely restrict mobility, necessitating advancements beyond conventional prosthetics. Motorized bionic limbs offer promise, but their utility depends on mimicking the evolving synergy of human movement in various settings. In this context, we present a novel model for bionic prostheses' application that leverages camera-based motion capture and wearable sensor data, to learn the synergistic coupling of the lower limbs during human locomotion, empowering it to infer the kinematic behavior of a missing lower limb across varied tasks, such as climbing inclines and stairs. We propose a model that can multitask, adapt continually, anticipate movements, and refine. The core of our method lies in an approach which we call -- multitask prospective rehearsal -- that anticipates and synthesizes future movements based on the previous prediction and employs a corrective mechanism for subsequent predictions. We design an evolving architecture that merges lightweight, task-specific modules on a shared backbone, ensuring both specificity and scalability. We empirically validate our model against various baselines using real-world human gait datasets, including experiments with transtibial amputees, which encompass a broad spectrum of locomotion tasks. The results show that our approach consistently outperforms baseline models, particularly under scenarios affected by distributional shifts, adversarial perturbations, and noise. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2308.06100 [pdf, other]

Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation

Authors: Philipp Vaeth, Alexander M. Fruehwald, Benjamin Paassen, Magda Gregorova

Abstract: Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. However, it is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.… ▽ More Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. However, it is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. In this work, we propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used. We use this framework to explore the effects of certain crucial design choices in the latest diffusion-based generative models for VCEs of natural image classification (ImageNet). We conduct a battery of ablation-like experiments, generating thousands of VCEs for a suite of classifiers of various complexity, accuracy and robustness. Our findings suggest multiple directions for future advancements and improvements of VCE methods. By sharing our methodology and our approach to tackle the computational challenges of such a study on a limited hardware setup (including the complete code base), we offer a valuable guidance for researchers in the field fostering consistency and transparency in the assessment of counterfactual explanations. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: Accepted at the 5th International Workshop on eXplainable Knowledge Discovery in Data Mining @ ECML 2023

arXiv:2307.08486 [pdf, other]

Fairness in KI-Systemen

Authors: Janine Strotherm, Alissa Müller, Barbara Hammer, Benjamin Paaßen

Abstract: The more AI-assisted decisions affect people's lives, the more important the fairness of such decisions becomes. In this chapter, we provide an introduction to research on fairness in machine learning. We explain the main fairness definitions and strategies for achieving fairness using concrete examples and place fairness research in the European context. Our contribution is aimed at an interdisci… ▽ More The more AI-assisted decisions affect people's lives, the more important the fairness of such decisions becomes. In this chapter, we provide an introduction to research on fairness in machine learning. We explain the main fairness definitions and strategies for achieving fairness using concrete examples and place fairness research in the European context. Our contribution is aimed at an interdisciplinary audience and therefore avoids mathematical formulation but emphasizes visualizations and examples. -- Je mehr KI-gestützte Entscheidungen das Leben von Menschen betreffen, desto wichtiger ist die Fairness solcher Entscheidungen. In diesem Kapitel geben wir eine Einführung in die Forschung zu Fairness im maschinellen Lernen. Wir erklären die wesentlichen Fairness-Definitionen und Strategien zur Erreichung von Fairness anhand konkreter Beispiele und ordnen die Fairness-Forschung in den europäischen Kontext ein. Unser Beitrag richtet sich dabei an ein interdisziplinäres Publikum und verzichtet daher auf die mathematische Formulierung sondern betont Visualisierungen und Beispiele. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: in German language

arXiv:2211.05227 [pdf, other]

doi 10.1109/TLT.2022.3144442

Automatic Creativity Measurement in Scratch Programs Across Modalities

Authors: Anastasia Kovalkov, Benjamin Paaßen, Avi Segal, Niels Pinkwart, Kobi Gal

Abstract: Promoting creativity is considered an important goal of education, but creativity is notoriously hard to measure.In this paper, we make the journey fromdefining a formal measure of creativity that is efficientlycomputable to applying the measure in a practical domain. The measure is general and relies on coretheoretical concepts in creativity theory, namely fluency, flexibility, and originality, i… ▽ More Promoting creativity is considered an important goal of education, but creativity is notoriously hard to measure.In this paper, we make the journey fromdefining a formal measure of creativity that is efficientlycomputable to applying the measure in a practical domain. The measure is general and relies on coretheoretical concepts in creativity theory, namely fluency, flexibility, and originality, integratingwith prior cognitive science literature. We adapted the general measure for projects in the popular visual programming language Scratch.We designed a machine learning model for predicting the creativity of Scratch projects, trained and evaluated on human expert creativity assessments in an extensive user study. Our results show that opinions about creativity in Scratch varied widely across experts. The automatic creativity assessment aligned with the assessment of the human experts more than the experts agreed with each other. This is a first step in providing computational models for measuring creativity that can be applied to educational technologies, and to scale up the benefit of creativity education in schools. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Journal ref: IEEE Transactions on Learning Technologies 14(6) (2021) 740-753

arXiv:2210.04359 [pdf, other]

Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates

Authors: Aida Kostikova, Benjamin Paassen, Dominik Beese, Ole Pütz, Gregor Wiedemann, Steffen Eger

Abstract: Solidarity is a crucial concept to understand social relations in societies. In this paper, we explore fine-grained solidarity frames to study solidarity towards women and migrants in German parliamentary debates between 1867 and 2022. Using 2,864 manually annotated text snippets (with a cost exceeding 18k Euro), we evaluate large language models (LLMs) like Llama 3, GPT-3.5, and GPT-4. We find th… ▽ More Solidarity is a crucial concept to understand social relations in societies. In this paper, we explore fine-grained solidarity frames to study solidarity towards women and migrants in German parliamentary debates between 1867 and 2022. Using 2,864 manually annotated text snippets (with a cost exceeding 18k Euro), we evaluate large language models (LLMs) like Llama 3, GPT-3.5, and GPT-4. We find that GPT-4 outperforms other LLMs, approaching human annotation quality. Using GPT-4, we automatically annotate more than 18k further instances (with a cost of around 500 Euro) across 155 years and find that solidarity with migrants outweighs anti-solidarity but that frequencies and solidarity types shift over time. Most importantly, group-based notions of (anti-)solidarity fade in favor of compassionate solidarity, focusing on the vulnerability of migrant groups, and exchange-based anti-solidarity, focusing on the lack of (economic) contribution. Our study highlights the interplay of historical events, socio-economic needs, and political ideologies in sha** migration discourse and social cohesion. We also show that powerful LLMs, if carefully prompted, can be cost-effective alternatives to human annotation for hard social scientific tasks. △ Less

Submitted 24 June, 2024; v1 submitted 9 October, 2022; originally announced October 2022.

Comments: Note title and author changes

arXiv:2108.00953 [pdf, ps, other]

doi 10.1007/978-3-030-89657-7_27

An A*-algorithm for the Unordered Tree Edit Distance with Custom Costs

Authors: Benjamin Paaßen

Abstract: The unordered tree edit distance is a natural metric to compute distances between trees without intrinsic child order, such as representations of chemical molecules. While the unordered tree edit distance is MAX SNP-hard in principle, it is feasible for small cases, e.g. via an A* algorithm. Unfortunately, current heuristics for the A* algorithm assume unit costs for deletions, insertions, and rep… ▽ More The unordered tree edit distance is a natural metric to compute distances between trees without intrinsic child order, such as representations of chemical molecules. While the unordered tree edit distance is MAX SNP-hard in principle, it is feasible for small cases, e.g. via an A* algorithm. Unfortunately, current heuristics for the A* algorithm assume unit costs for deletions, insertions, and replacements, which limits our ability to inject domain knowledge. In this paper, we present three novel heuristics for the A* algorithm that work with custom cost functions. In experiments on two chemical data sets, we show that custom costs make the A* computation faster and improve the error of a 5-nearest neighbor regressor, predicting chemical properties. We also show that, on these data, polynomial edit distances can achieve similar results as the unordered tree edit distance. △ Less

Submitted 26 July, 2021; originally announced August 2021.

Comments: Accepted at the 14th International Conference on Similarity Search and Applications (SISAP 2021)

arXiv:2105.01616 [pdf, ps, other]

doi 10.1016/j.neucom.2021.05.106

Reservoir Stack Machines

Authors: Benjamin Paaßen, Alexander Schulz, Barbara Hammer

Abstract: Memory-augmented neural networks equip a recurrent neural network with an explicit memory to support tasks that require information storage without interference over long times. A key motivation for such research is to perform classic computation tasks, such as parsing. However, memory-augmented neural networks are notoriously hard to train, requiring many backpropagation epochs and a lot of data.… ▽ More Memory-augmented neural networks equip a recurrent neural network with an explicit memory to support tasks that require information storage without interference over long times. A key motivation for such research is to perform classic computation tasks, such as parsing. However, memory-augmented neural networks are notoriously hard to train, requiring many backpropagation epochs and a lot of data. In this paper, we introduce the reservoir stack machine, a model which can provably recognize all deterministic context-free languages and circumvents the training problem by training only the output layer of a recurrent net and employing auxiliary information during training about the desired interaction with a stack. In our experiments, we validate the reservoir stack machine against deep and shallow networks from the literature on three benchmark tasks for Neural Turing machines and six deterministic context-free languages. Our results show that the reservoir stack machine achieves zero error, even on test sequences longer than the training data, requiring only a few seconds of training time and 100 training sequences. △ Less

Submitted 26 July, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

Comments: in print at the Journal Neurocomputing

arXiv:2103.11614 [pdf, ps, other]

doi 10.5281/zenodo.5634224

ast2vec: Utilizing Recursive Neural Encodings of Python Programs

Authors: Benjamin Paaßen, Jessica McBroom, Bryn Jeffries, Irena Koprinska, Kalina Yacef

Abstract: Educational datamining involves the application of datamining techniques to student activity. However, in the context of computer programming, many datamining techniques can not be applied because they expect vector-shaped input whereas computer programs have the form of syntax trees. In this paper, we present ast2vec, a neural network that maps Python syntax trees to vectors and back, thereby fac… ▽ More Educational datamining involves the application of datamining techniques to student activity. However, in the context of computer programming, many datamining techniques can not be applied because they expect vector-shaped input whereas computer programs have the form of syntax trees. In this paper, we present ast2vec, a neural network that maps Python syntax trees to vectors and back, thereby facilitating datamining on computer programs as well as the interpretation of datamining results. Ast2vec has been trained on almost half a million programs of novice programmers and is designed to be applied across learning tasks without re-training, meaning that users can apply it without any need for (additional) deep learning. We demonstrate the generality of ast2vec in three settings: First, we provide example analyses using ast2vec on a classroom-sized dataset, involving visualization, student motion analysis, clustering, and outlier detection, including two novel analyses, namely a progress-variance-projection and a dynamical systems analysis. Second, we consider the ability of ast2vec to recover the original syntax tree from its vector representation on the training data and two further large-scale programming datasets. Finally, we evaluate the predictive capability of a simple linear regression on top of ast2vec, obtaining similar results to techniques that work directly on syntax trees. We hope ast2vec can augment the educational datamining toolbelt by making analyses of computer programs easier, richer, and more efficient. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: Under consideration at the Journal of Educational Datamining

Journal ref: Journal of Educational Data Mining, 13(3) (2021) 1-35

arXiv:2012.02097 [pdf, other]

Recursive Tree Grammar Autoencoders

Authors: Benjamin Paassen, Irena Koprinska, Kalina Yacef

Abstract: Machine learning on trees has been mostly focused on trees as input to algorithms. Much less research has investigated trees as output, which has many applications, such as molecule optimization for drug discovery, or hint generation for intelligent tutoring systems. In this work, we propose a novel autoencoder approach, called recursive tree grammar autoencoder (RTG-AE), which encodes trees via a… ▽ More Machine learning on trees has been mostly focused on trees as input to algorithms. Much less research has investigated trees as output, which has many applications, such as molecule optimization for drug discovery, or hint generation for intelligent tutoring systems. In this work, we propose a novel autoencoder approach, called recursive tree grammar autoencoder (RTG-AE), which encodes trees via a bottom-up parser and decodes trees via a tree grammar, both learned via recursive neural networks that minimize the variational autoencoder loss. The resulting encoder and decoder can then be utilized in subsequent tasks, such as optimization and time series prediction. RTG-AEs are the first model to combine variational autoencoders, grammatical knowledge, and recursive processing. Our key message is that this unique combination of all three elements outperforms models which combine any two of the three. In particular, we perform an ablation study to show that our proposed method improves the autoencoding error, training time, and optimization score on synthetic as well as real datasets compared to four baselines. We further prove that RTG-AEs parse and generate trees in linear time and are expressive enough to handle all regular tree grammars. △ Less

Submitted 10 February, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

Comments: Submitted to the ECML/PKDD Journal Track

arXiv:2009.06342 [pdf, ps, other]

doi 10.1109/TNNLS.2021.3094139

Reservoir Memory Machines as Neural Computers

Authors: Benjamin Paaßen, Alexander Schulz, Terrence C. Stewart, Barbara Hammer

Abstract: Differentiable neural computers extend artificial neural networks with an explicit memory without interference, thus enabling the model to perform classic computation tasks such as graph traversal. However, such models are difficult to train, requiring long training times and large datasets. In this work, we achieve some of the computational capabilities of differentiable neural computers with a m… ▽ More Differentiable neural computers extend artificial neural networks with an explicit memory without interference, thus enabling the model to perform classic computation tasks such as graph traversal. However, such models are difficult to train, requiring long training times and large datasets. In this work, we achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently, namely an echo state network with an explicit memory without interference. This extension enables echo state networks to recognize all regular languages, including those that contractive echo state networks provably can not recognize. Further, we demonstrate experimentally that our model performs comparably to its fully-trained deep version on several typical benchmark tasks for differentiable neural computers. △ Less

Submitted 19 July, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

Comments: In print at the special issue 'New Frontiers in Extremely Efficient Reservoir Computing' of IEEE TNNLS

Journal ref: IEEE Transactions on Neural Networks and Learning Systems 33 (2022) 2575-2585

arXiv:2004.08925 [pdf, ps, other]

Tree Echo State Autoencoders with Grammars

Authors: Benjamin Paassen, Irena Koprinska, Kalina Yacef

Abstract: Tree data occurs in many forms, such as computer programs, chemical molecules, or natural language. Unfortunately, the non-vectorial and discrete nature of trees makes it challenging to construct functions with tree-formed output, complicating tasks such as optimization or time series prediction. Autoencoders address this challenge by map** trees to a vectorial latent space, where tasks are easi… ▽ More Tree data occurs in many forms, such as computer programs, chemical molecules, or natural language. Unfortunately, the non-vectorial and discrete nature of trees makes it challenging to construct functions with tree-formed output, complicating tasks such as optimization or time series prediction. Autoencoders address this challenge by map** trees to a vectorial latent space, where tasks are easier to solve, and then map** the solution back to a tree structure. However, existing autoencoding approaches for tree data fail to take the specific grammatical structure of tree domains into account and rely on deep learning, thus requiring large training datasets and long training times. In this paper, we propose tree echo state autoencoders (TES-AE), which are guided by a tree grammar and can be trained within seconds by virtue of reservoir computing. In our evaluation on three datasets, we demonstrate that our proposed approach is not only much faster than a state-of-the-art deep learning autoencoding approach (D-VAE) but also has less autoencoding error if little data and time is given. △ Less

Submitted 19 April, 2020; originally announced April 2020.

Comments: accepted at the 2020 International Joint Conference on Neural Networks (IJCNN 2020)

arXiv:2003.04793 [pdf, ps, other]

Reservoir memory machines

Authors: Benjamin Paassen, Alexander Schulz

Abstract: In recent years, Neural Turing Machines have gathered attention by joining the flexibility of neural networks with the computational capabilities of Turing machines. However, Neural Turing Machines are notoriously hard to train, which limits their applicability. We propose reservoir memory machines, which are still able to solve some of the benchmark tests for Neural Turing Machines, but are much… ▽ More In recent years, Neural Turing Machines have gathered attention by joining the flexibility of neural networks with the computational capabilities of Turing machines. However, Neural Turing Machines are notoriously hard to train, which limits their applicability. We propose reservoir memory machines, which are still able to solve some of the benchmark tests for Neural Turing Machines, but are much faster to train, requiring only an alignment algorithm and linear regression. Our model can also be seen as an extension of echo state networks with an external memory, enabling arbitrarily long storage without interference. △ Less

Submitted 11 February, 2020; originally announced March 2020.

arXiv:1908.09364 [pdf, ps, other]

doi 10.1007/978-3-030-33607-3_39

Adversarial Edit Attacks for Tree Data

Authors: Benjamin Paaßen

Abstract: Many machine learning models can be attacked with adversarial examples, i.e. inputs close to correctly classified examples that are classified incorrectly. However, most research on adversarial attacks to date is limited to vectorial data, in particular image data. In this contribution, we extend the field by introducing adversarial edit attacks for tree-structured data with potential applications… ▽ More Many machine learning models can be attacked with adversarial examples, i.e. inputs close to correctly classified examples that are classified incorrectly. However, most research on adversarial attacks to date is limited to vectorial data, in particular image data. In this contribution, we extend the field by introducing adversarial edit attacks for tree-structured data with potential applications in medicine and automated program analysis. Our approach solely relies on the tree edit distance and a logarithmic number of black-box queries to the attacked classifier without any need for gradient information. We evaluate our approach on two programming and two biomedical data sets and show that many established tree classifiers, like tree-kernel-SVMs and recursive neural networks, can be attacked effectively. △ Less

Submitted 27 August, 2019; v1 submitted 25 August, 2019; originally announced August 2019.

Comments: accepted at the 20th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL)

Journal ref: Proc. IDEAL 20 (2019) 359-366

arXiv:1905.06147 [pdf, ps, other]

Embeddings and Representation Learning for Structured Data

Authors: Benjamin Paaßen, Claudio Gallicchio, Alessio Micheli, Alessandro Sperduti

Abstract: Performing machine learning on structured data is complicated by the fact that such data does not have vectorial form. Therefore, multiple approaches have emerged to construct vectorial representations of structured data, from kernel and distance approaches to recurrent, recursive, and convolutional neural networks. Recent years have seen heightened attention in this demanding field of research an… ▽ More Performing machine learning on structured data is complicated by the fact that such data does not have vectorial form. Therefore, multiple approaches have emerged to construct vectorial representations of structured data, from kernel and distance approaches to recurrent, recursive, and convolutional neural networks. Recent years have seen heightened attention in this demanding field of research and several new approaches have emerged, such as metric learning on structured data, graph convolutional neural networks, and recurrent decoder networks for structured data. In this contribution, we provide an high-level overview of the state-of-the-art in representation learning and embeddings for structured data across a wide range of machine learning fields. △ Less

Submitted 15 May, 2019; originally announced May 2019.

Comments: Oral presentation at the 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2019) in Bruges, Belgium, on April 24th, 2019

Journal ref: Proc. ESANN (2019), 85-94

arXiv:1902.00375 [pdf, ps, other]

Dynamic fairness - Breaking vicious cycles in automatic decision making

Authors: Benjamin Paaßen, Astrid Bunge, Carolin Hainke, Leon Sindelar, Matthias Vogelsang

Abstract: In recent years, machine learning techniques have been increasingly applied in sensitive decision making processes, raising fairness concerns. Past research has shown that machine learning may reproduce and even exacerbate human bias due to biased training data or flawed model assumptions, and thus may lead to discriminatory actions. To counteract such biased models, researchers have proposed mult… ▽ More In recent years, machine learning techniques have been increasingly applied in sensitive decision making processes, raising fairness concerns. Past research has shown that machine learning may reproduce and even exacerbate human bias due to biased training data or flawed model assumptions, and thus may lead to discriminatory actions. To counteract such biased models, researchers have proposed multiple mathematical definitions of fairness according to which classifiers can be optimized. However, it has also been shown that the outcomes generated by some fairness notions may be unsatisfactory. In this contribution, we add to this research by considering decision making processes in time. We establish a theoretic model in which even perfectly accurate classifiers which adhere to almost all common fairness definitions lead to stable long-term inequalities due to vicious cycles. Only demographic parity, which enforces equal rates of positive decisions across groups, avoids these effects and establishes a virtuous cycle, which leads to perfectly accurate and fair classification in the long term. △ Less

Submitted 10 February, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

Comments: preprint of a paper accepted for oral presentation at the 27th European Symposium on Artificial Neural Networks (ESANN 2019)

Journal ref: Proc. ESANN (2019), 477-482

arXiv:1806.05009 [pdf, ps, other]

Tree Edit Distance Learning via Adaptive Symbol Embeddings

Authors: Benjamin Paaßen, Claudio Gallicchio, Alessio Micheli, Barbara Hammer

Abstract: Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees… ▽ More Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which we call embedding edit distance learning (BEDL) and which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those vectors supports class discrimination. We learn such embeddings by reducing the distance to prototypical trees from the same class and increasing the distance to prototypical trees from different classes. In our experiments, we show that BEDL improves upon the state-of-the-art in metric learning for trees on six benchmark data sets, ranging from computer science over biomedical data to a natural-language processing data set containing over 300,000 nodes. △ Less

Submitted 16 July, 2018; v1 submitted 13 June, 2018; originally announced June 2018.

Comments: Paper at the International Conference of Machine Learning (2018), 2018-07-10 to 2018-07-15 in Stockholm, Sweden

Journal ref: Proceedings of Machine Learning Research 80 (2018) 3973-3982

arXiv:1805.07123 [pdf, ps, other]

Tree Edit Distance Learning via Adaptive Symbol Embeddings: Supplementary Materials and Results

Authors: Benjamin Paaßen

Abstract: Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees… ▽ More Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those vectors supports class discrimination. We learn such embeddings by reducing the distance to prototypical trees from the same class and increasing the distance to prototypical trees from different classes. In our experiments, we show that our proposed metric learning approach improves upon the state-of-the-art in metric learning for trees on six benchmark data sets, ranging from computer science over biomedical data to a natural-language processing data set containing over 300,000 nodes. △ Less

Submitted 18 May, 2018; originally announced May 2018.

Comments: Supplementary Materials and additional Results for the ICML 2018 paper Tree Edit Distance Learning via Adaptive Symbol Embeddings

arXiv:1805.06869 [pdf, ps, other]

Revisiting the tree edit distance and its backtracing: A tutorial

Authors: Benjamin Paaßen

Abstract: Almost 30 years ago, Zhang and Shasha (1989) published a seminal paper describing an efficient dynamic programming algorithm computing the tree edit distance, that is, the minimum number of node deletions, insertions, and replacements that are necessary to transform one tree into another. Since then, the tree edit distance has been widely applied, for example in biology and intelligent tutoring sy… ▽ More Almost 30 years ago, Zhang and Shasha (1989) published a seminal paper describing an efficient dynamic programming algorithm computing the tree edit distance, that is, the minimum number of node deletions, insertions, and replacements that are necessary to transform one tree into another. Since then, the tree edit distance has been widely applied, for example in biology and intelligent tutoring systems. However, the original paper of Zhang and Shasha can be challenging to read for newcomers and it does not describe how to efficiently infer the optimal edit script. In this contribution, we provide a comprehensive tutorial to the tree edit distance algorithm of Zhang and Shasha. We further prove metric properties of the tree edit distance, and describe efficient algorithms to infer the cheapest edit script, as well as a summary of all cheapest edit scripts between two trees. △ Less

Submitted 14 September, 2022; v1 submitted 17 May, 2018; originally announced May 2018.

Comments: Supplementary material for the ICML 2018 paper: Tree Edit Distance Learning via Adaptive Symbol Embeddings

arXiv:1711.09256 [pdf, other]

doi 10.1016/j.neucom.2017.11.072

Expectation maximization transfer learning and its application for bionic hand prostheses

Authors: Benjamin Paaßen, Alexander Schulz, Janne Hahne, Barbara Hammer

Abstract: Machine learning models in practical settings are typically confronted with changes to the distribution of the incoming data. Such changes can severely affect the model performance, leading for example to misclassifications of data. This is particularly apparent in the domain of bionic hand prostheses, where machine learning models promise faster and more intuitive user interfaces, but are hindere… ▽ More Machine learning models in practical settings are typically confronted with changes to the distribution of the incoming data. Such changes can severely affect the model performance, leading for example to misclassifications of data. This is particularly apparent in the domain of bionic hand prostheses, where machine learning models promise faster and more intuitive user interfaces, but are hindered by their lack of robustness to everyday disturbances, such as electrode shifts. One way to address changes in the data distribution is transfer learning, that is, to transfer the disturbed data to a space where the original model is applicable again. In this contribution, we propose a novel expectation maximization algorithm to learn linear transformations that maximize the likelihood of disturbed data after the transformation. We also show that this approach generalizes to discriminative models, in particular learning vector quantization models. In our evaluation on data from the bionic prostheses domain we demonstrate that our approach can learn a transformation which improves classification accuracy significantly and outperforms all tested baselines, if few data or few classes are available in the target domain. △ Less

Submitted 25 November, 2017; originally announced November 2017.

Comments: accepted for publication in a special issue of the Journal 'Neurocomputing' for extended contributions of the 25h European Symposium on Artificial Neural Networks (ESANN 2017)

Journal ref: Neurocomputing 298 (2018) 122-133

arXiv:1708.06564 [pdf, other]

The Continuous Hint Factory - Providing Hints in Vast and Sparsely Populated Edit Distance Spaces

Authors: Benjamin Paaßen, Barbara Hammer, Thomas William Price, Tiffany Barnes, Sebastian Gross, Niels Pinkwart

Abstract: Intelligent tutoring systems can support students in solving multi-step tasks by providing hints regarding what to do next. However, engineering such next-step hints manually or via an expert model becomes infeasible if the space of possible states is too large. Therefore, several approaches have emerged to infer next-step hints automatically, relying on past students' data. In particular, the Hin… ▽ More Intelligent tutoring systems can support students in solving multi-step tasks by providing hints regarding what to do next. However, engineering such next-step hints manually or via an expert model becomes infeasible if the space of possible states is too large. Therefore, several approaches have emerged to infer next-step hints automatically, relying on past students' data. In particular, the Hint Factory (Barnes & Stamper, 2008) recommends edits that are most likely to guide students from their current state towards a correct solution, based on what successful students in the past have done in the same situation. Still, the Hint Factory relies on student data being available for any state a student might visit while solving the task, which is not the case for some learning tasks, such as open-ended programming tasks. In this contribution we provide a mathematical framework for edit-based hint policies and, based on this theory, propose a novel hint policy to provide edit hints in vast and sparsely populated state spaces. In particular, we extend the Hint Factory by considering data of past students in all states which are similar to the student's current state and creating hints approximating the weighted average of all these reference states. Because the space of possible weighted averages is continuous, we call this approach the Continuous Hint Factory. In our experimental evaluation, we demonstrate that the Continuous Hint Factory can predict more accurately what capable students would do compared to existing prediction schemes on two learning tasks, especially in an open-ended programming task, and that the Continuous Hint Factory is comparable to existing hint policies at reproducing tutor hints on a simple UML diagram task. △ Less

Submitted 30 June, 2018; v1 submitted 22 August, 2017; originally announced August 2017.

Journal ref: Journal of Educational Data Mining, 10 (2018) 1-35. Retrieved from https://jedm.educationaldatamining.org/index.php/JEDM/article/view/158

arXiv:1704.06498 [pdf, other]

doi 10.1007/s11063-017-9684-5

Time Series Prediction for Graphs in Kernel and Dissimilarity Spaces

Authors: Benjamin Paaßen, Christina Göpfert, Barbara Hammer

Abstract: Graph models are relevant in many fields, such as distributed computing, intelligent tutoring systems or social network analysis. In many cases, such models need to take changes in the graph structure into account, i.e. a varying number of nodes or edges. Predicting such changes within graphs can be expected to yield important insight with respect to the underlying dynamics, e.g. with respect to u… ▽ More Graph models are relevant in many fields, such as distributed computing, intelligent tutoring systems or social network analysis. In many cases, such models need to take changes in the graph structure into account, i.e. a varying number of nodes or edges. Predicting such changes within graphs can be expected to yield important insight with respect to the underlying dynamics, e.g. with respect to user behaviour. However, predictive techniques in the past have almost exclusively focused on single edges or nodes. In this contribution, we attempt to predict the future state of a graph as a whole. We propose to phrase time series prediction as a regression problem and apply dissimilarity- or kernel-based regression techniques, such as 1-nearest neighbor, kernel regression and Gaussian process regression, which can be applied to graphs via graph kernels. The output of the regression is a point embedded in a pseudo-Euclidean space, which can be analyzed using subsequent dissimilarity- or kernel-based processing methods. We discuss strategies to speed up Gaussian Processes regression from cubic to linear time and evaluate our approach on two well-established theoretical models of graph evolution as well as two real data sets from the domain of intelligent tutoring systems. We find that simple regression methods, such as kernel regression, are sufficient to capture the dynamics in the theoretical models, but that Gaussian process regression significantly improves the prediction error for real-world data. △ Less

Submitted 11 August, 2017; v1 submitted 21 April, 2017; originally announced April 2017.

Comments: preprint of a submission to 'Neural Processing Letters' (Special issue 'Off the mainstream')

Journal ref: Neural Processing Letters 48 (2018) 669-689

Showing 1–22 of 22 results for author: Paaßen, B