Skip to main content

Showing 1–50 of 121 results for author: Bischl, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18334  [pdf, other

    cs.LG stat.ML

    Efficient and Accurate Explanation Estimation with Distribution Compression

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: Exact computation of various machine learning explanations requires numerous model evaluations and in extreme cases becomes impractical. The computational cost of approximation increases with an ever-increasing size of data and model parameters. Many heuristics have been proposed to approximate post-hoc explanations efficiently. This paper shows that the standard i.i.d. sampling used in a broad sp… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: To be presented at the ICML 2024 Workshop on DMLR

  2. arXiv:2406.09069  [pdf, other

    cs.LG stat.ML

    On the Robustness of Global Feature Effect Explanations

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bo… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted at ECML PKDD 2024

  3. arXiv:2406.04098  [pdf, other

    stat.ML cs.LG

    A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

    Authors: Lukas Burk, John Zobolas, Bernd Bischl, Andreas Bender, Marvin N. Wright, Raphael Sonabend

    Abstract: This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are often narrow in scope, focusing, for example, on h… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 42 pages, 28 figures

  4. arXiv:2406.03348  [pdf, other

    cs.LG

    Position: A Call to Action for a Human-Centered AutoML Paradigm

    Authors: Marius Lindauer, Florian Karl, Anne Klier, Julia Moosbauer, Alexander Tornede, Andreas Mueller, Frank Hutter, Matthias Feurer, Bernd Bischl

    Abstract: Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML) workflows, aiding the research of new ML algorithms, and contributing to the democratization of ML by making it accessible to a broader audience. Over the past decade, commendable achievements in AutoML have primarily focused on optimizing predictive p… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  5. arXiv:2405.18218  [pdf, other

    cs.LG

    FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models

    Authors: Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi

    Abstract: Overparametrized transformer networks are the state-of-the-art architecture for Large Language Models (LLMs). However, such models contain billions of parameters making large compute a necessity, while raising environmental concerns. To address these issues, we propose FinerCut, a new form of fine-grained layer pruning, which in contrast to prior work at the transformer block level, considers all… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages

  6. arXiv:2405.15393  [pdf, other

    stat.ML cs.LG

    Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization

    Authors: Thomas Nagler, Lennart Schneider, Bernd Bischl, Matthias Feurer

    Abstract: Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed c… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 39 pages, 4 tables, 29 figures

  7. arXiv:2405.02200  [pdf, other

    cs.LG stat.ML

    Position: Why We Must Rethink Empirical Research in Machine Learning

    Authors: Moritz Herrmann, F. Julian D. Lange, Katharina Eggensperger, Giuseppe Casalicchio, Marcel Wever, Matthias Feurer, David Rügamer, Eyke Hüllermeier, Anne-Laure Boulesteix, Bernd Bischl

    Abstract: We warn against a common but incomplete understanding of empirical research in machine learning that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field. To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations. In particular, we argue… ▽ More

    Submitted 25 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 20 pages, accepted for publication at ICML 2024, camera-ready version

  8. arXiv:2404.16899  [pdf, other

    cs.LG

    mlr3summary: Concise and interpretable summaries for machine learning models

    Authors: Susanne Dandl, Marc Becker, Bernd Bischl, Giuseppe Casalicchio, Ludwig Bothmann

    Abstract: This work introduces a novel R package for concise, informative summaries of machine learning models. We take inspiration from the summary function for (generalized) linear models in R, but extend it in several directions: First, our summary function is model-agnostic and provides a unified summary output also for non-parametric machine learning models; Second, the summary output is more ext… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 9 pages

  9. arXiv:2404.12862  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    A Guide to Feature Importance Methods for Scientific Inference

    Authors: Fiona Katharina Ewald, Ludwig Bothmann, Marvin N. Wright, Bernd Bischl, Giuseppe Casalicchio, Gunnar König

    Abstract: While machine learning (ML) models are increasingly used due to their high predictive power, their use in understanding the data-generating process (DGP) is limited. Understanding the DGP requires insights into feature-target associations, which many ML models cannot directly provide, due to their opaque internal mechanisms. Feature importance (FI) methods provide useful insights into the DGP unde… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted at the 2nd World Conference on eXplainable Artificial Intelligence, xAI-2024

  10. arXiv:2404.03506  [pdf, other

    stat.ML cs.LG

    CountARFactuals -- Generating plausible model-agnostic counterfactual explanations with adversarial random forests

    Authors: Susanne Dandl, Kristin Blesch, Timo Freiesleben, Gunnar König, Jan Kapar, Bernd Bischl, Marvin Wright

    Abstract: Counterfactual explanations elucidate algorithmic decisions by pointing to scenarios that would have led to an alternative, desired outcome. Giving insight into the model's behavior, they hint users towards possible actions and give grounds for contesting decisions. As a crucial factor in achieving these goals, counterfactuals must be plausible, i.e., describing realistic alternative scenarios wit… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: SD, KB, TB, and GK contributed equally as first authors

  11. arXiv:2404.02629  [pdf, other

    cs.LG

    Effector: A Python package for regional explanations

    Authors: Vasilis Gkolemis, Christos Diou, Eirini Ntoutsi, Theodore Dalamagas, Bernd Bischl, Julia Herbinger, Giuseppe Casalicchio

    Abstract: Global feature effect methods explain a model outputting one plot per feature. The plot shows the average effect of the feature on the output, like the effect of age on the annual income. However, average effects may be misleading when derived from local effects that are heterogeneous, i.e., they significantly deviate from the average. To decrease the heterogeneity, regional effects provide multip… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 33 pages, 17 figures

  12. arXiv:2403.13150  [pdf, other

    cs.LG cs.AI stat.CO stat.ML

    Training Survival Models using Scoring Rules

    Authors: Philipp Kopper, David Rügamer, Raphael Sonabend, Bernd Bischl, Andreas Bender

    Abstract: Survival Analysis provides critical insights for partially incomplete time-to-event data in various domains. It is also an important example of probabilistic machine learning. The probabilistic nature of the predictions can be exploited by using (proper) scoring rules in the model fitting process instead of likelihood-based optimization. Our proposal does so in a generic manner and can be used for… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  13. arXiv:2403.04629  [pdf, other

    cs.LG cs.AI cs.HC cs.RO stat.ML

    Explaining Bayesian Optimization by Shapley Values Facilitates Human-AI Collaboration

    Authors: Julian Rodemann, Federico Croppi, Philipp Arens, Yusuf Sale, Julia Herbinger, Bernd Bischl, Eyke Hüllermeier, Thomas Augustin, Conor J. Walsh, Giuseppe Casalicchio

    Abstract: Bayesian optimization (BO) with Gaussian processes (GP) has become an indispensable algorithm for black box optimization problems. Not without a dash of irony, BO is often considered a black box itself, lacking ways to provide reasons as to why certain parameters are proposed to be evaluated. This is particularly relevant in human-in-the-loop applications of BO, such as in robotics. We address thi… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Preprint. Copyright by the authors. 19 pages, 24 figures

    ACM Class: I.2.6; I.2.9; F.2.2; J.6

  14. arXiv:2402.01484  [pdf, other

    cs.LG stat.CO stat.ML

    Connecting the Dots: Is Mode-Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?

    Authors: Emanuel Sommer, Lisa Wimmer, Theodore Papamarkou, Ludwig Bothmann, Bernd Bischl, David Rügamer

    Abstract: A major challenge in sample-based inference (SBI) for Bayesian neural networks is the size and structure of the networks' parameter space. Our work shows that successful SBI is possible by embracing the characteristic relationship between weight and function space, uncovering a systematic link between overparameterization and the difficulty of the sampling problem. Through extensive experiments, w… ▽ More

    Submitted 27 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  15. arXiv:2312.13234  [pdf, other

    cs.LG

    Position Paper: Bridging the Gap Between Machine Learning and Sensitivity Analysis

    Authors: Christian A. Scholbeck, Julia Moosbauer, Giuseppe Casalicchio, Hoshin Gupta, Bernd Bischl, Christian Heumann

    Abstract: We argue that interpretations of machine learning (ML) models or the model-building process can bee seen as a form of sensitivity analysis (SA), a general methodology used to explain complex systems in many fields such as environmental modeling, engineering, or economics. We address both researchers and practitioners, calling attention to the benefits of a unified SA-based view of explanations in… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  16. arXiv:2311.15395  [pdf, other

    cs.LG cs.CV stat.ML

    ConstraintMatch for Semi-constrained Clustering

    Authors: Jann Goschenhofer, Bernd Bischl, Zsolt Kira

    Abstract: Constrained clustering allows the training of classification models using pairwise constraints only, which are weak and relatively easy to mine, while still yielding full-supervision-level model performance. While they perform well even in the absence of the true underlying class labels, constrained clustering models still require large amounts of binary constraint annotations for training. In thi… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Journal ref: 2023 International Joint Conference on Neural Networks (IJCNN)

  17. arXiv:2311.01349  [pdf, other

    cs.LG cs.CY stat.ML

    Post-hoc Orthogonalization for Mitigation of Protected Feature Bias in CXR Embeddings

    Authors: Tobias Weber, Michael Ingrisch, Bernd Bischl, David Rügamer

    Abstract: Purpose: To analyze and remove protected feature effects in chest radiograph embeddings of deep learning models. Methods: An orthogonalization is utilized to remove the influence of protected features (e.g., age, sex, race) in CXR embeddings, ensuring feature-independent results. To validate the efficacy of the approach, we retrospectively study the MIMIC and CheXpert datasets using three pre-trai… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  18. arXiv:2310.15108  [pdf, other

    stat.ML cs.LG stat.AP stat.CO stat.ME

    Evaluating machine learning models in non-standard settings: An overview and new findings

    Authors: Roman Hornung, Malte Nalenz, Lennart Schneider, Andreas Bender, Ludwig Bothmann, Bernd Bischl, Thomas Augustin, Anne-Laure Boulesteix

    Abstract: Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and identically distributed, resampling using simple random data divisions may lead to biased GE estimates. This paper strives to present well-grounded guidelines fo… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  19. arXiv:2310.06514  [pdf, other

    cs.LG

    AttributionLab: Faithfulness of Feature Attribution Under Controllable Environments

    Authors: Yang Zhang, Yawei Li, Hannah Brown, Mina Rezaei, Bernd Bischl, Philip Torr, Ashkan Khakzar, Kenji Kawaguchi

    Abstract: Feature attribution explains neural network outputs by identifying relevant input features. The attribution has to be faithful, meaning that the attributed features must mirror the input features that influence the output. One recent trend to test faithfulness is to fit a model on designed data with known relevant features and then compare attributions with ground truth input features.This idea as… ▽ More

    Submitted 14 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Appear at NeurIPS 2023 Workshop XAIA

  20. arXiv:2310.02008  [pdf, other

    cs.LG econ.EM stat.ML

    fmeffects: An R Package for Forward Marginal Effects

    Authors: Holger Löwe, Christian A. Scholbeck, Christian Heumann, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Forward marginal effects (FMEs) have recently been introduced as a versatile and effective model-agnostic interpretation method. They provide comprehensible and actionable model explanations in the form of: If we change $x$ by an amount $h$, what is the change in predicted outcome $\widehat{y}$? We present the R package fmeffects, the first software implementation of FMEs. The relevant theoretical… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  21. arXiv:2309.02048  [pdf, other

    cs.LG stat.ML

    Probabilistic Self-supervised Learning via Scoring Rules Minimization

    Authors: Amirhossein Vahidi, Simon Schoßer, Lisa Wimmer, Yawei Li, Bernd Bischl, Eyke Hüllermeier, Mina Rezaei

    Abstract: In this paper, we propose a novel probabilistic self-supervised learning via Scoring Rule Minimization (ProSMIN), which leverages the power of probabilistic models to enhance representation quality and mitigate collapsing representations. Our proposed approach involves two neural networks; the online network and the target network, which collaborate and learn the diverse distribution of representa… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  22. arXiv:2308.14705  [pdf, other

    stat.ML cs.LG

    Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning

    Authors: Amirhossein Vahidi, Lisa Wimmer, Hüseyin Anil Gündüz, Bernd Bischl, Eyke Hüllermeier, Mina Rezaei

    Abstract: Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory demands. In addition, the efficiency of a deep ensemble is related to diversity among the ensemble members which is challenging for large, over-parameterized de… ▽ More

    Submitted 1 September, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  23. arXiv:2308.08949  [pdf, other

    cs.LG cs.AI

    A Dual-Perspective Approach to Evaluating Feature Attribution Methods

    Authors: Yawei Li, Yang Zhang, Kenji Kawaguchi, Ashkan Khakzar, Bernd Bischl, Mina Rezaei

    Abstract: Feature attribution methods attempt to explain neural network predictions by identifying relevant features. However, establishing a cohesive framework for assessing feature attribution remains a challenge. There are several views through which we can evaluate attributions. One principal lens is to observe the effect of perturbing attributed features on the model's behavior (i.e., faithfulness). Wh… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: 16 pages, 14 figures

  24. arXiv:2307.08364  [pdf, other

    cs.LG cs.NE

    Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML

    Authors: Lennart Purucker, Lennart Schneider, Marie Anastacio, Joeran Beel, Bernd Bischl, Holger Hoos

    Abstract: Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While… ▽ More

    Submitted 2 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 10 pages main paper, 24 pages references and appendix, 4 figures, 16 subfigures, 13 tables, to be published in: International Conference on Automated Machine Learning 2023; affiliations corrected. arXiv admin note: text overlap with arXiv:2307.00286

    ACM Class: I.2.6; I.5.1

  25. arXiv:2307.08175  [pdf, other

    cs.LG cs.NE stat.ML

    Multi-Objective Optimization of Performance and Interpretability of Tabular Supervised Machine Learning Models

    Authors: Lennart Schneider, Bernd Bischl, Janek Thomas

    Abstract: We present a model-agnostic framework for jointly optimizing the predictive performance and interpretability of supervised machine learning models for tabular data. Interpretability is quantified via three measures: feature sparsity, interaction sparsity of features, and sparsity of non-monotone feature effects. By treating hyperparameter optimization of a machine learning algorithm as a multi-obj… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: Extended version of the paper accepted at GECCO 2023. 16 pages, 7 tables, 7 figures

  26. arXiv:2307.07331  [pdf, other

    cs.CL cs.CY cs.LG stat.ML

    How Different Is Stereotypical Bias Across Languages?

    Authors: Ibrahim Tolga Öztürk, Rostislav Nedelchev, Christian Heumann, Esteban Garces Arias, Marius Roger, Bernd Bischl, Matthias Aßenmacher

    Abstract: Recent studies have demonstrated how to assess the stereotypical bias in pre-trained English language models. In this work, we extend this branch of research in multiple different dimensions by systematically investigating (a) mono- and multilingual models of (b) different underlying architectures with respect to their bias in (c) multiple different languages. To that end, we make use of the Engli… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted @ "3rd Workshop on Bias and Fairness in AI" (co-located with ECML PKDD 2023). This is the author's version of the work. The definite version of record will be published in the proceedings

  27. arXiv:2307.03571  [pdf, other

    cs.LG math.OC stat.ML

    Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization

    Authors: Chris Kolb, Christian L. Müller, Bernd Bischl, David Rügamer

    Abstract: We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity. These non-smooth and possibly non-convex problems typically rely on solvers tailored to specific models and regularizers. In contrast, our method enables fully differentiable and approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep l… ▽ More

    Submitted 26 April, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

  28. arXiv:2306.10087  [pdf, other

    cs.LG cs.AI

    ActiveGLAE: A Benchmark for Deep Active Learning with Transformers

    Authors: Lukas Rauch, Matthias Aßenmacher, Denis Huseljic, Moritz Wirth, Bernd Bischl, Bernhard Sick

    Abstract: Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most. Despite extensive research, there is currently no standardized evaluation protocol for transformer-based language models in the field of DAL. Diverse experimental settings lead to difficulties in comparing research and deriving recommenda… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted @ ECML PKDD 2023. This is the author's version of the work. The definitive Version of Record will be published in the Proceedings of ECML PKDD 2023

  29. arXiv:2306.00541  [pdf, other

    stat.ML cs.LG

    Decomposing Global Feature Effects Based on Feature Interactions

    Authors: Julia Herbinger, Marvin N. Wright, Thomas Nagler, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Global feature effect methods, such as partial dependence plots, provide an intelligible visualization of the expected marginal feature effect. However, such global feature effect methods can be misleading, as they do not represent local feature effects of single observations well when feature interactions are present. We formally introduce generalized additive decomposition of global effects (GAD… ▽ More

    Submitted 1 July, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  30. arXiv:2305.16376  [pdf, other

    eess.IV cs.CV cs.LG

    Constrained Probabilistic Mask Learning for Task-specific Undersampled MRI Reconstruction

    Authors: Tobias Weber, Michael Ingrisch, Bernd Bischl, David Rügamer

    Abstract: Undersampling is a common method in Magnetic Resonance Imaging (MRI) to subsample the number of data points in k-space, reducing acquisition times at the cost of decreased image quality. A popular approach is to employ undersampling patterns following various strategies, e.g., variable density sampling or radial trajectories. In this work, we propose a method that directly learns the undersampling… ▽ More

    Submitted 22 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: accepted at WACV 2024

  31. arXiv:2305.16031  [pdf, other

    cs.CL

    Efficient Document Embeddings via Self-Contrastive Bregman Divergence Learning

    Authors: Daniel Saggau, Mina Rezaei, Bernd Bischl, Ilias Chalkidis

    Abstract: Learning quality document embeddings is a fundamental problem in natural language processing (NLP), information retrieval (IR), recommendation systems, and search engines. Despite recent advances in the development of transformer-based models that produce sentence embeddings with self-contrastive learning, the encoding of long documents (Ks of words) is still challenging with respect to both effic… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 5 pages, short paper at Findings of ACL 2023

  32. Deep Learning for Survival Analysis: A Review

    Authors: Simon Wiegrebe, Philipp Kopper, Raphael Sonabend, Bernd Bischl, Andreas Bender

    Abstract: The influx of deep learning (DL) techniques into the field of survival analysis in recent years has led to substantial methodological progress; for instance, learning from unstructured or high-dimensional data such as images, text or omics data. In this work, we conduct a comprehensive systematic review of DL-based methods for time-to-event analysis, characterizing them according to both survival-… ▽ More

    Submitted 22 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 29 pages, 7 figures, 2 tables, 1 interactive table

    Journal ref: Artif Intell Rev 57, 65 (2024)

  33. Interpretable Regional Descriptors: Hyperbox-Based Local Explanations

    Authors: Susanne Dandl, Giuseppe Casalicchio, Bernd Bischl, Ludwig Bothmann

    Abstract: This work introduces interpretable regional descriptors, or IRDs, for local, model-agnostic interpretations. IRDs are hyperboxes that describe how an observation's feature values can be changed without affecting its prediction. They justify a prediction by providing a set of "even if" arguments (semi-factual explanations), and they indicate which features affect a prediction and whether pointwise… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Journal ref: Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science, vol. 14171, p. 479-495

  34. arXiv:2304.07250  [pdf, other

    cs.CV cs.AI

    Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments

    Authors: Felix Ott, Lucas Heublein, David Rügamer, Bernd Bischl, Christopher Mutschler

    Abstract: The localization of objects is a crucial task in various applications such as robotics, virtual and augmented reality, and the transportation of goods in warehouses. Recent advances in deep learning have enabled the localization using monocular visual cameras. While structure from motion (SfM) predicts the absolute pose from a point cloud, absolute pose regression (APR) methods learn a semantic un… ▽ More

    Submitted 9 June, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

    MSC Class: 68U01 ACM Class: I.2.9; I.2.10; I.4.1; I.4.10; I.5.4

  35. arXiv:2304.06569  [pdf, other

    stat.ML cs.LG stat.CO

    counterfactuals: An R Package for Counterfactual Explanation Methods

    Authors: Susanne Dandl, Andreas Hofheinz, Martin Binder, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Counterfactual explanation methods provide information on how feature values of individual observations must be changed to obtain a desired prediction. Despite the increasing amount of proposed methods in research, only a few implementations exist whose interfaces and requirements vary widely. In this work, we introduce the counterfactuals R package, which provides a modular and unified R6-based i… ▽ More

    Submitted 15 September, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: 49 pages LaTeX, updated benchmark results

  36. arXiv:2304.02902  [pdf, other

    stat.ML cs.LG

    Towards Efficient MCMC Sampling in Bayesian Neural Networks by Exploiting Symmetry

    Authors: Jonas Gregor Wiese, Lisa Wimmer, Theodore Papamarkou, Bernd Bischl, Stephan Günnemann, David Rügamer

    Abstract: Bayesian inference in deep neural networks is challenging due to the high-dimensional, strongly multi-modal parameter posterior density landscape. Markov chain Monte Carlo approaches asymptotically recover the true posterior but are considered prohibitively expensive for large modern architectures. Local methods, which have emerged as a popular alternative, focus on specific parameter regions that… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  37. arXiv:2303.11224  [pdf, other

    eess.IV cs.CV cs.LG

    Cascaded Latent Diffusion Models for High-Resolution Chest X-ray Synthesis

    Authors: Tobias Weber, Michael Ingrisch, Bernd Bischl, David Rügamer

    Abstract: While recent advances in large-scale foundational models show promising results, their application to the medical domain has not yet been explored in detail. In this paper, we progress into the realms of large-scale modeling in medical synthesis by proposing Cheff - a foundational cascaded latent diffusion model, which generates highly-realistic chest radiographs providing state-of-the-art quality… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: accepted at PAKDD 2023

  38. Can Fairness be Automated? Guidelines and Opportunities for Fairness-aware AutoML

    Authors: Hilde Weerts, Florian Pfisterer, Matthias Feurer, Katharina Eggensperger, Edward Bergman, Noor Awad, Joaquin Vanschoren, Mykola Pechenizkiy, Bernd Bischl, Frank Hutter

    Abstract: The field of automated machine learning (AutoML) introduces techniques that automate parts of the development of machine learning (ML) systems, accelerating the process and reducing barriers for novices. However, decisions derived from ML models can reproduce, amplify, or even introduce unfairness in our societies, causing harm to (groups of) individuals. In response, researchers have started to p… ▽ More

    Submitted 20 February, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Journal ref: Journal of Artificial Intelligence Research 79 (2024) 639-677

  39. arXiv:2301.06293  [pdf, other

    cs.CV

    Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition

    Authors: Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, Christopher Mutschler

    Abstract: The performance of a machine learning model degrades when it is applied to data from a similar but different domain than the data it has initially been trained on. The goal of domain adaptation (DA) is to mitigate this domain shift problem by searching for an optimal feature transformation to learn a domain-invariant representation. Such a domain shift can appear in handwriting recognition (HWR) a… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

    Comments: Accepted at IAPR Intl. Workshop on Multimodal Pattern Recognition of Social Signals in Human Computer Interaction (MPRSS), Montreal, Canada, August 2022

    MSC Class: 49Q22; 62M10 ACM Class: I.2.4

  40. Mind the Gap: Measuring Generalization Performance Across Multiple Objectives

    Authors: Matthias Feurer, Katharina Eggensperger, Edward Bergman, Florian Pfisterer, Bernd Bischl, Frank Hutter

    Abstract: Modern machine learning models are often constructed taking into account multiple objectives, e.g., minimizing inference time while also maximizing accuracy. Multi-objective hyperparameter optimization (MHPO) algorithms return such candidate models, and the approximation of the Pareto front is used to assess their performance. In practice, we also want to measure generalization when moving from th… ▽ More

    Submitted 9 February, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

  41. arXiv:2210.07723  [pdf, other

    stat.ML cs.CR cs.LG

    Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models

    Authors: Daniel Schalk, Bernd Bischl, David Rügamer

    Abstract: Various privacy-preserving frameworks that respect the individual's privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy… ▽ More

    Submitted 10 March, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

  42. arXiv:2209.07527  [pdf, other

    q-bio.QM cs.LG

    Improved proteasomal cleavage prediction with positive-unlabeled learning

    Authors: Emilio Dorigatti, Bernd Bischl, Benjamin Schubert

    Abstract: Accurate in silico modeling of the antigen processing pathway is crucial to enable personalized epitope vaccine design for cancer. An important step of such pathway is the degradation of the vaccine into smaller peptides by the proteasome, some of which are going to be presented to T cells by the MHC complex. While predicting MHC-peptide presentation has received a lot of attention recently, prote… ▽ More

    Submitted 28 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 8 pages

  43. arXiv:2209.06941  [pdf, other

    cs.CV cs.LG

    Joint Debiased Representation and Image Clustering Learning with Self-Supervision

    Authors: Shunjie-Fabian Zheng, JaeEun Nam, Emilio Dorigatti, Bernd Bischl, Shekoofeh Azizi, Mina Rezaei

    Abstract: Contrastive learning is among the most successful methods for visual representation learning, and its performance can be further improved by jointly performing clustering on the learned representations. However, existing methods for joint clustering and contrastive learning do not perform well on long-tailed data distributions, as majority classes overwhelm and distort the loss of minority classes… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

  44. arXiv:2209.03302  [pdf, other

    cs.LG

    Quantifying Aleatoric and Epistemic Uncertainty in Machine Learning: Are Conditional Entropy and Mutual Information Appropriate Measures?

    Authors: Lisa Wimmer, Yusuf Sale, Paul Hofman, Bern Bischl, Eyke Hüllermeier

    Abstract: The quantification of aleatoric and epistemic uncertainty in terms of conditional entropy and mutual information, respectively, has recently become quite common in machine learning. While the properties of these measures, which are rooted in information theory, seem appealing at first glance, we identify various incoherencies that call their appropriateness into question. In addition to the measur… ▽ More

    Submitted 25 June, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: To appear in: Proc. UAI, 39th Conference on Uncertainty in Artificial Intelligence, Pittsburgh, PA, USA, 2023

  45. arXiv:2209.02459  [pdf, other

    cs.LG

    Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision

    Authors: Emilio Dorigatti, Jonas Schweisthal, Bernd Bischl, Mina Rezaei

    Abstract: Learning from positive and unlabeled (PU) data is a setting where the learner only has access to positive and unlabeled samples while having no information on negative examples. Such PU setting is of great importance in various tasks such as medical diagnosis, social network analysis, financial markets analysis, and knowledge base completion, which also tend to be intrinsically imbalanced, i.e., w… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  46. arXiv:2208.00919  [pdf, other

    cs.CV

    Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression

    Authors: Felix Ott, Nisha Lakshmana Raichur, David Rügamer, Tobias Feigl, Heiko Neumann, Bernd Bischl, Christopher Mutschler

    Abstract: Visual-inertial localization is a key problem in computer vision and robotics applications such as virtual reality, self-driving cars, and aerial vehicles. The goal is to estimate an accurate pose of an object when either the environment or the dynamics are known. Absolute pose regression (APR) techniques directly regress the absolute pose from an image input in a known scene using convolutional a… ▽ More

    Submitted 4 August, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: Under review

    MSC Class: 68T40; 65D19 ACM Class: I.4; I.5.1

  47. arXiv:2208.00220  [pdf, other

    cs.LG

    HPO X ELA: Investigating Hyperparameter Optimization Landscapes by Means of Exploratory Landscape Analysis

    Authors: Lennart Schneider, Lennart Schäpermeier, Raphael Patrick Prager, Bernd Bischl, Heike Trautmann, Pascal Kerschke

    Abstract: Hyperparameter optimization (HPO) is a key component of machine learning models for achieving peak predictive performance. While numerous methods and algorithms for HPO have been proposed over the last years, little progress has been made in illuminating and examining the actual structure of these black-box optimization problems. Exploratory landscape analysis (ELA) subsumes a set of techniques th… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Comments: Accepted at PPSN 2022. 15 pages, 2 tables, 7 figures

  48. arXiv:2208.00204  [pdf, other

    cs.LG cs.NE stat.ML

    Tackling Neural Architecture Search With Quality Diversity Optimization

    Authors: Lennart Schneider, Florian Pfisterer, Paul Kent, Juergen Branke, Bernd Bischl, Janek Thomas

    Abstract: Neural architecture search (NAS) has been studied extensively and has grown to become a research field with substantial impact. While classical single-objective NAS searches for the architecture with the best performance, multi-objective NAS considers multiple objectives that should be optimized simultaneously, e.g., minimizing resource usage along the validation error. Although considerable progr… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Comments: Accepted at the First Conference on Automated Machine Learning (Main Track). 30 pages, 8 tables, 13 figures

  49. arXiv:2207.12560  [pdf, other

    cs.LG stat.ML

    AMLB: an AutoML Benchmark

    Authors: Pieter Gijsbers, Marcos L. P. Bueno, Stefan Coors, Erin LeDell, Sébastien Poirier, Janek Thomas, Bernd Bischl, Joaquin Vanschoren

    Abstract: Comparing different AutoML frameworks is notoriously challenging and often done incorrectly. We introduce an open and extensible benchmark that follows best practices and avoids common mistakes when comparing AutoML frameworks. We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks. The differences between the AutoML frameworks are explo… ▽ More

    Submitted 16 November, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: UNDER REVIEW: Revised submission to JMLR, with updated results from June 2023

  50. arXiv:2206.08640  [pdf, other

    cs.CV cs.AI

    Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

    Authors: Andreas Klaß, Sven M. Lorenz, Martin W. Lauer-Schmaltz, David Rügamer, Bernd Bischl, Christopher Mutschler, Felix Ott

    Abstract: For many applications, analyzing the uncertainty of a machine learning model is indispensable. While research of uncertainty quantification (UQ) techniques is very advanced for computer vision applications, UQ methods for spatio-temporal data are less studied. In this paper, we focus on models for online handwriting recognition, one particular type of spatio-temporal data. The data is observed fro… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    MSC Class: 62F15 ACM Class: H.1.1