-
Multimodal Deep Learning
Authors:
Cem Akkus,
Luyang Chu,
Vladana Djakovic,
Steffen Jauch-Walser,
Philipp Koch,
Giacomo Loss,
Christopher Marquardt,
Marco Moldovan,
Nadja Sauter,
Maximilian Schneider,
Rickmer Schulte,
Karol Urbanczyk,
Jann Goschenhofer,
Christian Heumann,
Rasmus Hvingelby,
Daniel Schalk,
Matthias Aßenmacher
Abstract:
This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance rep…
▽ More
This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models
Authors:
Daniel Schalk,
Bernd Bischl,
David Rügamer
Abstract:
Various privacy-preserving frameworks that respect the individual's privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy…
▽ More
Various privacy-preserving frameworks that respect the individual's privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the $L_2$-loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods.
△ Less
Submitted 10 March, 2023; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Distributed non-disclosive validation of predictive models by a modified ROC-GLM
Authors:
Daniel Schalk,
Verena S. Hoffmann,
Bernd Bischl,
Ulrich Mansmann
Abstract:
Distributed statistical analyses provide a promising approach for privacy protection when analysing data distributed over several databases. It brings the analysis to the data and not the data to the analysis. The analyst receives anonymous summary statistics which are combined to a aggregated result. We are interested to calculate the AUC of a prediction score based on a distributed approach with…
▽ More
Distributed statistical analyses provide a promising approach for privacy protection when analysing data distributed over several databases. It brings the analysis to the data and not the data to the analysis. The analyst receives anonymous summary statistics which are combined to a aggregated result. We are interested to calculate the AUC of a prediction score based on a distributed approach without getting to know the data of involved individual subjects distributed over different databases. We use DataSHIELD as the technology to carry out distributed analyses and use a newly developed algorithms to perform the validation of the prediction score. Calibration can easily be implemented in the distributed setting. But, discrimination represented by a respective ROC curve and its AUC is challenging. We base our approach on the ROC-GLM algorithm as well as on ideas of differential privacy. The proposed algorithms are evaluated in a simulation study. A real-word application is described: The audit use case of DIFUTURE (Medical Informatics Initiative) with the goal to validate a treatment prediction rule of patients with newly diagnosed multiple sclerosis.
△ Less
Submitted 14 March, 2023; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Accelerated Componentwise Gradient Boosting using Efficient Data Representation and Momentum-based Optimization
Authors:
Daniel Schalk,
Bernd Bischl,
David Rügamer
Abstract:
Componentwise boosting (CWB), also known as model-based boosting, is a variant of gradient boosting that builds on additive models as base learners to ensure interpretability. CWB is thus often used in research areas where models are employed as tools to explain relationships in data. One downside of CWB is its computational complexity in terms of memory and runtime. In this paper, we propose two…
▽ More
Componentwise boosting (CWB), also known as model-based boosting, is a variant of gradient boosting that builds on additive models as base learners to ensure interpretability. CWB is thus often used in research areas where models are employed as tools to explain relationships in data. One downside of CWB is its computational complexity in terms of memory and runtime. In this paper, we propose two techniques to overcome these issues without losing the properties of CWB: feature discretization of numerical features and incorporating Nesterov momentum into functional gradient descent. As the latter can be prone to early overfitting, we also propose a hybrid approach that prevents a possibly diverging gradient descent routine while ensuring faster convergence. We perform extensive benchmarks on multiple simulated and real-world data sets to demonstrate the improvements in runtime and memory consumption while maintaining state-of-the-art estimation and prediction performance.
△ Less
Submitted 29 October, 2021; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Automatic Componentwise Boosting: An Interpretable AutoML System
Authors:
Stefan Coors,
Daniel Schalk,
Bernd Bischl,
David Rügamer
Abstract:
In practice, machine learning (ML) workflows require various different steps, from data preprocessing, missing value imputation, model selection, to model tuning as well as model evaluation. Many of these steps rely on human ML experts. AutoML - the field of automating these ML pipelines - tries to help practitioners to apply ML off-the-shelf without any expert knowledge. Most modern AutoML system…
▽ More
In practice, machine learning (ML) workflows require various different steps, from data preprocessing, missing value imputation, model selection, to model tuning as well as model evaluation. Many of these steps rely on human ML experts. AutoML - the field of automating these ML pipelines - tries to help practitioners to apply ML off-the-shelf without any expert knowledge. Most modern AutoML systems like auto-sklearn, H20-AutoML or TPOT aim for high predictive performance, thereby generating ensembles that consist almost exclusively of black-box models. This, in turn, makes the interpretation for the layperson more intricate and adds another layer of opacity for users. We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm. Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions, allows for a straightforward calculation of feature importance, and gives insights into the required model complexity to fit the given task. We introduce the general framework and outline its implementation autocompboost. To demonstrate the frameworks efficacy, we compare autocompboost to other existing systems based on the OpenML AutoML-Benchmark. Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets while being more user-friendly and transparent.
△ Less
Submitted 16 October, 2021; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Component-Wise Boosting of Targets for Multi-Output Prediction
Authors:
Quay Au,
Daniel Schalk,
Giuseppe Casalicchio,
Ramona Schoedel,
Clemens Stachl,
Bernd Bischl
Abstract:
Multi-output prediction deals with the prediction of several targets of possibly diverse types. One way to address this problem is the so called problem transformation method. This method is often used in multi-label learning, but can also be used for multi-output prediction due to its generality and simplicity. In this paper, we introduce an algorithm that uses the problem transformation method f…
▽ More
Multi-output prediction deals with the prediction of several targets of possibly diverse types. One way to address this problem is the so called problem transformation method. This method is often used in multi-label learning, but can also be used for multi-output prediction due to its generality and simplicity. In this paper, we introduce an algorithm that uses the problem transformation method for multi-output prediction, while simultaneously learning the dependencies between target variables in a sparse and interpretable manner. In a first step, predictions are obtained for each target individually. Target dependencies are then learned via a component-wise boosting approach. We compare our new method with similar approaches in a benchmark using multi-label, multivariate regression and mixed-type datasets.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Granular Shear Flow Dynamics and Forces : Experiment and Continuum Theory
Authors:
L. Bocquet,
W. Losert,
D. Schalk,
T. C. Lubensky,
J. P. Gollub
Abstract:
We analyze the main features of granular shear flow through experimental measurements in a Couette geometry and a comparison to a locally Newtonian, continuum model of granular flow. The model is based on earlier hydrodynamic models, adjusted to take into account the experimentally observed coupling between fluctuations in particle motion and mean-flow properties. Experimentally, the local veloc…
▽ More
We analyze the main features of granular shear flow through experimental measurements in a Couette geometry and a comparison to a locally Newtonian, continuum model of granular flow. The model is based on earlier hydrodynamic models, adjusted to take into account the experimentally observed coupling between fluctuations in particle motion and mean-flow properties. Experimentally, the local velocity fluctuations are found to vary as a power of the local velocity gradient. This can be explained by an effective viscosity that diverges more rapidly as the random-close-packing density is approached than is predicted by Enskog theory for dense hard sphere systems. Experiment and theory are in good agreement, especially for the following key features of granular flow: The flow is confined to a small shear band, fluctuations decay approximately exponentially away from the sheared wall, and the shear stress is approximately independent of shear rate. The functional forms of the velocity and fluctuation profiles predicted by the model agree with the experimental results.
△ Less
Submitted 19 December, 2000;
originally announced December 2000.