-
How Inverse Conditional Flows Can Serve as a Substitute for Distributional Regression
Authors:
Lucas Kook,
Chris Kolb,
Philipp Schiele,
Daniel Dold,
Marcel Arpogaus,
Cornelius Fritz,
Philipp F. Baumann,
Philipp Kopper,
Tobias Pielok,
Emilio Dorigatti,
David Rügamer
Abstract:
Neural network representations of simple models, such as linear regression, are being studied increasingly to better understand the underlying principles of deep learning algorithms. However, neural representations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse…
▽ More
Neural network representations of simple models, such as linear regression, are being studied increasingly to better understand the underlying principles of deep learning algorithms. However, neural representations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse flow transformations (DRIFT), which includes neural representations of the aforementioned models. We empirically demonstrate that the neural representations of models in DRIFT can serve as a substitute for their classical statistical counterparts in several applications involving continuous, ordered, time-series, and survival outcomes. We confirm that models in DRIFT empirically match the performance of several statistical methods in terms of estimation of partial effects, prediction, and aleatoric uncertainty quantification. DRIFT covers both interpretable statistical models and flexible neural networks opening up new avenues in both statistical modeling and deep learning.
△ Less
Submitted 13 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Generalizing Orthogonalization for Models with Non-Linearities
Authors:
David Rügamer,
Chris Kolb,
Tobias Weber,
Lucas Kook,
Thomas Nagler
Abstract:
The complexity of black-box algorithms can lead to various challenges, including the introduction of biases. These biases present immediate risks in the algorithms' application. It was, for instance, shown that neural networks can deduce racial information solely from a patient's X-ray scan, a task beyond the capability of medical experts. If this fact is not known to the medical expert, automatic…
▽ More
The complexity of black-box algorithms can lead to various challenges, including the introduction of biases. These biases present immediate risks in the algorithms' application. It was, for instance, shown that neural networks can deduce racial information solely from a patient's X-ray scan, a task beyond the capability of medical experts. If this fact is not known to the medical expert, automatic decision-making based on this algorithm could lead to prescribing a treatment (purely) based on racial information. While current methodologies allow for the "orthogonalization" or "normalization" of neural networks with respect to such information, existing approaches are grounded in linear models. Our paper advances the discourse by introducing corrections for non-linearities such as ReLU activations. Our approach also encompasses scalar and tensor-valued predictions, facilitating its integration into neural network architectures. Through extensive experiments, we validate our method's effectiveness in safeguarding sensitive data in generalized linear models, normalizing convolutional neural networks for metadata, and rectifying pre-existing embeddings for undesired attributes.
△ Less
Submitted 2 June, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Deep conditional transformation models for survival analysis
Authors:
Gabriele Campanella,
Lucas Kook,
Ida Häggström,
Torsten Hothorn,
Thomas J. Fuchs
Abstract:
An every increasing number of clinical trials features a time-to-event outcome and records non-tabular patient data, such as magnetic resonance imaging or text data in the form of electronic health records. Recently, several neural-network based solutions have been proposed, some of which are binary classifiers. Parametric, distribution-free approaches which make full use of survival time and cens…
▽ More
An every increasing number of clinical trials features a time-to-event outcome and records non-tabular patient data, such as magnetic resonance imaging or text data in the form of electronic health records. Recently, several neural-network based solutions have been proposed, some of which are binary classifiers. Parametric, distribution-free approaches which make full use of survival time and censoring status have not received much attention. We present deep conditional transformation models (DCTMs) for survival outcomes as a unifying approach to parametric and semiparametric survival analysis. DCTMs allow the specification of non-linear and non-proportional hazards for both tabular and non-tabular data and extend to all types of censoring and truncation. On real and semi-synthetic data, we show that DCTMs compete with state-of-the-art DL approaches to survival analysis.
△ Less
Submitted 21 October, 2022; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Deep interpretable ensembles
Authors:
Lucas Kook,
Andrea Götschi,
Philipp FM Baumann,
Torsten Hothorn,
Beate Sick
Abstract:
Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawbac…
▽ More
Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawback of deep ensembles in high-stake decision fields, in which interpretable models are desired. We propose a novel transformation ensemble which aggregates probabilistic predictions with the guarantee to preserve interpretability and yield uniformly better predictions than the ensemble members on average. Transformation ensembles are tailored towards interpretable deep transformation models but are applicable to a wider range of probabilistic neural networks. In experiments on several publicly available data sets, we demonstrate that transformation ensembles perform on par with classical deep ensembles in terms of prediction performance, discrimination, and calibration. In addition, we demonstrate how transformation ensembles quantify both aleatoric and epistemic uncertainty, and produce minimax optimal predictions under certain conditions.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression
Authors:
David Rügamer,
Chris Kolb,
Cornelius Fritz,
Florian Pfisterer,
Philipp Kopper,
Bernd Bischl,
Ruolin Shen,
Christina Bukas,
Lisa Barros de Andrade e Sousa,
Dominik Thalmeier,
Philipp Baumann,
Lucas Kook,
Nadja Klein,
Christian L. Müller
Abstract:
In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep…
▽ More
In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as \pkg{mgcv}. The packages' modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models.
△ Less
Submitted 10 March, 2022; v1 submitted 6 April, 2021;
originally announced April 2021.
-
Deep and interpretable regression models for ordinal outcomes
Authors:
Lucas Kook,
Lisa Herzog,
Torsten Hothorn,
Oliver Dürr,
Beate Sick
Abstract:
Outcomes with a natural order commonly occur in prediction tasks and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome's order…
▽ More
Outcomes with a natural order commonly occur in prediction tasks and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome's order and yield interpretable predictor effects but are limited to tabular data. We present ordinal neural network transformation models (ONTRAMs), which unite DL with classical ordinal regression approaches. ONTRAMs are a special case of transformation models and trade off flexibility and interpretability by additively decomposing the transformation function into terms for image and tabular data using jointly trained neural networks. The performance of the most flexible ONTRAM is by definition equivalent to a standard multi-class DL model trained with cross-entropy while being faster in training when facing ordinal outcomes. Lastly, we discuss how to interpret model components for both tabular and image data on two publicly available datasets.
△ Less
Submitted 20 April, 2021; v1 submitted 16 October, 2020;
originally announced October 2020.