Search | arXiv e-print repository

Conformal Prediction for Causal Effects of Continuous Treatments

Authors: Maresa Schröder, Dennis Frauen, Jonas Schweisthal, Konstantin Heß, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Uncertainty quantification of causal effects is crucial for safety-critical applications such as personalized medicine. A powerful approach for this is conformal prediction, which has several practical benefits due to model-agnostic finite-sample guarantees. Yet, existing methods for conformal prediction of causal effects are limited to binary/discrete treatments and make highly restrictive assump… ▽ More Uncertainty quantification of causal effects is crucial for safety-critical applications such as personalized medicine. A powerful approach for this is conformal prediction, which has several practical benefits due to model-agnostic finite-sample guarantees. Yet, existing methods for conformal prediction of causal effects are limited to binary/discrete treatments and make highly restrictive assumptions such as known propensity scores. In this work, we provide a novel conformal prediction method for potential outcomes of continuous treatments. We account for the additional uncertainty introduced through propensity estimation so that our conformal prediction intervals are valid even if the propensity score is unknown. Our contributions are three-fold: (1) We derive finite-sample prediction intervals for potential outcomes of continuous treatments. (2) We provide an algorithm for calculating the derived intervals. (3) We demonstrate the effectiveness of the conformal prediction intervals in experiments on synthetic and real-world datasets. To the best of our knowledge, we are the first to propose conformal prediction for continuous treatments when the propensity score is unknown and must be estimated from data. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2405.21012 [pdf, other]

G-Transformer for Conditional Average Potential Outcome Estimation over Time

Authors: Konstantin Hess, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Estimating potential outcomes for treatments over time based on observational data is important for personalized decision-making in medicine. Yet, existing neural methods for this task suffer from either (a) bias or (b) large variance. In order to address both limitations, we introduce the G-transformer (GT). Our GT is a novel, neural end-to-end model designed for unbiased, low-variance estimation… ▽ More Estimating potential outcomes for treatments over time based on observational data is important for personalized decision-making in medicine. Yet, existing neural methods for this task suffer from either (a) bias or (b) large variance. In order to address both limitations, we introduce the G-transformer (GT). Our GT is a novel, neural end-to-end model designed for unbiased, low-variance estimation of conditional average potential outcomes (CAPOs) over time. Specifically, our GT is the first neural model to perform regression-based iterative G-computation for CAPOs in the time-varying setting. We evaluate the effectiveness of our GT across various experiments. In sum, this work represents a significant step towards personalized decision-making from electronic health records. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2311.16026 [pdf, other]

A Neural Framework for Generalized Causal Sensitivity Analysis

Authors: Dennis Frauen, Fergus Imrie, Alicia Curth, Valentyn Melnychuk, Stefan Feuerriegel, Mihaela van der Schaar

Abstract: Unobserved confounding is common in many applications, making causal inference from observational data challenging. As a remedy, causal sensitivity analysis is an important tool to draw causal conclusions under unobserved confounding with mathematical guarantees. In this paper, we propose NeuralCSA, a neural framework for generalized causal sensitivity analysis. Unlike previous work, our framework… ▽ More Unobserved confounding is common in many applications, making causal inference from observational data challenging. As a remedy, causal sensitivity analysis is an important tool to draw causal conclusions under unobserved confounding with mathematical guarantees. In this paper, we propose NeuralCSA, a neural framework for generalized causal sensitivity analysis. Unlike previous work, our framework is compatible with (i) a large class of sensitivity models, including the marginal sensitivity model, f-sensitivity models, and Rosenbaum's sensitivity model; (ii) different treatment types (i.e., binary and continuous); and (iii) different causal queries, including (conditional) average treatment effects and simultaneous effects on multiple outcomes. The generality of NeuralCSA is achieved by learning a latent distribution shift that corresponds to a treatment intervention using two conditional normalizing flows. We provide theoretical guarantees that NeuralCSA is able to infer valid bounds on the causal query of interest and also demonstrate this empirically using both simulated and real-world data. △ Less

Submitted 9 April, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted at ICLR 2024

arXiv:2311.11321 [pdf, other]

Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

Abstract: State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning. Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained) low-dimensional representation. However, low-dimensional representations can lose information about the observed confounders and thus lead to bias, because of whi… ▽ More State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning. Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained) low-dimensional representation. However, low-dimensional representations can lose information about the observed confounders and thus lead to bias, because of which the validity of representation learning for CATE estimation is typically violated. In this paper, we propose a new, representation-agnostic refutation framework for estimating bounds on the representation-induced confounding bias that comes from dimensionality reduction (or other constraints on the representations) in CATE estimation. First, we establish theoretically under which conditions CATE is non-identifiable given low-dimensional (constrained) representations. Second, as our remedy, we propose a neural refutation framework which performs partial identification of CATE or, equivalently, aims at estimating lower and upper bounds of the representation-induced confounding bias. We demonstrate the effectiveness of our bounds in a series of experiments. In sum, our refutation framework is of direct relevance in practice where the validity of CATE estimation is of importance. △ Less

Submitted 12 April, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

Journal ref: Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024), Vienna, Austria

arXiv:2311.04529 [pdf, ps, other]

Propagation of acoustic-gravity waves in non-uniform wind flows of the polar atmosphere

Authors: A. K. Fedorenko, E. I. Kryuchkov, O. K. Cheremnykh, Yu. O. Klymenko, S. V. Melnychuk

Abstract: Satellite observations of acoustic-gravity waves in the polar regions of the atmosphere indicate a close connection of these waves with wind flows. The paper investigates the peculiarities of the propagation of AGWs in spatially inhomogeneous flows, where the wind speed slowly changes in the horizontal direction. A system of hydrodynamic equations is used for the analysis, which takes into account… ▽ More Satellite observations of acoustic-gravity waves in the polar regions of the atmosphere indicate a close connection of these waves with wind flows. The paper investigates the peculiarities of the propagation of AGWs in spatially inhomogeneous flows, where the wind speed slowly changes in the horizontal direction. A system of hydrodynamic equations is used for the analysis, which takes into account the wind flow with spatial inhomogeneity. Unlike the system of equations written for a stationary medium (or a medium moving at a uniform speed), the resulting system contains components that describe the interaction of the waves with the medium. It is shown that the influence of inhomogeneous background parameters of the medium can be separated from the effects of inertial forces by means of the special variable substitution. The analytical expression is obtained that describes the changes in the wave amplitude in a medium moving with a non-uniform speed. This expression contains two functional dependencies: 1) the linear part caused by changes in the background parameters of the medium, it does not depend on the direction of the wave propagation relative to the flow; 2) the exponential part associated with the inertia forces, which determines the dependence of the amplitudes of AGWs on the direction of their propagation. The exponential part shows an increase in the amplitudes of the waves in the headwind and a decrease in their amplitudes in the downwind. The obtained theoretical dependence of the amplitudes of AGWs on the wind speed is in good agreement with the data of satellite observations of these waves in the polar thermosphere. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 9 pages, 2 figures

arXiv:2310.17687 [pdf, other]

Counterfactual Fairness for Predictions using Generative Adversarial Networks

Authors: Yuchen Ma, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Fairness in predictions is of direct importance in practice due to legal, ethical, and societal reasons. It is often achieved through counterfactual fairness, which ensures that the prediction for an individual is the same as that in a counterfactual world under a different sensitive attribute. However, achieving counterfactual fairness is challenging as counterfactuals are unobservable. In this p… ▽ More Fairness in predictions is of direct importance in practice due to legal, ethical, and societal reasons. It is often achieved through counterfactual fairness, which ensures that the prediction for an individual is the same as that in a counterfactual world under a different sensitive attribute. However, achieving counterfactual fairness is challenging as counterfactuals are unobservable. In this paper, we develop a novel deep neural network called Generative Counterfactual Fairness Network (GCFN) for making predictions under counterfactual fairness. Specifically, we leverage a tailored generative adversarial network to directly learn the counterfactual distribution of the descendants of the sensitive attribute, which we then use to enforce fair predictions through a novel counterfactual mediator regularization. If the counterfactual distribution is learned sufficiently well, our method is mathematically guaranteed to ensure the notion of counterfactual fairness. Thereby, our GCFN addresses key shortcomings of existing baselines that are based on inferring latent variables, yet which (a) are potentially correlated with the sensitive attributes and thus lead to bias, and (b) have weak capability in constructing latent representations and thus low prediction performance. Across various experiments, our method achieves state-of-the-art performance. Using a real-world case study from recidivism prediction, we further demonstrate that our method makes meaningful predictions in practice. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.17463 [pdf, other]

Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation

Authors: Konstantin Hess, Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

Abstract: Treatment effect estimation in continuous time is crucial for personalized medicine. However, existing methods for this task are limited to point estimates of the potential outcomes, whereas uncertainty estimates have been ignored. Needless to say, uncertainty quantification is crucial for reliable decision-making in medical applications. To fill this gap, we propose a novel Bayesian neural contro… ▽ More Treatment effect estimation in continuous time is crucial for personalized medicine. However, existing methods for this task are limited to point estimates of the potential outcomes, whereas uncertainty estimates have been ignored. Needless to say, uncertainty quantification is crucial for reliable decision-making in medical applications. To fill this gap, we propose a novel Bayesian neural controlled differential equation (BNCDE) for treatment effect estimation in continuous time. In our BNCDE, the time dimension is modeled through a coupled system of neural controlled differential equations and neural stochastic differential equations, where the neural stochastic differential equations allow for tractable variational Bayesian inference. Thereby, for an assigned sequence of treatments, our BNCDE provides meaningful posterior predictive distributions of the potential outcomes. To the best of our knowledge, ours is the first tailored neural method to provide uncertainty estimates of treatment effects in continuous time. As such, our method is of direct practical value for promoting reliable decision-making in medicine. △ Less

Submitted 3 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

arXiv:2306.01424 [pdf, other]

Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

Abstract: Counterfactual inference aims to answer retrospective "what if" questions and thus belongs to the most fine-grained type of inference in Pearl's causality ladder. Existing methods for counterfactual inference with continuous outcomes aim at point identification and thus make strong and unnatural assumptions about the underlying structural causal model. In this paper, we relax these assumptions and… ▽ More Counterfactual inference aims to answer retrospective "what if" questions and thus belongs to the most fine-grained type of inference in Pearl's causality ladder. Existing methods for counterfactual inference with continuous outcomes aim at point identification and thus make strong and unnatural assumptions about the underlying structural causal model. In this paper, we relax these assumptions and aim at partial counterfactual identification of continuous outcomes, i.e., when the counterfactual query resides in an ignorance interval with informative bounds. We prove that, in general, the ignorance interval of the counterfactual queries has non-informative bounds, already when functions of structural causal models are continuously differentiable. As a remedy, we propose a novel sensitivity model called Curvature Sensitivity Model. This allows us to obtain informative bounds by bounding the curvature of level sets of the functions. We further show that existing point counterfactual identification methods are special cases of our Curvature Sensitivity Model when the bound of the curvature is set to zero. We then propose an implementation of our Curvature Sensitivity Model in the form of a novel deep generative model, which we call Augmented Pseudo-Invertible Decoder. Our implementation employs (i) residual normalizing flows with (ii) variational augmentations. We empirically demonstrate the effectiveness of our Augmented Pseudo-Invertible Decoder. To the best of our knowledge, ours is the first partial identification model for Markovian structural causal models with continuous outcomes. △ Less

Submitted 11 January, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

Journal ref: Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, Louisiana, USA, 2023

arXiv:2305.19742 [pdf, other]

Reliable Off-Policy Learning for Dosage Combinations

Authors: Jonas Schweisthal, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Decision-making in personalized medicine such as cancer therapy or critical care must often make choices for dosage combinations, i.e., multiple continuous treatments. Existing work for this task has modeled the effect of multiple treatments independently, while estimating the joint effect has received little attention but comes with non-trivial challenges. In this paper, we propose a novel method… ▽ More Decision-making in personalized medicine such as cancer therapy or critical care must often make choices for dosage combinations, i.e., multiple continuous treatments. Existing work for this task has modeled the effect of multiple treatments independently, while estimating the joint effect has received little attention but comes with non-trivial challenges. In this paper, we propose a novel method for reliable off-policy learning for dosage combinations. Our method proceeds along three steps: (1) We develop a tailored neural network that estimates the individualized dose-response function while accounting for the joint effect of multiple dependent dosages. (2) We estimate the generalized propensity score using conditional normalizing flows in order to detect regions with limited overlap in the shared covariate-treatment space. (3) We present a gradient-based learning algorithm to find the optimal, individualized dosage combinations. Here, we ensure reliable estimation of the policy value by avoiding regions with limited overlap. We finally perform an extensive evaluation of our method to show its effectiveness. To the best of our knowledge, ours is the first work to provide a method for reliable off-policy learning for optimal dosage combinations. △ Less

Submitted 27 October, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2305.16988 [pdf, other]

Sharp Bounds for Generalized Causal Sensitivity Analysis

Authors: Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Causal inference from observational data is crucial for many disciplines such as medicine and economics. However, sharp bounds for causal effects under relaxations of the unconfoundedness assumption (causal sensitivity analysis) are subject to ongoing research. So far, works with sharp bounds are restricted to fairly simple settings (e.g., a single binary treatment). In this paper, we propose a un… ▽ More Causal inference from observational data is crucial for many disciplines such as medicine and economics. However, sharp bounds for causal effects under relaxations of the unconfoundedness assumption (causal sensitivity analysis) are subject to ongoing research. So far, works with sharp bounds are restricted to fairly simple settings (e.g., a single binary treatment). In this paper, we propose a unified framework for causal sensitivity analysis under unobserved confounding in various settings. For this, we propose a flexible generalization of the marginal sensitivity model (MSM) and then derive sharp bounds for a large class of causal effects. This includes (conditional) average treatment effects, effects for mediation analysis and path analysis, and distributional effects. Furthermore, our sensitivity model is applicable to discrete, continuous, and time-varying treatments. It allows us to interpret the partial identification problem under unobserved confounding as a distribution shift in the latent confounders while evaluating the causal effect of interest. In the special case of a single binary treatment, our bounds for (conditional) average treatment effects coincide with recent optimality results for causal sensitivity analysis. Finally, we propose a scalable algorithm to estimate our sharp bounds from observational data. △ Less

Submitted 16 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2303.08516 [pdf, other]

Fair Off-Policy Learning from Observational Data

Authors: Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: Algorithmic decision-making in practice must be fair for legal, ethical, and societal reasons. To achieve this, prior research has contributed various approaches that ensure fairness in machine learning predictions, while comparatively little effort has focused on fairness in decision-making, specifically off-policy learning. In this paper, we propose a novel framework for fair off-policy learning… ▽ More Algorithmic decision-making in practice must be fair for legal, ethical, and societal reasons. To achieve this, prior research has contributed various approaches that ensure fairness in machine learning predictions, while comparatively little effort has focused on fairness in decision-making, specifically off-policy learning. In this paper, we propose a novel framework for fair off-policy learning: we learn decision rules from observational data under different notions of fairness, where we explicitly assume that observational data were collected under a different potentially discriminatory behavioral policy. For this, we first formalize different fairness notions for off-policy learning. We then propose a neural network-based framework to learn optimal policies under different fairness notions. We further provide theoretical guarantees in the form of generalization bounds for the finite-sample version of our framework. We demonstrate the effectiveness of our framework through extensive numerical experiments using both simulated and real-world data. Altogether, our work enables algorithmic decision-making in a wide array of practical applications where fairness must be ensured. △ Less

Submitted 9 October, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: Revised version

arXiv:2209.08656 [pdf, other]

Probabilistic Population Protocol Models

Authors: Vladyslav Melnychuk

Abstract: Population protocols are a relatively novel computational model in which very resource-limited anonymous agents interact in pairs with the goal of computing predicates. We consider the probabilistic version of this model, which naturally allows to consider the setup in which a small probability of an incorrect output is tolerated. The main focus of this thesis is the question of confident leader e… ▽ More Population protocols are a relatively novel computational model in which very resource-limited anonymous agents interact in pairs with the goal of computing predicates. We consider the probabilistic version of this model, which naturally allows to consider the setup in which a small probability of an incorrect output is tolerated. The main focus of this thesis is the question of confident leader election, which is an extension of the regular leader election problem with an extra requirement for the eventual leader to detect its uniqueness. Having a confident leader allows the population protocols to determine the convergence of its computations. This behaviour of the model is highly beneficial and was shown to be feasible when the original model is extended in various ways. We show that it takes a linear in terms of the population size number of interactions for a probabilistic population protocol to have a non-zero fraction of agents in all reachable states, starting from a configuration with all agents in the same state. This leads us to a conclusion that confident leader election is out of reach even with the probabilistic version of the model. △ Less

Submitted 18 September, 2022; originally announced September 2022.

arXiv:2209.06203 [pdf, other]

Normalizing Flows for Interventional Density Estimation

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

Abstract: Existing machine learning methods for causal inference usually estimate quantities expressed via the mean of potential outcomes (e.g., average treatment effect). However, such quantities do not capture the full information about the distribution of potential outcomes. In this work, we estimate the density of potential outcomes after interventions from observational data. For this, we propose a nov… ▽ More Existing machine learning methods for causal inference usually estimate quantities expressed via the mean of potential outcomes (e.g., average treatment effect). However, such quantities do not capture the full information about the distribution of potential outcomes. In this work, we estimate the density of potential outcomes after interventions from observational data. For this, we propose a novel, fully-parametric deep learning method called Interventional Normalizing Flows. Specifically, we combine two normalizing flows, namely (i) a nuisance flow for estimating nuisance parameters and (ii) a target flow for parametric estimation of the density of potential outcomes. We further develop a tractable optimization objective based on a one-step bias correction for efficient and doubly robust estimation of the target flow parameters. As a result, our Interventional Normalizing Flows offer a properly normalized density estimator. Across various experiments, we demonstrate that our Interventional Normalizing Flows are expressive and highly effective, and scale well with both sample size and high-dimensional confounding. To the best of our knowledge, our Interventional Normalizing Flows are the first proper fully-parametric, deep learning method for density estimation of potential outcomes. △ Less

Submitted 20 June, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

Journal ref: Proceedings of the 40-th International Conference on Machine Learning, Honolulu, Hawaii, USA, PMLR 202, 2023

arXiv:2204.07258 [pdf, other]

Causal Transformer for Estimating Counterfactual Outcomes

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

Abstract: Estimating counterfactual outcomes over time from observational data is relevant for many applications (e.g., personalized medicine). Yet, state-of-the-art methods build upon simple long short-term memory (LSTM) networks, thus rendering inferences for complex, long-range dependencies challenging. In this paper, we develop a novel Causal Transformer for estimating counterfactual outcomes over time.… ▽ More Estimating counterfactual outcomes over time from observational data is relevant for many applications (e.g., personalized medicine). Yet, state-of-the-art methods build upon simple long short-term memory (LSTM) networks, thus rendering inferences for complex, long-range dependencies challenging. In this paper, we develop a novel Causal Transformer for estimating counterfactual outcomes over time. Our model is specifically designed to capture complex, long-range dependencies among time-varying confounders. For this, we combine three transformer subnetworks with separate inputs for time-varying covariates, previous treatments, and previous outcomes into a joint network with in-between cross-attentions. We further develop a custom, end-to-end training procedure for our Causal Transformer. Specifically, we propose a novel counterfactual domain confusion loss to address confounding bias: it aims to learn adversarial balanced representations, so that they are predictive of the next outcome but non-predictive of the current treatment assignment. We evaluate our Causal Transformer based on synthetic and real-world datasets, where it achieves superior performance over current baselines. To the best of our knowledge, this is the first work proposing transformer-based architecture for estimating counterfactual outcomes from longitudinal data. △ Less

Submitted 3 June, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

Journal ref: Proceedings of the 39-th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

arXiv:2203.01228 [pdf, other]

Estimating average causal effects from patient trajectories

Authors: Dennis Frauen, Tobias Hatt, Valentyn Melnychuk, Stefan Feuerriegel

Abstract: In medical practice, treatments are selected based on the expected causal effects on patient outcomes. Here, the gold standard for estimating causal effects are randomized controlled trials; however, such trials are costly and sometimes even unethical. Instead, medical practice is increasingly interested in estimating causal effects among patient (sub)groups from electronic health records, that is… ▽ More In medical practice, treatments are selected based on the expected causal effects on patient outcomes. Here, the gold standard for estimating causal effects are randomized controlled trials; however, such trials are costly and sometimes even unethical. Instead, medical practice is increasingly interested in estimating causal effects among patient (sub)groups from electronic health records, that is, observational data. In this paper, we aim at estimating the average causal effect (ACE) from observational data (patient trajectories) that are collected over time. For this, we propose DeepACE: an end-to-end deep learning model. DeepACE leverages the iterative G-computation formula to adjust for the bias induced by time-varying confounders. Moreover, we develop a novel sequential targeting procedure which ensures that DeepACE has favorable theoretical properties, i.e., is doubly robust and asymptotically efficient. To the best of our knowledge, this is the first work that proposes an end-to-end deep learning model tailored for estimating time-varying ACEs. We compare DeepACE in an extensive number of experiments, confirming that it achieves state-of-the-art performance. We further provide a case study for patients suffering from low back pain to demonstrate that DeepACE generates important and meaningful findings for clinical practice. Our work enables practitioners to develop effective treatment recommendations based on population effects. △ Less

Submitted 23 January, 2023; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted at AAAI 2023

arXiv:2010.12316 [pdf, other]

Matching the Clinical Reality: Accurate OCT-Based Diagnosis From Few Labels

Authors: Valentyn Melnychuk, Evgeniy Faerman, Ilja Manakov, Thomas Seidl

Abstract: Unlabeled data is often abundant in the clinic, making machine learning methods based on semi-supervised learning a good match for this setting. Despite this, they are currently receiving relatively little attention in medical image analysis literature. Instead, most practitioners and researchers focus on supervised or transfer learning approaches. The recently proposed MixMatch and FixMatch algor… ▽ More Unlabeled data is often abundant in the clinic, making machine learning methods based on semi-supervised learning a good match for this setting. Despite this, they are currently receiving relatively little attention in medical image analysis literature. Instead, most practitioners and researchers focus on supervised or transfer learning approaches. The recently proposed MixMatch and FixMatch algorithms have demonstrated promising results in extracting useful representations while requiring very few labels. Motivated by these recent successes, we apply MixMatch and FixMatch in an ophthalmological diagnostic setting and investigate how they fare against standard transfer learning. We find that both algorithms outperform the transfer learning baseline on all fractions of labelled data. Furthermore, our experiments show that exponential moving average (EMA) of model parameters, which is a component of both algorithms, is not needed for our classification problem, as disabling it leaves the outcome unchanged. Our code is available online: https://github.com/Valentyn1997/oct-diagn-semi-supervised △ Less

Submitted 23 October, 2020; originally announced October 2020.

Comments: KDAH-CIKM-2020

arXiv:2001.10883 [pdf, other]

Unsupervised Anomaly Detection for X-Ray Images

Authors: Diana Davletshina, Valentyn Melnychuk, Viet Tran, Hitansh Singla, Max Berrendorf, Evgeniy Faerman, Michael Fromm, Matthias Schubert

Abstract: Obtaining labels for medical (image) data requires scarce and expensive experts. Moreover, due to ambiguous symptoms, single images rarely suffice to correctly diagnose a medical condition. Instead, it often requires to take additional background information such as the patient's medical history or test results into account. Hence, instead of focusing on uninterpretable black-box systems deliverin… ▽ More Obtaining labels for medical (image) data requires scarce and expensive experts. Moreover, due to ambiguous symptoms, single images rarely suffice to correctly diagnose a medical condition. Instead, it often requires to take additional background information such as the patient's medical history or test results into account. Hence, instead of focusing on uninterpretable black-box systems delivering an uncertain final diagnosis in an end-to-end-fashion, we investigate how unsupervised methods trained on images without anomalies can be used to assist doctors in evaluating X-ray images of hands. Our method increases the efficiency of making a diagnosis and reduces the risk of missing important regions. Therefore, we adopt state-of-the-art approaches for unsupervised learning to detect anomalies and show how the outputs of these methods can be explained. To reduce the effect of noise, which often can be mistaken for an anomaly, we introduce a powerful preprocessing pipeline. We provide an extensive evaluation of different approaches and demonstrate empirically that even without labels it is possible to achieve satisfying results on a real-world dataset of X-ray images of hands. We also evaluate the importance of preprocessing and one of our main findings is that without it, most of our approaches perform not better than random. To foster reproducibility and accelerate research we make our code publicly available at https://github.com/Valentyn1997/xray △ Less

Submitted 4 November, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

arXiv:1911.08342 [pdf, ps, other]

doi 10.1007/978-3-030-45442-5_1

Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned

Authors: Max Berrendorf, Evgeniy Faerman, Valentyn Melnychuk, Volker Tresp, Thomas Seidl

Abstract: In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fu… ▽ More In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fully reproduce the results from the original paper and after a thorough audit of the code provided by authors, we concluded, that their implementation is different from the architecture described in the paper. In addition, several tricks are required to make the model work and some of them are not very intuitive. We provide an extensive ablation study to quantify the effects these tricks and changes of architecture have on final performance. Furthermore, we examine current evaluation approaches and systematize available benchmark datasets. We believe that people interested in KG matching might profit from our work, as well as novices entering the field △ Less

Submitted 23 January, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Showing 1–18 of 18 results for author: Melnychuk, V