-
Clustering of Disease Trajectories with Explainable Machine Learning: A Case Study on Postoperative Delirium Phenotypes
Authors:
Xiaochen Zheng,
Manuel Schürch,
Xingyu Chen,
Maria Angeliki Komninou,
Reto Schüpbach,
Ahmed Allam,
Jan Bartussek,
Michael Krauthammer
Abstract:
The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several disti…
▽ More
The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several distinct phenotypes, which cannot be directly observed in clinical practice. Identifying these phenotypes could enhance our understanding of POD pathogenesis and facilitate the development of targeted prevention and treatment strategies. In this paper, we propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering techniques to uncover potential POD phenotypes. We first demonstrate our approach using synthetic data, where we simulate patient cohorts with predefined phenotypes based on distinct sets of informative features. We aim to mimic any clinical disease with our synthetic data generation method. By training a predictive model and applying SHAP, we show that clustering patients in the SHAP feature importance space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space. We then present a case study using real-world data from a cohort of elderly surgical patients. The results showcase the utility of our approach in uncovering clinically relevant subtypes of complex disorders like POD, paving the way for more precise and personalized treatment strategies.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Towards AI-Based Precision Oncology: A Machine Learning Framework for Personalized Counterfactual Treatment Suggestions based on Multi-Omics Data
Authors:
Manuel Schürch,
Laura Boos,
Viola Heinzelmann-Schwarz,
Gabriele Gut,
Michael Krauthammer,
Andreas Wicki,
Tumor Profiler Consortium
Abstract:
AI-driven precision oncology has the transformative potential to reshape cancer treatment by leveraging the power of AI models to analyze the interaction between complex patient characteristics and their corresponding treatment outcomes. New technological platforms have facilitated the timely acquisition of multimodal data on tumor biology at an unprecedented resolution, such as single-cell multi-…
▽ More
AI-driven precision oncology has the transformative potential to reshape cancer treatment by leveraging the power of AI models to analyze the interaction between complex patient characteristics and their corresponding treatment outcomes. New technological platforms have facilitated the timely acquisition of multimodal data on tumor biology at an unprecedented resolution, such as single-cell multi-omics data, making this quality and quantity of data available for data-driven improved clinical decision-making. In this work, we propose a modular machine learning framework designed for personalized counterfactual cancer treatment suggestions based on an ensemble of machine learning experts trained on diverse multi-omics technologies. These specialized counterfactual experts per technology are consistently aggregated into a more powerful expert with superior performance and can provide both confidence and an explanation of its decision. The framework is tailored to address critical challenges inherent in data-driven cancer research, including the high-dimensional nature of the data, and the presence of treatment assignment bias in the retrospective observational data. The framework is showcased through comprehensive demonstrations using data from in-vitro and in-vivo treatment responses from a cohort of patients with ovarian cancer. Our method aims to empower clinicians with a reality-centric decision-support tool including probabilistic treatment suggestions with calibrated confidence and personalized explanations for tailoring treatment strategies to multi-omics characteristics of individual cancer patients.
△ Less
Submitted 20 March, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent Processes
Authors:
Cécile Trottet,
Manuel Schürch,
Ahmed Allam,
Imon Barua,
Liubov Petelytska,
Oliver Distler,
Anna-Maria Hoffmann-Vold,
Michael Krauthammer,
the EUSTAR collaborators
Abstract:
In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent t…
▽ More
In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge.
△ Less
Submitted 29 January, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
Two-Stage Aggregation with Dynamic Local Attention for Irregular Time Series
Authors:
Xingyu Chen,
Xiaochen Zheng,
Amina Mollaysa,
Manuel Schürch,
Ahmed Allam,
Michael Krauthammer
Abstract:
Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two…
▽ More
Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two-stageAggregation process with Dynamic local Attention to harmonize time-wise and feature-wise irregularities in multivariate time series. In the first stage, the irregular time series undergoes temporal embedding (TE) using all available features at each time step. This process preserves the contribution of each available feature and generates a fixed-dimensional representation per time step. The second stage introduces a dynamic local attention (DLA) mechanism with adaptive window sizes. DLA aggregates time recordings using feature-specific windows to harmonize irregular time intervals capturing feature-specific sampling rates. Then hierarchical MLP mixer layers process the output of DLA through multiscale patching to leverage information at various scales for the downstream tasks. TADA outperforms state-of-the-art methods on three real-world datasets, including the latest MIMIC IV dataset, and highlights its effectiveness in handling irregular multivariate time series and its potential for various real-world applications.
△ Less
Submitted 25 April, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Generating Personalized Insulin Treatments Strategies with Deep Conditional Generative Time Series Models
Authors:
Manuel Schürch,
Xiang Li,
Ahmed Allam,
Giulia Rathmes,
Amina Mollaysa,
Claudia Cavelti-Weder,
Michael Krauthammer
Abstract:
We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of nove…
▽ More
We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of novel multivariate treatment strategies tailored to the personalized patient history and trained for optimal expected future outcomes based on conditional expected utility maximization. We demonstrate our framework by generating personalized insulin treatment strategies and blood glucose predictions for hospitalized diabetes patients, showcasing the potential of our approach for generating improved personalized treatment strategies. Keywords: deep generative model, probabilistic decision support, personalized treatment generation, insulin and blood glucose prediction
△ Less
Submitted 13 November, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting
Authors:
Xiaochen Zheng,
Xingyu Chen,
Manuel Schürch,
Amina Mollaysa,
Ahmed Allam,
Michael Krauthammer
Abstract:
Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the history context. Moreover, the construction of positive and negative pairs in cu…
▽ More
Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the history context. Moreover, the construction of positive and negative pairs in current technologies strongly relies on specific time series characteristics, restricting their generalization across diverse types of time series data. To address these limitations, we propose SimTS, a simple representation learning approach for improving time series forecasting by learning to predict the future from the past in the latent space. SimTS does not rely on negative pairs or specific assumptions about the characteristics of the particular time series. Our extensive experiments on several benchmark time series forecasting datasets show that SimTS achieves competitive performance compared to existing contrastive learning methods. Furthermore, we show the shortcomings of the current contrastive learning framework used for time series forecasting through a detailed ablation study. Overall, our work suggests that SimTS is a promising alternative to other contrastive learning approaches for time series forecasting.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Deep Hypothesis Tests Detect Clinically Relevant Subgroup Shifts in Medical Images
Authors:
Lisa M. Koch,
Christian M. Schürch,
Christian F. Baumgartner,
Arthur Gretton,
Philipp Berens
Abstract:
Distribution shifts remain a fundamental problem for the safe application of machine learning systems. If undetected, they may impact the real-world performance of such systems or will at least render original performance claims invalid. In this paper, we focus on the detection of subgroup shifts, a type of distribution shift that can occur when subgroups have a different prevalence during validat…
▽ More
Distribution shifts remain a fundamental problem for the safe application of machine learning systems. If undetected, they may impact the real-world performance of such systems or will at least render original performance claims invalid. In this paper, we focus on the detection of subgroup shifts, a type of distribution shift that can occur when subgroups have a different prevalence during validation compared to the deployment setting. For example, algorithms developed on data from various acquisition settings may be predominantly applied in hospitals with lower quality data acquisition, leading to an inadvertent performance drop. We formulate subgroup shift detection in the framework of statistical hypothesis testing and show that recent state-of-the-art statistical tests can be effectively applied to subgroup shift detection on medical imaging data. We provide synthetic experiments as well as extensive evaluation on clinically meaningful subgroup shifts on histopathology as well as retinal fundus images. We conclude that classifier-based subgroup shift detection tests could be a particularly useful tool for post-market surveillance of deployed ML systems.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Studying Therapy Effects and Disease Outcomes in Silico using Artificial Counterfactual Tissue Samples
Authors:
Martin Paulikat,
Christian M. Schürch,
Christian F. Baumgartner
Abstract:
Understanding the interactions of different cell types inside the immune tumor microenvironment (iTME) is crucial for the development of immunotherapy treatments as well as for predicting their outcomes. Highly multiplexed tissue imaging (HMTI) technologies offer a tool which can capture cell properties of tissue samples by measuring expression of various proteins and storing them in separate imag…
▽ More
Understanding the interactions of different cell types inside the immune tumor microenvironment (iTME) is crucial for the development of immunotherapy treatments as well as for predicting their outcomes. Highly multiplexed tissue imaging (HMTI) technologies offer a tool which can capture cell properties of tissue samples by measuring expression of various proteins and storing them in separate image channels. HMTI technologies can be used to gain insights into the iTME and in particular how the iTME differs for different patient outcome groups of interest (e.g., treatment responders vs. non-responders). Understanding the systematic differences in the iTME of different patient outcome groups is crucial for develo** better treatments and personalising existing treatments. However, such analyses are inherently limited by the fact that any two tissue samples vary due to a large number of factors unrelated to the outcome. Here, we present CF-HistoGAN, a machine learning framework that employs generative adversarial networks (GANs) to create artificial counterfactual tissue samples that resemble the original tissue samples as closely as possible but capture the characteristics of a different patient outcome group. Specifically, we learn to "translate" HMTI samples from one patient group to create artificial paired samples. We show that this approach allows to directly study the effects of different patient outcomes on the iTMEs of individual tissue samples. We demonstrate that CF-HistoGAN can be employed as an explorative tool for understanding iTME effects on the pixel level. Moreover, we show that our method can be used to identify statistically significant differences in the expression of different proteins between patient groups with greater sensitivity compared to conventional approaches.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Correlated Product of Experts for Sparse Gaussian Process Regression
Authors:
Manuel Schürch,
Dario Azzimonti,
Alessio Benavoli,
Marco Zaffalon
Abstract:
Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computa…
▽ More
Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computational complexity. For this reason, many sparse GPs techniques have been developed over the past years. In this paper, we focus on GP regression tasks and propose a new approach based on aggregating predictions from several local and correlated experts. Thereby, the degree of correlation between the experts can vary between independent up to fully correlated experts. The individual predictions of the experts are aggregated taking into account their correlation resulting in consistent uncertainty estimates. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. The presented framework can deal with a general kernel function and multiple variables, and has a time and space complexity which is linear in the number of experts and data samples, which makes our approach highly scalable. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods for synthetic as well as several real-world datasets with deterministic and stochastic optimization.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Orthogonally Decoupled Variational Fourier Features
Authors:
Dario Azzimonti,
Manuel Schürch,
Alessio Benavoli,
Marco Zaffalon
Abstract:
Sparse inducing points have long been a standard method to fit Gaussian processes to big data. In the last few years, spectral methods that exploit approximations of the covariance kernel have shown to be competitive. In this work we exploit a recently introduced orthogonally decoupled variational basis to combine spectral methods and sparse inducing points methods. We show that the method is comp…
▽ More
Sparse inducing points have long been a standard method to fit Gaussian processes to big data. In the last few years, spectral methods that exploit approximations of the covariance kernel have shown to be competitive. In this work we exploit a recently introduced orthogonally decoupled variational basis to combine spectral methods and sparse inducing points methods. We show that the method is competitive with the state-of-the-art on synthetic and on real-world data.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Recursive Estimation for Sparse Gaussian Process Regression
Authors:
Manuel Schürch,
Dario Azzimonti,
Alessio Benavoli,
Marco Zaffalon
Abstract:
Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications. However, their use is limited to a few thousand of training samples due to their cubic time complexity. In order to scale GPs to larger datasets, several sparse approximations based on so-called inducing points have been proposed in the literature. In this work we investigate the conn…
▽ More
Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications. However, their use is limited to a few thousand of training samples due to their cubic time complexity. In order to scale GPs to larger datasets, several sparse approximations based on so-called inducing points have been proposed in the literature. In this work we investigate the connection between a general class of sparse inducing point GP regression methods and Bayesian recursive estimation which enables Kalman Filter like updating for online learning. The majority of previous work has focused on the batch setting, in particular for learning the model parameters and the position of the inducing points, here instead we focus on training with mini-batches. By exploiting the Kalman filter formulation, we propose a novel approach that estimates such parameters by recursively propagating the analytical gradients of the posterior over mini-batches of the data. Compared to state of the art methods, our method keeps analytic updates for the mean and covariance of the posterior, thus reducing drastically the size of the optimization problem. We show that our method achieves faster convergence and superior performance compared to state of the art sequential Gaussian Process regression on synthetic GP as well as real-world data with up to a million of data samples.
△ Less
Submitted 22 June, 2020; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Monomer-dimer tatami tilings of square regions
Authors:
Alejandro Erickson,
Mark Schurch
Abstract:
We prove that the number of monomer-dimer tilings of an $n\times n$ square grid, with $m<n$ monomers in which no four tiles meet at any point is $m2^m+(m+1)2^{m+1}$, when $m$ and $n$ have the same parity. In addition, we present a new proof of the result that there are $n2^{n-1}$ such tilings with $n$ monomers, which divides the tilings into $n$ classes of size $2^{n-1}$. The sum of these tilings…
▽ More
We prove that the number of monomer-dimer tilings of an $n\times n$ square grid, with $m<n$ monomers in which no four tiles meet at any point is $m2^m+(m+1)2^{m+1}$, when $m$ and $n$ have the same parity. In addition, we present a new proof of the result that there are $n2^{n-1}$ such tilings with $n$ monomers, which divides the tilings into $n$ classes of size $2^{n-1}$. The sum of these tilings over all monomer counts has the closed form $2^{n-1}(3n-4)+2$ and, curiously, this is equal to the sum of the squares of all parts in all compositions of $n$. We also describe two algorithms and a Gray code ordering for generating the $n2^{n-1}$ tilings with $n$ monomers, which are both based on our new proof.
△ Less
Submitted 23 October, 2011;
originally announced October 2011.