Search | arXiv e-print repository

Clustering of Disease Trajectories with Explainable Machine Learning: A Case Study on Postoperative Delirium Phenotypes

Authors: Xiaochen Zheng, Manuel Schürch, Xingyu Chen, Maria Angeliki Komninou, Reto Schüpbach, Ahmed Allam, Jan Bartussek, Michael Krauthammer

Abstract: The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several disti… ▽ More The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several distinct phenotypes, which cannot be directly observed in clinical practice. Identifying these phenotypes could enhance our understanding of POD pathogenesis and facilitate the development of targeted prevention and treatment strategies. In this paper, we propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering techniques to uncover potential POD phenotypes. We first demonstrate our approach using synthetic data, where we simulate patient cohorts with predefined phenotypes based on distinct sets of informative features. We aim to mimic any clinical disease with our synthetic data generation method. By training a predictive model and applying SHAP, we show that clustering patients in the SHAP feature importance space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space. We then present a case study using real-world data from a cohort of elderly surgical patients. The results showcase the utility of our approach in uncovering clinically relevant subtypes of complex disorders like POD, paving the way for more precise and personalized treatment strategies. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2402.12190

Towards AI-Based Precision Oncology: A Machine Learning Framework for Personalized Counterfactual Treatment Suggestions based on Multi-Omics Data

Authors: Manuel Schürch, Laura Boos, Viola Heinzelmann-Schwarz, Gabriele Gut, Michael Krauthammer, Andreas Wicki, Tumor Profiler Consortium

Abstract: AI-driven precision oncology has the transformative potential to reshape cancer treatment by leveraging the power of AI models to analyze the interaction between complex patient characteristics and their corresponding treatment outcomes. New technological platforms have facilitated the timely acquisition of multimodal data on tumor biology at an unprecedented resolution, such as single-cell multi-… ▽ More AI-driven precision oncology has the transformative potential to reshape cancer treatment by leveraging the power of AI models to analyze the interaction between complex patient characteristics and their corresponding treatment outcomes. New technological platforms have facilitated the timely acquisition of multimodal data on tumor biology at an unprecedented resolution, such as single-cell multi-omics data, making this quality and quantity of data available for data-driven improved clinical decision-making. In this work, we propose a modular machine learning framework designed for personalized counterfactual cancer treatment suggestions based on an ensemble of machine learning experts trained on diverse multi-omics technologies. These specialized counterfactual experts per technology are consistently aggregated into a more powerful expert with superior performance and can provide both confidence and an explanation of its decision. The framework is tailored to address critical challenges inherent in data-driven cancer research, including the high-dimensional nature of the data, and the presence of treatment assignment bias in the retrospective observational data. The framework is showcased through comprehensive demonstrations using data from in-vitro and in-vivo treatment responses from a cohort of patients with ovarian cancer. Our method aims to empower clinicians with a reality-centric decision-support tool including probabilistic treatment suggestions with calibrated confidence and personalized explanations for tailoring treatment strategies to multi-omics characteristics of individual cancer patients. △ Less

Submitted 20 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

arXiv:2311.08149 [pdf, other]

Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent Processes

Authors: Cécile Trottet, Manuel Schürch, Ahmed Allam, Imon Barua, Liubov Petelytska, Oliver Distler, Anna-Maria Hoffmann-Vold, Michael Krauthammer, the EUSTAR collaborators

Abstract: In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent t… ▽ More In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge. △ Less

Submitted 29 January, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 23 pages

arXiv:2311.07744 [pdf, other]

Two-Stage Aggregation with Dynamic Local Attention for Irregular Time Series

Authors: Xingyu Chen, Xiaochen Zheng, Amina Mollaysa, Manuel Schürch, Ahmed Allam, Michael Krauthammer

Abstract: Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two… ▽ More Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two-stageAggregation process with Dynamic local Attention to harmonize time-wise and feature-wise irregularities in multivariate time series. In the first stage, the irregular time series undergoes temporal embedding (TE) using all available features at each time step. This process preserves the contribution of each available feature and generates a fixed-dimensional representation per time step. The second stage introduces a dynamic local attention (DLA) mechanism with adaptive window sizes. DLA aggregates time recordings using feature-specific windows to harmonize irregular time intervals capturing feature-specific sampling rates. Then hierarchical MLP mixer layers process the output of DLA through multiscale patching to leverage information at various scales for the downstream tasks. TADA outperforms state-of-the-art methods on three real-world datasets, including the latest MIMIC IV dataset, and highlights its effectiveness in handling irregular multivariate time series and its potential for various real-world applications. △ Less

Submitted 25 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: A short version of this paper has been accepted for presentation at the Findings of Machine Learning for Health (ML4H) 2023 conference

arXiv:2309.16521 [pdf, other]

Generating Personalized Insulin Treatments Strategies with Deep Conditional Generative Time Series Models

Authors: Manuel Schürch, Xiang Li, Ahmed Allam, Giulia Rathmes, Amina Mollaysa, Claudia Cavelti-Weder, Michael Krauthammer

Abstract: We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of nove… ▽ More We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of novel multivariate treatment strategies tailored to the personalized patient history and trained for optimal expected future outcomes based on conditional expected utility maximization. We demonstrate our framework by generating personalized insulin treatment strategies and blood glucose predictions for hospitalized diabetes patients, showcasing the potential of our approach for generating improved personalized treatment strategies. Keywords: deep generative model, probabilistic decision support, personalized treatment generation, insulin and blood glucose prediction △ Less

Submitted 13 November, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 17 pages

Journal ref: Machine Learning for Health (ML4H) 2023

arXiv:2303.18205 [pdf, other]

SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting

Authors: Xiaochen Zheng, Xingyu Chen, Manuel Schürch, Amina Mollaysa, Ahmed Allam, Michael Krauthammer

Abstract: Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the history context. Moreover, the construction of positive and negative pairs in cu… ▽ More Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the history context. Moreover, the construction of positive and negative pairs in current technologies strongly relies on specific time series characteristics, restricting their generalization across diverse types of time series data. To address these limitations, we propose SimTS, a simple representation learning approach for improving time series forecasting by learning to predict the future from the past in the latent space. SimTS does not rely on negative pairs or specific assumptions about the characteristics of the particular time series. Our extensive experiments on several benchmark time series forecasting datasets show that SimTS achieves competitive performance compared to existing contrastive learning methods. Furthermore, we show the shortcomings of the current contrastive learning framework used for time series forecasting through a detailed ablation study. Overall, our work suggests that SimTS is a promising alternative to other contrastive learning approaches for time series forecasting. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: 13 pages, 6 figures

arXiv:2303.04862 [pdf, other]

Deep Hypothesis Tests Detect Clinically Relevant Subgroup Shifts in Medical Images

Authors: Lisa M. Koch, Christian M. Schürch, Christian F. Baumgartner, Arthur Gretton, Philipp Berens

Abstract: Distribution shifts remain a fundamental problem for the safe application of machine learning systems. If undetected, they may impact the real-world performance of such systems or will at least render original performance claims invalid. In this paper, we focus on the detection of subgroup shifts, a type of distribution shift that can occur when subgroups have a different prevalence during validat… ▽ More Distribution shifts remain a fundamental problem for the safe application of machine learning systems. If undetected, they may impact the real-world performance of such systems or will at least render original performance claims invalid. In this paper, we focus on the detection of subgroup shifts, a type of distribution shift that can occur when subgroups have a different prevalence during validation compared to the deployment setting. For example, algorithms developed on data from various acquisition settings may be predominantly applied in hospitals with lower quality data acquisition, leading to an inadvertent performance drop. We formulate subgroup shift detection in the framework of statistical hypothesis testing and show that recent state-of-the-art statistical tests can be effectively applied to subgroup shift detection on medical imaging data. We provide synthetic experiments as well as extensive evaluation on clinically meaningful subgroup shifts on histopathology as well as retinal fundus images. We conclude that classifier-based subgroup shift detection tests could be a particularly useful tool for post-market surveillance of deployed ML systems. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: Under review

arXiv:2302.03120 [pdf, other]

Studying Therapy Effects and Disease Outcomes in Silico using Artificial Counterfactual Tissue Samples

Authors: Martin Paulikat, Christian M. Schürch, Christian F. Baumgartner

Abstract: Understanding the interactions of different cell types inside the immune tumor microenvironment (iTME) is crucial for the development of immunotherapy treatments as well as for predicting their outcomes. Highly multiplexed tissue imaging (HMTI) technologies offer a tool which can capture cell properties of tissue samples by measuring expression of various proteins and storing them in separate imag… ▽ More Understanding the interactions of different cell types inside the immune tumor microenvironment (iTME) is crucial for the development of immunotherapy treatments as well as for predicting their outcomes. Highly multiplexed tissue imaging (HMTI) technologies offer a tool which can capture cell properties of tissue samples by measuring expression of various proteins and storing them in separate image channels. HMTI technologies can be used to gain insights into the iTME and in particular how the iTME differs for different patient outcome groups of interest (e.g., treatment responders vs. non-responders). Understanding the systematic differences in the iTME of different patient outcome groups is crucial for develo** better treatments and personalising existing treatments. However, such analyses are inherently limited by the fact that any two tissue samples vary due to a large number of factors unrelated to the outcome. Here, we present CF-HistoGAN, a machine learning framework that employs generative adversarial networks (GANs) to create artificial counterfactual tissue samples that resemble the original tissue samples as closely as possible but capture the characteristics of a different patient outcome group. Specifically, we learn to "translate" HMTI samples from one patient group to create artificial paired samples. We show that this approach allows to directly study the effects of different patient outcomes on the iTMEs of individual tissue samples. We demonstrate that CF-HistoGAN can be employed as an explorative tool for understanding iTME effects on the pixel level. Moreover, we show that our method can be used to identify statistically significant differences in the expression of different proteins between patient groups with greater sensitivity compared to conventional approaches. △ Less

Submitted 6 February, 2023; originally announced February 2023.

arXiv:2112.09519 [pdf, other]

Correlated Product of Experts for Sparse Gaussian Process Regression

Authors: Manuel Schürch, Dario Azzimonti, Alessio Benavoli, Marco Zaffalon

Abstract: Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computa… ▽ More Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computational complexity. For this reason, many sparse GPs techniques have been developed over the past years. In this paper, we focus on GP regression tasks and propose a new approach based on aggregating predictions from several local and correlated experts. Thereby, the degree of correlation between the experts can vary between independent up to fully correlated experts. The individual predictions of the experts are aggregated taking into account their correlation resulting in consistent uncertainty estimates. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. The presented framework can deal with a general kernel function and multiple variables, and has a time and space complexity which is linear in the number of experts and data samples, which makes our approach highly scalable. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods for synthetic as well as several real-world datasets with deterministic and stochastic optimization. △ Less

Submitted 17 December, 2021; originally announced December 2021.

arXiv:2007.06363 [pdf, other]

Orthogonally Decoupled Variational Fourier Features

Authors: Dario Azzimonti, Manuel Schürch, Alessio Benavoli, Marco Zaffalon

Abstract: Sparse inducing points have long been a standard method to fit Gaussian processes to big data. In the last few years, spectral methods that exploit approximations of the covariance kernel have shown to be competitive. In this work we exploit a recently introduced orthogonally decoupled variational basis to combine spectral methods and sparse inducing points methods. We show that the method is comp… ▽ More Sparse inducing points have long been a standard method to fit Gaussian processes to big data. In the last few years, spectral methods that exploit approximations of the covariance kernel have shown to be competitive. In this work we exploit a recently introduced orthogonally decoupled variational basis to combine spectral methods and sparse inducing points methods. We show that the method is competitive with the state-of-the-art on synthetic and on real-world data. △ Less

Submitted 13 July, 2020; originally announced July 2020.

arXiv:1905.11711 [pdf, other]

doi 10.1016/j.automatica.2020.109127

Recursive Estimation for Sparse Gaussian Process Regression

Authors: Manuel Schürch, Dario Azzimonti, Alessio Benavoli, Marco Zaffalon

Abstract: Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications. However, their use is limited to a few thousand of training samples due to their cubic time complexity. In order to scale GPs to larger datasets, several sparse approximations based on so-called inducing points have been proposed in the literature. In this work we investigate the conn… ▽ More Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications. However, their use is limited to a few thousand of training samples due to their cubic time complexity. In order to scale GPs to larger datasets, several sparse approximations based on so-called inducing points have been proposed in the literature. In this work we investigate the connection between a general class of sparse inducing point GP regression methods and Bayesian recursive estimation which enables Kalman Filter like updating for online learning. The majority of previous work has focused on the batch setting, in particular for learning the model parameters and the position of the inducing points, here instead we focus on training with mini-batches. By exploiting the Kalman filter formulation, we propose a novel approach that estimates such parameters by recursively propagating the analytical gradients of the posterior over mini-batches of the data. Compared to state of the art methods, our method keeps analytic updates for the mean and covariance of the posterior, thus reducing drastically the size of the optimization problem. We show that our method achieves faster convergence and superior performance compared to state of the art sequential Gaussian Process regression on synthetic GP as well as real-world data with up to a million of data samples. △ Less

Submitted 22 June, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

arXiv:1110.5103 [pdf, ps, other]

Monomer-dimer tatami tilings of square regions

Authors: Alejandro Erickson, Mark Schurch

Abstract: We prove that the number of monomer-dimer tilings of an $n\times n$ square grid, with $m<n$ monomers in which no four tiles meet at any point is $m2^m+(m+1)2^{m+1}$, when $m$ and $n$ have the same parity. In addition, we present a new proof of the result that there are $n2^{n-1}$ such tilings with $n$ monomers, which divides the tilings into $n$ classes of size $2^{n-1}$. The sum of these tilings… ▽ More We prove that the number of monomer-dimer tilings of an $n\times n$ square grid, with $m<n$ monomers in which no four tiles meet at any point is $m2^m+(m+1)2^{m+1}$, when $m$ and $n$ have the same parity. In addition, we present a new proof of the result that there are $n2^{n-1}$ such tilings with $n$ monomers, which divides the tilings into $n$ classes of size $2^{n-1}$. The sum of these tilings over all monomer counts has the closed form $2^{n-1}(3n-4)+2$ and, curiously, this is equal to the sum of the squares of all parts in all compositions of $n$. We also describe two algorithms and a Gray code ordering for generating the $n2^{n-1}$ tilings with $n$ monomers, which are both based on our new proof. △ Less

Submitted 23 October, 2011; originally announced October 2011.

Comments: Expanded conference proceedings: A. Erickson, M. Schurch, Enumerating tatami mat arrangements of square grids, in: 22nd International Workshop on Combinatorial Al- gorithms (IWOCA), volume 7056 of Lecture Notes in Computer Science (LNCS), Springer Berlin / Heidelberg, 2011, p. 12 pages. More on Tatami tilings at http://alejandroerickson.com/joomla/tatami-blog/collected-resources

MSC Class: 05B45; 05B50

Showing 1–12 of 12 results for author: Schürch, M