-
The PetShop Dataset -- Finding Causes of Performance Issues across Microservices
Authors:
Michaela Hardt,
William R. Orchard,
Patrick Blöbaum,
Shiva Kasiviswanathan,
Elke Kirschbaum
Abstract:
Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications that employ numerous microservices. Although the machine learning and systems research communities have proposed various techniques to tackle this problem, there is currently a lack of standardized datasets for quantitative b…
▽ More
Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications that employ numerous microservices. Although the machine learning and systems research communities have proposed various techniques to tackle this problem, there is currently a lack of standardized datasets for quantitative benchmarking. Consequently, research groups are compelled to create their own datasets for experimentation. This paper introduces a dataset specifically designed for evaluating root cause analyses in microservice-based applications. The dataset encompasses latency, requests, and availability metrics emitted in 5-minute intervals from a distributed application. In addition to normal operation metrics, the dataset includes 68 injected performance issues, which increase latency and reduce availability throughout the system. We showcase how this dataset can be used to evaluate the accuracy of a variety of methods spanning different causal and non-causal characterisations of the root cause analysis problem. We hope the new dataset, available at https://github.com/amazon-science/petshop-root-cause-analysis/ enables further development of techniques in this important area.
△ Less
Submitted 8 April, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Beyond Single-Feature Importance with ICECREAM
Authors:
Michael Oesterle,
Patrick Blöbaum,
Atalanti A. Mastakouri,
Elke Kirschbaum
Abstract:
Which set of features was responsible for a certain output of a machine learning model? Which components caused the failure of a cloud computing application? These are just two examples of questions we are addressing in this work by Identifying Coalition-based Explanations for Common and Rare Events in Any Model (ICECREAM). Specifically, we propose an information-theoretic quantitative measure for…
▽ More
Which set of features was responsible for a certain output of a machine learning model? Which components caused the failure of a cloud computing application? These are just two examples of questions we are addressing in this work by Identifying Coalition-based Explanations for Common and Rare Events in Any Model (ICECREAM). Specifically, we propose an information-theoretic quantitative measure for the influence of a coalition of variables on the distribution of a target variable. This allows us to identify which set of factors is essential to obtain a certain outcome, as opposed to well-established explainability and causal contribution analysis methods which can assign contributions only to individual factors and rank them by their importance. In experiments with synthetic and real-world data, we show that ICECREAM outperforms state-of-the-art methods for explainability and root cause analysis, and achieves impressive accuracy in both tasks.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Causal Inference Through the Structural Causal Marginal Problem
Authors:
Luigi Gresele,
Julius von Kügelgen,
Jonas M. Kübler,
Elke Kirschbaum,
Bernhard Schölkopf,
Dominik Janzing
Abstract:
We introduce an approach to counterfactual inference based on merging information from multiple datasets. We consider a causal reformulation of the statistical marginal problem: given a collection of marginal structural causal models (SCMs) over distinct but overlap** sets of variables, determine the set of joint SCMs that are counterfactually consistent with the marginal ones. We formalise this…
▽ More
We introduce an approach to counterfactual inference based on merging information from multiple datasets. We consider a causal reformulation of the statistical marginal problem: given a collection of marginal structural causal models (SCMs) over distinct but overlap** sets of variables, determine the set of joint SCMs that are counterfactually consistent with the marginal ones. We formalise this approach for categorical SCMs using the response function formulation and show that it reduces the space of allowed marginal and joint SCMs. Our work thus highlights a new mode of falsifiability through additional variables, in contrast to the statistical one via additional data.
△ Less
Submitted 14 July, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
DISCo: Deep learning, Instance Segmentation, and Correlations for cell segmentation in calcium imaging
Authors:
Elke Kirschbaum,
Alberto Bailoni,
Fred A. Hamprecht
Abstract:
Calcium imaging is one of the most important tools in neurophysiology as it enables the observation of neuronal activity for hundreds of cells in parallel and at single-cell resolution. In order to use the data gained with calcium imaging, it is necessary to extract individual cells and their activity from the recordings. We present DISCo, a novel approach for the cell segmentation in calcium imag…
▽ More
Calcium imaging is one of the most important tools in neurophysiology as it enables the observation of neuronal activity for hundreds of cells in parallel and at single-cell resolution. In order to use the data gained with calcium imaging, it is necessary to extract individual cells and their activity from the recordings. We present DISCo, a novel approach for the cell segmentation in calcium imaging videos. We use temporal information from the recordings in a computationally efficient way by computing correlations between pixels and combine it with shape-based information to identify active as well as non-active cells. We first learn to predict whether two pixels belong to the same cell; this information is summarized in an undirected, edge-weighted grid graph which we then partition. In so doing, we approximately solve the NP-hard correlation clustering problem with a recently proposed greedy algorithm. Evaluating our method on the Neurofinder public benchmark shows that DISCo outperforms all existing models trained on these datasets.
△ Less
Submitted 4 April, 2020; v1 submitted 21 August, 2019;
originally announced August 2019.