-
Sensitivity analysis for matching on high-dimensional predictors: A case study of racial disparity in US mortality
Authors:
Marina Hernandez,
Ciprian Crainiceanu
Abstract:
Matching on a low dimensional vector of scalar covariates consists of constructing groups of individuals in which each individual in a group is within a pre-specified distance from an individual in another group. However, matching in high dimensional spaces is more challenging because the distance can be sensitive to implementation details, caliper width, and measurement error of observations. To…
▽ More
Matching on a low dimensional vector of scalar covariates consists of constructing groups of individuals in which each individual in a group is within a pre-specified distance from an individual in another group. However, matching in high dimensional spaces is more challenging because the distance can be sensitive to implementation details, caliper width, and measurement error of observations. To partially address these problems, we propose to use extensive sensitivity analyses and identify the main sources of variation and bias. We illustrate these concepts by examining the racial disparity in all-cause mortality in the US using the National Health and Nutrition Examination Survey (NHANES 2003-2006). In particular, we match African Americans to Caucasian Americans on age, gender, BMI and objectively measured physical activity (PA). PA is measured every minute using accelerometers for up to seven days and then transformed into an empirical distribution of all of the minute-level observations. The Wasserstein metric is used as the measure of distance between these participant-specific distributions.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Automatic quality control framework for more reliable integration of machine learning-based image segmentation into medical workflows
Authors:
Elena Williams,
Sebastian Niehaus,
Janis Reinelt,
Alberto Merola,
Paul Glad Mihai,
Kersten Villringer,
Konstantin Thierbach,
Evelyn Medawar,
Daniel Lichterfeld,
Ingo Roeder,
Nico Scherf,
Maria del C. Valdés Hernández
Abstract:
Machine learning algorithms underpin modern diagnostic-aiding software, which has proved valuable in clinical practice, particularly in radiology. However, inaccuracies, mainly due to the limited availability of clinical samples for training these algorithms, hamper their wider applicability, acceptance, and recognition amongst clinicians. We present an analysis of state-of-the-art automatic quali…
▽ More
Machine learning algorithms underpin modern diagnostic-aiding software, which has proved valuable in clinical practice, particularly in radiology. However, inaccuracies, mainly due to the limited availability of clinical samples for training these algorithms, hamper their wider applicability, acceptance, and recognition amongst clinicians. We present an analysis of state-of-the-art automatic quality control (QC) approaches that can be implemented within these algorithms to estimate the certainty of their outputs. We validated the most promising approaches on a brain image segmentation task identifying white matter hyperintensities (WMH) in magnetic resonance imaging data. WMH are a correlate of small vessel disease common in mid-to-late adulthood and are particularly challenging to segment due to their varied size, and distributional patterns. Our results show that the aggregation of uncertainty and Dice prediction were most effective in failure detection for this task. Both methods independently improved mean Dice from 0.82 to 0.84. Our work reveals how QC methods can help to detect failed segmentation cases and therefore make automatic segmentation more reliable and suitable for clinical practice.
△ Less
Submitted 19 December, 2022; v1 submitted 6 December, 2021;
originally announced December 2021.
-
Graph Neural Networks Including Sparse Interpretability
Authors:
Chris Lin,
Gerald J. Sun,
Krishna C. Bulusu,
Jonathan R. Dry,
Marylens Hernandez
Abstract:
Graph Neural Networks (GNNs) are versatile, powerful machine learning methods that enable graph structure and feature representation learning, and have applications across many domains. For applications critically requiring interpretation, attention-based GNNs have been leveraged. However, these approaches either rely on specific model architectures or lack a joint consideration of graph structure…
▽ More
Graph Neural Networks (GNNs) are versatile, powerful machine learning methods that enable graph structure and feature representation learning, and have applications across many domains. For applications critically requiring interpretation, attention-based GNNs have been leveraged. However, these approaches either rely on specific model architectures or lack a joint consideration of graph structure and node features in their interpretation. Here we present a model-agnostic framework for interpreting important graph structure and node features, Graph neural networks Including SparSe inTerpretability (GISST). With any GNN model, GISST combines an attention mechanism and sparsity regularization to yield an important subgraph and node feature subset related to any graph-based task. Through a single self-attention layer, a GISST model learns an importance probability for each node feature and edge in the input graph. By including these importance probabilities in the model loss function, the probabilities are optimized end-to-end and tied to the task-specific performance. Furthermore, GISST sparsifies these importance probabilities with entropy and L1 regularization to reduce noise in the input graph topology and node features. Our GISST models achieve superior node feature and edge explanation precision in synthetic datasets, as compared to alternative interpretation approaches. Moreover, our GISST models are able to identify important graph structure in real-world datasets. We demonstrate in theory that edge feature importance and multiple edge types can be considered by incorporating them into the GISST edge probability computation. By jointly accounting for topology, node features, and edge features, GISST inherently provides simple and relevant interpretations for any GNN models and tasks.
△ Less
Submitted 30 June, 2020;
originally announced July 2020.
-
Multi-rater delta: extending the delta nominal measure of agreement between two raters to many raters
Authors:
A. Martín Andrés,
M. Álvarez Hernández
Abstract:
The need to measure the degree of agreement among R raters who independently classify n subjects within K nominal categories is frequent in many scientific areas. The most popular measures are Cohen's kappa (R = 2), Fleiss' kappa, Conger's kappa and Hubert's kappa (R $\geq$ 2) coefficients, which have several defects. In 2004, the delta coefficient was defined for the case of R = 2, which did not…
▽ More
The need to measure the degree of agreement among R raters who independently classify n subjects within K nominal categories is frequent in many scientific areas. The most popular measures are Cohen's kappa (R = 2), Fleiss' kappa, Conger's kappa and Hubert's kappa (R $\geq$ 2) coefficients, which have several defects. In 2004, the delta coefficient was defined for the case of R = 2, which did not have the defects of Cohen's kappa coefficient. This article extends the coefficient delta from R = 2 raters to R $\geq$ 2. The coefficient multi-rater delta has the same advantages as the coefficient delta with regard to the type kappa coefficients: i) it is intuitive and easy to interpret, because it refers to the proportion of replies that are concordant and non random; ii) the summands which give its value allow the degree of agreement in each category to be measured accurately, with no need to be collapsed; and iii) it is not affected by the marginal imbalance.
△ Less
Submitted 29 January, 2022; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Efficient Gauss-Newton-Krylov momentum conservation constrained PDE-LDDMM using the band-limited vector field parameterization
Authors:
Monica Hernandez
Abstract:
The class of non-rigid registration methods proposed in the framework of PDE-constrained Large Deformation Diffeomorphic Metric Map** is a particularly interesting family of physically meaningful diffeomorphic registration methods. PDE-constrained LDDMM methods are formulated as constrained variational problems, where the different physical models are imposed using the associated partial differe…
▽ More
The class of non-rigid registration methods proposed in the framework of PDE-constrained Large Deformation Diffeomorphic Metric Map** is a particularly interesting family of physically meaningful diffeomorphic registration methods. PDE-constrained LDDMM methods are formulated as constrained variational problems, where the different physical models are imposed using the associated partial differential equations as hard constraints. Inexact Newton-Krylov optimization has shown an excellent numerical accuracy and an extraordinarily fast convergence rate in this framework. However, the Galerkin representation of the non-stationary velocity fields does not provide proper geodesic paths. In a previous work, we proposed a method for PDE-constrained LDDMM parameterized in the space of initial velocity fields under the EPDiff equation. The proposed method provided geodesics in the framework of PDE-constrained LDDMM, and it showed performance competitive to benchmark PDE-constrained LDDMM and EPDiff-LDDMM methods. However, the major drawback of this method was the large memory load inherent to PDE-constrained LDDMM methods and the increased computational time with respect to the benchmark methods. In this work we optimize the computational complexity of the method using the band-limited vector field parameterization closing the loop with our previous works.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.