-
Data Science Principles for Interpretable and Explainable AI
Authors:
Kris Sankaran
Abstract:
Society's capacity for algorithmic problem-solving has never been greater. Artificial Intelligence is now applied across more domains than ever, a consequence of powerful abstractions, abundant data, and accessible software. As capabilities have expanded, so have risks, with models often deployed without fully understanding their potential impacts. Interpretable and interactive machine learning ai…
▽ More
Society's capacity for algorithmic problem-solving has never been greater. Artificial Intelligence is now applied across more domains than ever, a consequence of powerful abstractions, abundant data, and accessible software. As capabilities have expanded, so have risks, with models often deployed without fully understanding their potential impacts. Interpretable and interactive machine learning aims to make complex models more transparent and controllable, enhancing user agency. This review synthesizes key principles from the growing literature in this field.
We first introduce precise vocabulary for discussing interpretability, like the distinction between glass box and explainable algorithms. We then explore connections to classical statistical and design principles, like parsimony and the gulfs of interaction. Basic explainability techniques -- including learned embeddings, integrated gradients, and concept bottlenecks -- are illustrated with a simple case study. We also review criteria for objectively evaluating interpretability approaches. Throughout, we underscore the importance of considering audience goals when designing interactive algorithmic systems. Finally, we outline open challenges and discuss the potential role of data science in addressing them. Code to reproduce all examples can be found at https://go.wisc.edu/3k1ewe.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Lazard-style CAD and Equational Constraints
Authors:
James H. Davenport,
Akshar S. Nair,
Gregory K. Sankaran,
Ali K. Uncu
Abstract:
McCallum-style Cylindrical Algebra Decomposition (CAD) is a major improvement on the original Collins version, and has had many subsequent advances, notably for total or partial equational constraints. But it suffers from a problem with nullification. The recently-justified Lazard-style CAD does not have this problem. However, transporting the equational constraints work to Lazard-style does reint…
▽ More
McCallum-style Cylindrical Algebra Decomposition (CAD) is a major improvement on the original Collins version, and has had many subsequent advances, notably for total or partial equational constraints. But it suffers from a problem with nullification. The recently-justified Lazard-style CAD does not have this problem. However, transporting the equational constraints work to Lazard-style does reintroduce nullification issues. This paper explains the problem, and the solutions to it, based on the second author's Ph.D. thesis and the Brown--McCallum improvement to Lazard.
With a single equational constraint, we can gain the same improvements in Lazard-style as in McCallum-style CAD . Moreover, our approach does not fail where McCallum would due to nullification. Unsurprisingly, it does not achieve the same level of improvement as it does in the non-nullified cases. We also consider the case of multiple equational constraints.
△ Less
Submitted 7 December, 2023; v1 submitted 11 February, 2023;
originally announced February 2023.
-
Spatial Transcriptomics Dimensionality Reduction using Wavelet Bases
Authors:
Zhuoyan Xu,
Kris Sankaran
Abstract:
Spatially resolved transcriptomics (ST) measures gene expression along with the spatial coordinates of the measurements. The analysis of ST data involves significant computation complexity. In this work, we propose gene expression dimensionality reduction algorithm that retains spatial structure. We combine the wavelet transformation with matrix factorization to select spatially-varying genes. We…
▽ More
Spatially resolved transcriptomics (ST) measures gene expression along with the spatial coordinates of the measurements. The analysis of ST data involves significant computation complexity. In this work, we propose gene expression dimensionality reduction algorithm that retains spatial structure. We combine the wavelet transformation with matrix factorization to select spatially-varying genes. We extract a low-dimensional representation of these genes. We consider Empirical Bayes setting, imposing regularization through the prior distribution of factor genes. Additionally, We provide visualization of extracted representation genes capturing the global spatial pattern. We illustrate the performance of our methods by spatial structure recovery and gene expression reconstruction in simulation. In real data experiments, our method identifies spatial structure of gene factors and outperforms regular decomposition regarding reconstruction error. We found the connection between the fluctuation of gene patterns and wavelet technique, providing smoother visualization. We develop the package and share the workflow generating reproducible quantitative results and gene visualization. The package is available at https://github.com/OliverXUZY/waveST.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
Sampling Strategy for Fine-Tuning Segmentation Models to Crisis Area under Scarcity of Data
Authors:
Adrianna Janik,
Kris Sankaran
Abstract:
The use of remote sensing in humanitarian crisis response missions is well-established and has proven relevant repeatedly. One of the problems is obtaining gold annotations as it is costly and time consuming which makes it almost impossible to fine-tune models to new regions affected by the crisis. Where time is critical, resources are limited and environment is constantly changing, models has to…
▽ More
The use of remote sensing in humanitarian crisis response missions is well-established and has proven relevant repeatedly. One of the problems is obtaining gold annotations as it is costly and time consuming which makes it almost impossible to fine-tune models to new regions affected by the crisis. Where time is critical, resources are limited and environment is constantly changing, models has to evolve and provide flexible ways to adapt to a new situation. The question that we want to answer is if prioritization of samples provide better results in fine-tuning vs other classical sampling methods under annotated data scarcity? We propose a method to guide data collection during fine-tuning, based on estimated model and sample properties, like predicted IOU score. We propose two formulas for calculating sample priority. Our approach blends techniques from interpretability, representation learning and active learning. We have applied our method to a deep learning model for semantic segmentation, U-Net, in a remote sensing application of building detection - one of the core use cases of remote sensing in humanitarian applications. Preliminary results shows utility in prioritization of samples for tuning semantic segmentation models under scarcity of data condition.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization
Authors:
Adrianna Janik,
Kris Sankaran
Abstract:
Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisio…
▽ More
Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisions for classes of interest. However, finding relevant concepts is tedious, as representation spaces are high-dimensional and hard to navigate. Current approaches include hand-crafting concept datasets and then converting them to latent space directions; alternatively, the process can be automated by clustering the latent space. In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive visualization. We explore the potential value and limitations of these approaches through simulation experiments and an demo visual interface to real data. Overall, we find that these techniques offer a promising strategy for discovering relevant concepts in settings where users do not have predefined descriptions of them, but without completely automating the process.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
Source data selection for out-of-domain generalization
Authors:
Xinran Miao,
Kris Sankaran
Abstract:
Models that perform out-of-domain generalization borrow knowledge from heterogeneous source data and apply it to a related but distinct target task. Transfer learning has proven effective for accomplishing this generalization in many applications. However, poor selection of a source dataset can lead to poor performance on the target, a phenomenon called negative transfer. In order to take full adv…
▽ More
Models that perform out-of-domain generalization borrow knowledge from heterogeneous source data and apply it to a related but distinct target task. Transfer learning has proven effective for accomplishing this generalization in many applications. However, poor selection of a source dataset can lead to poor performance on the target, a phenomenon called negative transfer. In order to take full advantage of available source data, this work studies source data selection with respect to a target task. We propose two source selection methods that are based on the multi-bandit theory and random search, respectively. We conduct a thorough empirical evaluation on both simulated and real data. Our proposals can be also viewed as diagnostics for the existence of a reweighted source subsamples that perform better than the random selection of available samples.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Interactive Visualization and Representation Analysis Applied to Glacier Segmentation
Authors:
Minxing Zheng,
Xinran Miao,
Kris Sankaran
Abstract:
Interpretability has attracted increasing attention in earth observation problems. We apply interactive visualization and representation analysis to guide interpretation of glacier segmentation models. We visualize the activations from a U-Net to understand and evaluate the model performance. We build an online interface using the Shiny R package to provide comprehensive error analysis of the pred…
▽ More
Interpretability has attracted increasing attention in earth observation problems. We apply interactive visualization and representation analysis to guide interpretation of glacier segmentation models. We visualize the activations from a U-Net to understand and evaluate the model performance. We build an online interface using the Shiny R package to provide comprehensive error analysis of the predictions. Users can interact with the panels and discover model failure modes. Further, we discuss how visualization can provide sanity checks during data preprocessing and model training.
△ Less
Submitted 11 March, 2022; v1 submitted 11 December, 2021;
originally announced December 2021.
-
A System-Level Voltage/Frequency Scaling Characterization Framework for Multicore CPUs
Authors:
George Papadimitriou,
Manolis Kaliorakis,
Athanasios Chatzidimitriou,
Dimitris Gizopoulos,
Greg Favor,
Kumar Sankaran,
Shidhartha Das
Abstract:
Supply voltage scaling is one of the most effective techniques to reduce the power consumption of microprocessors. However, technology limitations such as aging and process variability enforce microprocessor designers to apply pessimistic voltage guardbands to guarantee correct operation in the field for any foreseeable workload. This worst-case design practice makes energy efficiency hard to scal…
▽ More
Supply voltage scaling is one of the most effective techniques to reduce the power consumption of microprocessors. However, technology limitations such as aging and process variability enforce microprocessor designers to apply pessimistic voltage guardbands to guarantee correct operation in the field for any foreseeable workload. This worst-case design practice makes energy efficiency hard to scale with technology evolution. Improving energy-efficiency requires the identification of the chip design margins through time-consuming and comprehensive characterization of its operational limits. Such a characterization of state-of-the-art multi-core CPUs fabricated in aggressive technologies is a multi-parameter process, which requires statistically significant information. In this paper, we present an automated framework to support system-level voltage and frequency scaling characterization of Applied Micro's state-of-the-art ARMv8-based multicore CPUs used in the X-Gene 2 micro-server family. The fully automated framework can provide fine-grained information of the system's state by monitoring any abnormal behavior that may occur during reduced supply voltage conditions. We also propose a new metric to quantify the behavior of a microprocessor when it operates beyond nominal conditions. Our experimental results demonstrate potential uses of the characterization framework to identify the limits of operation for improved energy efficiency.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Interpretability of a Deep Learning Model in the Application of Cardiac MRI Segmentation with an ACDC Challenge Dataset
Authors:
Adrianna Janik,
Jonathan Dodd,
Georgiana Ifrim,
Kris Sankaran,
Kathleen Curran
Abstract:
Cardiac Magnetic Resonance (CMR) is the most effective tool for the assessment and diagnosis of a heart condition, which malfunction is the world's leading cause of death. Software tools leveraging Artificial Intelligence already enhance radiologists and cardiologists in heart condition assessment but their lack of transparency is a problem. This project investigates if it is possible to discover…
▽ More
Cardiac Magnetic Resonance (CMR) is the most effective tool for the assessment and diagnosis of a heart condition, which malfunction is the world's leading cause of death. Software tools leveraging Artificial Intelligence already enhance radiologists and cardiologists in heart condition assessment but their lack of transparency is a problem. This project investigates if it is possible to discover concepts representative for different cardiac conditions from the deep network trained to segment crdiac structures: Left Ventricle (LV), Right Ventricle (RV) and Myocardium (MYO), using explainability methods that enhances classification system by providing the score-based values of qualitative concepts, along with the key performance metrics. With introduction of a need of explanations in GDPR explainability of AI systems is necessary. This study applies Discovering and Testing with Concept Activation Vectors (D-TCAV), an interpretaibilty method to extract underlying features important for cardiac disease diagnosis from MRI data. The method provides a quantitative notion of concept importance for disease classified. In previous studies, the base method is applied to the classification of cardiac disease and provides clinically meaningful explanations for the predictions of a black-box deep learning classifier. This study applies a method extending TCAV with a Discovering phase (D-TCAV) to cardiac MRI analysis. The advantage of the D-TCAV method over the base method is that it is user-independent. The contribution of this study is a novel application of the explainability method D-TCAV for cardiac MRI anlysis. D-TCAV provides a shorter pre-processing time for clinicians than the base method.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya
Authors:
Shimaa Baraka,
Benjamin Akera,
Bibek Aryal,
Tenzing Sherpa,
Finu Shresta,
Anthony Ortiz,
Kris Sankaran,
Juan Lavista Ferres,
Mir Matin,
Yoshua Bengio
Abstract:
Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote se…
▽ More
Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier map** process.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery
Authors:
Michel Deudon,
Alfredo Kalaitzis,
Israel Goytom,
Md Rifat Arefin,
Zhichao Lin,
Kris Sankaran,
Vincent Michalski,
Samira E. Kahou,
Julien Cornebise,
Yoshua Bengio
Abstract:
Generative deep learning has sparked a new wave of Super-Resolution (SR) algorithms that enhance single images with impressive aesthetic results, albeit with imaginary details. Multi-frame Super-Resolution (MFSR) offers a more grounded approach to the ill-posed problem, by conditioning on multiple low-resolution views. This is important for satellite monitoring of human impact on the planet -- fro…
▽ More
Generative deep learning has sparked a new wave of Super-Resolution (SR) algorithms that enhance single images with impressive aesthetic results, albeit with imaginary details. Multi-frame Super-Resolution (MFSR) offers a more grounded approach to the ill-posed problem, by conditioning on multiple low-resolution views. This is important for satellite monitoring of human impact on the planet -- from deforestation, to human rights violations -- that depend on reliable imagery. To this end, we present HighRes-net, the first deep learning approach to MFSR that learns its sub-tasks in an end-to-end fashion: (i) co-registration, (ii) fusion, (iii) up-sampling, and (iv) registration-at-the-loss. Co-registration of low-resolution views is learned implicitly through a reference-frame channel, with no explicit registration mechanism. We learn a global fusion operator that is applied recursively on an arbitrary number of low-resolution pairs. We introduce a registered loss, by learning to align the SR output to a ground-truth through ShiftNet. We show that by learning deep representations of multiple views, we can super-resolve low-resolution signals and enhance Earth Observation data at scale. Our approach recently topped the European Space Agency's MFSR competition on real-world satellite imagery.
△ Less
Submitted 15 February, 2020;
originally announced February 2020.
-
Nanoscale Microscopy Images Colorization Using Neural Networks
Authors:
Israel Goytom,
Qin Wang,
Tianxiang Yu,
Kunjie Dai,
Kris Sankaran,
Xinfei Zhou,
Dongdong Lin
Abstract:
Microscopy images are powerful tools and widely used in the majority of research areas, such as biology, chemistry, physics and materials fields by various microscopies (scanning electron microscope (SEM), atomic force microscope (AFM) and the optical microscope, et al.). However, most of the microscopy images are colorless due to the unique imaging mechanism. Though investigating on some popular…
▽ More
Microscopy images are powerful tools and widely used in the majority of research areas, such as biology, chemistry, physics and materials fields by various microscopies (scanning electron microscope (SEM), atomic force microscope (AFM) and the optical microscope, et al.). However, most of the microscopy images are colorless due to the unique imaging mechanism. Though investigating on some popular solutions proposed recently about colorizing images, we notice the process of those methods are usually tedious, complicated, and time-consuming. In this paper, inspired by the achievement of machine learning algorithms on different science fields, we introduce two artificial neural networks for gray microscopy image colorization: An end-to-end convolutional neural network (CNN) with a pre-trained model for feature extraction and a pixel-to-pixel neural style transfer convolutional neural network (NST-CNN), which can colorize gray microscopy images with semantic information learned from a user-provided colorful image at inference time. The results demonstrate that our algorithm not only can colorize the microscopy images under complex circumstances precisely but also make the color naturally according to the training of a massive number of nature images with proper hue and saturation.
△ Less
Submitted 22 February, 2020; v1 submitted 17 December, 2019;
originally announced December 2019.
-
Applying Knowledge Transfer for Water Body Segmentation in Peru
Authors:
Jessenia Gonzalez,
Debjani Bhowmick,
Cesar Beltran,
Kris Sankaran,
Yoshua Bengio
Abstract:
In this work, we present the application of convolutional neural networks for segmenting water bodies in satellite images. We first use a variant of the U-Net model to segment rivers and lakes from very high-resolution images from Peru. To circumvent the issue of scarce labelled data, we investigate the applicability of a knowledge transfer-based model that learns the map** from high-resolution…
▽ More
In this work, we present the application of convolutional neural networks for segmenting water bodies in satellite images. We first use a variant of the U-Net model to segment rivers and lakes from very high-resolution images from Peru. To circumvent the issue of scarce labelled data, we investigate the applicability of a knowledge transfer-based model that learns the map** from high-resolution labelled images and combines it with the very high-resolution map** so that better segmentation can be achieved. We train this model in a single process, end-to-end. Our preliminary results show that adding the information from the available high-resolution images does not help out-of-the-box, and in fact worsen results. This leads us to infer that the high-resolution data could be from a different distribution, and its addition leads to increased variance in our results.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
Tackling Climate Change with Machine Learning
Authors:
David Rolnick,
Priya L. Donti,
Lynn H. Kaack,
Kelly Kochanski,
Alexandre Lacoste,
Kris Sankaran,
Andrew Slavin Ross,
Nikola Milojevic-Dupont,
Natasha Jaques,
Anna Waldman-Brown,
Alexandra Luccioni,
Tegan Maharaj,
Evan D. Sherwin,
S. Karthik Mukkavilli,
Konrad P. Kording,
Carla Gomes,
Andrew Y. Ng,
Demis Hassabis,
John C. Platt,
Felix Creutzig,
Jennifer Chayes,
Yoshua Bengio
Abstract:
Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and hel** society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine lea…
▽ More
Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and hel** society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine learning, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the machine learning community to join the global effort against climate change.
△ Less
Submitted 5 November, 2019; v1 submitted 10 June, 2019;
originally announced June 2019.
-
Hierarchical Importance Weighted Autoencoders
Authors:
Chin-Wei Huang,
Kris Sankaran,
Eeshan Dhekane,
Alexandre Lacoste,
Aaron Courville
Abstract:
Importance weighted variational inference (Burda et al., 2015) uses multiple i.i.d. samples to have a tighter variational lower bound. We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation. The hope is that the proposals would coordinate to make up for the error made by one another to reduce the varia…
▽ More
Importance weighted variational inference (Burda et al., 2015) uses multiple i.i.d. samples to have a tighter variational lower bound. We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation. The hope is that the proposals would coordinate to make up for the error made by one another to reduce the variance of the importance estimator. Theoretically, we analyze the condition under which convergence of the estimator variance can be connected to convergence of the lower bound. Empirically, we confirm that maximization of the lower bound does implicitly minimize variance. Further analysis shows that this is a result of negative correlation induced by the proposed hierarchical meta sampling scheme, and performance of inference also improves when the number of samples increases.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
Visualizing the Consequences of Climate Change Using Cycle-Consistent Adversarial Networks
Authors:
Victor Schmidt,
Alexandra Luccioni,
S. Karthik Mukkavilli,
Narmada Balasooriya,
Kris Sankaran,
Jennifer Chayes,
Yoshua Bengio
Abstract:
We present a project that aims to generate images that depict accurate, vivid, and personalized outcomes of climate change using Cycle-Consistent Adversarial Networks (CycleGANs). By training our CycleGAN model on street-view images of houses before and after extreme weather events (e.g. floods, forest fires, etc.), we learn a map** that can then be applied to images of locations that have not y…
▽ More
We present a project that aims to generate images that depict accurate, vivid, and personalized outcomes of climate change using Cycle-Consistent Adversarial Networks (CycleGANs). By training our CycleGAN model on street-view images of houses before and after extreme weather events (e.g. floods, forest fires, etc.), we learn a map** that can then be applied to images of locations that have not yet experienced these events. This visual transformation is paired with climate model predictions to assess likelihood and type of climate-related events in the long term (50 years) in order to bring the future closer in the viewers mind. The eventual goal of our project is to enable individuals to make more informed choices about their climate future by creating a more visceral understanding of the effects of climate change, while maintaining scientific credibility by drawing on climate model projections.
△ Less
Submitted 2 May, 2019;
originally announced May 2019.
-
Regular cylindrical algebraic decomposition
Authors:
J. H. Davenport,
A. F. Locatelli,
G. K. Sankaran
Abstract:
We show that a strong well-based cylindrical algebraic decomposition P of a bounded semi-algebraic set is a regular cell decomposition, in any dimension and independently of the method by which P is constructed. Being well-based is a global condition on P that holds for the output of many widely used algorithms. We also show the same for S of dimension at most 3 and P a strong cylindrical algebrai…
▽ More
We show that a strong well-based cylindrical algebraic decomposition P of a bounded semi-algebraic set is a regular cell decomposition, in any dimension and independently of the method by which P is constructed. Being well-based is a global condition on P that holds for the output of many widely used algorithms. We also show the same for S of dimension at most 3 and P a strong cylindrical algebraic decomposition that is locally boundary simply connected: this is a purely local extra condition.
△ Less
Submitted 11 March, 2018;
originally announced March 2018.