-
Representation Bayesian Risk Decompositions and Multi-Source Domain Adaptation
Authors:
Xi Wu,
Yang Guo,
Jiefeng Chen,
Yingyu Liang,
Somesh Jha,
Prasad Chalasani
Abstract:
We consider representation learning (hypothesis class $\mathcal{H} = \mathcal{F}\circ\mathcal{G}$) where training and test distributions can be different. Recent studies provide hints and failure examples for domain invariant representation learning, a common approach for this problem, but the explanations provided are somewhat different and do not provide a unified picture. In this paper, we prov…
▽ More
We consider representation learning (hypothesis class $\mathcal{H} = \mathcal{F}\circ\mathcal{G}$) where training and test distributions can be different. Recent studies provide hints and failure examples for domain invariant representation learning, a common approach for this problem, but the explanations provided are somewhat different and do not provide a unified picture. In this paper, we provide new decompositions of risk which give finer-grained explanations and clarify potential generalization issues. For Single-Source Domain Adaptation, we give an exact decomposition (an equality) of the target risk, via a natural hybrid argument, as sum of three factors: (1) source risk, (2) representation conditional label divergence, and (3) representation covariate shift. We derive a similar decomposition for the Multi-Source case. These decompositions reveal factors (2) and (3) as the precise reasons for failure to generalize. For example, we demonstrate that domain adversarial neural networks (DANN) attempt to regularize for (3) but miss (2), while a recent technique Invariant Risk Minimization (IRM) attempts to account for (2) but does not consider (3). We also verify our observations experimentally.
△ Less
Submitted 3 June, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods
Authors:
Wei Zhang,
Thomas Kobber Panum,
Somesh Jha,
Prasad Chalasani,
David Page
Abstract:
We study the problem of learning Granger causality between event types from asynchronous, interdependent, multi-type event sequences. Existing work suffers from either limited model flexibility or poor model explainability and thus fails to uncover Granger causality across a wide variety of event sequences with diverse event interdependency. To address these weaknesses, we propose CAUSE (Causality…
▽ More
We study the problem of learning Granger causality between event types from asynchronous, interdependent, multi-type event sequences. Existing work suffers from either limited model flexibility or poor model explainability and thus fails to uncover Granger causality across a wide variety of event sequences with diverse event interdependency. To address these weaknesses, we propose CAUSE (Causality from AttribUtions on Sequence of Events), a novel framework for the studied task. The key idea of CAUSE is to first implicitly capture the underlying event interdependency by fitting a neural point process, and then extract from the process a Granger causality statistic using an axiomatic attribution method. Across multiple datasets riddled with diverse event interdependency, we demonstrate that CAUSE achieves superior performance on correctly inferring the inter-type Granger causality over a range of state-of-the-art methods.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Hybrid Robot-assisted Frameworks for Endomicroscopy Scanning in Retinal Surgeries
Authors:
Zhaoshuo Li,
Mahya Shahbazi,
Niravkumar Patel,
Eimear O' Sullivan,
Haojie Zhang,
Khushi Vyas,
Preetham Chalasani,
Anton Deguet,
Peter L. Gehlbach,
Iulian Iordachita,
Guang-Zhong Yang,
Russell H. Taylor
Abstract:
High-resolution real-time intraocular imaging of retina at the cellular level is very challenging due to the vulnerable and confined space within the eyeball as well as the limited availability of appropriate modalities. A probe-based confocal laser endomicroscopy (pCLE) system, can be a potential imaging modality for improved diagnosis. The ability to visualize the retina at the cellular level co…
▽ More
High-resolution real-time intraocular imaging of retina at the cellular level is very challenging due to the vulnerable and confined space within the eyeball as well as the limited availability of appropriate modalities. A probe-based confocal laser endomicroscopy (pCLE) system, can be a potential imaging modality for improved diagnosis. The ability to visualize the retina at the cellular level could provide information that may predict surgical outcomes. The adoption of intraocular pCLE scanning is currently limited due to the narrow field of view and the micron-scale range of focus. In the absence of motion compensation, physiological tremors of the surgeons' hand and patient movements also contribute to the deterioration of the image quality.
Therefore, an image-based hybrid control strategy is proposed to mitigate the above challenges. The proposed hybrid control strategy enables a shared control of the pCLE probe between surgeons and robots to scan the retina precisely, with the absence of hand tremors and with the advantages of an image-based auto-focus algorithm that optimizes the quality of pCLE images. The hybrid control strategy is deployed on two frameworks - cooperative and teleoperated. Better image quality, smoother motion, and reduced workload are all achieved in a statistically significant manner with the hybrid control frameworks.
△ Less
Submitted 8 April, 2020; v1 submitted 15 September, 2019;
originally announced September 2019.
-
Concise Explanations of Neural Networks using Adversarial Training
Authors:
Prasad Chalasani,
Jiefeng Chen,
Amrita Roy Chowdhury,
Somesh Jha,
Xi Wu
Abstract:
We show new connections between adversarial learning and explainability for deep neural networks (DNNs). One form of explanation of the output of a neural network model in terms of its input features, is a vector of feature-attributions. Two desirable characteristics of an attribution-based explanation are: (1) $\textit{sparseness}$: the attributions of irrelevant or weakly relevant features shoul…
▽ More
We show new connections between adversarial learning and explainability for deep neural networks (DNNs). One form of explanation of the output of a neural network model in terms of its input features, is a vector of feature-attributions. Two desirable characteristics of an attribution-based explanation are: (1) $\textit{sparseness}$: the attributions of irrelevant or weakly relevant features should be negligible, thus resulting in $\textit{concise}$ explanations in terms of the significant features, and (2) $\textit{stability}$: it should not vary significantly within a small local neighborhood of the input. Our first contribution is a theoretical exploration of how these two properties (when using attributions based on Integrated Gradients, or IG) are related to adversarial training, for a class of 1-layer networks (which includes logistic regression models for binary and multi-class classification); for these networks we show that (a) adversarial training using an $\ell_\infty$-bounded adversary produces models with sparse attribution vectors, and (b) natural model-training while encouraging stable explanations (via an extra term in the loss function), is equivalent to adversarial training. Our second contribution is an empirical verification of phenomenon (a), which we show, somewhat surprisingly, occurs $\textit{not only}$ $\textit{in 1-layer networks}$, $\textit{but also DNNs}$ $\textit{trained on }$ $\textit{standard image datasets}$, and extends beyond IG-based attributions, to those based on DeepSHAP: adversarial training with $\ell_\infty$-bounded perturbations yields significantly sparser attribution vectors, with little degradation in performance on natural test data, compared to natural training. Moreover, the sparseness of the attribution vectors is significantly better than that achievable via $\ell_1$-regularized natural training.
△ Less
Submitted 4 July, 2020; v1 submitted 15 October, 2018;
originally announced October 2018.
-
An on-line algorithm for improving performance in navigation
Authors:
Avrim Blum,
Prasad Chalasani
Abstract:
Recent papers have shown optimally-competitive on-line strategies for a robot traveling from a point $s$ to a point $t$ in certain unknown geometric environments. We consider the question: Having gained some partial information about the scene on its first trip from $s$ to $t$, can the robot improve its performance on subsequent trips it might make? This is a type of on-line problem where a stra…
▽ More
Recent papers have shown optimally-competitive on-line strategies for a robot traveling from a point $s$ to a point $t$ in certain unknown geometric environments. We consider the question: Having gained some partial information about the scene on its first trip from $s$ to $t$, can the robot improve its performance on subsequent trips it might make? This is a type of on-line problem where a strategy must exploit {\em partial information \/} about the future (e.g., about obstacles that lie ahead). For scenes with axis-parallel rectangular obstacles where the Euclidean distance between $s$ and $t$ is $n$, we present a deterministic algorithm whose {\em average\/} trip length after $k$ trips, $k \leq n$, is $O(\rootnbyk)$ times the length of the shortest $s$-$t$ path in the scene. We also show that this is the best a deterministic strategy can do. This algorithm can be thought of as performing an optimal tradeoff between search effort and the goodness of the path found. We improve this algorithm so that for {\em every\/} $i \leq n$, the robot's $i$th trip length is $O(\rootnbyi)$ times the shortest $s$-$t$ path length. A key idea of the paper is that a {\em tree\/} structure can be defined in the scene, where the nodes are portions of certain obstacles and the edges are ``short'' paths from a node to its children. The core of our algorithms is an on-line strategy for traversing this tree optimally.
△ Less
Submitted 20 September, 1994;
originally announced September 1994.
-
On the minimum latency problem
Authors:
Avrim Blum,
Prasad Chalasani,
Don Coppersmith,
Bill Pulleyblank,
Prabhakar Raghavan,
Madhu Sudan
Abstract:
We are given a set of points $p_1,\ldots , p_n$ and a symmetric distance matrix $(d_{ij})$ giving the distance between $p_i$ and $p_j$. We wish to construct a tour that minimizes $\sum_{i=1}^n \ell(i)$, where $\ell(i)$ is the {\em latency} of $p_i$, defined to be the distance traveled before first visiting $p_i$. This problem is also known in the literature as the {\em deliveryman problem} or th…
▽ More
We are given a set of points $p_1,\ldots , p_n$ and a symmetric distance matrix $(d_{ij})$ giving the distance between $p_i$ and $p_j$. We wish to construct a tour that minimizes $\sum_{i=1}^n \ell(i)$, where $\ell(i)$ is the {\em latency} of $p_i$, defined to be the distance traveled before first visiting $p_i$. This problem is also known in the literature as the {\em deliveryman problem} or the {\em traveling repairman problem}. It arises in a number of applications including disk-head scheduling, and turns out to be surprisingly different from the traveling salesman problem in character. We give exact and approximate solutions to a number of cases, including a constant-factor approximation algorithm whenever the distance matrix satisfies the triangle inequality.
△ Less
Submitted 20 September, 1994;
originally announced September 1994.