-
Capturing and Interpreting Unique Information
Authors:
Praveen Venkatesh,
Keerthana Gurushankar,
Gabriel Schamberg
Abstract:
Partial information decompositions (PIDs), which quantify information interactions between three or more variables in terms of uniqueness, redundancy and synergy, are gaining traction in many application domains. However, our understanding of the operational interpretations of PIDs is still incomplete for many popular PID definitions. In this paper, we discuss the operational interpretations of un…
▽ More
Partial information decompositions (PIDs), which quantify information interactions between three or more variables in terms of uniqueness, redundancy and synergy, are gaining traction in many application domains. However, our understanding of the operational interpretations of PIDs is still incomplete for many popular PID definitions. In this paper, we discuss the operational interpretations of unique information through the lens of two well-known PID definitions. We reexamine an interpretation from statistical decision theory showing how unique information upper bounds the risk in a decision problem. We then explore a new connection between the two PIDs, which allows us to develop an informal but appealing interpretation, and generalize the PID definitions using a common Lagrangian formulation. Finally, we provide a new PID definition that is able to capture the information that is unique. We also show that it has a straightforward interpretation and examine its properties.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Unscented Kalman Filter for Long-Distance Vessel Tracking in Geodetic Coordinates
Authors:
Blake Cole,
Gabriel Schamberg
Abstract:
This paper describes a novel tracking filter, designed primarily for use in collision avoidance systems on autonomous surface vehicles (ASVs). The proposed methodology leverages real-time kinematic information broadcast via the Automatic Information System (AIS) messaging protocol, in order to estimate the position, speed, and heading of nearby cooperative targets. The state of each target is recu…
▽ More
This paper describes a novel tracking filter, designed primarily for use in collision avoidance systems on autonomous surface vehicles (ASVs). The proposed methodology leverages real-time kinematic information broadcast via the Automatic Information System (AIS) messaging protocol, in order to estimate the position, speed, and heading of nearby cooperative targets. The state of each target is recursively estimated in geodetic coordinates using an unscented Kalman filter (UKF) with kinematic equations derived from the spherical law of cosines. This improves upon previous approaches, many of which employ the extended Kalman filter (EKF), and thus require the specification of a local planar coordinate frame, in order to describe the state kinematics in an easily differentiable form. The proposed geodetic UKF obviates the need for this local plane. This feature is particularly advantageous for long-range ASVs, which must otherwise periodically redefine a new local plane to curtail linearization error. In real-world operations, this recurring redefinition can introduce error and complicate mission planning. It is shown through both simulation and field testing that the proposed geodetic UKF performs as well as, or better than, the traditional plane-Cartesian EKF, both in terms of estimation error and stability.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Partial Information Decomposition via Deficiency for Multivariate Gaussians
Authors:
Praveen Venkatesh,
Gabriel Schamberg
Abstract:
Bivariate partial information decompositions (PIDs) characterize how the information in a "message" random variable is decomposed between two "constituent" random variables in terms of unique, redundant and synergistic information components. These components are a function of the joint distribution of the three variables, and are typically defined using an optimization over the space of all possi…
▽ More
Bivariate partial information decompositions (PIDs) characterize how the information in a "message" random variable is decomposed between two "constituent" random variables in terms of unique, redundant and synergistic information components. These components are a function of the joint distribution of the three variables, and are typically defined using an optimization over the space of all possible joint distributions. This makes it computationally challenging to compute PIDs in practice and restricts their use to low-dimensional random vectors. To ease this burden, we consider the case of jointly Gaussian random vectors in this paper. This case was previously examined by Barrett (2015), who showed that certain operationally well-motivated PIDs reduce to a closed form expression for scalar messages. Here, we show that Barrett's result does not extend to vector messages in general, and characterize the set of multivariate Gaussian distributions that reduce to closed-form. Then, for all other multivariate Gaussian distributions, we propose a convex optimization framework for approximately computing a specific PID definition based on the statistical concept of deficiency. Using simplifying assumptions specific to the Gaussian case, we provide an efficient algorithm to approximately compute the bivariate PID for multivariate Gaussian variables with tens or even hundreds of dimensions. We also theoretically and empirically justify the goodness of this approximation.
△ Less
Submitted 28 November, 2022; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Inferring neural dynamics during burst suppression using a neurophysiology-inspired switching state-space model
Authors:
Gabriel Schamberg,
Sourish Chakravarty,
Taylor E. Baum,
Emery N. Brown
Abstract:
Burst suppression is an electroencephalography (EEG) pattern associated with profoundly inactivated brain states characterized by cerebral metabolic depression. Its distinctive feature is alternation between short temporal segments of near-isoelectric inactivity (suppressions) and relatively high-voltage activity (bursts). Prior modeling studies suggest that burst-suppression EEG is a manifestatio…
▽ More
Burst suppression is an electroencephalography (EEG) pattern associated with profoundly inactivated brain states characterized by cerebral metabolic depression. Its distinctive feature is alternation between short temporal segments of near-isoelectric inactivity (suppressions) and relatively high-voltage activity (bursts). Prior modeling studies suggest that burst-suppression EEG is a manifestation of two alternating brain states associated with consumption (during a burst) and production (during a suppression) of adenosine triphosphate (ATP). This finding motivates us to infer latent states characterizing alternating brain states and underlying ATP kinetics from instantaneous power of multichannel EEG using a switching state-space model. Our model assumes Gaussian distributed data as a broadcast network manifestation of one of two global brain states. The two brain states are allowed to stochastically alternate with transition probabilities that depend on the instantaneous ATP level, which evolves according to first-order kinetics. The rate constants governing the ATP kinetics are allowed to vary as first-order autoregressive processes. Our latent state estimates are determined from data using a sequential Monte Carlo algorithm. Our neurophysiology-informed model not only provides unsupervised segmentation of multi-channel burst-suppression EEG but can also generate additional insights on the level of brain inactivation during anesthesia.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning
Authors:
Gabe Schamberg,
Marcus Badgeley,
Emery N. Brown
Abstract:
Reinforcement Learning (RL) can be used to fit a map** from patient state to a medication regimen. Prior studies have used deterministic and value-based tabular learning to learn a propofol dose from an observed anesthetic state. Deep RL replaces the table with a deep neural network and has been used to learn medication regimens from registry databases. Here we perform the first application of d…
▽ More
Reinforcement Learning (RL) can be used to fit a map** from patient state to a medication regimen. Prior studies have used deterministic and value-based tabular learning to learn a propofol dose from an observed anesthetic state. Deep RL replaces the table with a deep neural network and has been used to learn medication regimens from registry databases. Here we perform the first application of deep RL to closed-loop control of anesthetic dosing in a simulated environment. We use the cross-entropy method to train a deep neural network to map an observed anesthetic state to a probability of infusing a fixed propofol dosage. During testing, we implement a deterministic policy that transforms the probability of infusion to a continuous infusion rate. The model is trained and tested on simulated pharmacokinetic/pharmacodynamic models with randomized parameters to ensure robustness to patient variability. The deep RL agent significantly outperformed a proportional-integral-derivative controller (median absolute performance error 1.7% +/- 0.6 and 3.4% +/- 1.2). Modeling continuous input variables instead of a table affords more robust pattern recognition and utilizes our prior domain knowledge. Deep RL learned a smooth policy with a natural interpretation to data scientists and anesthesia care providers alike.
△ Less
Submitted 9 September, 2020; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Direct and Indirect Effects -- An Information Theoretic Perspective
Authors:
Gabriel Schamberg,
William Chapman,
Shang-** Xie,
Todd P. Coleman
Abstract:
Information theoretic (IT) approaches to quantifying causal influences have experienced some popularity in the literature, in both theoretical and applied (e.g. neuroscience and climate science) domains. While these causal measures are desirable in that they are model agnostic and can capture non-linear interactions, they are fundamentally different from common statistical notions of causal influe…
▽ More
Information theoretic (IT) approaches to quantifying causal influences have experienced some popularity in the literature, in both theoretical and applied (e.g. neuroscience and climate science) domains. While these causal measures are desirable in that they are model agnostic and can capture non-linear interactions, they are fundamentally different from common statistical notions of causal influence in that they (1) compare distributions over the effect rather than values of the effect and (2) are defined with respect to random variables representing a cause rather than specific values of a cause. We here present IT measures of direct, indirect, and total causal effects. The proposed measures are unlike existing IT techniques in that they enable measuring causal effects that are defined with respect to specific values of a cause while still offering the flexibility and general applicability of IT techniques. We provide an identifiability result and demonstrate application of the proposed measures in estimating the causal effect of the El NiƱo-Southern Oscillation on temperature anomalies in the North American Pacific Northwest.
△ Less
Submitted 28 July, 2020; v1 submitted 22 December, 2019;
originally announced December 2019.
-
On the Bias of Directed Information Estimators
Authors:
Gabriel Schamberg,
Todd P. Coleman
Abstract:
When estimating the directed information between two jointly stationary Markov processes, it is typically assumed that the recipient of the directed information is itself Markov of the same order as the joint process. While this assumption is often made explicit in the presentation of such estimators, a characterization of when we can expect the assumption to hold is lacking. Using the concept of…
▽ More
When estimating the directed information between two jointly stationary Markov processes, it is typically assumed that the recipient of the directed information is itself Markov of the same order as the joint process. While this assumption is often made explicit in the presentation of such estimators, a characterization of when we can expect the assumption to hold is lacking. Using the concept of d-separation from Bayesian networks, we present sufficient conditions for which this assumption holds. We further show that the set of parameters for which the condition is not also necessary has Lebesgue measure zero. Given the strictness of these conditions, we introduce a notion of partial directed information, which can be used to bound the bias of directed information estimates when the directed information recipient is not itself Markov. Lastly we estimate this bound on simulations in a variety of settings to assess the extent to which the bias should be cause for concern.
△ Less
Submitted 30 April, 2019; v1 submitted 1 February, 2019;
originally announced February 2019.
-
Measuring Sample Path Causal Influences with Relative Entropy
Authors:
Gabriel Schamberg,
Todd P. Coleman
Abstract:
We present a sample path dependent measure of causal influence between time series. The proposed causal measure is a random sequence, a realization of which enables identification of specific patterns that give rise to high levels of causal influence. We show that these patterns cannot be identified by existing measures such as directed information (DI). We demonstrate how sequential prediction th…
▽ More
We present a sample path dependent measure of causal influence between time series. The proposed causal measure is a random sequence, a realization of which enables identification of specific patterns that give rise to high levels of causal influence. We show that these patterns cannot be identified by existing measures such as directed information (DI). We demonstrate how sequential prediction theory may be leveraged to estimate the proposed causal measure and introduce a notion of regret for assessing the performance of such estimators. We prove a finite sample bound on this regret that is determined by the worst case regret of the sequential predictors used in the estimator. Justification for the proposed measure is provided through a series of examples, simulations, and application to stock market data. Within the context of estimating DI, we show that, because joint Markovicity of a pair of processes does not imply the marginal Markovicity of individual processes, commonly used plug-in estimators of DI will be biased for a large subset of jointly Markov processes. We introduce a notion of DI with "stale history", which can be combined with a plug-in estimator to upper and lower bound the DI when marginal Markovicity does not hold.
△ Less
Submitted 30 July, 2019; v1 submitted 11 October, 2018;
originally announced October 2018.
-
A Sample Path Measure of Causal Influence
Authors:
Gabriel Schamberg,
Todd P. Coleman
Abstract:
We present a sample path dependent measure of causal influence between two time series. The proposed measure is a random variable whose expected sum is the directed information. A realization of the proposed measure may be used to identify the specific patterns in the data that yield a greater flow of information from one process to another, even in stationary processes. We demonstrate how sequent…
▽ More
We present a sample path dependent measure of causal influence between two time series. The proposed measure is a random variable whose expected sum is the directed information. A realization of the proposed measure may be used to identify the specific patterns in the data that yield a greater flow of information from one process to another, even in stationary processes. We demonstrate how sequential prediction theory may be leveraged to obtain accurate estimates of the causal measure at each point in time and introduce a notion of regret for assessing the performance of estimators of the measure. We prove a finite sample bound on this regret that is determined by the regret of the sequential predictors used in obtaining the estimate. We estimate the causal measure for a simulated collection of binary Markov processes using a Bayesian updating approach. Finally, given that the measure is a function of time, we demonstrate how estimators of the causal measure may be extended to effectively capture causality in time-varying scenarios.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
A Modularized Efficient Framework for Non-Markov Time Series Estimation
Authors:
Gabriel Schamberg,
Demba Ba,
Todd P. Coleman
Abstract:
We present a compartmentalized approach to finding the maximum a-posteriori (MAP) estimate of a latent time series that obeys a dynamic stochastic model and is observed through noisy measurements. We specifically consider modern signal processing problems with non-Markov signal dynamics (e.g. group sparsity) and/or non-Gaussian measurement models (e.g. point process observation models used in neur…
▽ More
We present a compartmentalized approach to finding the maximum a-posteriori (MAP) estimate of a latent time series that obeys a dynamic stochastic model and is observed through noisy measurements. We specifically consider modern signal processing problems with non-Markov signal dynamics (e.g. group sparsity) and/or non-Gaussian measurement models (e.g. point process observation models used in neuroscience). Through the use of auxiliary variables in the MAP estimation problem, we show that a consensus formulation of the alternating direction method of multipliers (ADMM) enables iteratively computing separate estimates based on the likelihood and prior and subsequently "averaging" them in an appropriate sense using a Kalman smoother. As such, this can be applied to a broad class of problem settings and only requires modular adjustments when interchanging various aspects of the statistical model. Under broad log-concavity assumptions, we show that the separate estimation problems are convex optimization problems and that the iterative algorithm converges to the MAP estimate. As such, this framework can capture non-Markov latent time series models and non-Gaussian measurement models. We provide example applications involving (i) group-sparsity priors, within the context of electrophysiologic specrotemporal estimation, and (ii) non-Gaussian measurement models, within the context of dynamic analyses of learning with neural spiking and behavioral observations.
△ Less
Submitted 7 May, 2018; v1 submitted 14 June, 2017;
originally announced June 2017.