Search | arXiv e-print repository

arXiv:2311.06373 [pdf, other]

Partial Information Decomposition for Continuous Variables based on Shared Exclusions: Analytical Formulation and Estimation

Authors: David A. Ehrlich, Kyle Schick-Poland, Abdullah Makkeh, Felix Lanfermann, Patricia Wollstadt, Michael Wibral

Abstract: Describing statistical dependencies is foundational to empirical scientific research. For uncovering intricate and possibly non-linear dependencies between a single target variable and several source variables within a system, a principled and versatile framework can be found in the theory of Partial Information Decomposition (PID). Nevertheless, the majority of existing PID measures are restricte… ▽ More Describing statistical dependencies is foundational to empirical scientific research. For uncovering intricate and possibly non-linear dependencies between a single target variable and several source variables within a system, a principled and versatile framework can be found in the theory of Partial Information Decomposition (PID). Nevertheless, the majority of existing PID measures are restricted to categorical variables, while many systems of interest in science are continuous. In this paper, we present a novel analytic formulation for continuous redundancy--a generalization of mutual information--drawing inspiration from the concept of shared exclusions in probability space as in the discrete PID definition of $I^\mathrm{sx}_\cap$. Furthermore, we introduce a nearest-neighbor based estimator for continuous PID, and showcase its effectiveness by applying it to a simulated energy management system provided by the Honda Research Institute Europe GmbH. This work bridges the gap between the measure-theoretically postulated existence proofs for a continuous $I^\mathrm{sx}_\cap$ and its practical application to real-world scientific problems. △ Less

Submitted 27 March, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: 32 pages, 15 figures

MSC Class: 94A15

arXiv:2306.02149 [pdf, other]

A General Framework for Interpretable Neural Learning based on Local Information-Theoretic Goal Functions

Authors: Abdullah Makkeh, Marcel Graetz, Andreas C. Schneider, David A. Ehrlich, Viola Priesemann, Michael Wibral

Abstract: Despite the impressive performance of biological and artificial networks, an intuitive understanding of how their local learning dynamics contribute to network-level task solutions remains a challenge to this date. Efforts to bring learning to a more local scale indeed lead to valuable insights, however, a general constructive approach to describe local learning goals that is both interpretable an… ▽ More Despite the impressive performance of biological and artificial networks, an intuitive understanding of how their local learning dynamics contribute to network-level task solutions remains a challenge to this date. Efforts to bring learning to a more local scale indeed lead to valuable insights, however, a general constructive approach to describe local learning goals that is both interpretable and adaptable across diverse tasks is still missing. We have previously formulated a local information processing goal that is highly adaptable and interpretable for a model neuron with compartmental structure. Building on recent advances in Partial Information Decomposition (PID), we here derive a corresponding parametric local learning rule, which allows us to introduce 'infomorphic' neural networks. We demonstrate the versatility of these networks to perform tasks from supervised, unsupervised and memory learning. By leveraging the interpretable nature of the PID framework, infomorphic networks represent a valuable tool to advance our understanding of the intricate structure of local learning. △ Less

Submitted 30 April, 2024; v1 submitted 3 June, 2023; originally announced June 2023.

Comments: 26 pages, 12 figures

arXiv:2306.00734 [pdf, other]

From Babel to Boole: The Logical Organization of Information Decompositions

Authors: Aaron J. Gutknecht, Abdullah Makkeh, Michael Wibral

Abstract: The conventional approach to the general Partial Information Decomposition (PID) problem has been redundancy-based: specifying a measure of redundant information between collections of source variables induces a PID via Moebius-Inversion over the so called redundancy lattice. Despite the prevalence of this method, there has been ongoing interest in examining the problem through the lens of differe… ▽ More The conventional approach to the general Partial Information Decomposition (PID) problem has been redundancy-based: specifying a measure of redundant information between collections of source variables induces a PID via Moebius-Inversion over the so called redundancy lattice. Despite the prevalence of this method, there has been ongoing interest in examining the problem through the lens of different base-concepts of information, such as synergy, unique information, or union information. Yet, a comprehensive understanding of the logical organization of these different based-concepts and their associated PIDs remains elusive. In this work, we apply the mereological formulation of PID that we introduced in a recent paper to shed light on this problem. Within the mereological approach base-concepts can be expressed in terms of conditions phrased in formal logic on the specific parthood relations between the PID components and the different mutual information terms. We set forth a general pattern of these logical conditions of which all PID base-concepts in the literature are special cases and that also reveals novel base-concepts, in particular a concept we call "vulnerable information". △ Less

Submitted 25 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 20 pages, 8 figures

arXiv:2209.10438 [pdf, other]

A Measure of the Complexity of Neural Representations based on Partial Information Decomposition

Authors: David A. Ehrlich, Andreas C. Schneider, Viola Priesemann, Michael Wibral, Abdullah Makkeh

Abstract: In neural networks, task-relevant information is represented jointly by groups of neurons. However, the specific way in which this mutual information about the classification label is distributed among the individual neurons is not well understood: While parts of it may only be obtainable from specific single neurons, other parts are carried redundantly or synergistically by multiple neurons. We s… ▽ More In neural networks, task-relevant information is represented jointly by groups of neurons. However, the specific way in which this mutual information about the classification label is distributed among the individual neurons is not well understood: While parts of it may only be obtainable from specific single neurons, other parts are carried redundantly or synergistically by multiple neurons. We show how Partial Information Decomposition (PID), a recent extension of information theory, can disentangle these different contributions. From this, we introduce the measure of "Representational Complexity", which quantifies the difficulty of accessing information spread across multiple neurons. We show how this complexity is directly computable for smaller layers. For larger layers, we propose subsampling and coarse-graining procedures and prove corresponding bounds on the latter. Empirically, for quantized deep neural networks solving the MNIST and CIFAR10 tasks, we observe that representational complexity decreases both through successive hidden layers and over training, and compare the results to related measures. Overall, we propose representational complexity as a principled and interpretable summary statistic for analyzing the structure and evolution of neural representations and complex systems in general. △ Less

Submitted 17 May, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

Comments: 31 pages, 12 figures

MSC Class: 94A15; 68T07

Journal ref: Transactions on Machine Learning Research (TMLR), 05/2023

arXiv:2205.10308 [pdf, other]

Intrinsic timescales of spiking activity in humans during wakefulness and sleep

Authors: Annika Hagemann, Marcel Stephan Kehl, Jonas Dehning, F. Paul Spitzner, Johannes Niediek, Michael Wibral, Florian Mormann, Viola Priesemann

Abstract: Information processing in the brain requires integration of information over time. Such an integration can be achieved if signals are maintained in the network activity for the required period, as quantified by the intrinsic timescale. While short timescales are considered beneficial for fast responses to stimuli, long timescales facilitate information storage and integration. We quantified intrin… ▽ More Information processing in the brain requires integration of information over time. Such an integration can be achieved if signals are maintained in the network activity for the required period, as quantified by the intrinsic timescale. While short timescales are considered beneficial for fast responses to stimuli, long timescales facilitate information storage and integration. We quantified intrinsic timescales from spiking activity in the medial temporal lobe of humans. We found extended and highly diverse timescales ranging from tens to hundreds of milliseconds, though with no evidence for differences between subareas. Notably, however, timescales differed between sleep stages and were longest during slow wave sleep. This supports the hypothesis that intrinsic timescales are a central mechanism to tune networks to the requirements of different tasks and cognitive states. △ Less

Submitted 20 May, 2022; originally announced May 2022.

Comments: preprint

arXiv:2205.05303 [pdf, ps, other]

doi 10.1016/j.tins.2022.09.007

Dendritic predictive coding: A theory of cortical computation with spiking neurons

Authors: Fabian A. Mikulasch, Lucas Rudelt, Michael Wibral, Viola Priesemann

Abstract: Top-down feedback in cortex is critical for guiding sensory processing, which has prominently been formalized in the theory of hierarchical predictive coding (hPC). However, experimental evidence for error units, which are central to the theory, is inconclusive, and it remains unclear how hPC can be implemented with spiking neurons. To address this, we connect hPC to existing work on efficient cod… ▽ More Top-down feedback in cortex is critical for guiding sensory processing, which has prominently been formalized in the theory of hierarchical predictive coding (hPC). However, experimental evidence for error units, which are central to the theory, is inconclusive, and it remains unclear how hPC can be implemented with spiking neurons. To address this, we connect hPC to existing work on efficient coding in balanced networks with lateral inhibition, and predictive computation at apical dendrites. Together, this work points to an efficient implementation of hPC with spiking neurons, where prediction errors are computed not in separate units, but locally in dendritic compartments. The implied model shows a remarkable correspondence to experimentally observed cortical connectivity patterns, plasticity and dynamics, and at the same time can explain hallmarks of predictive processing, such as mismatch responses, in cortex. We thus propose dendritic predictive coding as one of the main organizational principles of cortex. △ Less

Submitted 11 May, 2022; originally announced May 2022.

Comments: 16 pages, 4 figures, 4 boxes

arXiv:2203.10810 [pdf, other]

Information-theoretic analyses of neural data to minimize the effect of researchers' assumptions in predictive coding studies

Authors: Patricia Wollstadt, Daniel L. Rathbun, W. Martin Usrey and, André Moraes Bastos, Michael Lindner, Viola Priesemann, Michael Wibral

Abstract: Studies investigating neural information processing often implicitly ask both, which processing strategy out of several alternatives is used and how this strategy is implemented in neural dynamics. A prime example are studies on predictive coding. These often ask if confirmed predictions about inputs or predictions errors between internal predictions and inputs are passed on in a hierarchical neur… ▽ More Studies investigating neural information processing often implicitly ask both, which processing strategy out of several alternatives is used and how this strategy is implemented in neural dynamics. A prime example are studies on predictive coding. These often ask if confirmed predictions about inputs or predictions errors between internal predictions and inputs are passed on in a hierarchical neural system--while at the same time looking for the neural correlates of coding for errors and predictions. If we do not know exactly what a neural system predicts at any given moment, this results in a circular analysis--as has been criticized correctly. To circumvent such circular analysis, we propose to express information processing strategies (such as predictive coding) by local information-theoretic quantities, such that they can be estimated directly from neural data. We demonstrate our approach by investigating two opposing accounts of predictive coding-like processing strategies, where we quantify the building blocks of predictive coding, namely predictability of inputs and transfer of information, by local active information storage and local transfer entropy. We define testable hypotheses on the relationship of both quantities to identify which of the assumed strategies was used. We demonstrate our approach on spiking data from the retinogeniculate synapse of the cat. Applying our local information dynamics framework, we are able to show that the synapse codes for predictable rather than surprising input. To support our findings, we apply measures from partial information decomposition, which allow to differentiate if the transferred information is primarily bottom-up sensory input or information transferred conditionally on the current state of the synapse. Supporting our local information-theoretic results, we find that the synapse preferentially transfers bottom-up information. △ Less

Submitted 22 May, 2023; v1 submitted 21 March, 2022; originally announced March 2022.

Comments: 36 pages, 9 figures, 3 tables; add link to analysis code

arXiv:2106.12393 [pdf, other]

A partial information decomposition for discrete and continuous variables

Authors: Kyle Schick-Poland, Abdullah Makkeh, Aaron J. Gutknecht, Patricia Wollstadt, Anja Sturm, Michael Wibral

Abstract: Conceptually, partial information decomposition (PID) is concerned with separating the information contributions several sources hold about a certain target by decomposing the corresponding joint mutual information into contributions such as synergistic, redundant, or unique information. Despite PID conceptually being defined for any type of random variables, so far, PID could only be quantified f… ▽ More Conceptually, partial information decomposition (PID) is concerned with separating the information contributions several sources hold about a certain target by decomposing the corresponding joint mutual information into contributions such as synergistic, redundant, or unique information. Despite PID conceptually being defined for any type of random variables, so far, PID could only be quantified for the joint mutual information of discrete systems. Recently, a quantification for PID in continuous settings for two or three source variables was introduced. Nonetheless, no ansatz has managed to both quantify PID for more than three variables and cover general measure-theoretic random variables, such as mixed discrete-continuous, or continuous random variables yet. In this work we will propose an information quantity, defining the terms of a PID, which is well-defined for any number or type of source or target random variable. This proposed quantity is tightly related to a recently developed local shared information quantity for discrete random variables based on the idea of shared exclusions. Further, we prove that this newly proposed information-measure fulfills various desirable properties, such as satisfying a set of local PID axioms, invariance under invertible transformations, differentiability with respect to the underlying probability density, and admitting a target chain rule. △ Less

Submitted 24 June, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

Comments: 24 pages, 1 figure, Reason for replacement: Updated funding information

MSC Class: 94A15 (Primary) 28C15; 60B05; 28A50 (Secondary)

arXiv:2105.04187 [pdf, other]

A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition

Authors: Patricia Wollstadt, Sebastian Schmitt, Michael Wibral

Abstract: Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms -- yet, a rigorous, information-theoretic definition of feature relevancy, which accounts for feature interactions such as redundant and synergistic contributions, is… ▽ More Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms -- yet, a rigorous, information-theoretic definition of feature relevancy, which accounts for feature interactions such as redundant and synergistic contributions, is still missing. We argue that this lack is inherent to classical information theory which does not provide measures to decompose the information a set of variables provides about a target into unique, redundant, and synergistic contributions. Such a decomposition has been introduced only recently by the partial information decomposition (PID) framework. Using PID, we clarify why feature selection is a conceptually difficult problem when approached using information theory and provide a novel definition of feature relevancy and redundancy in PID terms. From this definition, we show that the conditional mutual information (CMI) maximizes relevancy while minimizing redundancy and propose an iterative, CMI-based algorithm for practical feature selection. We demonstrate the power of our CMI-based algorithm in comparison to the unconditional mutual information on benchmark examples and provide corresponding PID estimates to highlight how PID allows to quantify information contribution of features and their interactions in feature-selection problems. △ Less

Submitted 4 May, 2023; v1 submitted 10 May, 2021; originally announced May 2021.

Comments: 44 pages, 12 figures. Reorganization and shortening of manuscript, added Appendix with theoretical guarantees, background information on the algorithm used, and an additional example application on a larger problem. Minor text editing

arXiv:2102.00218 [pdf, other]

Estimating the Unique Information of Continuous Variables

Authors: Ari Pakman, Amin Nejatbakhsh, Dar Gilboa, Abdullah Makkeh, Luca Mazzucato, Michael Wibral, Elad Schneidman

Abstract: The integration and transfer of information from multiple sources to multiple targets is a core motive of neural systems. The emerging field of partial information decomposition (PID) provides a novel information-theoretic lens into these mechanisms by identifying synergistic, redundant, and unique contributions to the mutual information between one and several variables. While many works have stu… ▽ More The integration and transfer of information from multiple sources to multiple targets is a core motive of neural systems. The emerging field of partial information decomposition (PID) provides a novel information-theoretic lens into these mechanisms by identifying synergistic, redundant, and unique contributions to the mutual information between one and several variables. While many works have studied aspects of PID for Gaussian and discrete distributions, the case of general continuous distributions is still uncharted territory. In this work we present a method for estimating the unique information in continuous distributions, for the case of one versus two variables. Our method solves the associated optimization problem over the space of distributions with fixed bivariate marginals by combining copula decompositions and techniques developed to optimize variational autoencoders. We obtain excellent agreement with known analytic results for Gaussians, and illustrate the power of our new approach in several brain-inspired neural models. Our method is capable of recovering the effective connectivity of a chaotic network of rate neurons, and uncovers a complex trade-off between redundancy, synergy and unique information in recurrent networks trained to solve a generalized XOR task. △ Less

Submitted 26 October, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

Journal ref: NeurIPS 2021

arXiv:2009.05732 [pdf, other]

doi 10.1038/s41467-020-20699-8

The challenges of containing SARS-CoV-2 via test-trace-and-isolate

Authors: Sebastian Contreras, Jonas Dehning, Matthias Loidolt, F. Paul Spitzner, Jorge H. Urrea-Quintero, Sebastian B. Mohr, Michael Wilczek, Johannes Zierenberg, Michael Wibral, Viola Priesemann

Abstract: Without a cure, vaccine, or proven long-term immunity against SARS-CoV-2, test-trace-and-isolate (TTI) strategies present a promising tool to contain its spread. For any TTI strategy, however, mitigation is challenged by pre- and asymptomatic transmission, TTI-avoiders, and undetected spreaders, who strongly contribute to hidden infection chains. Here, we studied a semi-analytical model and identi… ▽ More Without a cure, vaccine, or proven long-term immunity against SARS-CoV-2, test-trace-and-isolate (TTI) strategies present a promising tool to contain its spread. For any TTI strategy, however, mitigation is challenged by pre- and asymptomatic transmission, TTI-avoiders, and undetected spreaders, who strongly contribute to hidden infection chains. Here, we studied a semi-analytical model and identified two tip** points between controlled and uncontrolled spread: (1) the behavior-driven reproduction number of the hidden chains becomes too large to be compensated by the TTI capabilities, and (2) the number of new infections exceeds the tracing capacity. Both trigger a self-accelerating spread. We investigated how these tip** points depend on challenges like limited cooperation, missing contacts, and imperfect isolation. Our model results suggest that TTI alone is insufficient to contain an otherwise unhindered spread of SARS-CoV-2, implying that complementary measures like social distancing and improved hygiene remain necessary. △ Less

Submitted 10 November, 2020; v1 submitted 12 September, 2020; originally announced September 2020.

Journal ref: Nat. Commun 12 (2021) 378

arXiv:2008.09535 [pdf, other]

doi 10.1098/rspa.2021.0110

Bits and Pieces: Understanding Information Decomposition from Part-whole Relationships and Formal Logic

Authors: Aaron J. Gutknecht, Michael Wibral, Abdullah Makkeh

Abstract: Partial information decomposition (PID) seeks to decompose the multivariate mutual information that a set of source variables contains about a target variable into basic pieces, the so called "atoms of information". Each atom describes a distinct way in which the sources may contain information about the target. In this paper we show, first, that the entire theory of partial information decomposit… ▽ More Partial information decomposition (PID) seeks to decompose the multivariate mutual information that a set of source variables contains about a target variable into basic pieces, the so called "atoms of information". Each atom describes a distinct way in which the sources may contain information about the target. In this paper we show, first, that the entire theory of partial information decomposition can be derived from considerations of elementary parthood relationships between information contributions. This way of approaching the problem has the advantage of directly characterizing the atoms of information, instead of taking an indirect approach via the concept of redundancy. Secondly, we describe several intriguing links between PID and formal logic. In particular, we show how to define a measure of PID based on the information provided by certain statements about source realizations. Furthermore, we show how the mathematical lattice structure underlying PID theory can be translated into an isomorphic structure of logical statements with a particularly simple ordering relation: logical implication. The conclusion to be drawn from these considerations is that there are three isomorphic "worlds" of partial information decomposition, i.e. three equivalent ways to mathematically describe the decomposition of the information carried by a set of sources about a target: the world of parthood relationships, the world of logical statements, and the world of antichains that was utilized by Williams and Beer in their original exposition of PID theory. We additionally show how the parthood perspective provides a systematic way to answer a type of question that has been much discussed in the PID field: whether a partial information decomposition can be uniquely determined based on concepts other than redundant information. △ Less

Submitted 7 March, 2022; v1 submitted 21 August, 2020; originally announced August 2020.

Comments: 25 pages, 16 figures

arXiv:2004.01105 [pdf, other]

doi 10.1126/science.abb9789

Inferring change points in the COVID-19 spreading reveals the effectiveness of interventions

Authors: Jonas Dehning, Johannes Zierenberg, F. Paul Spitzner, Michael Wibral, Joao Pinheiro Neto, Michael Wilczek, Viola Priesemann

Abstract: As COVID-19 is rapidly spreading across the globe, short-term modeling forecasts provide time-critical information for decisions on containment and mitigation strategies. A main challenge for short-term forecasts is the assessment of key epidemiological parameters and how they change when first interventions show an effect. By combining an established epidemiological model with Bayesian inference,… ▽ More As COVID-19 is rapidly spreading across the globe, short-term modeling forecasts provide time-critical information for decisions on containment and mitigation strategies. A main challenge for short-term forecasts is the assessment of key epidemiological parameters and how they change when first interventions show an effect. By combining an established epidemiological model with Bayesian inference, we analyze the time dependence of the effective growth rate of new infections. Focusing on the COVID-19 spread in Germany, we detect change points in the effective growth rate that correlate well with the times of publicly announced interventions. Thereby, we can quantify the effect of interventions, and we can incorporate the corresponding change points into forecasts of future scenarios and case numbers. Our code is freely available and can be readily adapted to any country or region. △ Less

Submitted 4 May, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: 23 pages, 11 figures. Our code is freely available and can be readily adapted to any country or region ( https://github.com/Priesemann-Group/covid19_inference_forecast/ )

Journal ref: Science 369, 160 (2020)

arXiv:2002.03356 [pdf, other]

doi 10.1103/PhysRevE.103.032149

Introducing a differentiable measure of pointwise shared information

Authors: Abdullah Makkeh, Aaron J. Gutknecht, Michael Wibral

Abstract: Partial information decomposition (PID) of the multivariate mutual information describes the distinct ways in which a set of source variables contains information about a target variable. The groundbreaking work of Williams and Beer has shown that this decomposition cannot be determined from classic information theory without making additional assumptions, and several candidate measures have been… ▽ More Partial information decomposition (PID) of the multivariate mutual information describes the distinct ways in which a set of source variables contains information about a target variable. The groundbreaking work of Williams and Beer has shown that this decomposition cannot be determined from classic information theory without making additional assumptions, and several candidate measures have been proposed, often drawing on principles from related fields such as decision theory. None of these measures is differentiable with respect to the underlying probability mass function. We here present a novel measure that satisfies this property, emerges solely from information-theoretic principles, and has the form of a local mutual information. We show how the measure can be understood from the perspective of exclusions of probability mass, a principle that is foundational to the original definition of the mutual information by Fano. Since our measure is well-defined for individual realizations of the random variables it lends itself for example to local learning in artificial neural networks. We also show that it has a meaningful Möbius inversion on a redundancy lattice and obeys a target chain rule. We give an operational interpretation of the measure based on the decisions that an agent should take if given only the shared information. △ Less

Submitted 30 March, 2021; v1 submitted 9 February, 2020; originally announced February 2020.

Comments: 19 pages, 6 figures; title modified, text modified, typos corrected, manuscript published

Journal ref: Phys. Rev. E 103, 032149 (2021)

arXiv:1909.08418 [pdf, other]

doi 10.1038/s41467-020-16548-3

Control of criticality and computation in spiking neuromorphic networks with plasticity

Authors: Benjamin Cramer, David Stöckel, Markus Kreft, Michael Wibral, Johannes Schemmel, Karlheinz Meier, Viola Priesemann

Abstract: The critical state is assumed to be optimal for any computation in recurrent neural networks, because criticality maximizes a number of abstract computational properties. We challenge this assumption by evaluating the performance of a spiking recurrent neural network on a set of tasks of varying complexity at - and away from critical network dynamics. To that end, we developed a spiking network wi… ▽ More The critical state is assumed to be optimal for any computation in recurrent neural networks, because criticality maximizes a number of abstract computational properties. We challenge this assumption by evaluating the performance of a spiking recurrent neural network on a set of tasks of varying complexity at - and away from critical network dynamics. To that end, we developed a spiking network with synaptic plasticity on a neuromorphic chip. We show that the distance to criticality can be easily adapted by changing the input strength, and then demonstrate a clear relation between criticality, task-performance and information-theoretic fingerprint. Whereas the information-theoretic measures all show that network capacity is maximal at criticality, this is not the case for performance on specific tasks: Only the complex, memory-intensive tasks profit from criticality, whereas the simple tasks suffer from it. Thereby, we challenge the general assumption that criticality would be beneficial for any task, and provide instead an understanding of how the collective network state should be tuned to task requirement to achieve optimal performance. △ Less

Submitted 11 February, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

arXiv:1902.06828 [pdf, other]

doi 10.1162/netn_a_00092

Large-scale directed network inference with multivariate transfer entropy and hierarchical statistical testing

Authors: Leonardo Novelli, Patricia Wollstadt, Pedro Mediano, Michael Wibral, Joseph T. Lizier

Abstract: Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding… ▽ More Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding redundant inferences and capturing synergistic effects. However, multiple statistical comparisons may inflate the false positive rate and are computationally demanding, which limited the size of previous validation studies. The algorithm we present---as implemented in the IDTxl open-source software---addresses these challenges by employing hierarchical statistical tests to control the family-wise error rate and to allow for efficient parallelisation. The method was validated on synthetic datasets involving random networks of increasing size (up to 100 nodes), for both linear and nonlinear dynamics. The performance increased with the length of the time series, reaching consistently high precision, recall, and specificity (>98% on average) for 10000 time samples. Varying the statistical significance threshold showed a more favourable precision-recall trade-off for longer time series. Both the network size and the sample size are one order of magnitude larger than previously demonstrated, showing feasibility for typical EEG and MEG experiments. △ Less

Submitted 30 July, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

Journal ref: Network Neuroscience 2019 3:3, 827-847

arXiv:1809.07550 [pdf, other]

doi 10.3389/fnsys.2018.00055

Dynamic Adaptive Computation: Tuning network states to task requirements

Authors: Jens Wilting, Jonas Dehning, Joao Pinheiro Neto, Lucas Rudelt, Michael Wibral, Johannes Zierenberg, Viola Priesemann

Abstract: Neural circuits are able to perform computations under very diverse conditions and requirements. The required computations impose clear constraints on their fine-tuning: a rapid and maximally informative response to stimuli in general requires decorrelated baseline neural activity. Such network dynamics is known as asynchronous-irregular. In contrast, spatio-temporal integration of information req… ▽ More Neural circuits are able to perform computations under very diverse conditions and requirements. The required computations impose clear constraints on their fine-tuning: a rapid and maximally informative response to stimuli in general requires decorrelated baseline neural activity. Such network dynamics is known as asynchronous-irregular. In contrast, spatio-temporal integration of information requires maintenance and transfer of stimulus information over extended time periods. This can be realized at criticality, a phase transition where correlations, sensitivity and integration time diverge. Being able to flexibly switch, or even combine the above properties in a task-dependent manner would present a clear functional advantage. We propose that cortex operates in a "reverberating regime" because it is particularly favorable for ready adaptation of computational properties to context and task. This reverberating regime enables cortical networks to interpolate between the asynchronous-irregular and the critical state by small changes in effective synaptic strength or excitation-inhibition ratio. These changes directly adapt computational properties, including sensitivity, amplification, integration time and correlation length within the local network. We review recent converging evidence that cortex in vivo operates in the reverberating regime, and that various cortical areas have adapted their integration times to processing requirements. In addition, we propose that neuromodulation enables a fine-tuning of the network, so that local circuits can either decorrelate or integrate, and quench or maintain their input depending on task. We argue that this task-dependent tuning, which we call "dynamic adaptive computation", presents a central organization principle of cortical networks and discuss first experimental evidence. △ Less

Submitted 20 September, 2018; originally announced September 2018.

Comments: 6 pages + references, 2 figures

Journal ref: Frontiers in systems neuroscience 12 (2018)

arXiv:1807.10459 [pdf]

doi 10.21105/joss.01081

IDTxl: The Information Dynamics Toolkit xl: a Python package for the efficient analysis of multivariate information dynamics in networks

Authors: Patricia Wollstadt, Joseph T. Lizier, Raul Vicente, Conor Finn, Mario Martínez-Zarzuela, Pedro Mediano, Leonardo Novelli, Michael Wibral

Abstract: The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to estimate the following measures: 1) For network inference: multivariate transfer entropy (TE)/Granger causality (GC), multivariate mutual information (MI), bivariate… ▽ More The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to estimate the following measures: 1) For network inference: multivariate transfer entropy (TE)/Granger causality (GC), multivariate mutual information (MI), bivariate TE/GC, bivariate MI 2) For analysis of node dynamics: active information storage (AIS), partial information decomposition (PID) IDTxl implements estimators for discrete and continuous data with parallel computing engines for both GPU and CPU platforms. Written for Python3.4.3+. △ Less

Submitted 19 February, 2019; v1 submitted 27 July, 2018; originally announced July 2018.

Comments: 4 pages

Journal ref: Journal of Open Source Software, 4(34), 1081

arXiv:1608.08387 [pdf, other]

doi 10.1371/journal.pcbi.1005511

Breakdown of local information processing may underlie isoflurane anesthesia effects

Authors: Patricia Wollstadt, Kristin K. Sellers, Lucas Rudelt, Viola Priesemann, Axel Hutt, Flavio Fröhlich, Michael Wibral

Abstract: The disruption of coupling between brain areas has been suggested as the mechanism underlying loss of consciousness in anesthesia. This hypothesis has been tested previously by measuring the information transfer between brain areas, and by taking reduced information transfer as a proxy for decoupling. Yet, information transfer is a function of the amount of information available in the information… ▽ More The disruption of coupling between brain areas has been suggested as the mechanism underlying loss of consciousness in anesthesia. This hypothesis has been tested previously by measuring the information transfer between brain areas, and by taking reduced information transfer as a proxy for decoupling. Yet, information transfer is a function of the amount of information available in the information source-such that transfer decreases even for unchanged coupling when less source information is available. Therefore, we asked whether impaired local information processing leads to a loss of information transfer. An important prediction of this alternative hypothesis is that changes in locally available information (signal entropy) should be at least as pronounced as changes in information transfer. We tested this prediction by recording local field potentials in two ferrets after administration of isoflurane in concentrations of 0.0 %, 0.5 %, and 1.0 %. We found strong decreases in the source entropy under isoflurane in area V1 and the prefrontal cortex (PFC)-as predicted by our alternative hypothesis. The decrease in source entropy was stronger in PFC. Information transfer between V1 and PFC was reduced bidirectionally, but with a stronger decrease from PFC to V1. This links the stronger decrease in information transfer to the stronger decrease in source entropy, suggesting reduced source entropy reduces information transfer. This conclusion fits the observation that the synaptic targets of isoflurane are located in local cortical circuits rather than on the synapses formed by interareal axonal projections. Thus, changes in information transfer under isoflurane seem to be a consequence of changes in local processing more than of decoupling between brain areas. We suggest that source entropy changes must be considered whenever interpreting changes in information transfer as decoupling. △ Less

Submitted 30 March, 2017; v1 submitted 30 August, 2016; originally announced August 2016.

Comments: 48 pages, 11 Figures

arXiv:1510.00831 [pdf, other]

doi 10.1016/j.bandc.2015.09.004

Partial Information Decomposition as a Unified Approach to the Specification of Neural Goal Functions

Authors: Michael Wibral, Viola Priesemann, Jim W. Kay, Joseph T. Lizier, William A. Phillips

Abstract: In many neural systems anatomical motifs are present repeatedly, but despite their structural similarity they can serve very different tasks. A prime example for such a motif is the canonical microcircuit of six-layered neo-cortex, which is repeated across cortical areas, and is involved in a number of different tasks (e.g.sensory, cognitive, or motor tasks). This observation has spawned interest… ▽ More In many neural systems anatomical motifs are present repeatedly, but despite their structural similarity they can serve very different tasks. A prime example for such a motif is the canonical microcircuit of six-layered neo-cortex, which is repeated across cortical areas, and is involved in a number of different tasks (e.g.sensory, cognitive, or motor tasks). This observation has spawned interest in finding a common underlying principle, a 'goal function', of information processing implemented in this structure. By definition such a goal function, if universal, cannot be cast in processing-domain specific language (e.g. 'edge filtering', 'working memory'). Thus, to formulate such a principle, we have to use a domain-independent framework. Information theory offers such a framework. However, while the classical framework of information theory focuses on the relation between one input and one output (Shannon's mutual information), we argue that neural information processing crucially depends on the combination of \textit{multiple} inputs to create the output of a processor. To account for this, we use a very recent extension of Shannon Information theory, called partial information decomposition (PID). PID allows to quantify the information that several inputs provide individually (unique information), redundantly (shared information) or only jointly (synergistic information) about the output. First, we review the framework of PID. Then we apply it to reevaluate and analyze several earlier proposals of information theoretic neural goal functions (predictive coding, infomax, coherent infomax, efficient coding). We find that PID allows to compare these goal functions in a common framework, and also provides a versatile approach to design new goal functions from first principles. Building on this, we design and analyze a novel goal function, called 'coding with synergy'. [...] △ Less

Submitted 3 October, 2015; originally announced October 2015.

Comments: 21 pages, 4 figures, appendix

arXiv:1504.00156 [pdf]

doi 10.1371/journal.pone.0140530

A Graph Algorithmic Approach to Separate Direct from Indirect Neural Interactions

Authors: Patricia Wollstadt, Ulrich Meyer, Michael Wibral

Abstract: Network graphs have become a popular tool to represent complex systems composed of many interacting subunits; especially in neuroscience, network graphs are increasingly used to represent and analyze functional interactions between neural sources. Interactions are often reconstructed using pairwise bivariate analyses, overlooking their multivariate nature: it is neglected that investigating the ef… ▽ More Network graphs have become a popular tool to represent complex systems composed of many interacting subunits; especially in neuroscience, network graphs are increasingly used to represent and analyze functional interactions between neural sources. Interactions are often reconstructed using pairwise bivariate analyses, overlooking their multivariate nature: it is neglected that investigating the effect of one source on a target necessitates to take all other sources as potential nuisance variables into account; also combinations of sources may act jointly on a given target. Bivariate analyses produce networks that may contain spurious interactions, which reduce the interpretability of the network and its graph metrics. A truly multivariate reconstruction, however, is computationally intractable due to combinatorial explosion in the number of potential interactions. Thus, we have to resort to approximative methods to handle the intractability of multivariate interaction reconstruction, and thereby enable the use of networks in neuroscience. Here, we suggest such an approximative approach in the form of an algorithm that extends fast bivariate interaction reconstruction by identifying potentially spurious interactions post-hoc: the algorithm flags potentially spurious edges, which may then be pruned from the network. This produces a statistically conservative network approximation that is guaranteed to contain non-spurious interactions only. We describe the algorithm and present a reference implementation to test its performance. We discuss the algorithm in relation to other approximative multivariate methods and highlight suitable application scenarios. Our approach is a tractable and data-efficient way of reconstructing approximative networks of multivariate interactions. It is preferable if available data are limited or if fully multivariate approaches are computationally infeasible. △ Less

Submitted 23 November, 2015; v1 submitted 1 April, 2015; originally announced April 2015.

Comments: 24 pages, 8 figures, published in PLOS One

ACM Class: F.2.2; G.2.2; G.4; H.1.1

Journal ref: PLoS ONE 10(10): e0140530 (2015)

arXiv:1412.0291 [pdf, other]

doi 10.3389/frobt.2015.00005

Bits from Biology for Computational Intelligence

Authors: Michael Wibral, Joseph T. Lizier, Viola Priesemann

Abstract: Computational intelligence is broadly defined as biologically-inspired computing. Usually, inspiration is drawn from neural systems. This article shows how to analyze neural systems using information theory to obtain constraints that help identify the algorithms run by such systems and the information they represent. Algorithms and representations identified information-theoretically may then guid… ▽ More Computational intelligence is broadly defined as biologically-inspired computing. Usually, inspiration is drawn from neural systems. This article shows how to analyze neural systems using information theory to obtain constraints that help identify the algorithms run by such systems and the information they represent. Algorithms and representations identified information-theoretically may then guide the design of biologically inspired computing systems (BICS). The material covered includes the necessary introduction to information theory and the estimation of information theoretic quantities from neural data. We then show how to analyze the information encoded in a system about its environment, and also discuss recent methodological developments on the question of how much information each agent carries about the environment either uniquely, or redundantly or synergistically together with others. Last, we introduce the framework of local information dynamics, where information processing is decomposed into component processes of information storage, transfer, and modification -- locally in space and time. We close by discussing example applications of these measures to neural data and other complex systems. △ Less

Submitted 30 November, 2014; originally announced December 2014.

Journal ref: Frontiers in Robotics and AI, 2:5 (2015)

arXiv:1405.7965 [pdf]

doi 10.1016/j.conb.2014.08.002

Untangling cross-frequency coupling in neuroscience

Authors: Juhan Aru, Jaan Aru, Viola Priesemann, Michael Wibral, Luiz Lana, Gordon Pipa, Wolf Singer, Raul Vicente

Abstract: Cross-frequency coupling (CFC) has been proposed to coordinate neural dynamics across spatial and temporal scales. Despite its potential relevance for understanding healthy and pathological brain function, the standard CFC analysis and physiological interpretation come with fundamental problems. For example, apparent CFC can appear because of spectral correlations due to common non-stationarities… ▽ More Cross-frequency coupling (CFC) has been proposed to coordinate neural dynamics across spatial and temporal scales. Despite its potential relevance for understanding healthy and pathological brain function, the standard CFC analysis and physiological interpretation come with fundamental problems. For example, apparent CFC can appear because of spectral correlations due to common non-stationarities that may arise in the total absence of interactions between neural frequency components. To provide a road map towards an improved mechanistic understanding of CFC, we organize the available and potential novel statistical/modeling approaches according to their biophysical interpretability. While we do not provide solutions for all the problems described, we provide a list of practical recommendations to avoid common errors and to enhance the interpretability of CFC analysis. △ Less

Submitted 25 August, 2014; v1 submitted 30 May, 2014; originally announced May 2014.

Comments: 47 pages, 12 figures, including supplementary material

arXiv:1401.4068 [pdf]

doi 10.1371/journal.pone.0102833

Efficient transfer entropy analysis of non-stationary neural time series

Authors: Patricia Wollstadt, Mario Martínez-Zarzuela, Raul Vicente, Francisco J. Díaz-Pernas, Michael Wibral

Abstract: Information theory allows us to investigate information processing in neural systems in terms of information transfer, storage and modification. Especially the measure of information transfer, transfer entropy, has seen a dramatic surge of interest in neuroscience. Estimating transfer entropy from two processes requires the observation of multiple realizations of these processes to estimate associ… ▽ More Information theory allows us to investigate information processing in neural systems in terms of information transfer, storage and modification. Especially the measure of information transfer, transfer entropy, has seen a dramatic surge of interest in neuroscience. Estimating transfer entropy from two processes requires the observation of multiple realizations of these processes to estimate associated probability density functions. To obtain these observations, available estimators assume stationarity of processes to allow pooling of observations over time. This assumption however, is a major obstacle to the application of these estimators in neuroscience as observed processes are often non-stationary. As a solution, Gomez-Herrero and colleagues theoretically showed that the stationarity assumption may be avoided by estimating transfer entropy from an ensemble of realizations. Such an ensemble is often readily available in neuroscience experiments in the form of experimental trials. Thus, in this work we combine the ensemble method with a recently proposed transfer entropy estimator to make transfer entropy estimation applicable to non-stationary time series. We present an efficient implementation of the approach that deals with the increased computational demand of the ensemble method's practical application. In particular, we use a massively parallel implementation for a graphics processing unit to handle the computationally most heavy aspects of the ensemble method. We test the performance and robustness of our implementation on data from simulated stochastic processes and demonstrate the method's applicability to magnetoencephalographic data. While we mainly evaluate the proposed method for neuroscientific data, we expect it to be applicable in a variety of fields that are concerned with the analysis of information transfer in complex biological, social, and artificial systems. △ Less

Submitted 23 November, 2015; v1 submitted 16 January, 2014; originally announced January 2014.

Comments: 27 pages, 7 figures, submitted to PLOS ONE

Journal ref: PLoS ONE 10(10): e0140530 (2014)

Showing 1–24 of 24 results for author: Wibral, M