-
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
Authors:
Arjun Karuvally,
Terrence J. Sejnowski,
Hava T. Siegelmann
Abstract:
Traveling waves are a fundamental phenomenon in the brain, playing a crucial role in short-term information storage. In this study, we leverage the concept of traveling wave dynamics within a neural lattice to formulate a theoretical model of neural working memory, study its properties, and its real world implications in AI. The proposed model diverges from traditional approaches, which assume inf…
▽ More
Traveling waves are a fundamental phenomenon in the brain, playing a crucial role in short-term information storage. In this study, we leverage the concept of traveling wave dynamics within a neural lattice to formulate a theoretical model of neural working memory, study its properties, and its real world implications in AI. The proposed model diverges from traditional approaches, which assume information storage in static, register-like locations updated by interference. Instead, the model stores data as waves that is updated by the wave's boundary conditions. We rigorously examine the model's capabilities in representing and learning state histories, which are vital for learning history-dependent dynamical systems. The findings reveal that the model reliably stores external information and enhances the learning process by addressing the diminishing gradient problem. To understand the model's real-world applicability, we explore two cases: linear boundary condition (LBC) and non-linear, self-attention-driven boundary condition (SBC). The model with the linear boundary condition results in a shift matrix plus low-rank matrix currently used in H3 state space RNN. Further, our experiments with LBC reveal that this matrix is effectively learned by Recurrent Neural Networks (RNNs) through backpropagation when modeling history-dependent dynamical systems. Conversely, the SBC parallels the autoregressive loop of an attention-only transformer with the context vector representing the wave substrate. Collectively, our findings suggest the broader relevance of traveling waves in AI and its potential in advancing neural network architectures.
△ Less
Submitted 7 April, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Transformers and Cortical Waves: Encoders for Pulling In Context Across Time
Authors:
Lyle Muller,
Patricia S. Churchland,
Terrence J. Sejnowski
Abstract:
The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long "encoding vector" that allows transformers to learn long-range temporal dependencies in naturali…
▽ More
The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long "encoding vector" that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, "self-attention" applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas or multiple regions at the whole-brain scale could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable temporal context to be extracted from sequences of sensory inputs, the same computational principle used in transformers.
△ Less
Submitted 2 July, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Energy-based General Sequential Episodic Memory Networks at the Adiabatic Limit
Authors:
Arjun Karuvally,
Terry J. Sejnowski,
Hava T. Siegelmann
Abstract:
The General Associative Memory Model (GAMM) has a constant state-dependant energy surface that leads the output dynamics to fixed points, retrieving single memories from a collection of memories that can be asynchronously preloaded. We introduce a new class of General Sequential Episodic Memory Models (GSEMM) that, in the adiabatic limit, exhibit temporally changing energy surface, leading to a se…
▽ More
The General Associative Memory Model (GAMM) has a constant state-dependant energy surface that leads the output dynamics to fixed points, retrieving single memories from a collection of memories that can be asynchronously preloaded. We introduce a new class of General Sequential Episodic Memory Models (GSEMM) that, in the adiabatic limit, exhibit temporally changing energy surface, leading to a series of meta-stable states that are sequential episodic memories. The dynamic energy surface is enabled by newly introduced asymmetric synapses with signal propagation delays in the network's hidden layer. We study the theoretical and empirical properties of two memory models from the GSEMM class, differing in their activation functions. LISEM has non-linearities in the feature layer, whereas DSEM has non-linearity in the hidden layer. In principle, DSEM has a storage capacity that grows exponentially with the number of neurons in the network. We introduce a learning rule for the synapses based on the energy minimization principle and show it can learn single memories and their sequential relationships online. This rule is similar to the Hebbian learning algorithm and Spike-Timing Dependent Plasticity (STDP), which describe conditions under which synapses between neurons change strength. Thus, GSEMM combines the static and dynamic properties of episodic memory under a single theoretical framework and bridges neuroscience, machine learning, and artificial intelligence.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Internal feedback in the cortical perception-action loop enables fast and accurate behavior
Authors:
**g Shuang Li,
Anish A. Sarma,
Terrence J. Sejnowski,
John C. Doyle
Abstract:
Animals move smoothly and reliably in unpredictable environments. Models of sensorimotor control have assumed that sensory information from the environment leads to actions, which then act back on the environment, creating a single, unidirectional perception-action loop. This loop contains internal delays in sensory and motor pathways, which can lead to unstable control. We show here that these de…
▽ More
Animals move smoothly and reliably in unpredictable environments. Models of sensorimotor control have assumed that sensory information from the environment leads to actions, which then act back on the environment, creating a single, unidirectional perception-action loop. This loop contains internal delays in sensory and motor pathways, which can lead to unstable control. We show here that these delays can be compensated by internal feedback signals that flow backwards, from motor towards sensory areas. Internal feedback is ubiquitous in neural sensorimotor systems and recent advances in control theory show how internal feedback compensates internal delays. This is accomplished by filtering out self-generated and other predictable changes in early sensory areas so that unpredicted, actionable information can be rapidly transmitted toward action by the fastest components. For example, fast, giant neurons are necessarily less accurate than smaller neurons, but they are crucial for fast and accurate behavior. We use a mathematically tractable control model to show that internal feedback has an indispensable role in achieving state estimation, localization of function -- how different parts of cortex control different parts of the body -- and attention, all of which are crucial for effective sensorimotor control. This control model can explain anatomical, physiological and behavioral observations, including motor signals in visual cortex, heterogeneous kinetics of sensory receptors and the presence of giant Betz cells in motor cortex, Meynert cells in visual cortex and giant von Economo cells in the prefrontal cortex of humans as well as internal feedback patterns and unexplained heterogeneity in other neural systems.
△ Less
Submitted 10 January, 2023; v1 submitted 10 November, 2022;
originally announced November 2022.
-
Analytical prediction of specific spatiotemporal patterns in nonlinear oscillator networks with distance-dependent time delays
Authors:
Roberto C. Budzinski,
Tung T. Nguyen,
Gabriel B. Benigno,
Jacqueline Doàn,
Ján Mináč,
Terrence J. Sejnowski,
Lyle E. Muller
Abstract:
We introduce an analytical approach that allows predictions and mechanistic insights into the dynamics of nonlinear oscillator networks with heterogeneous time delays. We demonstrate that time delays shape the spectrum of a matrix associated to the system, leading to the emergence of waves with a preferred direction. We then create analytical predictions for the specific spatiotemporal patterns ob…
▽ More
We introduce an analytical approach that allows predictions and mechanistic insights into the dynamics of nonlinear oscillator networks with heterogeneous time delays. We demonstrate that time delays shape the spectrum of a matrix associated to the system, leading to the emergence of waves with a preferred direction. We then create analytical predictions for the specific spatiotemporal patterns observed in individual simulations of time-delayed Kuramoto networks. This approach generalizes to systems with heterogeneous time delays at finite scales, which permits the study of spatiotemporal dynamics in a broad range of applications.
△ Less
Submitted 16 January, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Geometry unites synchrony, chimeras, and waves in nonlinear oscillator networks
Authors:
Roberto C. Budzinski,
Tung T. Nguyen,
Jacqueline Doan,
Jan Minac,
Terrence J. Sejnowski,
Lyle E. Muller
Abstract:
One of the simplest mathematical models in the study of nonlinear systems is the Kuramoto model, which describes synchronization in systems from swarms of insects to superconductors. We have recently found a connection between the original, real-valued nonlinear Kuramoto model and a corresponding complex-valued system that permits describing the system in terms of a linear operator and iterative u…
▽ More
One of the simplest mathematical models in the study of nonlinear systems is the Kuramoto model, which describes synchronization in systems from swarms of insects to superconductors. We have recently found a connection between the original, real-valued nonlinear Kuramoto model and a corresponding complex-valued system that permits describing the system in terms of a linear operator and iterative update rule. We now use this description to investigate three major synchronization phenomena in Kuramoto networks (phase synchronization, chimera states, and traveling waves), not only in terms of steady state solutions but also in terms of transient dynamics and individual simulations. These results provide new mathematical insight into how sophisticated behaviors arise from connection patterns in nonlinear networked systems.
△ Less
Submitted 30 March, 2022; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Replay in Deep Learning: Current Approaches and Missing Biological Elements
Authors:
Tyler L. Hayes,
Giri P. Krishnan,
Maxim Bazhenov,
Hava T. Siegelmann,
Terrence J. Sejnowski,
Christopher Kanan
Abstract:
Replay is the reactivation of one or more neural patterns, which are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated into deep artificial neural networks th…
▽ More
Replay is the reactivation of one or more neural patterns, which are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated into deep artificial neural networks that learn over time to avoid catastrophic forgetting of previous knowledge. Replay algorithms have been successfully used in a wide range of deep learning methods within supervised, unsupervised, and reinforcement learning paradigms. In this paper, we provide the first comprehensive comparison between replay in the mammalian brain and replay in artificial neural networks. We identify multiple aspects of biological replay that are missing in deep learning systems and hypothesize how they could be utilized to improve artificial neural networks.
△ Less
Submitted 28 May, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Assessing observability of chaotic systems using Delay Differential Analysis
Authors:
Christopher E. Gonzalez,
Claudia Lainscsek,
Terrence J. Sejnowski,
Christophe Letellier
Abstract:
Observability can determine which recorded variables of a given system are optimal for discriminating its different states. Quantifying observability requires knowledge of the equations governing the dynamics. These equations are often unknown when experimental data are considered. Consequently, we propose an approach for numerically assessing observability using Delay Differential Analysis (DDA).…
▽ More
Observability can determine which recorded variables of a given system are optimal for discriminating its different states. Quantifying observability requires knowledge of the equations governing the dynamics. These equations are often unknown when experimental data are considered. Consequently, we propose an approach for numerically assessing observability using Delay Differential Analysis (DDA). Given a time series, DDA uses a delay differential equation for approximating the measured data. The lower the least squares error between the predicted and recorded data, the higher the observability. We thus rank the variables of several chaotic systems according to their corresponding least square error to assess observability. The performance of our approach is evaluated by comparison with the ranking provided by the symbolic observability coefficients as well as with two other data-based approaches using reservoir computing and singular value decomposition of the reconstructed space. We investigate the robustness of our approach against noise contamination.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence
Authors:
Terrence J. Sejnowski
Abstract:
Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and…
▽ More
Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and non-convex optimization theory. However, paradoxes in the training and effectiveness of deep learning networks are being investigated and insights are being found in the geometry of high-dimensional spaces. A mathematical theory of deep learning would illuminate how they function, allow us to assess the strengths and weaknesses of different network architectures and lead to major improvements. Deep learning has provided natural ways for humans to communicate with digital devices and is foundational for building artificial general intelligence. Deep learning was inspired by the architecture of the cerebral cortex and insights into autonomy and general intelligence may be found in other brain regions that are essential for planning and survival, but major breakthroughs will be needed to achieve these goals.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Diversity-enabled sweet spots in layered architectures and speed-accuracy trade-offs in sensorimotor control
Authors:
Yorie Nakahira,
Quanying Liu,
Terrence J. Sejnowski,
John C. Doyle
Abstract:
Nervous systems sense, communicate, compute and actuate movement using distributed components with severe trade-offs in speed, accuracy, sparsity, noise and saturation. Nevertheless, brains achieve remarkably fast, accurate, and robust control performance due to a highly effective layered control architecture. Here we introduce a driving task to study how a mountain biker mitigates the immediate d…
▽ More
Nervous systems sense, communicate, compute and actuate movement using distributed components with severe trade-offs in speed, accuracy, sparsity, noise and saturation. Nevertheless, brains achieve remarkably fast, accurate, and robust control performance due to a highly effective layered control architecture. Here we introduce a driving task to study how a mountain biker mitigates the immediate disturbance of trail bumps and responds to changes in trail direction. We manipulated the time delays and accuracy of the control input from the wheel as a surrogate for manipulating the characteristics of neurons in the control loop. The observed speed-accuracy trade-offs (SATs) motivated a theoretical framework consisting of layers of control loops with components having diverse speeds and accuracies within each physical level, such as nerve bundles containing axons with a wide range of sizes. Our model explains why the errors from two control loops -- one fast but inaccurate reflexive layer that corrects for bumps, and a planning layer that is slow but accurate -- are additive, and show how the errors in each control loop can be decomposed into the errors caused by the limited speeds and accuracies of the components. These results demonstrate that an appropriate diversity in the properties of neurons across layers helps to create "diversity-enabled sweet spots" (DESSs) so that both fast and accurate control is achieved using slow or inaccurate components.
△ Less
Submitted 2 May, 2021; v1 submitted 18 September, 2019;
originally announced September 2019.
-
Fitts' Law for speed-accuracy trade-off describes a diversity-enabled sweet spot in sensorimotor control
Authors:
Yorie Nakahira,
Quanying Liu,
Terrence J. Sejnowski,
John C. Doyle
Abstract:
Human sensorimotor control exhibits remarkable speed and accuracy, and the tradeoff between them is encapsulated in Fitts' Law for reaching and pointing. While Fitts related this to Shannon's channel capacity theorem, despite widespread study of Fitts' Law, a theory that connects implementation of sensorimotor control at the system and hardware level has not emerged. Here we describe a theory that…
▽ More
Human sensorimotor control exhibits remarkable speed and accuracy, and the tradeoff between them is encapsulated in Fitts' Law for reaching and pointing. While Fitts related this to Shannon's channel capacity theorem, despite widespread study of Fitts' Law, a theory that connects implementation of sensorimotor control at the system and hardware level has not emerged. Here we describe a theory that connects hardware (neurons and muscles with inherent severe speed-accuracy tradeoffs) with system level control to explain Fitts' Law for reaching and related laws. The results supporting the theory show that diversity between hardware components is exploited to achieve both fast and accurate control performance despite slow or inaccurate hardware. Such "diversity-enabled sweet spots" (DESSs) are ubiquitous in biology and technology, and explain why large heterogeneities exist in biological and technical components and how both engineers and natural selection routinely evolve fast and accurate systems using imperfect hardware.
△ Less
Submitted 18 September, 2019; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing
Authors:
Siddharth Siddharth,
Tzyy-** Jung,
Terrence J. Sejnowski
Abstract:
In recent years, the use of bio-sensing signals such as electroencephalogram (EEG), electrocardiogram (ECG), etc. have garnered interest towards applications in affective computing. The parallel trend of deep-learning has led to a huge leap in performance towards solving various vision-based research problems such as object detection. Yet, these advances in deep-learning have not adequately transl…
▽ More
In recent years, the use of bio-sensing signals such as electroencephalogram (EEG), electrocardiogram (ECG), etc. have garnered interest towards applications in affective computing. The parallel trend of deep-learning has led to a huge leap in performance towards solving various vision-based research problems such as object detection. Yet, these advances in deep-learning have not adequately translated into bio-sensing research. This work applies novel deep-learning-based methods to various bio-sensing and video data of four publicly available multi-modal emotion datasets. For each dataset, we first individually evaluate the emotion-classification performance obtained by each modality. We then evaluate the performance obtained by fusing the features from these modalities. We show that our algorithms outperform the results reported by other studies for emotion/valence/arousal/liking classification on DEAP and MAHNOB-HCI datasets and set up benchmarks for the newer AMIGOS and DREAMER datasets. We also evaluate the performance of our algorithms by combining the datasets and by using transfer learning to show that the proposed method overcomes the inconsistencies between the datasets. Hence, we do a thorough analysis of multi-modal affective data from more than 120 subjects and 2,800 trials. Finally, utilizing a convolution-deconvolution network, we propose a new technique towards identifying salient brain regions corresponding to various affective states.
△ Less
Submitted 16 May, 2019;
originally announced May 2019.
-
MCell-R: A particle-resolution network-free spatial modeling framework
Authors:
Jose-Juan Tapia,
Ali Sinan Saglam,
Jacob Czech,
Robert Kuczewski,
Thomas M. Bartol,
Terrence J. Sejnowski,
James R. Faeder
Abstract:
Spatial heterogeneity can have dramatic effects on the biochemical networks that drive cell regulation and decision-making. For this reason, a number of methods have been developed to model spatial heterogeneity and incorporated into widely used modeling platforms. Unfortunately, the standard approaches for specifying and simulating chemical reaction networks become untenable when dealing with mul…
▽ More
Spatial heterogeneity can have dramatic effects on the biochemical networks that drive cell regulation and decision-making. For this reason, a number of methods have been developed to model spatial heterogeneity and incorporated into widely used modeling platforms. Unfortunately, the standard approaches for specifying and simulating chemical reaction networks become untenable when dealing with multi-state, multi-component systems that are characterized by combinatorial complexity. To address this issue, we developed MCell-R, a framework that extends the particle-based spatial Monte Carlo simulator, MCell, with the rule-based model specification and simulation capabilities provided by BioNetGen and NFsim. The BioNetGen syntax enables the specification of biomolecules as structured objects whose components can have different internal states that represent such features as covalent modification and conformation and which can bind components of other molecules to form molecular complexes. The network-free simulation algorithm used by NFsim enables efficient simulation of rule-based models even when the size of the network implied by the biochemical rules is too large to enumerate explicitly, which frequently occurs in detailed models of biochemical signaling. The result is a framework that can efficiently simulate systems characterized by combinatorial complexity at the level of spatially-resolved individual molecules over biologically relevant time and length scales.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
Spatial Stochastic Modeling with MCell and CellBlender
Authors:
Sanjana Gupta,
Jacob Czech,
Robert Kuczewski,
Thomas M. Bartol,
Terrence J. Sejnowski,
Robin E. C. Lee,
James R. Faeder
Abstract:
This chapter provides a brief introduction to the theory and practice of spatial stochastic simulations. It begins with an overview of different methods available for biochemical simulations highlighting their strengths and limitations. Spatial stochastic modeling approaches are indicated when diffusion is relatively slow and spatial inhomogeneities involve relatively small numbers of particles. T…
▽ More
This chapter provides a brief introduction to the theory and practice of spatial stochastic simulations. It begins with an overview of different methods available for biochemical simulations highlighting their strengths and limitations. Spatial stochastic modeling approaches are indicated when diffusion is relatively slow and spatial inhomogeneities involve relatively small numbers of particles. The popular software package MCell allows particle-based stochastic simulations of biochemical systems in complex three dimensional (3D) geometries, which are important for many cell biology applications. Here, we provide an overview of the simulation algorithms used by MCell and the underlying theory. We then give a tutorial on building and simulating MCell models using the CellBlender graphical user interface, that is built as a plug-in to Blender, a widely-used and freely available software platform for 3D modeling. The tutorial starts with simple models that demonstrate basic MCell functionality and then advances to a number of more complex examples that demonstrate a range of features and provide examples of important biophysical effects that require spatially-resolved stochastic dynamics to capture.
△ Less
Submitted 30 September, 2018;
originally announced October 2018.
-
Multi-modal Approach for Affective Computing
Authors:
Siddharth Siddharth,
Tzyy-** Jung,
Terrence J. Sejnowski
Abstract:
Throughout the past decade, many studies have classified human emotions using only a single sensing modality such as face video, electroencephalogram (EEG), electrocardiogram (ECG), galvanic skin response (GSR), etc. The results of these studies are constrained by the limitations of these modalities such as the absence of physiological biomarkers in the face-video analysis, poor spatial resolution…
▽ More
Throughout the past decade, many studies have classified human emotions using only a single sensing modality such as face video, electroencephalogram (EEG), electrocardiogram (ECG), galvanic skin response (GSR), etc. The results of these studies are constrained by the limitations of these modalities such as the absence of physiological biomarkers in the face-video analysis, poor spatial resolution in EEG, poor temporal resolution of the GSR etc. Scant research has been conducted to compare the merits of these modalities and understand how to best use them individually and jointly. Using multi-modal AMIGOS dataset, this study compares the performance of human emotion classification using multiple computational approaches applied to face videos and various bio-sensing modalities. Using a novel method for compensating physiological baseline we show an increase in the classification accuracy of various approaches that we use. Finally, we present a multi-modal emotion-classification approach in the domain of affective computing research.
△ Less
Submitted 20 June, 2018; v1 submitted 25 April, 2018;
originally announced April 2018.
-
An Affordable Bio-Sensing and Activity Tagging Platform for HCI Research
Authors:
Siddharth,
Aashish Patel,
Tzyy-** Jung,
Terrence J. Sejnowski
Abstract:
We present a novel multi-modal bio-sensing platform capable of integrating multiple data streams for use in real-time applications. The system is composed of a central compute module and a companion headset. The compute node collects, time-stamps and transmits the data while also providing an interface for a wide range of sensors including electroencephalogram, photoplethysmogram, electrocardiogra…
▽ More
We present a novel multi-modal bio-sensing platform capable of integrating multiple data streams for use in real-time applications. The system is composed of a central compute module and a companion headset. The compute node collects, time-stamps and transmits the data while also providing an interface for a wide range of sensors including electroencephalogram, photoplethysmogram, electrocardiogram, and eye gaze among others. The companion headset contains the gaze tracking cameras. By integrating many of the measurements systems into an accessible package, we are able to explore previously unanswerable questions ranging from open-environment interactions to emotional response studies. Though some of the integrated sensors are designed from the ground-up to fit into a compact form factor, we validate the accuracy of the sensors and find that they perform similarly to, and in some cases better than, alternatives.
△ Less
Submitted 21 February, 2018;
originally announced February 2018.
-
Differential covariance: A new method to estimate functional connectivity in fMRI
Authors:
Tiger w. Lin,
Giri P. Krishnan,
Maxim Bazhenov,
Terrence J. Sejnowski
Abstract:
Measuring functional connectivity from fMRI is important in understanding processing in cortical networks. However, because brain's connection pattern is complex, currently used methods are prone to produce false connections. We introduce here a new method that uses derivative for estimating functional connectivity. Using simulations, we benchmarked our method with other commonly used methods. Our…
▽ More
Measuring functional connectivity from fMRI is important in understanding processing in cortical networks. However, because brain's connection pattern is complex, currently used methods are prone to produce false connections. We introduce here a new method that uses derivative for estimating functional connectivity. Using simulations, we benchmarked our method with other commonly used methods. Our method achieves better results in complex network simulations. This new method provides an alternative way to estimate functional connectivity.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
Gradient Descent for Spiking Neural Networks
Authors:
Dongsung Huh,
Terrence J. Sejnowski
Abstract:
Much of studies on neural computation are based on network models of static neurons that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking networks. Here, we…
▽ More
Much of studies on neural computation are based on network models of static neurons that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking networks and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~millisecond) spike-based interactions for efficient encoding of information, and a delayed memory XOR task over extended duration (~second). The results show that our method indeed optimizes the spiking network dynamics on the time scale of individual spikes as well as behavioral time scales. In conclusion, our result offers a general purpose supervised learning algorithm for spiking neural networks, thus advancing further investigations on spike-based computation.
△ Less
Submitted 19 June, 2017; v1 submitted 14 June, 2017;
originally announced June 2017.
-
Differential Covariance: A New Class of Methods to Estimate Sparse Connectivity from Neural Recordings
Authors:
Tiger W. Lin,
Anup Das,
Giri P. Krishnan,
Maxim Bazhenov,
Terrence J. Sejnowski
Abstract:
With our ability to record more neurons simultaneously, making sense of these data is a challenge. Functional connectivity is one popular way to study the relationship between multiple neural signals. Correlation-based methods are a set of currently well-used techniques for functional connectivity estimation. However, due to explaining away and unobserved common inputs (Stevenson et al., 2008), th…
▽ More
With our ability to record more neurons simultaneously, making sense of these data is a challenge. Functional connectivity is one popular way to study the relationship between multiple neural signals. Correlation-based methods are a set of currently well-used techniques for functional connectivity estimation. However, due to explaining away and unobserved common inputs (Stevenson et al., 2008), they produce spurious connections. The general linear model (GLM), which models spikes trains as Poisson processes (Okatan et al., 2005; Truccolo et al., 2005; Pillow et al., 2008), avoids these confounds. We develop here a new class of methods by using differential signals based on simulated intracellular voltage recordings. It is equivalent to a regularized AR(2) model. We also expand the method to simulated local field potential (LFP) recordings and calcium imaging. In all of our simulated data, the differential covariance-based methods achieved better or similar performance to the GLM method and required fewer data samples. This new class of methods provides alternative ways to analyze neural signals.
△ Less
Submitted 8 June, 2017;
originally announced June 2017.
-
The Wilson Machine for Image Modeling
Authors:
Saeed Saremi,
Terrence J. Sejnowski
Abstract:
Learning the distribution of natural images is one of the hardest and most important problems in machine learning. The problem remains open, because the enormous complexity of the structures in natural images spans all length scales. We break down the complexity of the problem and show that the hierarchy of structures in natural images fuels a new class of learning algorithms based on the theory o…
▽ More
Learning the distribution of natural images is one of the hardest and most important problems in machine learning. The problem remains open, because the enormous complexity of the structures in natural images spans all length scales. We break down the complexity of the problem and show that the hierarchy of structures in natural images fuels a new class of learning algorithms based on the theory of critical phenomena and stochastic processes. We approach this problem from the perspective of the theory of critical phenomena, which was developed in condensed matter physics to address problems with infinite length-scale fluctuations, and build a framework to integrate the criticality of natural images into a learning algorithm. The problem is broken down by map** images into a hierarchy of binary images, called bitplanes. In this representation, the top bitplane is critical, having fluctuations in structures over a vast range of scales. The bitplanes below go through a gradual stochastic heating process to disorder. We turn this representation into a directed probabilistic graphical model, transforming the learning problem into the unsupervised learning of the distribution of the critical bitplane and the supervised learning of the conditional distributions for the remaining bitplanes. We learnt the conditional distributions by logistic regression in a convolutional architecture. Conditioned on the critical binary image, this simple architecture can generate large, natural-looking images, with many shades of gray, without the use of hidden units, unprecedented in the studies of natural images. The framework presented here is a major step in bringing criticality and stochastic processes to machine learning and in studying natural image statistics.
△ Less
Submitted 11 November, 2015; v1 submitted 26 October, 2015;
originally announced October 2015.
-
Short-term synaptic plasticity in the deterministic Tsodyks-Markram model leads to unpredictable network dynamics
Authors:
Jesus M Cortes,
Mathieu Desroches,
Serafim Rodrigues,
Romain Veltz,
Miguel A. Munoz,
Terrence J. Sejnowski
Abstract:
Short-Term Synaptic Plasticity (STSP) strongly affects the neural dynamics of cortical networks. The Tsodyks and Markram (TM) model for STSP accurately accounts for a wide range of physiological responses at different types of cortical synapses. Here, we report for the first time a route to chaotic behavior via a Shilnikov homoclinic bifurcation that dynamically organises some of the responses in…
▽ More
Short-Term Synaptic Plasticity (STSP) strongly affects the neural dynamics of cortical networks. The Tsodyks and Markram (TM) model for STSP accurately accounts for a wide range of physiological responses at different types of cortical synapses. Here, we report for the first time a route to chaotic behavior via a Shilnikov homoclinic bifurcation that dynamically organises some of the responses in the TM model. In particular, the presence of such a homoclinic bifurcation strongly affects the shape of the trajectories in the phase space and induces highly irregular transient dynamics; indeed, in the vicinity of the Shilnikov homoclinic bifurcation, the number of population spikes and their precise timing are unpredictable and highly sensitive to the initial conditions. Such an irregular deterministic dynamics has its counterpart in stochastic/network versions of the TM model: the existence of the Shilnikov homoclinic bifurcation generates complex and irregular spiking patterns and --acting as a sort of springboard-- facilitates transitions between the down-state and unstable periodic orbits. The interplay between the (deterministic) homoclinic bifurcation and stochastic effects may give rise to some of the complex dynamics observed in neural systems.
△ Less
Submitted 30 September, 2013;
originally announced September 2013.
-
The effect of neural adaptation of population coding accuracy
Authors:
J. M. Cortes,
D. Marinazzo,
P. Series,
M. W. Oram,
T. J. Sejnowski,
M. C. W. van Rossum
Abstract:
Most neurons in the primary visual cortex initially respond vigorously when a preferred stimulus is presented, but adapt as stimulation continues. The functional consequences of adaptation are unclear. Typically a reduction of firing rate would reduce single neuron accuracy as less spikes are available for decoding, but it has been suggested that on the population level, adaptation increases codin…
▽ More
Most neurons in the primary visual cortex initially respond vigorously when a preferred stimulus is presented, but adapt as stimulation continues. The functional consequences of adaptation are unclear. Typically a reduction of firing rate would reduce single neuron accuracy as less spikes are available for decoding, but it has been suggested that on the population level, adaptation increases coding accuracy. This question requires careful analysis as adaptation not only changes the firing rates of neurons, but also the neural variability and correlations between neurons, which affect coding accuracy as well. We calculate the coding accuracy using a computational model that implements two forms of adaptation: spike frequency adaptation and synaptic adaptation in the form of short-term synaptic plasticity. We find that the net effect of adaptation is subtle and heterogeneous. Depending on adaptation mechanism and test stimulus, adaptation can either increase or decrease coding accuracy. We discuss the neurophysiological and psychophysical implications of the findings and relate it to published experimental data.
△ Less
Submitted 14 March, 2011;
originally announced March 2011.
-
Inhibitory synchrony as a mechanism for attentional gain modulation
Authors:
Paul H. E. Tiesinga,
Jean-Marc Fellous,
Emilio Salinas,
Jorge V. Jose,
Terrence J. Sejnowski
Abstract:
Recordings from area V4 of monkeys have revealed that when the focus of attention is on a visual stimulus within the receptive field of a cortical neuron, two distinct changes can occur: The firing rate of the neuron can change and there can be an increase in the coherence between spikes and the local field potential in the gamma-frequency range (30-50 Hz). The hypothesis explored here is that t…
▽ More
Recordings from area V4 of monkeys have revealed that when the focus of attention is on a visual stimulus within the receptive field of a cortical neuron, two distinct changes can occur: The firing rate of the neuron can change and there can be an increase in the coherence between spikes and the local field potential in the gamma-frequency range (30-50 Hz). The hypothesis explored here is that these observed effects of attention could be a consequence of changes in the synchrony of local interneuron networks. We performed computer simulations of a Hodgkin-Huxley type neuron driven by a constant depolarizing current, I, representing visual stimulation and a modulatory inhibitory input representing the effects of attention via local interneuron networks. We observed that the neuron's firing rate and the coherence of its output spike train with the synaptic inputs was modulated by the degree of synchrony of the inhibitory inputs. The model suggest that the observed changes in firing rate and coherence of neurons in the visual cortex could be controlled by top-down inputs that regulated the coherence in the activity of a local inhibitory network discharging at gamma frequencies.
△ Less
Submitted 14 March, 2005;
originally announced March 2005.
-
Complex Independent Component Analysis of Frequency-Domain Electroencephalographic Data
Authors:
Jorn Anemuller,
Terrence J. Sejnowski,
Scott Makeig
Abstract:
Independent component analysis (ICA) has proven useful for modeling brain and electroencephalographic (EEG) data. Here, we present a new, generalized method to better capture the dynamics of brain signals than previous ICA algorithms. We regard EEG sources as eliciting spatio-temporal activity patterns, corresponding to, e.g., trajectories of activation propagating across cortex. This leads to a…
▽ More
Independent component analysis (ICA) has proven useful for modeling brain and electroencephalographic (EEG) data. Here, we present a new, generalized method to better capture the dynamics of brain signals than previous ICA algorithms. We regard EEG sources as eliciting spatio-temporal activity patterns, corresponding to, e.g., trajectories of activation propagating across cortex. This leads to a model of convolutive signal superposition, in contrast with the commonly used instantaneous mixing model. In the frequency-domain, convolutive mixing is equivalent to multiplicative mixing of complex signal sources within distinct spectral bands. We decompose the recorded spectral-domain signals into independent components by a complex infomax ICA algorithm. First results from a visual attention EEG experiment exhibit (1) sources of spatio-temporal dynamics in the data, (2) links to subject behavior, (3) sources with a limited spectral extent, and (4) a higher degree of independence compared to sources derived by standard ICA.
△ Less
Submitted 25 November, 2003; v1 submitted 10 October, 2003;
originally announced October 2003.