Search | arXiv e-print repository

How connectivity structure shapes rich and lazy learning in neural circuits

Authors: Yuhan Helena Liu, Aristide Baratin, Jonathan Cornford, Stefan Mihalas, Eric Shea-Brown, Guillaume Lajoie

Abstract: In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biolo… ▽ More In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in sha** learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting. △ Less

Submitted 19 February, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: Published at ICLR 2024

arXiv:2307.10515 [pdf, other]

Gaussian Partial Information Decomposition: Bias Correction and Application to High-dimensional Data

Authors: Praveen Venkatesh, Corbett Bennett, Sam Gale, Tamina K. Ramirez, Greggory Heller, Severine Durand, Shawn Olsen, Stefan Mihalas

Abstract: Recent advances in neuroscientific experimental techniques have enabled us to simultaneously record the activity of thousands of neurons across multiple brain regions. This has led to a growing need for computational tools capable of analyzing how task-relevant information is represented and communicated between several brain regions. Partial information decompositions (PIDs) have emerged as one s… ▽ More Recent advances in neuroscientific experimental techniques have enabled us to simultaneously record the activity of thousands of neurons across multiple brain regions. This has led to a growing need for computational tools capable of analyzing how task-relevant information is represented and communicated between several brain regions. Partial information decompositions (PIDs) have emerged as one such tool, quantifying how much unique, redundant and synergistic information two or more brain regions carry about a task-relevant message. However, computing PIDs is computationally challenging in practice, and statistical issues such as the bias and variance of estimates remain largely unexplored. In this paper, we propose a new method for efficiently computing and estimating a PID definition on multivariate Gaussian distributions. We show empirically that our method satisfies an intuitive additivity property, and recovers the ground truth in a battery of canonical examples, even at high dimensionality. We also propose and evaluate, for the first time, a method to correct the bias in PID estimates at finite sample sizes. Finally, we demonstrate that our Gaussian PID effectively characterizes inter-areal interactions in the mouse brain, revealing higher redundancy between visual areas when a stimulus is behaviorally relevant. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2206.01338 [pdf, other]

Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators

Authors: Yuhan Helena Liu, Stephen Smith, Stefan Mihalas, Eric Shea-Brown, Uygar Sümbül

Abstract: The spectacular successes of recurrent neural network models where key parameters are adjusted via backpropagation-based gradient descent have inspired much thought as to how biological neuronal networks might solve the corresponding synaptic credit assignment problem. There is so far little agreement, however, as to how biological networks could implement the necessary backpropagation through tim… ▽ More The spectacular successes of recurrent neural network models where key parameters are adjusted via backpropagation-based gradient descent have inspired much thought as to how biological neuronal networks might solve the corresponding synaptic credit assignment problem. There is so far little agreement, however, as to how biological networks could implement the necessary backpropagation through time, given widely recognized constraints of biological synaptic network signaling architectures. Here, we propose that extra-synaptic diffusion of local neuromodulators such as neuropeptides may afford an effective mode of backpropagation lying within the bounds of biological plausibility. Going beyond existing temporal truncation-based gradient approximations, our approximate gradient-based update rule, ModProp, propagates credit information through arbitrary time steps. ModProp suggests that modulatory signals can act on receiving cells by convolving their eligibility traces via causal, time-invariant and synapse-type-specific filter taps. Our mathematical analysis of ModProp learning, together with simulation results on benchmark temporal tasks, demonstrate the advantage of ModProp over existing biologically-plausible temporal credit assignment rules. These results suggest a potential neuronal mechanism for signaling credit information related to recurrent interactions over a longer time horizon. Finally, we derive an in-silico implementation of ModProp that could serve as a low-complexity and causal alternative to backpropagation through time. △ Less

Submitted 13 January, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022 Camera Ready

arXiv:1810.11594 [pdf, other]

Convolutional neural networks with extra-classical receptive fields

Authors: Brian Hu, Stefan Mihalas

Abstract: Convolutional neural networks (CNNs) have had great success in many real-world applications and have also been used to model visual processing in the brain. However, these networks are quite brittle - small changes in the input image can dramatically change a network's output prediction. In contrast to what is known from biology, these networks largely rely on feedforward connections, ignoring the… ▽ More Convolutional neural networks (CNNs) have had great success in many real-world applications and have also been used to model visual processing in the brain. However, these networks are quite brittle - small changes in the input image can dramatically change a network's output prediction. In contrast to what is known from biology, these networks largely rely on feedforward connections, ignoring the influence of recurrent connections. They also focus on supervised rather than unsupervised learning. To address these issues, we combine traditional supervised learning via backpropagation with a specialized unsupervised learning rule to learn lateral connections between neurons within a convolutional neural network. These connections have been shown to optimally integrate information from the surround, generating extra-classical receptive fields for the neurons in our new proposed model (CNNEx). Models with optimal lateral connections are more robust to noise and achieve better performance on noisy versions of the MNIST and CIFAR-10 datasets. Resistance to noise can be further improved by combining our model with additional regularization techniques such as dropout and weight decay. Although the image statistics of MNIST and CIFAR-10 differ greatly, the same unsupervised learning rule generalized to both datasets. Our results demonstrate the potential usefulness of combining supervised and unsupervised learning techniques and suggest that the integration of lateral connections into convolutional neural networks is an important area of future research. △ Less

Submitted 27 October, 2018; originally announced October 2018.

arXiv:1609.03622 [pdf]

doi 10.3389/fncom.2017.00028

A Computational Analysis of the Function of Three Inhibitory Cell Types in Contextual Visual Processing

Authors: Jung H. Lee, Christof Koch, Stefan Mihalas

Abstract: Most cortical inhibitory cell types exclusively express one of three genes, parvalbumin, somatostatin and 5HT3a. The visual responses of cortical neurons are affected not only by local cues, but also by visual context. As the inhibitory neuron types have distinctive synaptic sources and targets over different spatial extents and from different areas, we conjecture that they possess distinct roles… ▽ More Most cortical inhibitory cell types exclusively express one of three genes, parvalbumin, somatostatin and 5HT3a. The visual responses of cortical neurons are affected not only by local cues, but also by visual context. As the inhibitory neuron types have distinctive synaptic sources and targets over different spatial extents and from different areas, we conjecture that they possess distinct roles in contextual processing. We use modeling to relate structural information to function in primary visual cortex (V1) of the mouse, and investigate their role in contextual visual processing. Our findings are threefold. First, the inhibition mediated by parvalbumin positive (PV) cells mediates local processing and could underlie their role in boundary detection. Second, the inhibition mediated by somatostatin-positive (SST) cells facilitates longer range spatial competition among receptive fields. Third, non-specific top-down modulation to interneurons expressing vasoactive intestinal polypeptide (VIP), a subclass of 5HT3a neurons, can selectively enhance V1 responses. △ Less

Submitted 12 September, 2016; originally announced September 2016.

Comments: 39 pages, 5 figures, 4 supplemental figures, 2 tables

arXiv:1605.09073 [pdf, other]

doi 10.1103/PhysRevE.98.062312

Feedback through graph motifs relates structure and function in complex networks

Authors: Yu Hu, Steven L. Brunton, Nicholas Cain, Stefan Mihalas, J. Nathan Kutz, Eric Shea-Brown

Abstract: In physics, biology and engineering, network systems abound. How does the connectivity of a network system combine with the behavior of its individual components to determine its collective function? We approach this question for networks with linear time-invariant dynamics by relating internal network feedbacks to the statistical prevalence of connectivity motifs, a set of surprisingly simple and… ▽ More In physics, biology and engineering, network systems abound. How does the connectivity of a network system combine with the behavior of its individual components to determine its collective function? We approach this question for networks with linear time-invariant dynamics by relating internal network feedbacks to the statistical prevalence of connectivity motifs, a set of surprisingly simple and local statistics of connectivity. This results in a reduced order model of the network input-output dynamics in terms of motifs structures. As an example, the new formulation dramatically simplifies the classic Erdos-Renyi graph, reducing the overall network behavior to one proportional feedback wrapped around the dynamics of a single node. For general networks, higher-order motifs systematically provide further layers and types of feedback to regulate the network response. Thus, the local connectivity shapes temporal and spectral processing by the network as a whole, and we show how this enables robust, yet tunable, functionality such as extending the time constant with which networks remember past signals. The theory also extends to networks composed from heterogeneous nodes with distinct dynamics and connectivity, and patterned input to (and readout from) subsets of nodes. These statistical descriptions provide a powerful theoretical framework to understand the functionality of real-world network systems, as we illustrate with examples including the mouse brain connectome. △ Less

Submitted 18 December, 2018; v1 submitted 29 May, 2016; originally announced May 2016.

Comments: 31 pages, 20 figures

Journal ref: Phys. Rev. E 98, 062312 (2018)

arXiv:1605.08031 [pdf, other]

High resolution neural connectivity from incomplete tracing data using nonnegative spline regression

Authors: Kameron Decker Harris, Stefan Mihalas, Eric Shea-Brown

Abstract: Whole-brain neural connectivity data are now available from viral tracing experiments, which reveal the connections between a source injection site and elsewhere in the brain. These hold the promise of revealing spatial patterns of connectivity throughout the mammalian brain. To achieve this goal, we seek to fit a weighted, nonnegative adjacency matrix among 100 $μ$m brain "voxels" using viral tra… ▽ More Whole-brain neural connectivity data are now available from viral tracing experiments, which reveal the connections between a source injection site and elsewhere in the brain. These hold the promise of revealing spatial patterns of connectivity throughout the mammalian brain. To achieve this goal, we seek to fit a weighted, nonnegative adjacency matrix among 100 $μ$m brain "voxels" using viral tracer data. Despite a multi-year experimental effort, injections provide incomplete coverage, and the number of voxels in our data is orders of magnitude larger than the number of injections, making the problem severely underdetermined. Furthermore, projection data are missing within the injection site because local connections there are not separable from the injection signal. We use a novel machine-learning algorithm to meet these challenges and develop a spatially explicit, voxel-scale connectivity map of the mouse visual system. Our method combines three features: a matrix completion loss for missing data, a smoothing spline penalty to regularize the problem, and (optionally) a low rank factorization. We demonstrate the consistency of our estimator using synthetic data and then apply it to newly available Allen Mouse Brain Connectivity Atlas data for the visual system. Our algorithm is significantly more predictive than current state of the art approaches which assume regions to be homogeneous. We demonstrate the efficacy of a low rank version on visual cortex data and discuss the possibility of extending this to a whole-brain connectivity matrix at the voxel scale. △ Less

Submitted 26 October, 2016; v1 submitted 24 May, 2016; originally announced May 2016.

Comments: Supplement at https://github.com/kharris/high-res-connectivity-nips-2016

MSC Class: 62J07; 92C20

Journal ref: NIPS, 2016

arXiv:1306.1200 [pdf]

doi 10.3389/fnbeh.2014.00061

A general theory of intertemporal decision-making and the perception of time

Authors: Vijay Mohan K Namboodiri, Stefan Mihalas, Tanya Marton, Marshall G Hussain Shuler

Abstract: Animals and humans make decisions based on their expected outcomes. Since relevant outcomes are often delayed, perceiving delays and choosing between earlier versus later rewards (intertemporal decision-making) is an essential component of animal behavior. The myriad observations made in experiments studying intertemporal decision-making and time perception have not yet been rationalized within a… ▽ More Animals and humans make decisions based on their expected outcomes. Since relevant outcomes are often delayed, perceiving delays and choosing between earlier versus later rewards (intertemporal decision-making) is an essential component of animal behavior. The myriad observations made in experiments studying intertemporal decision-making and time perception have not yet been rationalized within a single theory. Here we present a theory-Training--Integrated Maximized Estimation of Reinforcement Rate (TIMERR)--that explains a wide variety of behavioral observations made in intertemporal decision-making and the perception of time. Our theory postulates that animals make intertemporal choices to optimize expected reward rates over a limited temporal window; this window includes a past integration interval (over which experienced reward rate is estimated) and the expected delay to future reward. Using this theory, we derive a mathematical expression for the subjective representation of time. A unique contribution of our work is in finding that the past integration interval directly determines the steepness of temporal discounting and the nonlinearity of time perception. In so doing, our theory provides a single framework to understand both intertemporal decision-making and time perception. △ Less

Submitted 9 November, 2013; v1 submitted 5 June, 2013; originally announced June 2013.

Comments: 37 pages, 4 main figures, 3 supplementary figures

Journal ref: Front. Behav. Neurosci. 8:61

Showing 1–8 of 8 results for author: Mihalas, S