-
Expressivity of Neural Networks with Random Weights and Learned Biases
Authors:
Ezekiel Williams,
Avery Hee-Woon Ryoo,
Thomas Jiralerspong,
Alexandre Payeur,
Matthew G. Perich,
Luca Mazzucato,
Guillaume Lajoie
Abstract:
Landmark universal function approximation results for neural networks with trained weights and biases provided impetus for the ubiquitous use of neural networks as learning models in Artificial Intelligence (AI) and neuroscience. Recent work has pushed the bounds of universal approximation by showing that arbitrary functions can similarly be learned by tuning smaller subsets of parameters, for exa…
▽ More
Landmark universal function approximation results for neural networks with trained weights and biases provided impetus for the ubiquitous use of neural networks as learning models in Artificial Intelligence (AI) and neuroscience. Recent work has pushed the bounds of universal approximation by showing that arbitrary functions can similarly be learned by tuning smaller subsets of parameters, for example the output weights, within randomly initialized networks. Motivated by the fact that biases can be interpreted as biologically plausible mechanisms for adjusting unit outputs in neural networks, such as tonic inputs or activation thresholds, we investigate the expressivity of neural networks with random weights where only biases are optimized. We provide theoretical and numerical evidence demonstrating that feedforward neural networks with fixed random weights can be trained to perform multiple tasks by learning biases only. We further show that an equivalent result holds for recurrent neural networks predicting dynamical system trajectories. Our results are relevant to neuroscience, where they demonstrate the potential for behaviourally relevant changes in dynamics without modifying synaptic weights, as well as for AI, where they shed light on multi-task methods such as bias fine-tuning and unit masking.
△ Less
Submitted 2 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
A Unified, Scalable Framework for Neural Population Decoding
Authors:
Mehdi Azabou,
Vinam Arora,
Venkataramana Ganesh,
Ximeng Mao,
Santosh Nachimuthu,
Michael J. Mendelson,
Blake Richards,
Matthew G. Perich,
Guillaume Lajoie,
Eva L. Dyer
Abstract:
Our ability to use deep learning approaches to decipher neural activity would likely benefit from greater scale, in terms of both model size and datasets. However, the integration of many neural recordings into one unified model is challenging, as each recording contains the activity of different neurons from different individual animals. In this paper, we introduce a training framework and archit…
▽ More
Our ability to use deep learning approaches to decipher neural activity would likely benefit from greater scale, in terms of both model size and datasets. However, the integration of many neural recordings into one unified model is challenging, as each recording contains the activity of different neurons from different individual animals. In this paper, we introduce a training framework and architecture designed to model the population dynamics of neural activity across diverse, large-scale neural recordings. Our method first tokenizes individual spikes within the dataset to build an efficient representation of neural events that captures the fine temporal structure of neural activity. We then employ cross-attention and a PerceiverIO backbone to further construct a latent tokenization of neural population activities. Utilizing this architecture and training framework, we construct a large-scale multi-session model trained on large datasets from seven nonhuman primates, spanning over 158 different sessions of recording from over 27,373 neural units and over 100 hours of recordings. In a number of different tasks, we demonstrate that our pretrained model can be rapidly adapted to new, unseen sessions with unspecified neuron correspondence, enabling few-shot performance with minimal labels. This work presents a powerful new approach for building deep learning tools to analyze neural data and stakes out a clear path to training at scale.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
How connectivity structure shapes rich and lazy learning in neural circuits
Authors:
Yuhan Helena Liu,
Aristide Baratin,
Jonathan Cornford,
Stefan Mihalas,
Eric Shea-Brown,
Guillaume Lajoie
Abstract:
In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biolo…
▽ More
In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in sha** learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.
△ Less
Submitted 19 February, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Synaptic Weight Distributions Depend on the Geometry of Plasticity
Authors:
Roman Pogodin,
Jonathan Cornford,
Arna Ghosh,
Gauthier Gidel,
Guillaume Lajoie,
Blake Richards
Abstract:
A growing literature in computational neuroscience leverages gradient descent and learning algorithms that approximate it to study synaptic plasticity in the brain. However, the vast majority of this work ignores a critical underlying assumption: the choice of distance for synaptic changes - i.e. the geometry of synaptic plasticity. Gradient descent assumes that the distance is Euclidean, but many…
▽ More
A growing literature in computational neuroscience leverages gradient descent and learning algorithms that approximate it to study synaptic plasticity in the brain. However, the vast majority of this work ignores a critical underlying assumption: the choice of distance for synaptic changes - i.e. the geometry of synaptic plasticity. Gradient descent assumes that the distance is Euclidean, but many other distances are possible, and there is no reason that biology necessarily uses Euclidean geometry. Here, using the theoretical tools provided by mirror descent, we show that the distribution of synaptic weights will depend on the geometry of synaptic plasticity. We use these results to show that experimentally-observed log-normal weight distributions found in several brain areas are not consistent with standard gradient descent (i.e. a Euclidean geometry), but rather with non-Euclidean distances. Finally, we show that it should be possible to experimentally test for different synaptic geometries by comparing synaptic weight distributions before and after learning. Overall, our work shows that the current paradigm in theoretical work on synaptic plasticity that assumes Euclidean synaptic geometry may be misguided and that it should be possible to experimentally determine the true geometry of synaptic plasticity in the brain.
△ Less
Submitted 4 March, 2024; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Sources of Richness and Ineffability for Phenomenally Conscious States
Authors:
Xu Ji,
Eric Elmoznino,
George Deane,
Axel Constant,
Guillaume Dumas,
Guillaume Lajoie,
Jonathan Simon,
Yoshua Bengio
Abstract:
Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theore…
▽ More
Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theoretic dynamical systems perspective on the richness and ineffability of consciousness. In our framework, the richness of conscious experience corresponds to the amount of information in a conscious state and ineffability corresponds to the amount of information lost at different stages of processing. We describe how attractor dynamics in working memory would induce impoverished recollections of our original experiences, how the discrete symbolic nature of language is insufficient for describing the rich and high-dimensional structure of experiences, and how similarity in the cognitive function of two individuals relates to improved communicability of their experiences to each other. While our model may not settle all questions relating to the explanatory gap, it makes progress toward a fully physicalist explanation of the richness and ineffability of conscious experience: two important aspects that seem to be part of what makes qualitative character so puzzling.
△ Less
Submitted 20 June, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Beyond accuracy: generalization properties of bio-plausible temporal credit assignment rules
Authors:
Yuhan Helena Liu,
Arna Ghosh,
Blake A. Richards,
Eric Shea-Brown,
Guillaume Lajoie
Abstract:
To unveil how the brain learns, ongoing work seeks biologically-plausible approximations of gradient descent algorithms for training recurrent neural networks (RNNs). Yet, beyond task accuracy, it is unclear if such learning rules converge to solutions that exhibit different levels of generalization than their nonbiologically-plausible counterparts. Leveraging results from deep learning theory bas…
▽ More
To unveil how the brain learns, ongoing work seeks biologically-plausible approximations of gradient descent algorithms for training recurrent neural networks (RNNs). Yet, beyond task accuracy, it is unclear if such learning rules converge to solutions that exhibit different levels of generalization than their nonbiologically-plausible counterparts. Leveraging results from deep learning theory based on loss landscape curvature, we ask: how do biologically-plausible gradient approximations affect generalization? We first demonstrate that state-of-the-art biologically-plausible learning rules for training RNNs exhibit worse and more variable generalization performance compared to their machine learning counterparts that follow the true gradient more closely. Next, we verify that such generalization performance is correlated significantly with loss landscape curvature, and we show that biologically-plausible learning rules tend to approach high-curvature regions in synaptic weight space. Using tools from dynamical systems, we derive theoretical arguments and present a theorem explaining this phenomenon. This predicts our numerical results, and explains why biologically-plausible rules lead to worse and more variable generalization properties. Finally, we suggest potential remedies that could be used by the brain to mitigate this effect. To our knowledge, our analysis is the first to identify the reason for this generalization gap between artificial and biologically-plausible learning rules, which can help guide future investigations into how the brain learns solutions that generalize.
△ Less
Submitted 13 January, 2023; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Learning shared neural manifolds from multi-subject FMRI data
Authors:
Jessie Huang,
Erica L. Busch,
Tom Wallenstein,
Michal Gerasimiuk,
Andrew Benz,
Guillaume Lajoie,
Guy Wolf,
Nicholas B. Turk-Browne,
Smita Krishnaswamy
Abstract:
Functional magnetic resonance imaging (fMRI) is a notoriously noisy measurement of brain activity because of the large variations between individuals, signals marred by environmental differences during collection, and spatiotemporal averaging required by the measurement resolution. In addition, the data is extremely high dimensional, with the space of the activity typically having much lower intri…
▽ More
Functional magnetic resonance imaging (fMRI) is a notoriously noisy measurement of brain activity because of the large variations between individuals, signals marred by environmental differences during collection, and spatiotemporal averaging required by the measurement resolution. In addition, the data is extremely high dimensional, with the space of the activity typically having much lower intrinsic dimension. In order to understand the connection between stimuli of interest and brain activity, and analyze differences and commonalities between subjects, it becomes important to learn a meaningful embedding of the data that denoises, and reveals its intrinsic structure. Specifically, we assume that while noise varies significantly between individuals, true responses to stimuli will share common, low-dimensional features between subjects which are jointly discoverable. Similar approaches have been exploited previously but they have mainly used linear methods such as PCA and shared response modeling (SRM). In contrast, we propose a neural network called MRMD-AE (manifold-regularized multiple decoder, autoencoder), that learns a common embedding from multiple subjects in an experiment while retaining the ability to decode to individual raw fMRI signals. We show that our learned common space represents an extensible manifold (where new points not seen during training can be mapped), improves the classification accuracy of stimulus features of unseen timepoints, as well as improves cross-subject translation of fMRI signals. We believe this framework can be used for many downstream applications such as guided brain-computer interface (BCI) training in the future.
△ Less
Submitted 22 December, 2021;
originally announced January 2022.
-
Efficient and robust multi-task learning in the brain with modular latent primitives
Authors:
Christian David Márton,
Léo Gagnon,
Guillaume Lajoie,
Kanaka Rajan
Abstract:
Biological agents do not have infinite resources to learn new things. For this reason, a central aspect of human learning is the ability to recycle previously acquired knowledge in a way that allows for faster, less resource-intensive acquisition of new skills. In spite of that, how neural networks in the brain leverage existing knowledge to learn new computations is not well understood. In this w…
▽ More
Biological agents do not have infinite resources to learn new things. For this reason, a central aspect of human learning is the ability to recycle previously acquired knowledge in a way that allows for faster, less resource-intensive acquisition of new skills. In spite of that, how neural networks in the brain leverage existing knowledge to learn new computations is not well understood. In this work, we study this question in artificial recurrent neural networks (RNNs) trained on a corpus of commonly used neuroscience tasks. Combining brain-inspired inductive biases we call functional and structural, we propose a system that learns new tasks by building on top of pre-trained latent dynamics organised into separate recurrent modules. These modules, acting as prior knowledge acquired previously through evolution or development, are pre-trained on the statistics of the full corpus of tasks so as to be independent and maximally informative. The resulting model, we call a Modular Latent Primitives (MoLaP) network, allows for learning multiple tasks while kee** parameter counts, and updates, low. We also show that the skills acquired with our approach are more robust to a broad range of perturbations compared to those acquired with other multi-task learning strategies, and that generalisation to new tasks is facilitated. This work offers a new perspective on achieving efficient multi-task learning in the brain, illustrating the benefits of leveraging pre-trained latent dynamical primitives.
△ Less
Submitted 25 May, 2022; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Advantages of biologically-inspired adaptive neural activation in RNNs during learning
Authors:
Victor Geadah,
Giancarlo Kerg,
Stefan Horoi,
Guy Wolf,
Guillaume Lajoie
Abstract:
Dynamic adaptation in single-neuron response plays a fundamental role in neural coding in biological neural networks. Yet, most neural activation functions used in artificial networks are fixed and mostly considered as an inconsequential architecture choice. In this paper, we investigate nonlinear activation function adaptation over the large time scale of learning, and outline its impact on seque…
▽ More
Dynamic adaptation in single-neuron response plays a fundamental role in neural coding in biological neural networks. Yet, most neural activation functions used in artificial networks are fixed and mostly considered as an inconsequential architecture choice. In this paper, we investigate nonlinear activation function adaptation over the large time scale of learning, and outline its impact on sequential processing in recurrent neural networks. We introduce a novel parametric family of nonlinear activation functions, inspired by input-frequency response curves of biological neurons, which allows interpolation between well-known activation functions such as ReLU and sigmoid. Using simple numerical experiments and tools from dynamical systems and information theory, we study the role of neural activation features in learning dynamics. We find that activation adaptation provides distinct task-specific solutions and in some cases, improves both learning speed and performance. Importantly, we find that optimal activation features emerging from our parametric family are considerably different from typical functions used in the literature, suggesting that exploiting the gap between these usual configurations can help learning. Finally, we outline situations where neural activation adaptation alone may help mitigate changes in input statistics in a given task, suggesting mechanisms for transfer learning optimization.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Correlation-based model of artificially induced plasticity in motor cortex by a bidirectional brain-computer interface
Authors:
Guillaume Lajoie,
Nedialko I. Krouchev,
John F. Kalaska,
Adrienne L. Fairhall,
Eberhard E. Fetz
Abstract:
Experiments show that spike-triggered stimulation performed with Bidirectional Brain-Computer-Interfaces (BBCI) can artificially strengthen connections between separate neural sites in motor cortex (MC). What are the neuronal mechanisms responsible for these changes and how does targeted stimulation by a BBCI shape population-level synaptic connectivity? The present work describes a recurrent neur…
▽ More
Experiments show that spike-triggered stimulation performed with Bidirectional Brain-Computer-Interfaces (BBCI) can artificially strengthen connections between separate neural sites in motor cortex (MC). What are the neuronal mechanisms responsible for these changes and how does targeted stimulation by a BBCI shape population-level synaptic connectivity? The present work describes a recurrent neural network model with probabilistic spiking mechanisms and plastic synapses capable of capturing both neural and synaptic activity statistics relevant to BBCI conditioning protocols. When spikes from a neuron recorded at one MC site trigger stimuli at a second target site after a fixed delay, the connections between sites are strengthened for spike-stimulus delays consistent with experimentally derived spike time dependent plasticity (STDP) rules. However, the relationship between STDP mechanisms at the level of networks, and their modification with neural implants remains poorly understood. Using our model, we successfully reproduces key experimental results and use analytical derivations, along with novel experimental data. We then derive optimal operational regimes for BBCIs, and formulate predictions concerning the efficacy of spike-triggered stimulation in different regimes of cortical activity.
△ Less
Submitted 2 September, 2016;
originally announced September 2016.
-
Revisiting chaos in stimulus-driven spiking networks: signal encoding and discrimination
Authors:
Guillaume Lajoie,
Kevin K Lin,
Jean-Philippe Thivierge,
Eric Shea-Brown
Abstract:
Highly connected recurrent neural networks often produce chaotic dynamics, meaning their precise activity is sensitive to small perturbations. What are the consequences for how such networks encode streams of temporal stimuli? On the one hand, chaos is a strong source of randomness, suggesting that small changes in stimuli will be obscured by intrinsically generated variability. On the other hand,…
▽ More
Highly connected recurrent neural networks often produce chaotic dynamics, meaning their precise activity is sensitive to small perturbations. What are the consequences for how such networks encode streams of temporal stimuli? On the one hand, chaos is a strong source of randomness, suggesting that small changes in stimuli will be obscured by intrinsically generated variability. On the other hand, recent work shows that the type of chaos that occurs in spiking networks can have a surprisingly low-dimensional structure, suggesting that there may be "room" for fine stimulus features to be precisely resolved. Here we show that strongly chaotic networks produce patterned spikes that reliably encode time-dependent stimuli: using a decoder sensitive to spike times on timescales of 10's of ms, one can easily distinguish responses to very similar inputs. Moreover, recurrence serves to distribute signals throughout chaotic networks so that small groups of cells can encode substantial information about signals arriving elsewhere. A conclusion is that the presence of strong chaos in recurrent networks does not prohibit precise stimulus encoding.
△ Less
Submitted 25 April, 2016;
originally announced April 2016.
-
Dynamic signal tracking in a simple V1 spiking model
Authors:
Guillaume Lajoie,
Lai-Sang Young
Abstract:
This work is part of an effort to understand the neural basis for our visual system's ability, or failure, to accurately track moving visual signals. We consider here a ring model of spiking neurons, intended as a simplified computational model of a single hypercolumn of the primary visual cortex. Signals that consist of edges with time-varying orientations localized in space are considered. Our m…
▽ More
This work is part of an effort to understand the neural basis for our visual system's ability, or failure, to accurately track moving visual signals. We consider here a ring model of spiking neurons, intended as a simplified computational model of a single hypercolumn of the primary visual cortex. Signals that consist of edges with time-varying orientations localized in space are considered. Our model is calibrated to produce spontaneous and driven firing rates roughly consistent with experiments, and our two main findings, for which we offer dynamical explanation on the level of neuronal interactions, are the following: (1) We have documented consistent transient overshoots in signal perception following signal switches due to emergent interactions of the E- and I-populations, and (2) for continuously moving signals, we have found that accuracy is considerably lower at reversals of orientation than when continuing in the same direction (as when the signal is a rotating bar). To measure performance, we use two metrics, called fidelity and reliability, to compare signals reconstructed by the system to the ones presented, and to assess trial-to-trial variability. We propose that the same population mechanisms responsible for orientation selectivity also impose constraints on dynamic signal tracking that manifest in perception failures consistent with psychophysical observations.
△ Less
Submitted 11 January, 2016;
originally announced January 2016.
-
Structured chaos shapes spike-response noise entropy in balanced neural networks
Authors:
Guillaume Lajoie,
Jean-Philippe Thivierge,
Eric Shea-Brown
Abstract:
Large networks of sparsely coupled, excitatory and inhibitory cells occur throughout the brain. A striking feature of these networks is that they are chaotic. How does this chaos manifest in the neural code? Specifically, how variable are the spike patterns that such a network produces in response to an input signal? To answer this, we derive a bound for the entropy of multi-cell spike pattern dis…
▽ More
Large networks of sparsely coupled, excitatory and inhibitory cells occur throughout the brain. A striking feature of these networks is that they are chaotic. How does this chaos manifest in the neural code? Specifically, how variable are the spike patterns that such a network produces in response to an input signal? To answer this, we derive a bound for the entropy of multi-cell spike pattern distributions in large recurrent networks of spiking neurons responding to fluctuating inputs. The analysis is based on results from random dynamical systems theory and is complimented by detailed numerical simulations. We find that the spike pattern entropy is an order of magnitude lower than what would be extrapolated from single cells. This holds despite the fact that network coupling becomes vanishingly sparse as network size grows -- a phenomenon that depends on ``extensive chaos," as previously discovered for balanced networks without stimulus drive. Moreover, we show how spike pattern entropy is controlled by temporal features of the inputs. Our findings provide insight into how neural networks may encode stimuli in the presence of inherently chaotic dynamics.
△ Less
Submitted 22 February, 2014; v1 submitted 27 November, 2013;
originally announced November 2013.
-
Chaos and reliability in balanced spiking networks with temporal drive
Authors:
Guillaume Lajoie,
Kevin K. Lin,
Eric Shea-Brown
Abstract:
Biological information processing is often carried out by complex networks of interconnected dynamical units. A basic question about such networks is that of reliability: if the same signal is presented many times with the network in different initial states, will the system entrain to the signal in a repeatable way? Reliability is of particular interest in neuroscience, where large, complex netwo…
▽ More
Biological information processing is often carried out by complex networks of interconnected dynamical units. A basic question about such networks is that of reliability: if the same signal is presented many times with the network in different initial states, will the system entrain to the signal in a repeatable way? Reliability is of particular interest in neuroscience, where large, complex networks of excitatory and inhibitory cells are ubiquitous. These networks are known to autonomously produce strongly chaotic dynamics - an obvious threat to reliability. Here, we show that such chaos persists in the presence of weak and strong stimuli, but that even in the presence of chaos, intermittent periods of highly reliable spiking often coexist with unreliable activity. We elucidate the local dynamical mechanisms involved in this intermittent reliability, and investigate the relationship between this phenomenon and certain time-dependent attractors arising from the dynamics. A conclusion is that chaotic dynamics do not have to be an obstacle to precise spike responses, a fact with implications for signal coding in large networks.
△ Less
Submitted 10 April, 2013; v1 submitted 13 September, 2012;
originally announced September 2012.