-
Duality Principle and Biologically Plausible Learning: Connecting the Representer Theorem and Hebbian Learning
Authors:
Yanis Bahroun,
Dmitri B. Chklovskii,
Anirvan M. Sengupta
Abstract:
A normative approach called Similarity Matching was recently introduced for deriving and understanding the algorithmic basis of neural computation focused on unsupervised problems. It involves deriving algorithms from computational objectives and evaluating their compatibility with anatomical and physiological observations. In particular, it introduces neural architectures by considering dual alte…
▽ More
A normative approach called Similarity Matching was recently introduced for deriving and understanding the algorithmic basis of neural computation focused on unsupervised problems. It involves deriving algorithms from computational objectives and evaluating their compatibility with anatomical and physiological observations. In particular, it introduces neural architectures by considering dual alternatives instead of primal formulations of popular models such as PCA. However, its connection to the Representer theorem remains unexplored. In this work, we propose to use teachings from this approach to explore supervised learning algorithms and clarify the notion of Hebbian learning. We examine regularized supervised learning and elucidate the emergence of neural architecture and additive versus multiplicative update rules. In this work, we focus not on develo** new algorithms but on showing that the Representer theorem offers the perfect lens to study biologically plausible learning algorithms. We argue that many past and current advancements in the field rely on some form of dual formulation to introduce biological plausibility. In short, as long as a dual formulation exists, it is possible to derive biologically plausible algorithms. Our work sheds light on the pivotal role of the Representer theorem in advancing our comprehension of neural computation.
△ Less
Submitted 2 August, 2023;
originally announced September 2023.
-
Unlocking the Potential of Similarity Matching: Scalability, Supervision and Pre-training
Authors:
Yanis Bahroun,
Shagesh Sridharan,
Atithi Acharya,
Dmitri B. Chklovskii,
Anirvan M. Sengupta
Abstract:
While effective, the backpropagation (BP) algorithm exhibits limitations in terms of biological plausibility, computational cost, and suitability for online learning. As a result, there has been a growing interest in develo** alternative biologically plausible learning approaches that rely on local learning rules. This study focuses on the primarily unsupervised similarity matching (SM) framewor…
▽ More
While effective, the backpropagation (BP) algorithm exhibits limitations in terms of biological plausibility, computational cost, and suitability for online learning. As a result, there has been a growing interest in develo** alternative biologically plausible learning approaches that rely on local learning rules. This study focuses on the primarily unsupervised similarity matching (SM) framework, which aligns with observed mechanisms in biological systems and offers online, localized, and biologically plausible algorithms. i) To scale SM to large datasets, we propose an implementation of Convolutional Nonnegative SM using PyTorch. ii) We introduce a localized supervised SM objective reminiscent of canonical correlation analysis, facilitating stacking SM layers. iii) We leverage the PyTorch implementation for pre-training architectures such as LeNet and compare the evaluation of features against BP-trained models. This work combines biologically plausible algorithms with computational efficiency opening multiple avenues for further explorations.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Normative framework for deriving neural networks with multi-compartmental neurons and non-Hebbian plasticity
Authors:
David Lipshutz,
Yanis Bahroun,
Siavash Golkar,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point…
▽ More
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point neurons and Hebbian/anti-Hebbian plasticity. These NN models account for many anatomical and physiological observations; however, the objectives have limited computational power and the derived NNs do not explain multi-compartmental neuronal structures and non-Hebbian forms of plasticity that are prevalent throughout the brain. In this article, we unify and generalize recent extensions of the similarity matching approach to address more complex objectives, including a large class of unsupervised and self-supervised learning tasks that can be formulated as symmetric generalized eigenvalue problems or nonnegative matrix factorization problems. Interestingly, the online algorithms derived from these objectives naturally map onto NNs with multi-compartmental neurons and local, non-Hebbian learning rules. Therefore, this unified extension of the similarity matching approach provides a normative framework that facilitates understanding multi-compartmental neuronal structures and non-Hebbian plasticity found throughout the brain.
△ Less
Submitted 3 August, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
A Normative and Biologically Plausible Algorithm for Independent Component Analysis
Authors:
Yanis Bahroun,
Dmitri B Chklovskii,
Anirvan M Sengupta
Abstract:
The brain effortlessly solves blind source separation (BSS) problems, but the algorithm it uses remains elusive. In signal processing, linear BSS problems are often solved by Independent Component Analysis (ICA). To serve as a model of a biological circuit, the ICA neural network (NN) must satisfy at least the following requirements: 1. The algorithm must operate in the online setting where data s…
▽ More
The brain effortlessly solves blind source separation (BSS) problems, but the algorithm it uses remains elusive. In signal processing, linear BSS problems are often solved by Independent Component Analysis (ICA). To serve as a model of a biological circuit, the ICA neural network (NN) must satisfy at least the following requirements: 1. The algorithm must operate in the online setting where data samples are streamed one at a time, and the NN computes the sources on the fly without storing any significant fraction of the data in memory. 2. The synaptic weight update is local, i.e., it depends only on the biophysical variables present in the vicinity of a synapse. Here, we propose a novel objective function for ICA from which we derive a biologically plausible NN, including both the neural architecture and the synaptic learning rules. Interestingly, our algorithm relies on modulating synaptic plasticity by the total activity of the output neurons. In the brain, this could be accomplished by neuromodulators, extracellular calcium, local field potential, or nitric oxide.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Neural optimal feedback control with local learning rules
Authors:
Johannes Friedrich,
Siavash Golkar,
Shiva Farashahi,
Alexander Genkin,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the K…
▽ More
A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the Kalman filter or its extensions. However, a satisfactory neural model of Kalman filtering and control is lacking because existing proposals have the following limitations: not considering the delay of sensory feedback, training in alternating phases, and requiring knowledge of the noise covariance matrices, as well as that of systems dynamics. Moreover, the majority of these studies considered Kalman filtering in isolation, and not jointly with control. To address these shortcomings, we introduce a novel online algorithm which combines adaptive Kalman filtering with a model free control approach (i.e., policy gradient algorithm). We implement this algorithm in a biologically plausible neural network with local synaptic plasticity rules. This network performs system identification and Kalman filtering, without the need for multiple phases with distinct update rules or the knowledge of the noise covariances. It can perform state estimation with delayed sensory feedback, with the help of an internal model. It learns the control policy without requiring any knowledge of the dynamics, thus avoiding the need for weight transport. In this way, our implementation of OFC solves the credit assignment problem needed to produce the appropriate sensory-motor control in the presence of stimulus delay.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Informationally complete POVM-based shadow tomography
Authors:
Atithi Acharya,
Siddhartha Saha,
Anirvan M. Sengupta
Abstract:
Recently introduced shadow tomography protocols use classical shadows of quantum states to predict many target functions of an unknown quantum state. Unlike full quantum state tomography, shadow tomography does not insist on accurate recovery of the density matrix for high rank mixed states. Yet, such a protocol makes multiple accurate predictions with high confidence, based on a moderate number o…
▽ More
Recently introduced shadow tomography protocols use classical shadows of quantum states to predict many target functions of an unknown quantum state. Unlike full quantum state tomography, shadow tomography does not insist on accurate recovery of the density matrix for high rank mixed states. Yet, such a protocol makes multiple accurate predictions with high confidence, based on a moderate number of quantum measurements. One particular influential algorithm, proposed by Huang, Kueng, and Preskill arXiv:2002.08953, requires additional circuits for performing certain random unitary transformations. In this paper, we avoid these transformations but employ an arbitrary informationally complete POVM and show that such a procedure can compute k-bit correlation functions for quantum states reliably. We also show that, for this application, we do not need the median of means procedure of Huang et al. Finally, we discuss the contrast between the computation of correlation functions and fidelity of reconstruction of low rank density matrices.
△ Less
Submitted 26 May, 2021; v1 submitted 12 May, 2021;
originally announced May 2021.
-
A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit
Authors:
Yanis Bahroun,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
Learning to detect content-independent transformations from data is one of the central problems in biological and artificial intelligence. An example of such problem is unsupervised learning of a visual motion detector from pairs of consecutive video frames. Rao and Ruderman formulated this problem in terms of learning infinitesimal transformation operators (Lie group generators) via minimizing im…
▽ More
Learning to detect content-independent transformations from data is one of the central problems in biological and artificial intelligence. An example of such problem is unsupervised learning of a visual motion detector from pairs of consecutive video frames. Rao and Ruderman formulated this problem in terms of learning infinitesimal transformation operators (Lie group generators) via minimizing image reconstruction error. Unfortunately, it is difficult to map their model onto a biologically plausible neural network (NN) with local learning rules. Here we propose a biologically plausible model of motion detection. We also adopt the transformation-operator approach but, instead of reconstruction-error minimization, start with a similarity-preserving objective function. An online algorithm that optimizes such an objective function naturally maps onto an NN with biologically plausible learning rules. The trained NN recapitulates major features of the well-studied motion detector in the fly. In particular, it is consistent with the experimental observation that local motion detectors combine information from at least three adjacent pixels, something that contradicts the celebrated Hassenstein-Reichardt model.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
A biologically plausible neural network for local supervision in cortical microcircuits
Authors:
Siavash Golkar,
David Lipshutz,
Yanis Bahroun,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
The backpropagation algorithm is an invaluable tool for training artificial neural networks; however, because of a weight sharing requirement, it does not provide a plausible model of brain function. Here, in the context of a two-layer network, we derive an algorithm for training a neural network which avoids this problem by not requiring explicit error computation and backpropagation. Furthermore…
▽ More
The backpropagation algorithm is an invaluable tool for training artificial neural networks; however, because of a weight sharing requirement, it does not provide a plausible model of brain function. Here, in the context of a two-layer network, we derive an algorithm for training a neural network which avoids this problem by not requiring explicit error computation and backpropagation. Furthermore, our algorithm maps onto a neural network that bears a remarkable resemblance to the connectivity structure and learning rules of the cortex. We find that our algorithm empirically performs comparably to backprop on a number of datasets.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
A simple normative network approximates local non-Hebbian learning in the cortex
Authors:
Siavash Golkar,
David Lipshutz,
Yanis Bahroun,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
To guide behavior, the brain extracts relevant features from high-dimensional data streamed by sensory organs. Neuroscience experiments demonstrate that the processing of sensory inputs by cortical neurons is modulated by instructive signals which provide context and task-relevant information. Here, adopting a normative approach, we model these instructive signals as supervisory inputs guiding the…
▽ More
To guide behavior, the brain extracts relevant features from high-dimensional data streamed by sensory organs. Neuroscience experiments demonstrate that the processing of sensory inputs by cortical neurons is modulated by instructive signals which provide context and task-relevant information. Here, adopting a normative approach, we model these instructive signals as supervisory inputs guiding the projection of the feedforward data. Mathematically, we start with a family of Reduced-Rank Regression (RRR) objective functions which include Reduced Rank (minimum) Mean Square Error (RRMSE) and Canonical Correlation Analysis (CCA), and derive novel offline and online optimization algorithms, which we call Bio-RRR. The online algorithms can be implemented by neural networks whose synaptic learning rules resemble calcium plateau potential dependent plasticity observed in the cortex. We detail how, in our model, the calcium plateau potential can be interpreted as a backpropagating error signal. We demonstrate that, despite relying exclusively on biologically plausible local learning rules, our algorithms perform competitively with existing implementations of RRMSE and CCA.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
A biologically plausible neural network for multi-channel Canonical Correlation Analysis
Authors:
David Lipshutz,
Yanis Bahroun,
Siavash Golkar,
Anirvan M. Sengupta,
Dmitri B. Chklovskii
Abstract:
Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement Canonical Correlation Analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this en…
▽ More
Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement Canonical Correlation Analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this end, we seek a multi-channel CCA algorithm that can be implemented in a biologically plausible neural network. For biological plausibility, we require that the network operates in the online setting and its synaptic update rules are local. Starting from a novel CCA objective function, we derive an online optimization algorithm whose optimization steps can be implemented in a single-layer neural network with multi-compartmental neurons and local non-Hebbian learning rules. We also derive an extension of our online CCA algorithm with adaptive output rank and output whitening. Interestingly, the extension maps onto a neural network whose neural architecture and synaptic updates resemble neural circuitry and synaptic plasticity observed experimentally in cortical pyramidal neurons.
△ Less
Submitted 26 March, 2021; v1 submitted 1 October, 2020;
originally announced October 2020.
-
A Neural Network for Semi-Supervised Learning on Manifolds
Authors:
Alexander Genkin,
Anirvan M. Sengupta,
Dmitri Chklovskii
Abstract:
Semi-supervised learning algorithms typically construct a weighted graph of data points to represent a manifold. However, an explicit graph representation is problematic for neural networks operating in the online setting. Here, we propose a feed-forward neural network capable of semi-supervised learning on manifolds without using an explicit graph representation. Our algorithm uses channels that…
▽ More
Semi-supervised learning algorithms typically construct a weighted graph of data points to represent a manifold. However, an explicit graph representation is problematic for neural networks operating in the online setting. Here, we propose a feed-forward neural network capable of semi-supervised learning on manifolds without using an explicit graph representation. Our algorithm uses channels that represent localities on the manifold such that correlations between channels represent manifold structure. The proposed neural network has two layers. The first layer learns to build a representation of low-dimensional manifolds in the input data as proposed recently in [8]. The second learns to classify data using both occasional supervision and similarity of the manifold representation of the data. The channel carrying label information for the second layer is assumed to be "silent" most of the time. Learning in both layers is Hebbian, making our network design biologically plausible. We experimentally demonstrate the effect of semi-supervised learning on non-trivial manifolds.
△ Less
Submitted 21 August, 2019;
originally announced August 2019.
-
Clustering is semidefinitely not that hard: Nonnegative SDP for manifold disentangling
Authors:
Mariano Tepper,
Anirvan M. Sengupta,
Dmitri Chklovskii
Abstract:
In solving hard computational problems, semidefinite program (SDP) relaxations often play an important role because they come with a guarantee of optimality. Here, we focus on a popular semidefinite relaxation of K-means clustering which yields the same solution as the non-convex original formulation for well segregated datasets. We report an unexpected finding: when data contains (greater than ze…
▽ More
In solving hard computational problems, semidefinite program (SDP) relaxations often play an important role because they come with a guarantee of optimality. Here, we focus on a popular semidefinite relaxation of K-means clustering which yields the same solution as the non-convex original formulation for well segregated datasets. We report an unexpected finding: when data contains (greater than zero-dimensional) manifolds, the SDP solution captures such geometrical structures. Unlike traditional manifold embedding techniques, our approach does not rely on manually defining a kernel but rather enforces locality via a nonnegativity constraint. We thus call our approach NOnnegative MAnifold Disentangling, or NOMAD. To build an intuitive understanding of its manifold learning capabilities, we develop a theoretical analysis of NOMAD on idealized datasets. While NOMAD is convex and the globally optimal solution can be found by generic SDP solvers with polynomial time complexity, they are too slow for modern datasets. To address this problem, we analyze a non-convex heuristic and present a new, convex and yet efficient, algorithm, based on the conditional gradient method. Our results render NOMAD a versatile, understandable, and powerful tool for manifold learning.
△ Less
Submitted 5 September, 2018; v1 submitted 19 June, 2017;
originally announced June 2017.
-
Critical Behavior and Universality Classes for an Algorithmic Phase Transition in Sparse Reconstruction
Authors:
Mohammad Ramezanali,
Partha P. Mitra,
Anirvan M. Sengupta
Abstract:
Recovery of an $N$-dimensional, $K$-sparse solution $\mathbf{x}$ from an $M$-dimensional vector of measurements $\mathbf{y}$ for multivariate linear regression can be accomplished by minimizing a suitably penalized least-mean-square cost $||\mathbf{y}-\mathbf{H} \mathbf{x}||_2^2+λV(\mathbf{x})$. Here $\mathbf{H}$ is a known matrix and $V(\mathbf{x})$ is an algorithm-dependent sparsity-inducing pen…
▽ More
Recovery of an $N$-dimensional, $K$-sparse solution $\mathbf{x}$ from an $M$-dimensional vector of measurements $\mathbf{y}$ for multivariate linear regression can be accomplished by minimizing a suitably penalized least-mean-square cost $||\mathbf{y}-\mathbf{H} \mathbf{x}||_2^2+λV(\mathbf{x})$. Here $\mathbf{H}$ is a known matrix and $V(\mathbf{x})$ is an algorithm-dependent sparsity-inducing penalty. For `random' $\mathbf{H}$, in the limit $λ\rightarrow 0$ and $M,N,K\rightarrow \infty$, kee** $ρ=K/N$ and $α=M/N$ fixed, exact recovery is possible for $α$ past a critical value $α_c = α(ρ)$. Assuming $\mathbf{x}$ has iid entries, the critical curve exhibits some universality, in that its shape does not depend on the distribution of $\mathbf{x}$. However, the algorithmic phase transition occurring at $α=α_c$ and associated universality classes remain ill-understood from a statistical physics perspective, i.e. in terms of scaling exponents near the critical curve. In this article, we analyze the mean-field equations for two algorithms, Basis Pursuit ($V(\mathbf{x})=||\mathbf{x}||_{1} $) and Elastic Net ($V(\mathbf{x})= ||\mathbf{x}||_{1} + \tfrac{g}{2} ||\mathbf{x}||_{2}^2$) and show that they belong to different universality classes in the sense of scaling exponents, with Mean Squared Error (MSE) of the recovered vector scaling as $λ^\frac{4}{3}$ and $λ$ respectively, for small $λ$ on the critical line. In the presence of additive noise, we find that, when $α>α_c$, MSE is minimized at a non-zero value for $λ$, whereas at $α=α_c$, MSE always increases with $λ$.
△ Less
Submitted 28 October, 2019; v1 submitted 29 September, 2015;
originally announced September 2015.
-
The cavity method for analysis of large-scale penalized regression
Authors:
Mohammad Ramezanali,
Partha P. Mitra,
Anirvan M. Sengupta
Abstract:
Penalized regression methods aim to retrieve reliable predictors among a large set of putative ones from a limited amount of measurements. In particular, penalized regression with singular penalty functions is important for sparse reconstruction algorithms. For large-scale problems, these algorithms exhibit sharp phase transition boundaries where sparse retrieval breaks down. Large optimization pr…
▽ More
Penalized regression methods aim to retrieve reliable predictors among a large set of putative ones from a limited amount of measurements. In particular, penalized regression with singular penalty functions is important for sparse reconstruction algorithms. For large-scale problems, these algorithms exhibit sharp phase transition boundaries where sparse retrieval breaks down. Large optimization problems associated with sparse reconstruction have been analyzed in the literature by setting up corresponding statistical mechanical models at a finite temperature. Using replica method for mean field approximation, and subsequently taking a zero temperature limit, this approach reproduces the algorithmic phase transition boundaries. Unfortunately, the replica trick and the non-trivial zero temperature limit obscure the underlying reasons for the failure of a sparse reconstruction algorithm, and of penalized regression methods, in general. In this paper, we employ the ``cavity method'' to give an alternative derivation of the mean field equations, working directly in the zero-temperature limit. This derivation provides insight into the origin of the different terms in the self-consistency conditions. The cavity method naturally involves a quantity, the average local susceptibility, whose behavior distinguishes different phases in this system. This susceptibility can be generalized for analysis of a broader class of sparse reconstruction algorithms.
△ Less
Submitted 25 November, 2015; v1 submitted 13 January, 2015;
originally announced January 2015.
-
SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding
Authors:
Rajat S. Roy,
Kevin C. Chen,
Anirvan M. Sengupta,
Alexander Schliep
Abstract:
Scaffolding is an important subproblem in "de novo" genome assembly in which mate pair data are used to construct a linear sequence of contigs separated by gaps. Here we present SLIQ, a set of simple linear inequalities derived from the geometry of contigs on the line that can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a c…
▽ More
Scaffolding is an important subproblem in "de novo" genome assembly in which mate pair data are used to construct a linear sequence of contigs separated by gaps. Here we present SLIQ, a set of simple linear inequalities derived from the geometry of contigs on the line that can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a contig digraph. The SLIQ inequalities can also filter out unreliable mate pairs and can be used as a preprocessing step for any scaffolding algorithm. We tested the SLIQ inequalities on five real data sets ranging in complexity from simple bacterial genomes to complex mammalian genomes and compared the results to the majority voting procedure used by many other scaffolding algorithms. SLIQ predicted the relative positions and orientations of the contigs with high accuracy in all cases and gave more accurate position predictions than majority voting for complex genomes, in particular the human genome. Finally, we present a simple scaffolding algorithm that produces linear scaffolds given a contig digraph. We show that our algorithm is very efficient compared to other scaffolding algorithms while maintaining high accuracy in predicting both contig positions and orientations for real data sets.
△ Less
Submitted 9 November, 2011; v1 submitted 6 November, 2011;
originally announced November 2011.