Skip to main content

Showing 1–13 of 13 results for author: Golkar, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02585  [pdf, other

    cs.LG cs.AI stat.ML

    Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task

    Authors: Siavash Golkar, Alberto Bietti, Mariel Pettee, Michael Eickenberg, Miles Cranmer, Keiya Hirashima, Geraud Krawezik, Nicholas Lourie, Michael McCabe, Rudy Morel, Ruben Ohana, Liam Holden Parker, Bruno Régaldo-Saint Blancard, Kyunghyun Cho, Shirley Ho

    Abstract: Transformers have revolutionized machine learning across diverse domains, yet understanding their behavior remains crucial, particularly in high-stakes applications. This paper introduces the contextual counting task, a novel toy problem aimed at enhancing our understanding of Transformers in quantitative and scientific contexts. This task requires precise localization and computation within datas… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

  2. arXiv:2401.03248  [pdf, other

    q-bio.NC eess.SY nlin.CD stat.ML

    Neuronal Temporal Filters as Normal Mode Extractors

    Authors: Siavash Golkar, Jules Berman, David Lipshutz, Robert Mihai Haret, Tim Gollisch, Dmitri B. Chklovskii

    Abstract: To generate actions in the face of physiological delays, the brain must predict the future. Here we explore how prediction may lie at the core of brain function by considering a neuron predicting the future of a scalar time series input. Assuming that the dynamics of the lag vector (a vector composed of several consecutive elements of the time series) are locally linear, Normal Mode Decomposition… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  3. arXiv:2310.02994  [pdf, other

    cs.LG cs.AI stat.ML

    Multiple Physics Pretraining for Physical Surrogate Models

    Authors: Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Geraud Krawezik, Francois Lanusse, Mariel Pettee, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

    Abstract: We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling. MPP involves training large surrogate models to predict the dynamics of multiple heterogeneous physical systems simultaneously by learning features that are broadly useful across diverse physical tasks. In order to learn effectively in this setting, we introduce a… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  4. arXiv:2310.02989  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    xVal: A Continuous Number Encoding for Large Language Models

    Authors: Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, Miles Cranmer, Geraud Krawezik, Francois Lanusse, Michael McCabe, Ruben Ohana, Liam Parker, Bruno Régaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

    Abstract: Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers. We propose xVal, a numerical encoding scheme that represents any real number using just a single token. xVal represents a given real number by scaling a dedicated embedding vector by the number value. Combined with a modified number-inference… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 10 pages 7 figures. Supplementary: 5 pages 2 figures

  5. arXiv:2302.10051  [pdf, other

    q-bio.NC cs.NE stat.ML

    Normative framework for deriving neural networks with multi-compartmental neurons and non-Hebbian plasticity

    Authors: David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point… ▽ More

    Submitted 3 August, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Added: Figure 1, sections 2, 3

  6. arXiv:2211.07723  [pdf, other

    stat.ML cs.LG cs.NE

    An online algorithm for contrastive Principal Component Analysis

    Authors: Siavash Golkar, David Lipshutz, Tiberiu Tesileanu, Dmitri B. Chklovskii

    Abstract: Finding informative low-dimensional representations that can be computed efficiently in large datasets is an important problem in data analysis. Recently, contrastive Principal Component Analysis (cPCA) was proposed as a more informative generalization of PCA that takes advantage of contrastive learning. However, the performance of cPCA is sensitive to hyper-parameter choice and there is currently… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures

  7. arXiv:2010.12644  [pdf, other

    q-bio.NC cs.LG cs.NE stat.ML

    A biologically plausible neural network for Slow Feature Analysis

    Authors: David Lipshutz, Charlie Windolf, Siavash Golkar, Dmitri B. Chklovskii

    Abstract: Learning latent features from time series data is an important problem in both machine learning and brain function. One approach, called Slow Feature Analysis (SFA), leverages the slowness of many salient features relative to the rapidly varying input signals. Furthermore, when trained on naturalistic stimuli, SFA reproduces interesting properties of cells in the primary visual cortex and hippocam… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: 17 pages, 7 figures

  8. arXiv:2010.00525  [pdf, other

    q-bio.NC cs.NE stat.ML

    A biologically plausible neural network for multi-channel Canonical Correlation Analysis

    Authors: David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chklovskii

    Abstract: Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement Canonical Correlation Analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this en… ▽ More

    Submitted 26 March, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: 46 pages, 14 figures

  9. arXiv:1911.11691  [pdf, other

    cs.LG cs.NE q-bio.NC stat.ML

    Emergent Structures and Lifetime Structure Evolution in Artificial Neural Networks

    Authors: Siavash Golkar

    Abstract: Motivated by the flexibility of biological neural networks whose connectivity structure changes significantly during their lifetime, we introduce the Unstructured Recursive Network (URN) and demonstrate that it can exhibit similar flexibility during training via gradient descent. We show empirically that many of the different neural network structures commonly used in practice today (including ful… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: Proceedings of NeurIPS workshop on Real Neurons & Hidden Units. 5 Pages, 6 figures

  10. arXiv:1905.05843  [pdf, other

    cs.LG stat.ML

    Task-Driven Data Verification via Gradient Descent

    Authors: Siavash Golkar, Kyunghyun Cho

    Abstract: We introduce a novel algorithm for the detection of possible sample corruption such as mislabeled samples in a training dataset given a small clean validation set. We use a set of inclusion variables which determine whether or not any element of the noisy training set should be included in the training of a network. We compute these inclusion variables by optimizing the performance of the network… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 10 pages, 6 figures

  11. arXiv:1904.05903  [pdf, other

    quant-ph hep-lat physics.comp-ph stat.ML

    Inferring the quantum density matrix with machine learning

    Authors: Kyle Cranmer, Siavash Golkar, Duccio Pappadopulo

    Abstract: We introduce two methods for estimating the density matrix for a quantum system: Quantum Maximum Likelihood and Quantum Variational Inference. In these methods, we construct a variational family to model the density matrix of a mixed quantum state. We also introduce quantum flows, the quantum analog of normalizing flows, which can be used to increase the expressivity of this variational family. Th… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: 12 pages, 3 figures

  12. arXiv:1903.04476  [pdf, other

    cs.LG cs.NE q-bio.NC stat.ML

    Continual Learning via Neural Pruning

    Authors: Siavash Golkar, Michael Kagan, Kyunghyun Cho

    Abstract: We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification. In this method, subsequent tasks are trained using the inactive neurons and filters of the sparsified network and cause zero deterioration to the performance of previous tasks. In order to deal with the possible compromise between model… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: 12 pages, 5 figures, 3 tables

  13. arXiv:1806.01337  [pdf, other

    stat.ML cs.LG

    Backdrop: Stochastic Backpropagation

    Authors: Siavash Golkar, Kyle Cranmer

    Abstract: We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitiv… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

    Comments: 11 pages, 9 figures, 2 tables. Source code available at https://github.com/dexgen/backdrop