Search | arXiv e-print repository

Binding in hippocampal-entorhinal circuits enables compositionality in cognitive maps

Authors: Christopher J. Kymn, Sonia Mazelet, Anthony Thomas, Denis Kleyko, E. Paxon Frady, Friedrich T. Sommer, Bruno A. Olshausen

Abstract: We propose a normative model for spatial representation in the hippocampal formation that combines optimality principles, such as maximizing coding range and spatial information per neuron, with an algebraic framework for computing in distributed representation. Spatial position is encoded in a residue number system, with individual residues represented by high-dimensional, complex-valued vectors.… ▽ More We propose a normative model for spatial representation in the hippocampal formation that combines optimality principles, such as maximizing coding range and spatial information per neuron, with an algebraic framework for computing in distributed representation. Spatial position is encoded in a residue number system, with individual residues represented by high-dimensional, complex-valued vectors. These are composed into a single vector representing position by a similarity-preserving, conjunctive vector-binding operation. Self-consistency between the representations of the overall position and of the individual residues is enforced by a modular attractor network whose modules correspond to the grid cell modules in entorhinal cortex. The vector binding operation can also associate different contexts to spatial representations, yielding a model for entorhinal cortex and hippocampus. We show that the model achieves normative desiderata including superlinear scaling of patterns with dimension, robust error correction, and hexagonal, carry-free encoding of spatial position. These properties in turn enable robust path integration and association with sensory inputs. More generally, the model formalizes how compositional computations could occur in the hippocampal formation and leads to testable experimental predictions. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 23 pages, 12 figures

arXiv:2406.03489 [pdf, ps, other]

A self-aligning recirculated crossed optical dipole trap for lithium atoms

Authors: Ming Lian, Maximillian Mrozek-McCourt, Christopher K. Angyal, Dadbeh Shaddel, Zachary J. Blogg, John R. Griffin, Ian Crawley, Ariel T. Sommer

Abstract: Crossed optical dipole traps (ODTs) provide three-dimensional confinement of cold atoms and other optically trappable particles. However, the need to maintain the intersection of the two trap** beams poses strict requirements on alignment stability, and limits the ability to move the trap. Here we demonstrate a novel crossed ODT design that features inherent stability of the beam crossing, allow… ▽ More Crossed optical dipole traps (ODTs) provide three-dimensional confinement of cold atoms and other optically trappable particles. However, the need to maintain the intersection of the two trap** beams poses strict requirements on alignment stability, and limits the ability to move the trap. Here we demonstrate a novel crossed ODT design that features inherent stability of the beam crossing, allowing the trap to move and remain aligned. The trap consists of a single high-power laser beam, imaged back onto itself at an angle to form a crossed trap. Self-aligning behavior results from employing an imaging system with positive magnification tuned precisely to unity. We employ laser-cooled samples of $^6$Li atoms to demonstrate that the trap remains well-aligned over a 4.3 mm travel range along an axis approximately perpendicular to the plane containing the crossed beams. Our scheme can be applied to bring an atomic cloud held in a crossed ODT close to a surface or field source for various applications in quantum simulation, sensing, and information processing. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 9 pages, 5 figures

arXiv:2311.15839 [pdf, other]

Ontologising Trustworthy in the Telecommunications Domain

Authors: Ian Oliver, Pekka Kuure, Wiktor Sedkowski, Thore Sommer

Abstract: Based upon trusted and confidential computing platforms, telecommunications systems must provide guaranteed security for the processes and data running atop them. This in turn requires us to provide trustworthy systems. The term trustworthy is poorly defined with corresponding misunderstanding and misapplication. We present a definition of this term, as well as others, demonstrate its application… ▽ More Based upon trusted and confidential computing platforms, telecommunications systems must provide guaranteed security for the processes and data running atop them. This in turn requires us to provide trustworthy systems. The term trustworthy is poorly defined with corresponding misunderstanding and misapplication. We present a definition of this term, as well as others, demonstrate its application against certain telecommunications use cases and address how the learnings from ontologising these structures contribute to standardisation and the necessity for FAIR ontologies across telecommunications standards and hosting organisations. △ Less

Submitted 27 November, 2023; originally announced November 2023.

ACM Class: D.2.1

arXiv:2311.04872 [pdf, other]

Computing with Residue Numbers in High-Dimensional Representation

Authors: Christopher J. Kymn, Denis Kleyko, E. Paxon Frady, Connor Bybee, Pentti Kanerva, Friedrich T. Sommer, Bruno A. Olshausen

Abstract: We introduce Residue Hyperdimensional Computing, a computing framework that unifies residue number systems with an algebra defined over random, high-dimensional vectors. We show how residue numbers can be represented as high-dimensional vectors in a manner that allows algebraic operations to be performed with component-wise, parallelizable operations on the vector elements. The resulting framework… ▽ More We introduce Residue Hyperdimensional Computing, a computing framework that unifies residue number systems with an algebra defined over random, high-dimensional vectors. We show how residue numbers can be represented as high-dimensional vectors in a manner that allows algebraic operations to be performed with component-wise, parallelizable operations on the vector elements. The resulting framework, when combined with an efficient method for factorizing high-dimensional vectors, can represent and operate on numerical values over a large dynamic range using vastly fewer resources than previous methods, and it exhibits impressive robustness to noise. We demonstrate the potential for this framework to solve computationally difficult problems in visual perception and combinatorial optimization, showing improvement over baseline methods. More broadly, the framework provides a possible account for the computational operations of grid cells in the brain, and it suggests new machine learning architectures for representing and manipulating numerical data. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 24 pages, 10 figures

arXiv:2305.16873 [pdf, other]

doi 10.1162/neco_a_01590

Efficient Decoding of Compositional Structure in Holistic Representations

Authors: Denis Kleyko, Connor Bybee, **-Chen Huang, Christopher J. Kymn, Bruno A. Olshausen, E. Paxon Frady, Friedrich T. Sommer

Abstract: We investigate the task of retrieving information from compositional distributed representations formed by Hyperdimensional Computing/Vector Symbolic Architectures and present novel techniques which achieve new information rate bounds. First, we provide an overview of the decoding techniques that can be used to approach the retrieval task. The techniques are categorized into four groups. We then e… ▽ More We investigate the task of retrieving information from compositional distributed representations formed by Hyperdimensional Computing/Vector Symbolic Architectures and present novel techniques which achieve new information rate bounds. First, we provide an overview of the decoding techniques that can be used to approach the retrieval task. The techniques are categorized into four groups. We then evaluate the considered techniques in several settings that involve, e.g., inclusion of external noise and storage elements with reduced precision. In particular, we find that the decoding techniques from the sparse coding and compressed sensing literature (rarely used for Hyperdimensional Computing/Vector Symbolic Architectures) are also well-suited for decoding information from the compositional distributed representations. Combining these decoding techniques with interference cancellation ideas from communications improves previously reported bounds (Hersche et al., 2021) of the information rate of the distributed representations from 1.20 to 1.40 bits per dimension for smaller codebooks and from 0.60 to 1.26 bits per dimension for larger codebooks. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 28 pages, 5 figures

Journal ref: Neural Computation, 2023

arXiv:2303.13691 [pdf, other]

Learning and generalization of compositional representations of visual scenes

Authors: E. Paxon Frady, Spencer Kent, Quinn Tran, Pentti Kanerva, Bruno A. Olshausen, Friedrich T. Sommer

Abstract: Complex visual scenes that are composed of multiple objects, each with attributes, such as object name, location, pose, color, etc., are challenging to describe in order to train neural networks. Usually,deep learning networks are trained supervised by categorical scene descriptions. The common categorical description of a scene contains the names of individual objects but lacks information about… ▽ More Complex visual scenes that are composed of multiple objects, each with attributes, such as object name, location, pose, color, etc., are challenging to describe in order to train neural networks. Usually,deep learning networks are trained supervised by categorical scene descriptions. The common categorical description of a scene contains the names of individual objects but lacks information about other attributes. Here, we use distributed representations of object attributes and vector operations in a vector symbolic architecture to create a full compositional description of a scene in a high-dimensional vector. To control the scene composition, we use artificial images composed of multiple, translated and colored MNIST digits. In contrast to learning category labels, here we train deep neural networks to output the full compositional vector description of an input image. The output of the deep network can then be interpreted by a VSA resonator network, to extract object identity or other properties of indiviual objects. We evaluate the performance and generalization properties of the system on randomly generated scenes. Specifically, we show that the network is able to learn the task and generalize to unseen seen digit shapes and scene configurations. Further, the generalisation ability of the trained model is limited. For example, with a gap in the training data, like an object not shown in a particular image location during training, the learning does not automatically fill this gap. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: 10 pages, 6 figures

arXiv:2301.04353 [pdf, other]

doi 10.1063/5.0142000

K-space interpretation of image-scanning-microscopy

Authors: Tal I. Sommer, Gil Weinberg, Ori Katz

Abstract: In recent years, image-scanning microscopy (ISM, also termed pixel-reassignment microscopy) has emerged as a technique that improves the resolution and signal-to-noise compared to confocal and widefield microscopy by employing a detector array at the image plane of a confocal laser scanning microscope. Here, we present a k-space analysis of coherent ISM, showing that ISM is equivalent to spotlight… ▽ More In recent years, image-scanning microscopy (ISM, also termed pixel-reassignment microscopy) has emerged as a technique that improves the resolution and signal-to-noise compared to confocal and widefield microscopy by employing a detector array at the image plane of a confocal laser scanning microscope. Here, we present a k-space analysis of coherent ISM, showing that ISM is equivalent to spotlight synthetic-aperture radar (SAR) and analogous to oblique-illumination microscopy. This insight indicates that ISM can be performed with a single detector placed in the k-space of the sample, which we numerically demonstrate. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2212.06071 [pdf, other]

3DSC - A New Dataset of Superconductors Including Crystal Structures

Authors: Timo Sommer, Roland Willa, Jörg Schmalian, Pascal Friederich

Abstract: Data-driven methods, in particular machine learning, can help to speed up the discovery of new materials by finding hidden patterns in existing data and using them to identify promising candidate materials. In the case of superconductors, which are a highly interesting but also a complex class of materials with many relevant applications, the use of data science tools is to date slowed down by a l… ▽ More Data-driven methods, in particular machine learning, can help to speed up the discovery of new materials by finding hidden patterns in existing data and using them to identify promising candidate materials. In the case of superconductors, which are a highly interesting but also a complex class of materials with many relevant applications, the use of data science tools is to date slowed down by a lack of accessible data. In this work, we present a new and publicly available superconductivity dataset ('3DSC'), featuring the critical temperature $T_\mathrm{c}$ of superconducting materials additionally to tested non-superconductors. In contrast to existing databases such as the SuperCon database which contains information on the chemical composition, the 3DSC is augmented by the approximate three-dimensional crystal structure of each material. We perform a statistical analysis and machine learning experiments to show that access to this structural information improves the prediction of the critical temperature $T_\mathrm{c}$ of materials. Furthermore, we see the 3DSC not as a finished dataset, but we provide ideas and directions for further research to improve the 3DSC in multiple ways. We are confident that this database will be useful in applying state-of-the-art machine learning methods to eventually find new superconductors. △ Less

Submitted 14 December, 2022; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: 15 pages + 10 pages of supporting information; UPDATE: standardised formatting, removed double dash from title & updated github links

arXiv:2212.03426 [pdf, other]

Efficient Optimization with Higher-Order Ising Machines

Authors: Connor Bybee, Denis Kleyko, Dmitri E. Nikonov, Amir Khosrowshahi, Bruno A. Olshausen, Friedrich T. Sommer

Abstract: A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions.… ▽ More A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions. Here, we demonstrate that higher-order Ising machines can solve satisfiability problems more resource-efficiently in terms of the number of spin variables and their connections when compared to traditional second-order Ising machines. Further, our results show on a benchmark dataset of Boolean \textit{k}-satisfiability problems that higher-order Ising machines implemented with coupled oscillators rapidly find solutions that are better than second-order Ising machines, thus, improving the current state-of-the-art for Ising machines. △ Less

Submitted 6 December, 2022; originally announced December 2022.

Comments: 13 pages, 4 figures

arXiv:2209.02000 [pdf, other]

doi 10.1038/s42256-024-00846-2

Visual Odometry with Neuromorphic Resonator Networks

Authors: Alpha Renner, Lazar Supic, Andreea Danielescu, Giacomo Indiveri, E. Paxon Frady, Friedrich T. Sommer, Yulia Sandamirskaya

Abstract: Visual Odometry (VO) is a method to estimate self-motion of a mobile robot using visual sensors. Unlike odometry based on integrating differential measurements that can accumulate errors, such as inertial sensors or wheel encoders, visual odometry is not compromised by drift. However, image-based VO is computationally demanding, limiting its application in use cases with low-latency, -memory, and… ▽ More Visual Odometry (VO) is a method to estimate self-motion of a mobile robot using visual sensors. Unlike odometry based on integrating differential measurements that can accumulate errors, such as inertial sensors or wheel encoders, visual odometry is not compromised by drift. However, image-based VO is computationally demanding, limiting its application in use cases with low-latency, -memory, and -energy requirements. Neuromorphic hardware offers low-power solutions to many vision and AI problems, but designing such solutions is complicated and often has to be assembled from scratch. Here we propose to use Vector Symbolic Architecture (VSA) as an abstraction layer to design algorithms compatible with neuromorphic hardware. Building from a VSA model for scene analysis, described in our companion paper, we present a modular neuromorphic algorithm that achieves state-of-the-art performance on two-dimensional VO tasks. Specifically, the proposed algorithm stores and updates a working memory of the presented visual environment. Based on this working memory, a resonator network estimates the changing location and orientation of the camera. We experimentally validate the neuromorphic VSA-based approach to VO with two benchmarks: one based on an event camera dataset and the other in a dynamic scene with a robotic task. △ Less

Submitted 26 June, 2024; v1 submitted 5 September, 2022; originally announced September 2022.

Comments: 19 pages, 5 figures, minor revisions, added results for shapes_translation dataset

ACM Class: I.4.9

Journal ref: Nature Machine Intelligence 6 (2024)

arXiv:2208.12880 [pdf, other]

doi 10.1038/s42256-024-00848-0

Neuromorphic Visual Scene Understanding with Resonator Networks

Authors: Alpha Renner, Lazar Supic, Andreea Danielescu, Giacomo Indiveri, Bruno A. Olshausen, Yulia Sandamirskaya, Friedrich T. Sommer, E. Paxon Frady

Abstract: Analyzing a visual scene by inferring the configuration of a generative model is widely considered the most flexible and generalizable approach to scene understanding. Yet, one major problem is the computational challenge of the inference procedure, involving a combinatorial search across object identities and poses. Here we propose a neuromorphic solution exploiting three key concepts: (1) a comp… ▽ More Analyzing a visual scene by inferring the configuration of a generative model is widely considered the most flexible and generalizable approach to scene understanding. Yet, one major problem is the computational challenge of the inference procedure, involving a combinatorial search across object identities and poses. Here we propose a neuromorphic solution exploiting three key concepts: (1) a computational framework based on Vector Symbolic Architectures (VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator Networks (HRN) to factorize the non-commutative transforms translation and rotation in visual scenes; (3) the design of a multi-compartment spiking phasor neuron model for implementing complex-valued resonator networks on neuromorphic hardware. The VSA framework uses vector binding operations to form a generative image model in which binding acts as the equivariant operation for geometric transformations. A scene can, therefore, be described as a sum of vector products, which can then be efficiently factorized by a resonator network to infer objects and their poses. The HRN features a partitioned architecture in which vector binding is equivariant for horizontal and vertical translation within one partition and for rotation and scaling within the other partition. The spiking neuron model allows map** the resonator network onto efficient and low-power neuromorphic hardware. Our approach is demonstrated on synthetic scenes composed of simple 2D shapes undergoing rigid geometric transformations and color changes. A companion paper demonstrates the same approach in real-world application scenarios for machine vision and robotics. △ Less

Submitted 26 June, 2024; v1 submitted 26 August, 2022; originally announced August 2022.

Comments: 23 pages, 8 figures, minor revisions and extended supplementary material

ACM Class: I.4.8

Journal ref: Nature Machine Intelligence 6 (2024)

arXiv:2208.09481 [pdf, other]

Graph neural networks for materials science and chemistry

Authors: Patrick Reiser, Marlen Neubert, André Eberhard, Luca Torresi, Chen Zhou, Chen Shao, Houssam Metni, Clint van Hoesel, Henrik Schopmans, Timo Sommer, Pascal Friederich

Abstract: Machine learning plays an increasingly important role in many areas of chemistry and materials science, e.g. to predict materials properties, to accelerate simulations, to design new materials, and to predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials… ▽ More Machine learning plays an increasingly important role in many areas of chemistry and materials science, e.g. to predict materials properties, to accelerate simulations, to design new materials, and to predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials science, as they directly work on a graph or structural representation of molecules and materials and therefore have full access to all relevant information required to characterize materials. In this review article, we provide an overview of the basic principles of GNNs, widely used datasets, and state-of-the-art architectures, followed by a discussion of a wide range of recent applications of GNNs in chemistry and materials science, and concluding with a road-map for the further development and application of GNNs. △ Less

Submitted 5 August, 2022; originally announced August 2022.

Comments: 37 pages, 2 figures

arXiv:2204.07163 [pdf, other]

Cross-Frequency Coupling Increases Memory Capacity in Oscillatory Neural Networks

Authors: Connor Bybee, Alexander Belsten, Friedrich T. Sommer

Abstract: An open problem in neuroscience is to explain the functional role of oscillations in neural networks, contributing, for example, to perception, attention, and memory. Cross-frequency coupling (CFC) is associated with information integration across populations of neurons. Impaired CFC is linked to neurological disease. It is unclear what role CFC has in information processing and brain functional c… ▽ More An open problem in neuroscience is to explain the functional role of oscillations in neural networks, contributing, for example, to perception, attention, and memory. Cross-frequency coupling (CFC) is associated with information integration across populations of neurons. Impaired CFC is linked to neurological disease. It is unclear what role CFC has in information processing and brain functional connectivity. We construct a model of CFC which predicts a computational role for observed $θ- γ$ oscillatory circuits in the hippocampus and cortex. Our model predicts that the complex dynamics in recurrent and feedforward networks of coupled oscillators performs robust information storage and pattern retrieval. Based on phasor associative memories (PAM), we present a novel oscillator neural network (ONN) model that includes subharmonic injection locking (SHIL) and which reproduces experimental observations of CFC. We show that the presence of CFC increases the memory capacity of a population of neurons connected by plastic synapses. CFC enables error-free pattern retrieval whereas pattern retrieval fails without CFC. In addition, the trade-offs between sparse connectivity, capacity, and information per connection are identified. The associative memory is based on a complex-valued neural network, or phasor neural network (PNN). We show that for values of $Q$ which are the same as the ratio of $γ$ to $θ$ oscillations observed in the hippocampus and the cortex, the associative memory achieves greater capacity and information storage than previous models. The novel contributions of this work are providing a computational framework based on oscillator dynamics which predicts the functional role of neural oscillations and connecting concepts in neural network theory and dynamical system theory. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: 4 pages, one figure, Presented at Computation and Systems Neuroscience (COSYNE) 2022

arXiv:2204.00507 [pdf, other]

Deep Learning in Spiking Phasor Neural Networks

Authors: Connor Bybee, E. Paxon Frady, Friedrich T. Sommer

Abstract: Spiking Neural Networks (SNNs) have attracted the attention of the deep learning community for use in low-latency, low-power neuromorphic hardware, as well as models for understanding neuroscience. In this paper, we introduce Spiking Phasor Neural Networks (SPNNs). SPNNs are based on complex-valued Deep Neural Networks (DNNs), representing phases by spike times. Our model computes robustly employi… ▽ More Spiking Neural Networks (SNNs) have attracted the attention of the deep learning community for use in low-latency, low-power neuromorphic hardware, as well as models for understanding neuroscience. In this paper, we introduce Spiking Phasor Neural Networks (SPNNs). SPNNs are based on complex-valued Deep Neural Networks (DNNs), representing phases by spike times. Our model computes robustly employing a spike timing code and gradients can be formed using the complex domain. We train SPNNs on CIFAR-10, and demonstrate that the performance exceeds that of other timing coded SNNs, approaching results with comparable real-valued DNNs. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: 10 pages, 5 figures, work presented at Intel Neuromorphic Community Fall 2019 workshop in Graz, Austria and the UC Berkeley Center for Computational Biology Retreat 2019

arXiv:2203.00920 [pdf, other]

doi 10.1145/3517343.3517368

Integer Factorization with Compositional Distributed Representations

Authors: Denis Kleyko, Connor Bybee, Christopher J. Kymn, Bruno A. Olshausen, Amir Khosrowshahi, Dmitri E. Nikonov, Friedrich T. Sommer, E. Paxon Frady

Abstract: In this paper, we present an approach to integer factorization using distributed representations formed with Vector Symbolic Architectures. The approach formulates integer factorization in a manner such that it can be solved using neural networks and potentially implemented on parallel neuromorphic hardware. We introduce a method for encoding numbers in distributed vector spaces and explain how th… ▽ More In this paper, we present an approach to integer factorization using distributed representations formed with Vector Symbolic Architectures. The approach formulates integer factorization in a manner such that it can be solved using neural networks and potentially implemented on parallel neuromorphic hardware. We introduce a method for encoding numbers in distributed vector spaces and explain how the resonator network can solve the integer factorization problem. We evaluate the approach on factorization of semiprimes by measuring the factorization accuracy versus the scale of the problem. We also demonstrate how the proposed approach generalizes beyond the factorization of semiprimes; in principle, it can be used for factorization of any composite number. This work demonstrates how a well-known combinatorial search problem may be formulated and solved within the framework of Vector Symbolic Architectures, and it opens the door to solving similarly difficult problems in other domains. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: 8 pages, 4 figures

Journal ref: NICE 2022: Neuro-Inspired Computational Elements Conference

arXiv:2201.11108 [pdf, other]

A probabilistic latent variable model for detecting structure in binary data

Authors: Christopher Warner, Kiersten Ruda, Friedrich T. Sommer

Abstract: We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model… ▽ More We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: 25 pages, 20 figures

arXiv:2201.10000 [pdf, other]

Neural Manifold Clustering and Embedding

Authors: Zengyi Li, Yubei Chen, Yann LeCun, Friedrich T. Sommer

Abstract: Given a union of non-linear manifolds, non-linear subspace clustering or manifold clustering aims to cluster data points based on manifold structures and also learn to parameterize each manifold as a linear subspace in a feature space. Deep neural networks have the potential to achieve this goal under highly non-linear settings given their large capacity and flexibility. We argue that achieving ma… ▽ More Given a union of non-linear manifolds, non-linear subspace clustering or manifold clustering aims to cluster data points based on manifold structures and also learn to parameterize each manifold as a linear subspace in a feature space. Deep neural networks have the potential to achieve this goal under highly non-linear settings given their large capacity and flexibility. We argue that achieving manifold clustering with neural networks requires two essential ingredients: a domain-specific constraint that ensures the identification of the manifolds, and a learning algorithm for embedding each manifold to a linear subspace in the feature space. This work shows that many constraints can be implemented by data augmentation. For subspace feature learning, Maximum Coding Rate Reduction (MCR$^2$) objective can be used. Putting them together yields {\em Neural Manifold Clustering and Embedding} (NMCE), a novel method for general purpose manifold clustering, which significantly outperforms autoencoder-based deep subspace clustering. Further, on more challenging natural image datasets, NMCE can also outperform other algorithms specifically designed for clustering. Qualitatively, we demonstrate that NMCE learns a meaningful and interpretable feature space. As the formulation of NMCE is closely related to several important Self-supervised learning (SSL) methods, we believe this work can help us build a deeper understanding on SSL representation learning. △ Less

Submitted 24 January, 2022; originally announced January 2022.

arXiv:2111.03746 [pdf, other]

Efficient Neuromorphic Signal Processing with Loihi 2

Authors: Garrick Orchard, E. Paxon Frady, Daniel Ben Dayan Rubin, Sophia Sanborn, Sumit Bam Shrestha, Friedrich T. Sommer, Mike Davies

Abstract: The biologically inspired spiking neurons used in neuromorphic computing are nonlinear filters with dynamic state variables -- very different from the stateless neuron models used in deep learning. The next version of Intel's neuromorphic research processor, Loihi 2, supports a wide range of stateful spiking neuron models with fully programmable dynamics. Here we showcase advanced spiking neuron m… ▽ More The biologically inspired spiking neurons used in neuromorphic computing are nonlinear filters with dynamic state variables -- very different from the stateless neuron models used in deep learning. The next version of Intel's neuromorphic research processor, Loihi 2, supports a wide range of stateful spiking neuron models with fully programmable dynamics. Here we showcase advanced spiking neuron models that can be used to efficiently process streaming data in simulation experiments on emulated Loihi 2 hardware. In one example, Resonate-and-Fire (RF) neurons are used to compute the Short Time Fourier Transform (STFT) with similar computational complexity but 47x less output bandwidth than the conventional STFT. In another example, we describe an algorithm for optical flow estimation using spatiotemporal RF neurons that requires over 90x fewer operations than a conventional DNN-based solution. We also demonstrate promising preliminary results using backpropagation to train RF neurons for audio classification tasks. Finally, we show that a cascade of Hopf resonators - a variant of the RF neuron - replicates novel properties of the cochlea and motivates an efficient spike-based spectrogram encoder. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2110.01993 [pdf, ps, other]

doi 10.1364/OE.445465

Aluminum nitride integration on silicon nitride photonic circuits: a new hybrid approach towards on-chip nonlinear optics

Authors: Giulio Terrasanta, Timo Sommer, Manuel Müller, Matthias Althammer, Rudolf Gross, Menno Poot

Abstract: Aluminum nitride (AlN) is an emerging material for integrated quantum photonics due to its large $χ^{(2)}$ nonlinearity. Here we demonstrate the hybrid integration of AlN on silicon nitride (SiN) photonic chips. Composite microrings are fabricated by reactive DC sputtering of c-axis oriented AlN on top of pre-patterned SiN. This new approach does not require any patterning of AlN and depends only… ▽ More Aluminum nitride (AlN) is an emerging material for integrated quantum photonics due to its large $χ^{(2)}$ nonlinearity. Here we demonstrate the hybrid integration of AlN on silicon nitride (SiN) photonic chips. Composite microrings are fabricated by reactive DC sputtering of c-axis oriented AlN on top of pre-patterned SiN. This new approach does not require any patterning of AlN and depends only on reliable SiN nanofabrication. This simplifies the nanofabrication process drastically. Optical characteristics, such as the quality factor, propagation losses and group index, are obtained. Our hybrid resonators can have a one order of magnitude increase in quality factor after the AlN integration, with propagation losses down to \SI{0.7}{dB/cm}. Using finite-element simulations, phase matching in these waveguides is explored. △ Less

Submitted 6 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: 13 pages, 6 figures v2: removed incorrect copyright line

arXiv:2109.03429 [pdf, other]

Computing on Functions Using Randomized Vector Representations

Authors: E. Paxon Frady, Denis Kleyko, Christopher J. Kymn, Bruno A. Olshausen, Friedrich T. Sommer

Abstract: Vector space models for symbolic processing that encode symbols by random vectors have been proposed in cognitive science and connectionist communities under the names Vector Symbolic Architecture (VSA), and, synonymously, Hyperdimensional (HD) computing. In this paper, we generalize VSAs to function spaces by map** continuous-valued data into a vector space such that the inner product between t… ▽ More Vector space models for symbolic processing that encode symbols by random vectors have been proposed in cognitive science and connectionist communities under the names Vector Symbolic Architecture (VSA), and, synonymously, Hyperdimensional (HD) computing. In this paper, we generalize VSAs to function spaces by map** continuous-valued data into a vector space such that the inner product between the representations of any two data points represents a similarity kernel. By analogy to VSA, we call this new function encoding and computing framework Vector Function Architecture (VFA). In VFAs, vectors can represent individual data points as well as elements of a function space (a reproducing kernel Hilbert space). The algebraic vector operations, inherited from VSA, correspond to well-defined operations in function space. Furthermore, we study a previously proposed method for encoding continuous data, fractional power encoding (FPE), which uses exponentiation of a random base vector to produce randomized representations of data points and fulfills the kernel properties for inducing a VFA. We show that the distribution from which elements of the base vector are sampled determines the shape of the FPE kernel, which in turn induces a VFA for computing with band-limited functions. In particular, VFAs provide an algebraic framework for implementing large-scale kernel machines with random features, extending Rahimi and Recht, 2007. Finally, we demonstrate several applications of VFA models to problems in image recognition, density estimation and nonlinear regression. Our analyses and results suggest that VFAs constitute a powerful new framework for representing and manipulating functions in distributed neural systems, with myriad applications in artificial intelligence. △ Less

Submitted 8 September, 2021; originally announced September 2021.

Comments: 33 pages, 18 Figures

arXiv:2106.07978 [pdf]

doi 10.1063/5.0062716

Pixel-reassignment in Ultrasound Imaging

Authors: Tal I. Sommer, Ori Katz

Abstract: We present an adaptation of the pixel-reassignment technique from confocal fluorescent microscopy to coherent ultrasound imaging. The method, Ultrasound Pixel-Reassignment (UPR), provides a resolution and signal to noise (SNR) improvement in ultrasound imaging by computationally reassigning off-focus signals acquired using traditional plane-wave compounding ultrasonography. We theoretically analyz… ▽ More We present an adaptation of the pixel-reassignment technique from confocal fluorescent microscopy to coherent ultrasound imaging. The method, Ultrasound Pixel-Reassignment (UPR), provides a resolution and signal to noise (SNR) improvement in ultrasound imaging by computationally reassigning off-focus signals acquired using traditional plane-wave compounding ultrasonography. We theoretically analyze the analogy between the optical and ultrasound implementations of pixel reassignment, and experimentally evaluate the imaging quality on tissue-mimicking acoustic phantoms. We demonstrate that UPR provides a $25\%$ resolution improvement and a $3dB$ SNR improvement in in-vitro scans, without any change in hardware or acquisition scheme. △ Less

Submitted 24 January, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

Journal ref: Appl. Phys. Lett. 119, 123701 (2021)

arXiv:2106.05268 [pdf, other]

doi 10.1109/JPROC.2022.3209104

Vector Symbolic Architectures as a Computing Framework for Emerging Hardware

Authors: Denis Kleyko, Mike Davies, E. Paxon Frady, Pentti Kanerva, Spencer J. Kent, Bruno A. Olshausen, Evgeny Osipov, Jan M. Rabaey, Dmitri A. Rachkovskij, Abbas Rahimi, Friedrich T. Sommer

Abstract: This article reviews recent progress in the development of the computing framework vector symbolic architectures (VSA) (also known as hyperdimensional computing). This framework is well suited for implementation in stochastic, emerging hardware, and it naturally expresses the types of cognitive operations required for artificial intelligence (AI). We demonstrate in this article that the field-like… ▽ More This article reviews recent progress in the development of the computing framework vector symbolic architectures (VSA) (also known as hyperdimensional computing). This framework is well suited for implementation in stochastic, emerging hardware, and it naturally expresses the types of cognitive operations required for artificial intelligence (AI). We demonstrate in this article that the field-like algebraic structure of VSA offers simple but powerful operations on high-dimensional vectors that can support all data structures and manipulations relevant to modern computing. In addition, we illustrate the distinguishing feature of VSA, "computing in superposition," which sets it apart from conventional computing. It also opens the door to efficient solutions to the difficult combinatorial search problems inherent in AI applications. We sketch ways of demonstrating that VSA are computationally universal. We see them acting as a framework for computing with distributed representations that can play a role of an abstraction layer for emerging computing hardware. This article serves as a reference for computer architects by illustrating the philosophy behind VSA, techniques of distributed computing with them, and their relevance to emerging computing hardware, such as neuromorphic computing. △ Less

Submitted 20 July, 2023; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: 31 pages, 15 figures, 4 Tables

Journal ref: Proceedings of the IEEE (2022), vol. 110, no. 10

arXiv:2105.07853 [pdf, other]

doi 10.3390/mi12080880

Efficient optomechanical mode-shape map** of micromechanical devices

Authors: David Hoch, Kevin-Jeremy Haas, Leopold Moller, Timo Sommer, Pedro Soubelet, Jonathan Finley, Menno Poot

Abstract: We demonstrate a method to optically map multiple modes of mechanical structures simultaneously. The fast and robust method, based on a modified phase-lock-loop, is demonstrated on a silicon nitride membrane and compared with three different approaches. Line traces and two-dimensional maps of different modes are acquired. The high quality enables us to determine the weights of individual contribut… ▽ More We demonstrate a method to optically map multiple modes of mechanical structures simultaneously. The fast and robust method, based on a modified phase-lock-loop, is demonstrated on a silicon nitride membrane and compared with three different approaches. Line traces and two-dimensional maps of different modes are acquired. The high quality enables us to determine the weights of individual contributions in superpositions of degenerate modes. △ Less

Submitted 6 May, 2021; originally announced May 2021.

Journal ref: Micromachines 12 880 (2021)

arXiv:2103.08318 [pdf, ps, other]

doi 10.1088/2633-4356/ac08ed

Growth of Aluminum Nitride on a Silicon Nitride Substrate for Hybrid Photonic Circuits

Authors: G. Terrasanta, M. Müller, T. Sommer, S. Geprägs, R. Gross, M. Althammer, M. Poot

Abstract: Aluminum nitride (AlN) is an emerging material for integrated quantum photonics with its excellent linear and nonlinear optical properties. In particular, its second-order nonlinear susceptibility $χ^{(2)}$ allows single-photon generation. We have grown AlN thin films on silicon nitride via reactive DC magnetron sputtering. The thin films have been characterized using X-ray diffraction, optical re… ▽ More Aluminum nitride (AlN) is an emerging material for integrated quantum photonics with its excellent linear and nonlinear optical properties. In particular, its second-order nonlinear susceptibility $χ^{(2)}$ allows single-photon generation. We have grown AlN thin films on silicon nitride via reactive DC magnetron sputtering. The thin films have been characterized using X-ray diffraction, optical reflectometry, atomic force microscopy, and scanning electron microscopy. The crystalline properties of the thin films have been improved by optimizing the nitrogen to argon ratio and the magnetron DC power of the deposition process. X-ray diffraction measurements confirm the fabrication of high-quality c-axis oriented thin films with a full width at half maximum of the rocking curves of 3.9 deg. for 300-nm-thick films. Atomic force microscopy measurements reveal a root mean square surface roughness below 1 nm. The AlN deposition on SiN allows us to fabricate hybrid photonic circuits with a new approach that avoids the challenging patterning of AlN. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Journal ref: Mater. Quantum. Technol. 1 021002 (2021)

arXiv:2012.07881 [pdf, other]

doi 10.1109/TNNLS.2023.3237381

Perceptron Theory Can Predict the Accuracy of Neural Networks

Authors: Denis Kleyko, Antonello Rosato, E. Paxon Frady, Massimo Panella, Friedrich T. Sommer

Abstract: Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different arch… ▽ More Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. The theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. The perceptron theory compares favorably to other methods that do not rely on training an estimator model. △ Less

Submitted 20 July, 2023; v1 submitted 14 December, 2020; originally announced December 2020.

Comments: 16 pages, 14 figures

Journal ref: IEEE Transactions on Neural Networks and Learning Systems (2023)

arXiv:2010.03587 [pdf, other]

A Neural Network MCMC sampler that maximizes Proposal Entropy

Authors: Zengyi Li, Yubei Chen, Friedrich T. Sommer

Abstract: Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network based samplers were trained wit… ▽ More Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network based samplers were trained with objectives that either did not explicitly encourage exploration, or used a L2 jump objective which could only be applied to well structured distributions. Thus it seems promising to instead maximize the proposal entropy for adapting the proposal to distributions of any shape. To allow direct optimization of the proposal entropy, we propose a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieves significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks. Further, the sampler is applied on training of a convergent energy-based model of natural images. The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: conference submission preprint

arXiv:2010.03585 [pdf, other]

doi 10.1109/TNNLS.2021.3119543

Cellular Automata Can Reduce Memory Requirements of Collective-State Computing

Authors: Denis Kleyko, E. Paxon Frady, Friedrich T. Sommer

Abstract: Various non-classical approaches of distributed information processing, such as neural networks, computation with Ising models, reservoir computing, vector symbolic architectures, and others, employ the principle of collective-state computing. In this type of computing, the variables relevant in a computation are superimposed into a single high-dimensional state vector, the collective-state. The v… ▽ More Various non-classical approaches of distributed information processing, such as neural networks, computation with Ising models, reservoir computing, vector symbolic architectures, and others, employ the principle of collective-state computing. In this type of computing, the variables relevant in a computation are superimposed into a single high-dimensional state vector, the collective-state. The variable encoding uses a fixed set of random patterns, which has to be stored and kept available during the computation. Here we show that an elementary cellular automaton with rule 90 (CA90) enables space-time tradeoff for collective-state computing models that use random dense binary representations, i.e., memory requirements can be traded off with computation running CA90. We investigate the randomization behavior of CA90, in particular, the relation between the length of the randomization period and the size of the grid, and how CA90 preserves similarity in the presence of the initialization noise. Based on these analyses we discuss how to optimize a collective-state computing model, in which CA90 expands representations on the fly from short seed patterns - rather than storing the full set of random patterns. The CA90 expansion is applied and tested in concrete scenarios using reservoir computing and vector symbolic architectures. Our experimental results show that collective-state computing with CA90 expansion performs similarly compared to traditional collective-state models, in which random patterns are generated initially by a pseudo-random number generator and then stored in a large memory. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: 13 pages, 11 figures

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 6, 2022

arXiv:2009.06734 [pdf, other]

Variable Binding for Sparse Distributed Representations: Theory and Applications

Authors: E. Paxon Frady, Denis Kleyko, Friedrich T. Sommer

Abstract: Symbolic reasoning and neural networks are often considered incompatible approaches. Connectionist models known as Vector Symbolic Architectures (VSAs) can potentially bridge this gap. However, classical VSAs and neural networks are still considered incompatible. VSAs encode symbols by dense pseudo-random vectors, where information is distributed throughout the entire neuron population. Neural net… ▽ More Symbolic reasoning and neural networks are often considered incompatible approaches. Connectionist models known as Vector Symbolic Architectures (VSAs) can potentially bridge this gap. However, classical VSAs and neural networks are still considered incompatible. VSAs encode symbols by dense pseudo-random vectors, where information is distributed throughout the entire neuron population. Neural networks encode features locally, often forming sparse vectors of neural activation. Following Rachkovskij (2001); Laiho et al. (2015), we explore symbolic reasoning with sparse distributed representations. The core operations in VSAs are dyadic operations between vectors to express variable binding and the representation of sets. Thus, algebraic manipulations enable VSAs to represent and process data structures in a vector space of fixed dimensionality. Using techniques from compressed sensing, we first show that variable binding between dense vectors in VSAs is mathematically equivalent to tensor product binding between sparse vectors, an operation which increases dimensionality. This result implies that dimensionality-preserving binding for general sparse vectors must include a reduction of the tensor matrix into a single sparse vector. Two options for sparsity-preserving variable binding are investigated. One binding method for general sparse vectors extends earlier proposals to reduce the tensor product into a vector, such as circular convolution. The other method is only defined for sparse block-codes, block-wise circular convolution. Our experiments reveal that variable binding for block-codes has ideal properties, whereas binding for general sparse vectors also works, but is lossy, similar to previous proposals. We demonstrate a VSA with sparse block-codes in example applications, cognitive reasoning and classification, and discuss its relevance for neuroscience and neural networks. △ Less

Submitted 14 September, 2020; originally announced September 2020.

Comments: 15 pages, 9 figures

arXiv:2007.03748 [pdf, other]

Resonator networks for factoring distributed representations of data structures

Authors: E. Paxon Frady, Spencer Kent, Bruno A. Olshausen, Friedrich T. Sommer

Abstract: The ability to encode and manipulate data structures with distributed neural representations could qualitatively enhance the capabilities of traditional neural networks by supporting rule-based symbolic reasoning, a central property of cognition. Here we show how this may be accomplished within the framework of Vector Symbolic Architectures (VSA) (Plate, 1991; Gayler, 1998; Kanerva, 1996), whereby… ▽ More The ability to encode and manipulate data structures with distributed neural representations could qualitatively enhance the capabilities of traditional neural networks by supporting rule-based symbolic reasoning, a central property of cognition. Here we show how this may be accomplished within the framework of Vector Symbolic Architectures (VSA) (Plate, 1991; Gayler, 1998; Kanerva, 1996), whereby data structures are encoded by combining high-dimensional vectors with operations that together form an algebra on the space of distributed representations. In particular, we propose an efficient solution to a hard combinatorial search problem that arises when decoding elements of a VSA data structure: the factorization of products of multiple code vectors. Our proposed algorithm, called a resonator network, is a new type of recurrent neural network that interleaves VSA multiplication operations and pattern completion. We show in two examples -- parsing of a tree-like data structure and parsing of a visual scene -- how the factorization problem arises and how the resonator network can solve it. More broadly, resonator networks open the possibility to apply VSAs to myriad artificial intelligence problems in real-world domains. A companion paper (Kent et al., 2020) presents a rigorous analysis and evaluation of the performance of resonator networks, showing it out-performs alternative approaches. △ Less

Submitted 7 July, 2020; originally announced July 2020.

Comments: 20 pages, 5 figures, to appear in Neural Computation 2020 with companion paper: arXiv:1906.11684

arXiv:2005.02567 [pdf, other]

A Model for Image Segmentation in Retina

Authors: Christopher Warner, Friedrich T. Sommer

Abstract: While traditional feed-forward filter models can reproduce the rate responses of retinal ganglion neurons to simple stimuli, they cannot explain why synchrony between spikes is much higher than expected by Poisson firing [6], and can be sometimes rhythmic [25, 16]. Here we investigate the hypothesis that synchrony in periodic retinal spike trains could convey contextual information of the visual i… ▽ More While traditional feed-forward filter models can reproduce the rate responses of retinal ganglion neurons to simple stimuli, they cannot explain why synchrony between spikes is much higher than expected by Poisson firing [6], and can be sometimes rhythmic [25, 16]. Here we investigate the hypothesis that synchrony in periodic retinal spike trains could convey contextual information of the visual input, which is extracted by computations in the retinal network. We propose a computational model for image segmentation consisting of a Kuramoto model of coupled oscillators whose phases model the timing of individual retinal spikes. The phase couplings between oscillators are shaped by the stimulus structure, causing cells to synchronize if the local contrast in their receptive fields is similar. In essence, relaxation in the oscillator network solves a graph clustering problem with the graph representing feature similarity between different points in the image. We tested different model versions on the Berkeley Image Segmentation Data Set (BSDS). Networks with phase interactions set by standard representations of the feature graph (adjacency matrix, Graph Laplacian or modularity) failed to exhibit segmentation performance significantly over the baseline, a model of independent sensors. In contrast, a network with phase interactions that takes into account not only feature similarities but also geometric distances between receptive fields exhibited segmentation performance significantly above baseline. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 39 pages, 20 figures

arXiv:2005.01862 [pdf, other]

Complex Amplitude-Phase Boltzmann Machines

Authors: Zengyi Li, Friedrich T. Sommer

Abstract: We extend the framework of Boltzmann machines to a network of complex-valued neurons with variable amplitudes, referred to as Complex Amplitude-Phase Boltzmann machine (CAP-BM). The model is capable of performing unsupervised learning on the amplitude and relative phase distribution in complex data. The sampling rule of the Gibbs distribution and the learning rules of the model are presented. Lear… ▽ More We extend the framework of Boltzmann machines to a network of complex-valued neurons with variable amplitudes, referred to as Complex Amplitude-Phase Boltzmann machine (CAP-BM). The model is capable of performing unsupervised learning on the amplitude and relative phase distribution in complex data. The sampling rule of the Gibbs distribution and the learning rules of the model are presented. Learning in a Complex Amplitude-Phase restricted Boltzmann machine (CAP-RBM) is demonstrated on synthetic complex-valued images, and handwritten MNIST digits transformed by a complex wavelet transform. Specifically, we show the necessity of a new amplitude-amplitude coupling term in our model. The proposed model is potentially valuable for machine learning tasks involving complex-valued data with amplitude variation, and for develo** algorithms for novel computation hardware, such as coupled oscillators and neuromorphic hardware, on which Boltzmann sampling can be executed in the complex domain. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: Short Technical Note

arXiv:2004.12691 [pdf, other]

Neuromorphic Nearest-Neighbor Search Using Intel's Pohoiki Springs

Authors: E. Paxon Frady, Garrick Orchard, David Florey, Nabil Imam, Ruokun Liu, Joyesh Mishra, Jonathan Tse, Andreas Wild, Friedrich T. Sommer, Mike Davies

Abstract: Neuromorphic computing applies insights from neuroscience to uncover innovations in computing technology. In the brain, billions of interconnected neurons perform rapid computations at extremely low energy levels by leveraging properties that are foreign to conventional computing systems, such as temporal spiking codes and finely parallelized processing units integrating both memory and computatio… ▽ More Neuromorphic computing applies insights from neuroscience to uncover innovations in computing technology. In the brain, billions of interconnected neurons perform rapid computations at extremely low energy levels by leveraging properties that are foreign to conventional computing systems, such as temporal spiking codes and finely parallelized processing units integrating both memory and computation. Here, we showcase the Pohoiki Springs neuromorphic system, a mesh of 768 interconnected Loihi chips that collectively implement 100 million spiking neurons in silicon. We demonstrate a scalable approximate k-nearest neighbor (k-NN) algorithm for searching large databases that exploits neuromorphic principles. Compared to state-of-the-art conventional CPU-based implementations, we achieve superior latency, index build time, and energy efficiency when evaluated on several standard datasets containing over 1 million high-dimensional patterns. Further, the system supports adding new data points to the indexed database online in O(1) time unlike all but brute force conventional k-NN implementations. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: 9 pages, 8 figures, 3 tables, submission to NICE 2020

arXiv:1912.06131 [pdf, ps, other]

doi 10.1103/PhysRevResearch.4.023231

Transport of Spin and Mass at Normal-Superfluid Interfaces in the Unitary Fermi Gas

Authors: Ding Zhang, Ariel T. Sommer

Abstract: Transport in strongly interacting Fermi gases provides a window into the non-equilibrium behavior of strongly correlated fermions. In particular, the interface between a strongly polarized normal gas and a weakly polarized superfluid at finite temperature presents a model for understanding transport at normal-superfluid and normal-superconductor interfaces. An excess of polarization in the normal… ▽ More Transport in strongly interacting Fermi gases provides a window into the non-equilibrium behavior of strongly correlated fermions. In particular, the interface between a strongly polarized normal gas and a weakly polarized superfluid at finite temperature presents a model for understanding transport at normal-superfluid and normal-superconductor interfaces. An excess of polarization in the normal phase or a deficit of polarization in the superfluid brings the system out of equilibrium, leading to transport currents across the interface. We implement a phenomenological mean-field model of the unitary Fermi gas, and investigate the transport of spin and mass under non-equilibrium conditions. We consider independently prepared normal and superfluid regions brought into contact, and calculate the instantaneous spin and mass currents across the normal-superfluid (NS) interface. For an unpolarized superfluid, we find that spin current is suppressed below a threshold value in the driving chemical potential differences, while the threshold nearly vanishes for a critically polarized superfluid. The mass current can exhibit a threshold in cases where Andreev reflection vanishes, while in general Andreev reflection prevents the occurrence of a threshold in the mass current. Our results provide guidance to future experiments aiming to characterize spin and mass transport across NS interfaces. △ Less

Submitted 6 May, 2022; v1 submitted 12 December, 2019; originally announced December 2019.

Comments: 18 pages, 15 figures

Journal ref: Physical Review Research 4, 023231 (2022)

arXiv:1910.07762 [pdf, other]

Learning Energy-Based Models in High-Dimensional Spaces with Multi-scale Denoising Score Matching

Authors: Zengyi Li, Yubei Chen, Friedrich T. Sommer

Abstract: Energy-Based Models (EBMs) assign unnormalized log-probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning, and many more. But training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentia… ▽ More Energy-Based Models (EBMs) assign unnormalized log-probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning, and many more. But training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoising score matching \citep{vincent2011connection} has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising \citep{saremi2019neural}. However, demonstrations of such models in high quality sample synthesis of high dimensional data were lacking. Recently, \citet{song2019generative} have shown that a generative model trained by denoising score matching accomplishes excellent sample synthesis, when trained with data samples corrupted with multiple levels of noise. Here we provide analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multi-scale denoising score matching. Our model exhibits data generation performance comparable to state-of-the-art techniques such as GANs, and sets a new baseline for EBMs. The proposed model also provides density information and performs well in an image inpainting task. △ Less

Submitted 19 December, 2019; v1 submitted 17 October, 2019; originally announced October 2019.

arXiv:1909.09521 [pdf]

The limits for complete photonic bandgaps in low-contrast media

Authors: Lukas Maiwald, Timo Sommer, Marvin Schulz, Manfred Eich, Alexander Yu. Petrov

Abstract: The minimal refractive index contrast to obtain a complete photonic bandgap (CPBG) in structured media was not identified so far. We address this problem by considering distributed quasicrystals in with arbitrary number and positions of Bragg peaks in reciprocal space. For these structures an analytical estimation is derived which predicts that there is an optimal number of Bragg peaks for any ref… ▽ More The minimal refractive index contrast to obtain a complete photonic bandgap (CPBG) in structured media was not identified so far. We address this problem by considering distributed quasicrystals in with arbitrary number and positions of Bragg peaks in reciprocal space. For these structures an analytical estimation is derived which predicts that there is an optimal number of Bragg peaks for any refractive index contrast and finite CPBGs for an arbitrarily small refractive index contrast in 2D and 3D. Results of numerical simulations of dipole emission in 2D structures support our estimation. In 3D an emission suppression of almost 10 dB was demonstrated with a refractive index contrast of 1.6. The reason for residual leakage in 3D structures has to be further investigated. △ Less

Submitted 20 September, 2019; originally announced September 2019.

arXiv:1906.11684 [pdf, other]

Resonator Networks outperform optimization methods at solving high-dimensional vector factorization

Authors: Spencer J. Kent, E. Paxon Frady, Friedrich T. Sommer, Bruno A. Olshausen

Abstract: We develop theoretical foundations of Resonator Networks, a new type of recurrent neural network introduced in Frady et al. (2020) to solve a high-dimensional vector factorization problem arising in Vector Symbolic Architectures. Given a composite vector formed by the Hadamard product between a discrete set of high-dimensional vectors, a Resonator Network can efficiently decompose the composite in… ▽ More We develop theoretical foundations of Resonator Networks, a new type of recurrent neural network introduced in Frady et al. (2020) to solve a high-dimensional vector factorization problem arising in Vector Symbolic Architectures. Given a composite vector formed by the Hadamard product between a discrete set of high-dimensional vectors, a Resonator Network can efficiently decompose the composite into these factors. We compare the performance of Resonator Networks against optimization-based methods, including Alternating Least Squares and several gradient-based algorithms, showing that Resonator Networks are superior in several important ways. This advantage is achieved by leveraging a combination of nonlinear dynamics and "searching in superposition," by which estimates of the correct solution are formed from a weighted superposition of all possible solutions. While the alternative methods also search in superposition, the dynamics of Resonator Networks allow them to strike a more effective balance between exploring the solution space and exploiting local information to drive the network toward probable solutions. Resonator Networks are not guaranteed to converge, but within a particular regime they almost always do. In exchange for relaxing this guarantee of global convergence, Resonator Networks are dramatically more effective at finding factorizations than all alternative approaches considered. △ Less

Submitted 14 July, 2020; v1 submitted 19 June, 2019; originally announced June 2019.

Comments: arXiv's LaTeX compiler contains a compatibility issue with the subcaption package, screwing up the placement of Figure 6 (and subsequent figures) in V3. This update simply remedies that issue

arXiv:1904.00986 [pdf]

A simple method for detecting chaos in nature

Authors: Daniel Toker, Friedrich T. Sommer, Mark D'Esposito

Abstract: Chaos, or exponential sensitivity to small perturbations, appears everywhere in nature. Moreover, chaos is predicted to play diverse functional roles in living systems. A method for detecting chaos from empirical measurements should therefore be a key component of the biologist's toolkit. But, classic chaos-detection tools are highly sensitive to measurement noise and break down for common edge ca… ▽ More Chaos, or exponential sensitivity to small perturbations, appears everywhere in nature. Moreover, chaos is predicted to play diverse functional roles in living systems. A method for detecting chaos from empirical measurements should therefore be a key component of the biologist's toolkit. But, classic chaos-detection tools are highly sensitive to measurement noise and break down for common edge cases, making it difficult to detect chaos in domains, like biology, where measurements are noisy. However, newer tools promise to overcome these limitations. Here, we combine several such tools into an automated processing pipeline, and show that our pipeline can detect the presence (or absence) of chaos in noisy recordings, even for difficult edge cases. As a first-pass application of our pipeline, we show that heart rate variability is not chaotic as some have proposed, and instead reflects a stochastic process in both health and disease. Our tool is easy-to-use and freely available. △ Less

Submitted 9 January, 2020; v1 submitted 26 March, 2019; originally announced April 2019.

arXiv:1904.00929 [pdf, other]

Unsupervised Abbreviation Disambiguation Contextual disambiguation using word embeddings

Authors: Manuel Ciosici, Tobias Sommer, Ira Assent

Abstract: Abbreviations often have several distinct meanings, often making their use in text ambiguous. Expanding them to their intended meaning in context is important for Machine Reading tasks such as document search, recommendation and question answering. Existing approaches mostly rely on manually labeled examples of abbreviations and their correct long-forms. Such data sets are costly to create and res… ▽ More Abbreviations often have several distinct meanings, often making their use in text ambiguous. Expanding them to their intended meaning in context is important for Machine Reading tasks such as document search, recommendation and question answering. Existing approaches mostly rely on manually labeled examples of abbreviations and their correct long-forms. Such data sets are costly to create and result in trained models with limited applicability and flexibility. Importantly, most current methods must be subjected to a full empirical evaluation in order to understand their limitations, which is cumbersome in practice. In this paper, we present an entirely unsupervised abbreviation disambiguation method (called UAD) that picks up abbreviation definitions from unstructured text. Creating distinct tokens per meaning, we learn context representations as word vectors. We demonstrate how to further boost abbreviation disambiguation performance by obtaining better context representations using additional unstructured text. Our method is the first abbreviation disambiguation approach with a transparent model that allows performance analysis without requiring full-scale evaluation, making it highly relevant for real-world deployments. In our thorough empirical evaluation, UAD achieves high performance on large real-world data sets from different domains and outperforms both baseline and state-of-the-art methods. UAD scales well and supports thousands of abbreviations with multiple different meanings within a single model. In order to spur more research into abbreviation disambiguation, we publish a new data set, that we also use in our experiments. △ Less

Submitted 22 May, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

Comments: Fixed author names; Revised text and experimental section

arXiv:1901.07718 [pdf, other]

Robust computation with rhythmic spike patterns

Authors: E. Paxon Frady, Friedrich T. Sommer

Abstract: Information coding by precise timing of spikes can be faster and more energy-efficient than traditional rate coding. However, spike-timing codes are often brittle, which has limited their use in theoretical neuroscience and computing applications. Here, we propose a novel type of attractor neural network in complex state space, and show how it can be leveraged to construct spiking neural networks… ▽ More Information coding by precise timing of spikes can be faster and more energy-efficient than traditional rate coding. However, spike-timing codes are often brittle, which has limited their use in theoretical neuroscience and computing applications. Here, we propose a novel type of attractor neural network in complex state space, and show how it can be leveraged to construct spiking neural networks with robust computational properties through a phase-to-timing map**. Building on Hebbian neural associative memories, like Hopfield networks, we first propose threshold phasor associative memory (TPAM) networks. Complex phasor patterns whose components can assume continuous-valued phase angles and binary magnitudes can be stored and retrieved as stable fixed points in the network dynamics. TPAM achieves high memory capacity when storing sparse phasor patterns, and we derive the energy function that governs its fixed point attractor dynamics. Second, through simulation experiments we show how the complex algebraic computations in TPAM can be approximated by a biologically plausible network of integrate-and-fire neurons with synaptic delays and recurrently connected inhibitory interneurons. The fixed points of TPAM in the complex domain are commensurate with stable periodic states of precisely timed spiking activity that are robust to perturbation. The link established between rhythmic firing patterns and complex attractor dynamics has implications for the interpretation of spike patterns seen in neuroscience, and can serve as a framework for computation in emerging neuromorphic devices. △ Less

Submitted 22 January, 2019; originally announced January 2019.

Comments: 22 pages, 8 figures

arXiv:1806.09335 [pdf]

doi 10.13140/RG.2.2.32493.49129

Request for Comments: Proposal of a Blockchain for the Automatic Management and Acceptance of Student Achievements

Authors: Thorsten Sommer, Gergana Deppe, Valerie Stehling, Max Haberstroh, Frank Hees

Abstract: Staying abroad during their studies is increasingly popular for students. However, there are various challenges for both students and universities. One important question for students is whether or not achievements performed at different universities can be taken into account for either enrolling at a foreign university or for completing the studies at their home university. In addition to univers… ▽ More Staying abroad during their studies is increasingly popular for students. However, there are various challenges for both students and universities. One important question for students is whether or not achievements performed at different universities can be taken into account for either enrolling at a foreign university or for completing the studies at their home university. In addition to university achievements, an increasing proportion of the 195 million students worldwide increasingly receive certificates from MOOCs or other social media services. The integration of such services into university teaching is still in the initial stages and presents some challenges. In this paper we describe the idea to manage all these study achievements worldwide in a blockchain, which might solve the national and international challenges regarding the recognition of student achievements. The aim of this paper is to encourage discussion in the global community instead of presenting a finished concept. Some of the open research questions are: How to ensure student data protection, how to deal with fraud and how to deal with the possibility that students can analytically calculate the easiest way through their studies? △ Less

Submitted 25 June, 2018; originally announced June 2018.

arXiv:1803.00412 [pdf]

A theory of sequence indexing and working memory in recurrent neural networks

Authors: E. Paxon Frady, Denis Kleyko, Friedrich T. Sommer

Abstract: To accommodate structured approaches of neural computation, we propose a class of recurrent neural networks for indexing and storing sequences of symbols or analog data vectors. These networks with randomized input weights and orthogonal recurrent weights implement coding principles previously described in vector symbolic architectures (VSA), and leverage properties of reservoir computing. In gene… ▽ More To accommodate structured approaches of neural computation, we propose a class of recurrent neural networks for indexing and storing sequences of symbols or analog data vectors. These networks with randomized input weights and orthogonal recurrent weights implement coding principles previously described in vector symbolic architectures (VSA), and leverage properties of reservoir computing. In general, the storage in reservoir computing is lossy and crosstalk noise limits the retrieval accuracy and information capacity. A novel theory to optimize memory performance in such networks is presented and compared with simulation experiments. The theory describes linear readout of analog data, and readout with winner-take-all error correction of symbolic data as proposed in VSA models. We find that diverse VSA models from the literature have universal performance properties, which are superior to what previous analyses predicted. Further, we propose novel VSA models with the statistically optimal Wiener filter in the readout that exhibit much higher information capacity, in particular for storing analog data. The presented theory also applies to memory buffers, networks with gradual forgetting, which can operate on infinite data streams without memory overflow. Interestingly, we find that different forgetting mechanisms, such as attenuating recurrent weights or neural nonlinearities, produce very similar behavior if the forgetting time constants are aligned. Such models exhibit extensive capacity when their forgetting time constant is optimized for given noise conditions and network size. These results enable the design of new types of VSA models for the online processing of data streams. △ Less

Submitted 28 February, 2018; originally announced March 2018.

Comments: 62 pages, 19 Figures, 85 equations, accepted in Neural Computation. arXiv admin note: text overlap with arXiv:1707.01429

arXiv:1708.02967 [pdf]

Information Integration In Large Brain Networks

Authors: Daniel Toker, Friedrich T. Sommer

Abstract: An outstanding problem in neuroscience is to understand how information is integrated across the many modules of the brain. While classic information-theoretic measures have transformed our understanding of feedforward information processing in the brain's sensory periphery, comparable measures for information flow in the massively recurrent networks of the rest of the brain have been lacking. To… ▽ More An outstanding problem in neuroscience is to understand how information is integrated across the many modules of the brain. While classic information-theoretic measures have transformed our understanding of feedforward information processing in the brain's sensory periphery, comparable measures for information flow in the massively recurrent networks of the rest of the brain have been lacking. To address this, recent work in information theory has produced a sound measure of network-wide "integrated information," which can be estimated from time-series data. But, a computational hurdle has stymied attempts to measure large-scale information integration in real brains. Specifically, the measurement of integrated information involves a combinatorial search for the informational "weakest link" of a network, a process whose computation time explodes super-exponentially with network size. Here, we show that spectral clustering, applied on the correlation matrix of time-series data, provides an approximate but robust solution to the search for the the informational weakest link of large networks. This reduces the computation time for integrated information in large systems from longer than the lifespan of the universe to just minutes. We evaluate this solution in brain-like systems of coupled oscillators as well as in high-density electrocortigraphy data from two macaque monkeys, and show that the informational "weakest link" of the monkey cortex splits posterior sensory areas from anterior association areas. Finally, we use our solution to provide evidence in support of the long-standing hypothesis that information integration is maximized by networks with a high global efficiency, and that modular network structures promote the segregation of information. △ Less

Submitted 8 February, 2019; v1 submitted 9 August, 2017; originally announced August 2017.

arXiv:1707.03925 [pdf, other]

doi 10.1103/PhysRevLett.119.143001

Long-Lived Ultracold Molecules with Electric and Magnetic Dipole Moments

Authors: Timur M. Rvachov, Hyungmok Son, Ariel T. Sommer, Sepehr Ebadi, Juliana J. Park, Martin W. Zwierlein, Wolfgang Ketterle, Alan O. Jamison

Abstract: We create fermionic dipolar $^{23}$Na$^6$Li molecules in their triplet ground state from an ultracold mixture of $^{23}$Na and $^6$Li. Using magneto-association across a narrow Feshbach resonance followed by a two-photon STIRAP transfer to the triplet ground state, we produce $3\,{\times}\,10^4$ ground state molecules in a spin-polarized state. We observe a lifetime of $4.6\,\text{s}$ in an isolat… ▽ More We create fermionic dipolar $^{23}$Na$^6$Li molecules in their triplet ground state from an ultracold mixture of $^{23}$Na and $^6$Li. Using magneto-association across a narrow Feshbach resonance followed by a two-photon STIRAP transfer to the triplet ground state, we produce $3\,{\times}\,10^4$ ground state molecules in a spin-polarized state. We observe a lifetime of $4.6\,\text{s}$ in an isolated molecular sample, approaching the $p$-wave universal rate limit. Electron spin resonance spectroscopy of the triplet state was used to determine the hyperfine structure of this previously unobserved molecular state. △ Less

Submitted 12 February, 2018; v1 submitted 12 July, 2017; originally announced July 2017.

Comments: 5 pages, 5 figures

Journal ref: Phys. Rev. Lett. 119, 143001 (2017)

arXiv:1707.01429 [pdf, other]

Theory of the superposition principle for randomized connectionist representations in neural networks

Authors: E. Paxon Frady, Denis Kleyko, Friedrich T. Sommer

Abstract: To understand cognitive reasoning in the brain, it has been proposed that symbols and compositions of symbols are represented by activity patterns (vectors) in a large population of neurons. Formal models implementing this idea [Plate 2003], [Kanerva 2009], [Gayler 2003], [Eliasmith 2012] include a reversible superposition operation for representing with a single vector an entire set of symbols or… ▽ More To understand cognitive reasoning in the brain, it has been proposed that symbols and compositions of symbols are represented by activity patterns (vectors) in a large population of neurons. Formal models implementing this idea [Plate 2003], [Kanerva 2009], [Gayler 2003], [Eliasmith 2012] include a reversible superposition operation for representing with a single vector an entire set of symbols or an ordered sequence of symbols. If the representation space is high-dimensional, large sets of symbols can be superposed and individually retrieved. However, crosstalk noise limits the accuracy of retrieval and information capacity. To understand information processing in the brain and to design artificial neural systems for cognitive reasoning, a theory of this superposition operation is essential. Here, such a theory is presented. The superposition operations in different existing models are mapped to linear neural networks with unitary recurrent matrices, in which retrieval accuracy can be analyzed by a single equation. We show that networks representing information in superposition can achieve a channel capacity of about half a bit per neuron, a significant fraction of the total available entropy. Going beyond existing models, superposition operations with recency effects are proposed that avoid catastrophic forgetting when representing the history of infinite data streams. These novel models correspond to recurrent networks with non-unitary matrices or with nonlinear neurons, and can be analyzed and optimized with an extension of our theory. △ Less

Submitted 5 July, 2017; originally announced July 2017.

Comments: 42 pages, 13 figures

arXiv:1606.03474 [pdf, other]

Learning overcomplete, low coherence dictionaries with linear inference

Authors: Jesse A. Livezey, Alejandro F. Bujan, Friedrich T. Sommer

Abstract: Finding overcomplete latent representations of data has applications in data analysis, signal processing, machine learning, theoretical neuroscience and many other fields. In an overcomplete representation, the number of latent features exceeds the data dimensionality, which is useful when the data is undersampled by the measurements (compressed sensing, information bottlenecks in neural systems)… ▽ More Finding overcomplete latent representations of data has applications in data analysis, signal processing, machine learning, theoretical neuroscience and many other fields. In an overcomplete representation, the number of latent features exceeds the data dimensionality, which is useful when the data is undersampled by the measurements (compressed sensing, information bottlenecks in neural systems) or composed from multiple complete sets of linear features, each spanning the data space. Independent Components Analysis (ICA) is a linear technique for learning sparse latent representations, which typically has a lower computational cost than sparse coding, its nonlinear, recurrent counterpart. While well suited for finding complete representations, we show that overcompleteness poses a challenge to existing ICA algorithms. Specifically, the coherence control in existing ICA algorithms, necessary to prevent the formation of duplicate dictionary features, is ill-suited in the overcomplete case. We show that in this case several existing ICA algorithms have undesirable global minima that maximize coherence. Further, by comparing ICA algorithms on synthetic data and natural images to the computationally more expensive sparse coding solution, we show that the coherence control biases the exploration of the data manifold, sometimes yielding suboptimal solutions. We provide a theoretical explanation of these failures and, based on the theory, propose improved overcomplete ICA algorithms. All told, this study contributes new insights into and methods for coherence control for linear ICA, some of which are applicable to many other, potentially nonlinear, unsupervised learning methods. △ Less

Submitted 15 October, 2018; v1 submitted 10 June, 2016; originally announced June 2016.

Comments: 27 pages, 11 figures

Journal ref: JMLR 20(174) 1-42 (2019)

arXiv:1311.2097 [pdf, ps, other]

doi 10.1162/NECO_a_00600

Risk-sensitive Reinforcement Learning

Authors: Yun Shen, Michael J. Tobia, Tobias Sommer, Klaus Obermayer

Abstract: We derive a family of risk-sensitive reinforcement learning methods for agents, who face sequential decision-making tasks in uncertain environments. By applying a utility function to the temporal difference (TD) error, nonlinear transformations are effectively applied not only to the received rewards but also to the true transition probabilities of the underlying Markov decision process. When appr… ▽ More We derive a family of risk-sensitive reinforcement learning methods for agents, who face sequential decision-making tasks in uncertain environments. By applying a utility function to the temporal difference (TD) error, nonlinear transformations are effectively applied not only to the received rewards but also to the true transition probabilities of the underlying Markov decision process. When appropriate utility functions are chosen, the agents' behaviors express key features of human behavior as predicted by prospect theory (Kahneman and Tversky, 1979), for example different risk-preferences for gains and losses as well as the shape of subjective probability curves. We derive a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. As a proof of principle for the applicability of the new framework we apply it to quantify human behavior in a sequential investment task. We find, that the risk-sensitive variant provides a significantly better fit to the behavioral data and that it leads to an interpretation of the subject's responses which is indeed consistent with prospect theory. The analysis of simultaneously measured fMRI signals show a significant correlation of the risk-sensitive TD error with BOLD signal change in the ventral striatum. In addition we find a significant correlation of the risk-sensitive Q-values with neural activity in the striatum, cingulate cortex and insula, which is not present if standard Q-values are used. △ Less

Submitted 23 January, 2014; v1 submitted 8 November, 2013; originally announced November 2013.

Comments: 27 pages, 7 figures

Journal ref: Neural Computation, Vol. 26, Nr. 7, pp. 1298--1328, 2014

arXiv:1302.4736 [pdf, other]

doi 10.1038/nature12338

Heavy Solitons in a Fermionic Superfluid

Authors: Tarik Yefsah, Ariel T. Sommer, Mark J. H. Ku, Lawrence W. Cheuk, Wenjie Ji, Waseem S. Bakr, Martin W. Zwierlein

Abstract: Topological excitations are found throughout nature, in proteins and DNA, as dislocations in crystals, as vortices and solitons in superfluids and superconductors, and generally in the wake of symmetry-breaking phase transitions. In fermionic systems, topological defects may provide bound states for fermions that often play a crucial role for the system's transport properties. Famous examples are… ▽ More Topological excitations are found throughout nature, in proteins and DNA, as dislocations in crystals, as vortices and solitons in superfluids and superconductors, and generally in the wake of symmetry-breaking phase transitions. In fermionic systems, topological defects may provide bound states for fermions that often play a crucial role for the system's transport properties. Famous examples are Andreev bound states inside vortex cores, fractionally charged solitons in relativistic quantum field theory, and the spinless charged solitons responsible for the high conductivity of polymers. However, the free motion of topological defects in electronic systems is hindered by pinning at impurities. Here we create long-lived solitons in a strongly interacting fermionic superfluid by imprinting a phase step into the superfluid wavefunction, and directly observe their oscillatory motion in the trapped superfluid. As the interactions are tuned from the regime of Bose-Einstein condensation (BEC) of tightly bound molecules towards the Bardeen-Cooper-Schrieffer (BCS) limit of long-range Cooper pairs, the effective mass of the solitons increases dramatically to more than 200 times their bare mass. This signals their filling with Andreev states and strong quantum fluctuations. For the unitary Fermi gas, the mass enhancement is more than fifty times larger than expectations from mean-field Bogoliubov-de Gennes theory. Our work paves the way towards the experimental study and control of Andreev bound states in ultracold atomic gases. In the presence of spin imbalance, the solitons created here represent one limit of the long sought-after Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) state of mobile Cooper pairs. △ Less

Submitted 19 February, 2013; originally announced February 2013.

Comments: 8 pages, 6 figures

Journal ref: Nature 499, 426-430 (2013)

arXiv:1205.3483 [pdf, other]

doi 10.1103/PhysRevLett.109.095302

Spin-Injection Spectroscopy of a Spin-Orbit Coupled Fermi Gas

Authors: Lawrence W. Cheuk, Ariel T. Sommer, Zoran Hadzibabic, Tarik Yefsah, Waseem S. Bakr, Martin W. Zwierlein

Abstract: The coupling of the spin of electrons to their motional state lies at the heart of recently discovered topological phases of matter. Here we create and detect spin-orbit coupling in an atomic Fermi gas, a highly controllable form of quantum degenerate matter. We reveal the spin-orbit gap via spin-injection spectroscopy, which characterizes the energy-momentum dispersion and spin composition of the… ▽ More The coupling of the spin of electrons to their motional state lies at the heart of recently discovered topological phases of matter. Here we create and detect spin-orbit coupling in an atomic Fermi gas, a highly controllable form of quantum degenerate matter. We reveal the spin-orbit gap via spin-injection spectroscopy, which characterizes the energy-momentum dispersion and spin composition of the quantum states. For energies within the spin-orbit gap, the system acts as a spin diode. To fully inhibit transport, we open an additional spin gap, thereby creating a spin-orbit coupled lattice whose spinful band structure we probe. In the presence of s-wave interactions, such systems should display induced p-wave pairing, topological superfluidity, and Majorana edge states. △ Less

Submitted 15 May, 2012; originally announced May 2012.

Journal ref: Phys. Rev. Lett. 109, 095302 (2012)

arXiv:1112.1125 [pdf, other]

Learning in embodied action-perception loops through exploration

Authors: Daniel Y. Little, Friedrich T. Sommer

Abstract: Although exploratory behaviors are ubiquitous in the animal kingdom, their computational underpinnings are still largely unknown. Behavioral Psychology has identified learning as a primary drive underlying many exploratory behaviors. Exploration is seen as a means for an animal to gather sensory data useful for reducing its ignorance about the environment. While related problems have been addresse… ▽ More Although exploratory behaviors are ubiquitous in the animal kingdom, their computational underpinnings are still largely unknown. Behavioral Psychology has identified learning as a primary drive underlying many exploratory behaviors. Exploration is seen as a means for an animal to gather sensory data useful for reducing its ignorance about the environment. While related problems have been addressed in Data Mining and Reinforcement Learning, the computational modeling of learning-driven exploration by embodied agents is largely unrepresented. Here, we propose a computational theory for learning-driven exploration based on the concept of missing information that allows an agent to identify informative actions using Bayesian inference. We demonstrate that when embodiment constraints are high, agents must actively coordinate their actions to learn efficiently. Compared to earlier approaches, our exploration policy yields more efficient learning across a range of worlds with diverse structures. The improved learning in turn affords greater success in general tasks including navigation and reward gathering. We conclude by discussing how the proposed theory relates to previous information-theoretic objectives of behavior, such as predictive information and the free energy principle, and how it might contribute to a general theory of exploratory behavior. △ Less

Submitted 9 December, 2011; v1 submitted 5 December, 2011; originally announced December 2011.

arXiv:1110.3747 [pdf, other]

doi 10.1038/NPHYS2273

Feynman diagrams versus Fermi-gas Feynman emulator

Authors: K. Van Houcke, F. Werner, E. Kozik, N. Prokofev, B. Svistunov, M. J. H. Ku, A. T. Sommer, L. W. Cheuk, A. Schirotzek, M. W. Zwierlein

Abstract: Precise understanding of strongly interacting fermions, from electrons in modern materials to nuclear matter, presents a major goal in modern physics. However, the theoretical description of interacting Fermi systems is usually plagued by the intricate quantum statistics at play. Here we present a cross-validation between a new theoretical approach, Bold Diagrammatic Monte Carlo (BDMC), and precis… ▽ More Precise understanding of strongly interacting fermions, from electrons in modern materials to nuclear matter, presents a major goal in modern physics. However, the theoretical description of interacting Fermi systems is usually plagued by the intricate quantum statistics at play. Here we present a cross-validation between a new theoretical approach, Bold Diagrammatic Monte Carlo (BDMC), and precision experiments on ultra-cold atoms. Specifically, we compute and measure with unprecedented accuracy the normal-state equation of state of the unitary gas, a prototypical example of a strongly correlated fermionic system. Excellent agreement demonstrates that a series of Feynman diagrams can be controllably resummed in a non-perturbative regime using BDMC. This opens the door to the solution of some of the most challenging problems across many areas of physics. △ Less

Submitted 19 March, 2012; v1 submitted 17 October, 2011; originally announced October 2011.

Journal ref: Nature Phys. 8, 366 (2012)

Showing 1–50 of 66 results for author: Sommer, T