-
A probabilistic latent variable model for detecting structure in binary data
Authors:
Christopher Warner,
Kiersten Ruda,
Friedrich T. Sommer
Abstract:
We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model…
▽ More
We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
A Neural Network MCMC sampler that maximizes Proposal Entropy
Authors:
Zengyi Li,
Yubei Chen,
Friedrich T. Sommer
Abstract:
Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network based samplers were trained wit…
▽ More
Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network based samplers were trained with objectives that either did not explicitly encourage exploration, or used a L2 jump objective which could only be applied to well structured distributions. Thus it seems promising to instead maximize the proposal entropy for adapting the proposal to distributions of any shape. To allow direct optimization of the proposal entropy, we propose a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieves significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks. Further, the sampler is applied on training of a convergent energy-based model of natural images. The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
Complex Amplitude-Phase Boltzmann Machines
Authors:
Zengyi Li,
Friedrich T. Sommer
Abstract:
We extend the framework of Boltzmann machines to a network of complex-valued neurons with variable amplitudes, referred to as Complex Amplitude-Phase Boltzmann machine (CAP-BM). The model is capable of performing unsupervised learning on the amplitude and relative phase distribution in complex data. The sampling rule of the Gibbs distribution and the learning rules of the model are presented. Lear…
▽ More
We extend the framework of Boltzmann machines to a network of complex-valued neurons with variable amplitudes, referred to as Complex Amplitude-Phase Boltzmann machine (CAP-BM). The model is capable of performing unsupervised learning on the amplitude and relative phase distribution in complex data. The sampling rule of the Gibbs distribution and the learning rules of the model are presented. Learning in a Complex Amplitude-Phase restricted Boltzmann machine (CAP-RBM) is demonstrated on synthetic complex-valued images, and handwritten MNIST digits transformed by a complex wavelet transform. Specifically, we show the necessity of a new amplitude-amplitude coupling term in our model. The proposed model is potentially valuable for machine learning tasks involving complex-valued data with amplitude variation, and for develo** algorithms for novel computation hardware, such as coupled oscillators and neuromorphic hardware, on which Boltzmann sampling can be executed in the complex domain.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
Learning Energy-Based Models in High-Dimensional Spaces with Multi-scale Denoising Score Matching
Authors:
Zengyi Li,
Yubei Chen,
Friedrich T. Sommer
Abstract:
Energy-Based Models (EBMs) assign unnormalized log-probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning, and many more. But training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentia…
▽ More
Energy-Based Models (EBMs) assign unnormalized log-probability to data samples. This functionality has a variety of applications, such as sample synthesis, data denoising, sample restoration, outlier detection, Bayesian reasoning, and many more. But training of EBMs using standard maximum likelihood is extremely slow because it requires sampling from the model distribution. Score matching potentially alleviates this problem. In particular, denoising score matching \citep{vincent2011connection} has been successfully used to train EBMs. Using noisy data samples with one fixed noise level, these models learn fast and yield good results in data denoising \citep{saremi2019neural}. However, demonstrations of such models in high quality sample synthesis of high dimensional data were lacking. Recently, \citet{song2019generative} have shown that a generative model trained by denoising score matching accomplishes excellent sample synthesis, when trained with data samples corrupted with multiple levels of noise. Here we provide analysis and empirical evidence showing that training with multiple noise levels is necessary when the data dimension is high. Leveraging this insight, we propose a novel EBM trained with multi-scale denoising score matching. Our model exhibits data generation performance comparable to state-of-the-art techniques such as GANs, and sets a new baseline for EBMs. The proposed model also provides density information and performs well in an image inpainting task.
△ Less
Submitted 19 December, 2019; v1 submitted 17 October, 2019;
originally announced October 2019.
-
Resonator Networks outperform optimization methods at solving high-dimensional vector factorization
Authors:
Spencer J. Kent,
E. Paxon Frady,
Friedrich T. Sommer,
Bruno A. Olshausen
Abstract:
We develop theoretical foundations of Resonator Networks, a new type of recurrent neural network introduced in Frady et al. (2020) to solve a high-dimensional vector factorization problem arising in Vector Symbolic Architectures. Given a composite vector formed by the Hadamard product between a discrete set of high-dimensional vectors, a Resonator Network can efficiently decompose the composite in…
▽ More
We develop theoretical foundations of Resonator Networks, a new type of recurrent neural network introduced in Frady et al. (2020) to solve a high-dimensional vector factorization problem arising in Vector Symbolic Architectures. Given a composite vector formed by the Hadamard product between a discrete set of high-dimensional vectors, a Resonator Network can efficiently decompose the composite into these factors. We compare the performance of Resonator Networks against optimization-based methods, including Alternating Least Squares and several gradient-based algorithms, showing that Resonator Networks are superior in several important ways. This advantage is achieved by leveraging a combination of nonlinear dynamics and "searching in superposition," by which estimates of the correct solution are formed from a weighted superposition of all possible solutions. While the alternative methods also search in superposition, the dynamics of Resonator Networks allow them to strike a more effective balance between exploring the solution space and exploiting local information to drive the network toward probable solutions. Resonator Networks are not guaranteed to converge, but within a particular regime they almost always do. In exchange for relaxing this guarantee of global convergence, Resonator Networks are dramatically more effective at finding factorizations than all alternative approaches considered.
△ Less
Submitted 14 July, 2020; v1 submitted 19 June, 2019;
originally announced June 2019.