-
SemiMultiPose: A Semi-supervised Multi-animal Pose Estimation Framework
Authors:
Ari Blau,
Christoph Gebhardt,
Andres Bendesky,
Liam Paninski,
Anqi Wu
Abstract:
Multi-animal pose estimation is essential for studying animals' social behaviors in neuroscience and neuroethology. Advanced approaches have been proposed to support multi-animal estimation and achieve state-of-the-art performance. However, these models rarely exploit unlabeled data during training even though real world applications have exponentially more unlabeled frames than labeled frames. Ma…
▽ More
Multi-animal pose estimation is essential for studying animals' social behaviors in neuroscience and neuroethology. Advanced approaches have been proposed to support multi-animal estimation and achieve state-of-the-art performance. However, these models rarely exploit unlabeled data during training even though real world applications have exponentially more unlabeled frames than labeled frames. Manually adding dense annotations for a large number of images or videos is costly and labor-intensive, especially for multiple instances. Given these deficiencies, we propose a novel semi-supervised architecture for multi-animal pose estimation, leveraging the abundant structures pervasive in unlabeled frames in behavior videos to enhance training, which is critical for sparsely-labeled problems. The resulting algorithm will provide superior multi-animal pose estimation results on three animal experiments compared to the state-of-the-art baseline and exhibits more predictive power in sparsely-labeled data regimes.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Amortized Probabilistic Detection of Communities in Graphs
Authors:
Yueqi Wang,
Yoonho Lee,
Pallab Basu,
Juho Lee,
Yee Whye Teh,
Liam Paninski,
Ari Pakman
Abstract:
Learning community structures in graphs has broad applications across scientific domains. While graph neural networks (GNNs) have been successful in encoding graph structures, existing GNN-based methods for community detection are limited by requiring knowledge of the number of communities in advance, in addition to lacking a proper probabilistic formulation to handle uncertainty. We propose a sim…
▽ More
Learning community structures in graphs has broad applications across scientific domains. While graph neural networks (GNNs) have been successful in encoding graph structures, existing GNN-based methods for community detection are limited by requiring knowledge of the number of communities in advance, in addition to lacking a proper probabilistic formulation to handle uncertainty. We propose a simple framework for amortized community detection, which addresses both of these issues by combining the expressive power of GNNs with recent methods for amortized clustering. Our models consist of a graph representation backbone that extracts structural information and an amortized clustering network that naturally handles variable numbers of clusters. Both components combine into well-defined models of the posterior distribution of graph communities and are jointly optimized given labeled graphs. At inference time, the models yield parallel samples from the posterior of community labels, quantifying uncertainty in a principled way. We evaluate several models from our framework on synthetic and real datasets and demonstrate superior performance to previous methods. As a separate contribution, we extend recent amortized probabilistic clustering architectures by adding attention modules, which yield further improvements on community detection tasks.
△ Less
Submitted 15 June, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.
-
Chronic, cortex-wide imaging of specific cell populations during behavior
Authors:
Joao Couto,
Simon Musall,
Xiaonan R Sun,
Anup Khanal,
Steven Gluf,
Shreya Saxena,
Ian Kinsella,
Taiga Abe,
John P. Cunningham,
Liam Paninski,
Anne K Churchland
Abstract:
Measurements of neuronal activity across brain areas are important for understanding the neural correlates of cognitive and motor processes like attention, decision-making, and action selection. However, techniques that allow cellular resolution measurements are expensive and require a high degree of technical expertise, which limits their broad use. Widefield imaging of genetically encoded indica…
▽ More
Measurements of neuronal activity across brain areas are important for understanding the neural correlates of cognitive and motor processes like attention, decision-making, and action selection. However, techniques that allow cellular resolution measurements are expensive and require a high degree of technical expertise, which limits their broad use. Widefield imaging of genetically encoded indicators is a high throughput, cost effective, and flexible approach to measure activity of specific cell populations with high temporal resolution and a cortex-wide field of view. Here we outline our protocol for assembling a widefield setup, a surgical preparation to image through the intact skull, and imaging neural activity chronically in behaving, transgenic mice that express a calcium indicator in specific subpopulations of cortical neurons. Further, we highlight a processing pipeline that leverages novel, cloud-based methods to analyze large-scale imaging datasets. The protocol targets labs that are seeking to build macroscopes, optimize surgical procedures for long-term chronic imaging, and/or analyze cortex-wide neuronal recordings.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
A zero-inflated gamma model for deconvolved calcium imaging traces
Authors:
Xue-Xin Wei,
Ding Zhou,
Andres Grosmark,
Zaki Ajabi,
Fraser Sparks,
Pengcheng Zhou,
Mark Brandon,
Attila Losonczy,
Liam Paninski
Abstract:
Calcium imaging is a critical tool for measuring the activity of large neural populations. Much effort has been devoted to develo** "pre-processing" tools for calcium video data, addressing the important issues of e.g., motion correction, denoising, compression, demixing, and deconvolution. However, statistical modeling of deconvolved calcium signals (i.e., the estimated activity extracted by a…
▽ More
Calcium imaging is a critical tool for measuring the activity of large neural populations. Much effort has been devoted to develo** "pre-processing" tools for calcium video data, addressing the important issues of e.g., motion correction, denoising, compression, demixing, and deconvolution. However, statistical modeling of deconvolved calcium signals (i.e., the estimated activity extracted by a pre-processing pipeline) is just as critical for interpreting calcium measurements, and for incorporating these observations into downstream probabilistic encoding and decoding models. Surprisingly, these issues have to date received significantly less attention. In this work we examine the statistical properties of the deconvolved activity estimates, and compare probabilistic models for these random signals. In particular, we propose a zero-inflated gamma (ZIG) model, which characterizes the calcium responses as a mixture of a gamma distribution and a point mass that serves to model zero responses. We apply the resulting models to neural encoding and decoding problems. We find that the ZIG model outperforms simpler models (e.g., Poisson or Bernoulli models) in the context of both simulated and real neural data, and can therefore play a useful role in bridging calcium imaging analysis methods with tools for analyzing activity in large neural populations.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Disentangled Sticky Hierarchical Dirichlet Process Hidden Markov Model
Authors:
Ding Zhou,
Yuanjun Gao,
Liam Paninski
Abstract:
The Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) has been used widely as a natural Bayesian nonparametric extension of the classical Hidden Markov Model for learning from sequential and time-series data. A sticky extension of the HDP-HMM has been proposed to strengthen the self-persistence probability in the HDP-HMM. However, the sticky HDP-HMM entangles the strength of the self-pe…
▽ More
The Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) has been used widely as a natural Bayesian nonparametric extension of the classical Hidden Markov Model for learning from sequential and time-series data. A sticky extension of the HDP-HMM has been proposed to strengthen the self-persistence probability in the HDP-HMM. However, the sticky HDP-HMM entangles the strength of the self-persistence prior and transition prior together, limiting its expressiveness. Here, we propose a more general model: the disentangled sticky HDP-HMM (DS-HDP-HMM). We develop novel Gibbs sampling algorithms for efficient inference in this model. We show that the disentangled sticky HDP-HMM outperforms the sticky HDP-HMM and HDP-HMM on both synthetic and real data, and apply the new approach to analyze neural data and segment behavioral video data.
△ Less
Submitted 21 June, 2020; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Linear-time inference for Gaussian Processes on one dimension
Authors:
Jackson Loper,
David Blei,
John P. Cunningham,
Liam Paninski
Abstract:
Gaussian Processes (GPs) provide powerful probabilistic frameworks for interpolation, forecasting, and smoothing, but have been hampered by computational scaling issues. Here we investigate data sampled on one dimension (e.g., a scalar or vector time series sampled at arbitrarily-spaced intervals), for which state-space models are popular due to their linearly-scaling computational costs. It has l…
▽ More
Gaussian Processes (GPs) provide powerful probabilistic frameworks for interpolation, forecasting, and smoothing, but have been hampered by computational scaling issues. Here we investigate data sampled on one dimension (e.g., a scalar or vector time series sampled at arbitrarily-spaced intervals), for which state-space models are popular due to their linearly-scaling computational costs. It has long been conjectured that state-space models are general, able to approximate any one-dimensional GP. We provide the first general proof of this conjecture, showing that any stationary GP on one dimension with vector-valued observations governed by a Lebesgue-integrable continuous kernel can be approximated to any desired precision using a specifically-chosen state-space model: the Latent Exponentially Generated (LEG) family. This new family offers several advantages compared to the general state-space model: it is always stable (no unbounded growth), the covariance can be computed in closed form, and its parameter space is unconstrained (allowing straightforward estimation via gradient descent). The theorem's proof also draws connections to Spectral Mixture Kernels, providing insight about this popular family of kernels. We develop parallelized algorithms for performing inference and learning in the LEG model, test the algorithm on real and synthetic data, and demonstrate scaling to datasets with billions of samples.
△ Less
Submitted 12 October, 2021; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Neural Clustering Processes
Authors:
Ari Pakman,
Yueqi Wang,
Catalin Mitelut,
**Hyung Lee,
Liam Paninski
Abstract:
Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be inaccurate and/or very slow. In this work we introduce deep network architectures trained with labeled samples from any generative model of clustered datasets. At te…
▽ More
Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be inaccurate and/or very slow. In this work we introduce deep network architectures trained with labeled samples from any generative model of clustered datasets. At test time, the networks generate approximate posterior samples of cluster labels for any new dataset of arbitrary size. We develop two complementary approaches to this task, requiring either O(N) or O(K) network forward passes per dataset, where N is the dataset size and K the number of clusters. Unlike previous approaches, our methods sample the labels of all the data points from a well-defined posterior, and can learn nonparametric Bayesian posteriors since they do not limit the number of mixture components. As a scientific application, we present a novel approach to neural spike sorting for high-density multielectrode arrays.
△ Less
Submitted 23 June, 2020; v1 submitted 28 December, 2018;
originally announced January 2019.
-
Amortized Bayesian inference for clustering models
Authors:
Ari Pakman,
Liam Paninski
Abstract:
We develop methods for efficient amortized approximate Bayesian inference over posterior distributions of probabilistic clustering models, such as Dirichlet process mixture models. The approach is based on map** distributed, symmetry-invariant representations of cluster arrangements into conditional probabilities. The method parallelizes easily, yields iid samples from the approximate posterior…
▽ More
We develop methods for efficient amortized approximate Bayesian inference over posterior distributions of probabilistic clustering models, such as Dirichlet process mixture models. The approach is based on map** distributed, symmetry-invariant representations of cluster arrangements into conditional probabilities. The method parallelizes easily, yields iid samples from the approximate posterior of cluster assignments with the same computational cost of a single Gibbs sampler sweep, and can easily be applied to both conjugate and non-conjugate models, as training only requires samples from the generative model.
△ Less
Submitted 23 November, 2018;
originally announced November 2018.
-
Nonlinear Evolution via Spatially-Dependent Linear Dynamics for Electrophysiology and Calcium Data
Authors:
Daniel Hernandez,
Antonio Khalil Moretti,
Ziqiang Wei,
Shreya Saxena,
John Cunningham,
Liam Paninski
Abstract:
Latent variable models have been widely applied for the analysis of time series resulting from experimental neuroscience techniques. In these datasets, observations are relatively smooth and possibly nonlinear. We present Variational Inference for Nonlinear Dynamics (VIND), a variational inference framework that is able to uncover nonlinear, smooth latent dynamics from sequential data. The framewo…
▽ More
Latent variable models have been widely applied for the analysis of time series resulting from experimental neuroscience techniques. In these datasets, observations are relatively smooth and possibly nonlinear. We present Variational Inference for Nonlinear Dynamics (VIND), a variational inference framework that is able to uncover nonlinear, smooth latent dynamics from sequential data. The framework is a direct extension of PfLDS; including a structured approximate posterior describing spatially-dependent linear dynamics, as well as an algorithm that relies on the fixed-point iteration method to achieve convergence. We apply VIND to electrophysiology, single-cell voltage and widefield imaging datasets with state-of-the-art results in reconstruction error. In single-cell voltage data, VIND finds a 5D latent space, with variables akin to those of Hodgkin-Huxley-like models. VIND's learned dynamics are further quantified by predicting future neural activity. VIND excels in this task, in some cases substantially outperforming current methods.
△ Less
Submitted 16 June, 2020; v1 submitted 6 November, 2018;
originally announced November 2018.
-
Penalized matrix decomposition for denoising, compression, and improved demixing of functional imaging data
Authors:
E. Kelly Buchanan,
Ian Kinsella,
Ding Zhou,
Rong Zhu,
Pengcheng Zhou,
Felipe Gerhard,
John Ferrante,
Ying Ma,
Sharon Kim,
Mohammed Shaik,
Yajie Liang,
Rongwen Lu,
Jacob Reimer,
Paul Fahey,
Taliah Muhammad,
Graham Dempsey,
Elizabeth Hillman,
Na Ji,
Andreas Tolias,
Liam Paninski
Abstract:
Calcium imaging has revolutionized systems neuroscience, providing the ability to image large neural populations with single-cell resolution. The resulting datasets are quite large, which has presented a barrier to routine open sharing of this data, slowing progress in reproducible research. State of the art methods for analyzing this data are based on non-negative matrix factorization (NMF); thes…
▽ More
Calcium imaging has revolutionized systems neuroscience, providing the ability to image large neural populations with single-cell resolution. The resulting datasets are quite large, which has presented a barrier to routine open sharing of this data, slowing progress in reproducible research. State of the art methods for analyzing this data are based on non-negative matrix factorization (NMF); these approaches solve a non-convex optimization problem, and are effective when good initializations are available, but can break down in low-SNR settings where common initialization approaches fail. Here we introduce an approach to compressing and denoising functional imaging data. The method is based on a spatially-localized penalized matrix decomposition (PMD) of the data to separate (low-dimensional) signal from (temporally-uncorrelated) noise. This approach can be applied in parallel on local spatial patches and is therefore highly scalable, does not impose non-negativity constraints or require stringent identifiability assumptions (leading to significantly more robust results compared to NMF), and estimates all parameters directly from the data, so no hand-tuning is required. We have applied the method to a wide range of functional imaging data (including one-photon, two-photon, three-photon, widefield, somatic, axonal, dendritic, calcium, and voltage imaging datasets): in all cases, we observe ~2-4x increases in SNR and compression rates of 20-300x with minimal visible loss of signal, with no adjustment of hyperparameters; this in turn facilitates the process of demixing the observed activity into contributions from individual neurons. We focus on two challenging applications: dendritic calcium imaging data and voltage imaging data in the context of optogenetic stimulation. In both cases, we show that our new approach leads to faster and much more robust extraction of activity from the data.
△ Less
Submitted 17 July, 2018;
originally announced July 2018.
-
Reparameterizing the Birkhoff Polytope for Variational Permutation Inference
Authors:
Scott W. Linderman,
Gonzalo E. Mena,
Hal Cooper,
Liam Paninski,
John P. Cunningham
Abstract:
Many matching, tracking, sorting, and ranking problems require probabilistic reasoning about possible permutations, a set that grows factorially with dimension. Combinatorial optimization algorithms may enable efficient point estimation, but fully Bayesian inference poses a severe challenge in this high-dimensional, discrete space. To surmount this challenge, we start with the usual step of relaxi…
▽ More
Many matching, tracking, sorting, and ranking problems require probabilistic reasoning about possible permutations, a set that grows factorially with dimension. Combinatorial optimization algorithms may enable efficient point estimation, but fully Bayesian inference poses a severe challenge in this high-dimensional, discrete space. To surmount this challenge, we start with the usual step of relaxing a discrete set (here, of permutation matrices) to its convex hull, which here is the Birkhoff polytope: the set of all doubly-stochastic matrices. We then introduce two novel transformations: first, an invertible and differentiable stick-breaking procedure that maps unconstrained space to the Birkhoff polytope; second, a map that rounds points toward the vertices of the polytope. Both transformations include a temperature parameter that, in the limit, concentrates the densities on permutation matrices. We then exploit these transformations and reparameterization gradients to introduce variational inference over permutation matrices, and we demonstrate its utility in a series of experiments.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
Automated scalable segmentation of neurons from multispectral images
Authors:
Uygar Sümbül,
Douglas Roussien Jr.,
Fei Chen,
Nicholas Barry,
Edward S. Boyden,
Dawen Cai,
John P. Cunningham,
Liam Paninski
Abstract:
Reconstruction of neuroanatomy is a fundamental problem in neuroscience. Stochastic expression of colors in individual cells is a promising tool, although its use in the nervous system has been limited due to various sources of variability in expression. Moreover, the intermingled anatomy of neuronal trees is challenging for existing segmentation algorithms. Here, we propose a method to automate t…
▽ More
Reconstruction of neuroanatomy is a fundamental problem in neuroscience. Stochastic expression of colors in individual cells is a promising tool, although its use in the nervous system has been limited due to various sources of variability in expression. Moreover, the intermingled anatomy of neuronal trees is challenging for existing segmentation algorithms. Here, we propose a method to automate the segmentation of neurons in such (potentially pseudo-colored) images. The method uses spatio-color relations between the voxels, generates supervoxels to reduce the problem size by four orders of magnitude before the final segmentation, and is parallelizable over the supervoxels. To quantify performance and gain insight, we generate simulated images, where the noise level and characteristics, the density of expression, and the number of fluorophore types are variable. We also present segmentations of real Brainbow images of the mouse hippocampus, which reveal many of the dendritic segments.
△ Less
Submitted 21 January, 2017; v1 submitted 1 November, 2016;
originally announced November 2016.
-
Recurrent switching linear dynamical systems
Authors:
Scott W. Linderman,
Andrew C. Miller,
Ryan P. Adams,
David M. Blei,
Liam Paninski,
Matthew J. Johnson
Abstract:
Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers the…
▽ More
Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers these dynamical units, but also explains how their switching behavior depends on observations or continuous latent states. These "recurrent" switching linear dynamical systems provide further insight by discovering the conditions under which each unit is deployed, something that traditional SLDS models fail to do. We leverage recent algorithmic advances in approximate inference to make Bayesian inference in these models easy, fast, and scalable.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
Stochastic Bouncy Particle Sampler
Authors:
Ari Pakman,
Dar Gilboa,
David Carlson,
Liam Paninski
Abstract:
We introduce a novel stochastic version of the non-reversible, rejection-free Bouncy Particle Sampler (BPS), a Markov process whose sample trajectories are piecewise linear. The algorithm is based on simulating first arrival times in a doubly stochastic Poisson process using the thinning method, and allows efficient sampling of Bayesian posteriors in big datasets. We prove that in the BPS no bias…
▽ More
We introduce a novel stochastic version of the non-reversible, rejection-free Bouncy Particle Sampler (BPS), a Markov process whose sample trajectories are piecewise linear. The algorithm is based on simulating first arrival times in a doubly stochastic Poisson process using the thinning method, and allows efficient sampling of Bayesian posteriors in big datasets. We prove that in the BPS no bias is introduced by noisy evaluations of the log-likelihood gradient. On the other hand, we argue that efficiency considerations favor a small, controllable bias in the construction of the thinning proposals, in exchange for faster mixing. We introduce a simple regression-based proposal intensity for the thinning method that controls this trade-off. We illustrate the algorithm in several examples in which it outperforms both unbiased, but slowly mixing stochastic versions of BPS, as well as biased stochastic gradient-based samplers.
△ Less
Submitted 13 June, 2017; v1 submitted 2 September, 2016;
originally announced September 2016.
-
Fast Online Deconvolution of Calcium Imaging Data
Authors:
Johannes Friedrich,
Pengcheng Zhou,
Liam Paninski
Abstract:
Fluorescent calcium indicators are a popular means for observing the spiking activity of large neuronal populations, but extracting the activity of each neuron from raw fluorescence calcium imaging data is a nontrivial problem. We present a fast online active set method to solve this sparse non-negative deconvolution problem. Importantly, the algorithm progresses through each time series sequentia…
▽ More
Fluorescent calcium indicators are a popular means for observing the spiking activity of large neuronal populations, but extracting the activity of each neuron from raw fluorescence calcium imaging data is a nontrivial problem. We present a fast online active set method to solve this sparse non-negative deconvolution problem. Importantly, the algorithm progresses through each time series sequentially from beginning to end, thus enabling real-time online estimation of neural activity during the imaging session. Our algorithm is a generalization of the pool adjacent violators algorithm (PAVA) for isotonic regression and inherits its linear-time computational complexity. We gain remarkable increases in processing speed: more than one order of magnitude compared to currently employed state of the art convex solvers relying on interior point methods. Unlike these approaches, our method can exploit warm starts; therefore optimizing model hyperparameters only requires a handful of passes through the data. A minor modification can further improve the quality of activity inference by imposing a constraint on the minimum spike size. The algorithm enables real-time simultaneous deconvolution of $O(10^5)$ traces of whole-brain larval zebrafish imaging data on a laptop.
△ Less
Submitted 16 March, 2017; v1 submitted 2 September, 2016;
originally announced September 2016.
-
Robust and scalable Bayesian analysis of spatial neural tuning function data
Authors:
Kamiar Rahnama Rad,
Timothy A. Machado,
Liam Paninski
Abstract:
A common analytical problem in neuroscience is the interpretation of neural activity with respect to sensory input or behavioral output. This is typically achieved by regressing measured neural activity against known stimuli or behavioral variables to produce a "tuning function" for each neuron. Unfortunately, because this approach handles neurons individually, it cannot take advantage of simultan…
▽ More
A common analytical problem in neuroscience is the interpretation of neural activity with respect to sensory input or behavioral output. This is typically achieved by regressing measured neural activity against known stimuli or behavioral variables to produce a "tuning function" for each neuron. Unfortunately, because this approach handles neurons individually, it cannot take advantage of simultaneous measurements from spatially adjacent neurons that often have similar tuning properties. On the other hand, sharing information between adjacent neurons can errantly degrade estimates of tuning functions across space if there are sharp discontinuities in tuning between nearby neurons. In this paper, we develop a computationally efficient block Gibbs sampler that effectively pools information between neurons to de-noise tuning function estimates while simultaneously preserving sharp discontinuities that might exist in the organization of tuning across space. This method is fully Bayesian and its computational cost per iteration scales sub-quadratically with total parameter dimensionality. We demonstrate the robustness and scalability of this approach by applying it to both real and synthetic datasets. In particular, an application to data from the spinal cord illustrates that the proposed methods can dramatically decrease the experimental time required to accurately estimate tuning functions.
△ Less
Submitted 24 June, 2016;
originally announced June 2016.
-
Linear dynamical neural population models through nonlinear embeddings
Authors:
Yuanjun Gao,
Evan Archer,
Liam Paninski,
John P. Cunningham
Abstract:
A body of recent work in modeling neural activity focuses on recovering low-dimensional latent features that capture the statistical structure of large-scale neural populations. Most such approaches have focused on linear generative models, where inference is computationally tractable. Here, we propose fLDS, a general class of nonlinear generative models that permits the firing rate of each neuron…
▽ More
A body of recent work in modeling neural activity focuses on recovering low-dimensional latent features that capture the statistical structure of large-scale neural populations. Most such approaches have focused on linear generative models, where inference is computationally tractable. Here, we propose fLDS, a general class of nonlinear generative models that permits the firing rate of each neuron to vary as an arbitrary smooth function of a latent, linear dynamical state. This extra flexibility allows the model to capture a richer set of neural variability than a purely linear model, but retains an easily visualizable low-dimensional latent space. To fit this class of non-conjugate models we propose a variational inference scheme, along with a novel approximate posterior capable of capturing rich temporal correlations across time. We show that our techniques permit inference in a wide class of generative models.We also show in application to two neural datasets that, compared to state-of-the-art neural population models, fLDS captures a much larger proportion of neural variability with a small number of latent dimensions, providing superior predictive performance and interpretability.
△ Less
Submitted 25 October, 2016; v1 submitted 26 May, 2016;
originally announced May 2016.
-
Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data
Authors:
Pengcheng Zhou,
Shanna L. Resendez,
Jose Rodriguez-Romaguera,
Jessica C. Jimenez,
Shay Q. Neufeld,
Garret D. Stuber,
Rene Hen,
Mazen A. Kheirbek,
Bernardo L. Sabatini,
Robert E. Kass,
Liam Paninski
Abstract:
In vivo calcium imaging through microscopes has enabled deep brain imaging of previously inaccessible neuronal populations within the brains of freely moving subjects. However, microendoscopic data suffer from high levels of background fluorescence as well as an increased potential for overlap** neuronal signals. Previous methods fail in identifying neurons and demixing their temporal activity b…
▽ More
In vivo calcium imaging through microscopes has enabled deep brain imaging of previously inaccessible neuronal populations within the brains of freely moving subjects. However, microendoscopic data suffer from high levels of background fluorescence as well as an increased potential for overlap** neuronal signals. Previous methods fail in identifying neurons and demixing their temporal activity because the cellular signals are often submerged in the large fluctuating background. Here we develop an efficient method to extract cellular signals with minimal influence from the background. We model the background with two realistic components: (1) one models the constant baseline and slow trends of each pixel, and (2) the other models the fast fluctuations from out-of-focus signals and is therefore constrained to have low spatial-frequency structure. This decomposition avoids cellular signals being absorbed into the background term. After subtracting the background approximated with this model, we use Constrained Nonnegative Matrix Factorization (CNMF, Pnevmatikakis et al. (2016)) to better demix neural signals and get their denoised and deconvolved temporal activity. We validate our method on simulated and experimental data, where it shows fast, reliable, and high quality signal extraction under a wide variety of imaging parameters.
△ Less
Submitted 25 May, 2017; v1 submitted 23 May, 2016;
originally announced May 2016.
-
Bayesian methods for event analysis of intracellular currents
Authors:
Josh Merel,
Ben Shababo,
Alex Naka,
Hillel Adesnik,
Liam Paninski
Abstract:
Investigation of neural circuit functioning often requires statistical interpretation of events in subthreshold electrophysiological recordings. This problem is non-trivial because recordings may have moderate levels of structured noise and events may have distinct kinetics. In addition, novel experimental designs that combine optical and electrophysiological methods will depend upon statistical t…
▽ More
Investigation of neural circuit functioning often requires statistical interpretation of events in subthreshold electrophysiological recordings. This problem is non-trivial because recordings may have moderate levels of structured noise and events may have distinct kinetics. In addition, novel experimental designs that combine optical and electrophysiological methods will depend upon statistical tools that combine multimodal data. We present a Bayesian approach for inferring the timing, strength, and kinetics of postsynaptic currents (PSCs) from voltage-clamp recordings on a per event basis. The simple generative model for a single voltage-clamp recording flexibly extends to include network-level structure to enable experiments designed to probe synaptic connectivity. We validate the approach on simulated and real data. We also demonstrate that extensions of the basic PSC detection algorithm can handle recordings contaminated with optically evoked currents, and we simulate a scenario in which calcium imaging observations, available for a subset of neurons, can be fused with electrophysiological data to achieve higher temporal resolution. We apply this approach to simulated and real ground truth data to demonstrate its higher sensitivity in detecting small signal-to-noise events and its increased robustness to noise compared to standard methods for detecting PSCs. The new Bayesian event analysis approach for electrophysiological recordings should allow for better estimation of physiological parameters under more variable conditions and help support new experimental designs for circuit map**.
△ Less
Submitted 18 May, 2016; v1 submitted 20 March, 2016;
originally announced March 2016.
-
Partition Functions from Rao-Blackwellized Tempered Sampling
Authors:
David Carlson,
Patrick Stinson,
Ari Pakman,
Liam Paninski
Abstract:
Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability…
▽ More
Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability law of the inverse temperatures, and provides estimates of the partition function in terms of a simple quotient of Rao-Blackwellized marginal inverse temperature probability estimates, which are updated while sampling. We show that the method has interesting connections with several alternative popular methods, and offers some significant advantages. In particular, we empirically find that the new method provides more accurate estimates than Annealed Importance Sampling when calculating partition functions of large Restricted Boltzmann Machines (RBM); moreover, the method is sufficiently accurate to track training and validation log-likelihoods during learning of RBMs, at minimal computational cost.
△ Less
Submitted 25 May, 2016; v1 submitted 6 March, 2016;
originally announced March 2016.
-
Black box variational inference for state space models
Authors:
Evan Archer,
Il Memming Park,
Lars Buesing,
John Cunningham,
Liam Paninski
Abstract:
Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highly-structured models, such as the linear dynamical system w…
▽ More
Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highly-structured models, such as the linear dynamical system with linear-Gaussian observations, have closed-form inference procedures (e.g. the Kalman Filter), but this case is an exception to the general rule that exact posterior inference in more complex generative models is intractable. Consequently, much work in time-series modeling focuses on approximate inference procedures for one particular class of models. Here, we extend recent developments in stochastic variational inference to develop a `black-box' approximate inference technique for latent variable models with latent dynamical structure. We propose a structured Gaussian variational approximate posterior that carries the same intuition as the standard Kalman filter-smoother but, importantly, permits us to use the same inference approach to approximate the posterior of much more general, nonlinear latent variable generative models. We show that our approach recovers accurate estimates in the case of basic models with closed-form posteriors, and more interestingly performs well in comparison to variational approaches that were designed in a bespoke fashion for specific non-conjugate models.
△ Less
Submitted 23 November, 2015;
originally announced November 2015.
-
Neuroprosthetic decoder training as imitation learning
Authors:
Josh Merel,
David Carlson,
Liam Paninski,
John P. Cunningham
Abstract:
Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder agai…
▽ More
Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user's intended movement. We describe how training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger, [1]), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.
△ Less
Submitted 14 March, 2016; v1 submitted 12 November, 2015;
originally announced November 2015.
-
A structured matrix factorization framework for large scale calcium imaging data analysis
Authors:
Eftychios A. Pnevmatikakis,
Yuanjun Gao,
Daniel Soudry,
David Pfau,
Clay Lacefield,
Kira Poskanzer,
Randy Bruno,
Rafael Yuste,
Liam Paninski
Abstract:
We present a structured matrix factorization approach to analyzing calcium imaging recordings of large neuronal ensembles. Our goal is to simultaneously identify the locations of the neurons, demix spatially overlap** components, and denoise and deconvolve the spiking activity of each neuron from the slow dynamics of the calcium indicator. The matrix factorization approach relies on the observat…
▽ More
We present a structured matrix factorization approach to analyzing calcium imaging recordings of large neuronal ensembles. Our goal is to simultaneously identify the locations of the neurons, demix spatially overlap** components, and denoise and deconvolve the spiking activity of each neuron from the slow dynamics of the calcium indicator. The matrix factorization approach relies on the observation that the spatiotemporal fluorescence activity can be expressed as a product of two matrices: a spatial matrix that encodes the location of each neuron in the optical field and a temporal matrix that characterizes the calcium concentration of each neuron over time. We present a simple approach for estimating the dynamics of the calcium indicator as well as the observation noise statistics from the observed data. These parameters are then used to set up the matrix factorization problem in a constrained form that requires no further parameter tuning. We discuss initialization and post-processing techniques that enhance the performance of our method, along with efficient and largely parallelizable algorithms. We apply our method to {\it in vivo} large scale multi-neuronal imaging data and also demonstrate how similar methods can be used for the analysis of {\it in vivo} dendritic imaging data.
△ Less
Submitted 9 September, 2014;
originally announced September 2014.
-
Bayesian spike inference from calcium imaging data
Authors:
Eftychios A. Pnevmatikakis,
Josh Merel,
Ari Pakman,
Liam Paninski
Abstract:
We present efficient Bayesian methods for extracting neuronal spiking information from calcium imaging data. The goal of our methods is to sample from the posterior distribution of spike trains and model parameters (baseline concentration, spike amplitude etc) given noisy calcium imaging data. We present discrete time algorithms where we sample the existence of a spike at each time bin using Gibbs…
▽ More
We present efficient Bayesian methods for extracting neuronal spiking information from calcium imaging data. The goal of our methods is to sample from the posterior distribution of spike trains and model parameters (baseline concentration, spike amplitude etc) given noisy calcium imaging data. We present discrete time algorithms where we sample the existence of a spike at each time bin using Gibbs methods, as well as continuous time algorithms where we sample over the number of spikes and their locations at an arbitrary resolution using Metropolis-Hastings methods for point processes. We provide Rao-Blackwellized extensions that (i) marginalize over several model parameters and (ii) provide smooth estimates of the marginal spike posterior distribution in continuous time. Our methods serve as complements to standard point estimates and allow for quantification of uncertainty in estimating the underlying spike train and model parameters.
△ Less
Submitted 26 November, 2013;
originally announced November 2013.
-
Auxiliary-variable Exact Hamiltonian Monte Carlo Samplers for Binary Distributions
Authors:
Ari Pakman,
Liam Paninski
Abstract:
We present a new approach to sample from generic binary distributions, based on an exact Hamiltonian Monte Carlo algorithm applied to a piecewise continuous augmentation of the binary distribution of interest. An extension of this idea to distributions over mixtures of binary and possibly-truncated Gaussian or exponential variables allows us to sample from posteriors of linear and probit regressio…
▽ More
We present a new approach to sample from generic binary distributions, based on an exact Hamiltonian Monte Carlo algorithm applied to a piecewise continuous augmentation of the binary distribution of interest. An extension of this idea to distributions over mixtures of binary and possibly-truncated Gaussian or exponential variables allows us to sample from posteriors of linear and probit regression models with spike-and-slab priors and truncated parameters. We illustrate the advantages of these algorithms in several examples in which they outperform the Metropolis or Gibbs samplers.
△ Less
Submitted 12 October, 2015; v1 submitted 9 November, 2013;
originally announced November 2013.
-
A shotgun sampling solution for the common input problem in neural connectivity inference
Authors:
Daniel Soudry,
Suraj Keshri,
Patrick Stinson,
Min-hwan Oh,
Garud Iyengar,
Liam Paninski
Abstract:
Inferring connectivity in neuronal networks remains a key challenge in statistical neuroscience. The `common input' problem presents the major roadblock: it is difficult to reliably distinguish causal connections between pairs of observed neurons from correlations induced by common input from unobserved neurons. Since available recording techniques allow us to sample from only a small fraction of…
▽ More
Inferring connectivity in neuronal networks remains a key challenge in statistical neuroscience. The `common input' problem presents the major roadblock: it is difficult to reliably distinguish causal connections between pairs of observed neurons from correlations induced by common input from unobserved neurons. Since available recording techniques allow us to sample from only a small fraction of large networks simultaneously with sufficient temporal resolution, naive connectivity estimators that neglect these common input effects are highly biased. This work proposes a `shotgun' experimental design, in which we observe multiple sub-networks briefly, in a serial manner. Thus, while the full network cannot be observed simultaneously at any given time, we may be able to observe most of it during the entire experiment. Using a generalized linear model for a spiking recurrent neural network, we develop scalable approximate Bayesian methods to perform network inference given this type of data, in which only a small fraction of the network is observed in each time bin. We demonstrate in simulation that, using this method: (1) The shotgun experimental design can eliminate the biases induced by common input effects. (2) Networks with thousands of neurons, in which only a small fraction of the neurons is observed in each time bin, could be quickly and accurately estimated. (3) Performance can be improved if we exploit prior information about the probability of having a connection between two neurons, its dependence on neuronal cell types (e.g., Dale's law), or its dependence on the distance between neurons.
△ Less
Submitted 17 December, 2014; v1 submitted 15 September, 2013;
originally announced September 2013.
-
Fast inference in generalized linear models via expected log-likelihoods
Authors:
Alexandro D. Ramirez,
Liam Paninski
Abstract:
Generalized linear models play an essential role in a wide variety of statistical applications. This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation. The basic idea is to replace a sum that appears in the exact log-likelihood by an expectation over the model covariates; the resulting "expected log-likelihood" can in many cases be computed…
▽ More
Generalized linear models play an essential role in a wide variety of statistical applications. This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation. The basic idea is to replace a sum that appears in the exact log-likelihood by an expectation over the model covariates; the resulting "expected log-likelihood" can in many cases be computed significantly faster than the exact log-likelihood. In many neuroscience experiments the distribution over model covariates is controlled by the experimenter and the expected log-likelihood approximation becomes particularly useful; for example, estimators based on maximizing this expected log-likelihood (or a penalized version thereof) can often be obtained with orders of magnitude computational savings compared to the exact maximum likelihood estimators. A risk analysis establishes that these maximum EL estimators often come with little cost in accuracy (and in some cases even improved accuracy) compared to standard maximum likelihood estimates. Finally, we find that these methods can significantly decrease the computation time of marginal likelihood calculations for model selection and of Markov chain Monte Carlo methods for sampling from the posterior parameter distribution. We illustrate our results by applying these methods to a computationally-challenging dataset of neural spike trains obtained via large-scale multi-electrode recordings in the primate retina.
△ Less
Submitted 24 May, 2013;
originally announced May 2013.
-
Exact Hamiltonian Monte Carlo for Truncated Multivariate Gaussians
Authors:
Ari Pakman,
Liam Paninski
Abstract:
We present a Hamiltonian Monte Carlo algorithm to sample from multivariate Gaussian distributions in which the target space is constrained by linear and quadratic inequalities or products thereof. The Hamiltonian equations of motion can be integrated exactly and there are no parameters to tune. The algorithm mixes faster and is more efficient than Gibbs sampling. The runtime depends on the number…
▽ More
We present a Hamiltonian Monte Carlo algorithm to sample from multivariate Gaussian distributions in which the target space is constrained by linear and quadratic inequalities or products thereof. The Hamiltonian equations of motion can be integrated exactly and there are no parameters to tune. The algorithm mixes faster and is more efficient than Gibbs sampling. The runtime depends on the number and shape of the constraints but the algorithm is highly parallelizable. In many cases, we can exploit special structure in the covariance matrices of the untruncated Gaussian to further speed up the runtime. A simple extension of the algorithm permits sampling from distributions whose log-density is piecewise quadratic, as in the "Bayesian Lasso" model.
△ Less
Submitted 5 June, 2013; v1 submitted 20 August, 2012;
originally announced August 2012.
-
Efficient methods for sampling spike trains in networks of coupled neurons
Authors:
Yuriy Mishchenko,
Liam Paninski
Abstract:
Monte Carlo approaches have recently been proposed to quantify connectivity in neuronal networks. The key problem is to sample from the conditional distribution of a single neuronal spike train, given the activity of the other neurons in the network. Dependencies between neurons are usually relatively weak; however, temporal dependencies within the spike train of a single neuron are typically stro…
▽ More
Monte Carlo approaches have recently been proposed to quantify connectivity in neuronal networks. The key problem is to sample from the conditional distribution of a single neuronal spike train, given the activity of the other neurons in the network. Dependencies between neurons are usually relatively weak; however, temporal dependencies within the spike train of a single neuron are typically strong. In this paper we develop several specialized Metropolis--Hastings samplers which take advantage of this dependency structure. These samplers are based on two ideas: (1) an adaptation of fast forward--backward algorithms from the theory of hidden Markov models to take advantage of the local dependencies inherent in spike trains, and (2) a first-order expansion of the conditional likelihood which allows for efficient exact sampling in the limit of weak coupling between neurons. We also demonstrate that these samplers can effectively incorporate side information, in particular, noisy fluorescence observations in the context of calcium-sensitive imaging experiments. We quantify the efficiency of these samplers in a variety of simulated experiments in which the network parameters are closely matched to data measured in real cortical networks, and also demonstrate the sampler applied to real calcium imaging data.
△ Less
Submitted 30 November, 2011;
originally announced November 2011.
-
A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging data
Authors:
Yuriy Mishchencko,
Joshua T. Vogelstein,
Liam Paninski
Abstract:
Deducing the structure of neural circuits is one of the central problems of modern neuroscience. Recently-introduced calcium fluorescent imaging methods permit experimentalists to observe network activity in large populations of neurons, but these techniques provide only indirect observations of neural spike trains, with limited time resolution and signal quality. In this work we present a Bayesia…
▽ More
Deducing the structure of neural circuits is one of the central problems of modern neuroscience. Recently-introduced calcium fluorescent imaging methods permit experimentalists to observe network activity in large populations of neurons, but these techniques provide only indirect observations of neural spike trains, with limited time resolution and signal quality. In this work we present a Bayesian approach for inferring neural circuitry given this type of imaging data. We model the network activity in terms of a collection of coupled hidden Markov chains, with each chain corresponding to a single neuron in the network and the coupling between the chains reflecting the network's connectivity matrix. We derive a Monte Carlo Expectation--Maximization algorithm for fitting the model parameters; to obtain the sufficient statistics in a computationally-efficient manner, we introduce a specialized blockwise-Gibbs algorithm for sampling from the joint activity of all observed neurons given the observed fluorescence data. We perform large-scale simulations of randomly connected neuronal networks with biophysically realistic parameters and find that the proposed methods can accurately infer the connectivity in these networks given reasonable experimental and computational constraints. In addition, the estimation accuracy may be improved significantly by incorporating prior knowledge about the sparseness of connectivity in the network, via standard L$_1$ penalization methods.
△ Less
Submitted 21 July, 2011;
originally announced July 2011.
-
Fast non-negative deconvolution for spike train inference from population calcium imaging
Authors:
Joshua T. Vogelstein,
Adam M. Packer,
Tim A. Machado,
Tanya Sippy,
Baktash Babadi,
Rafael Yuste,
Liam Paninski
Abstract:
Calcium imaging for observing spiking activity from large populations of neurons are quickly gaining popularity. While the raw data are fluorescence movies, the underlying spike trains are of interest. This work presents a fast non-negative deconvolution filter to infer the approximately most likely spike train for each neuron, given the fluorescence observations. This algorithm outperforms opti…
▽ More
Calcium imaging for observing spiking activity from large populations of neurons are quickly gaining popularity. While the raw data are fluorescence movies, the underlying spike trains are of interest. This work presents a fast non-negative deconvolution filter to infer the approximately most likely spike train for each neuron, given the fluorescence observations. This algorithm outperforms optimal linear deconvolution (Wiener filtering) on both simulated and biological data. The performance gains come from restricting the inferred spike trains to be positive (using an interior-point method), unlike the Wiener filter. The algorithm is fast enough that even when imaging over 100 neurons, inference can be performed on the set of all observed traces faster than real-time. Performing optimal spatial filtering on the images further refines the estimates. Importantly, all the parameters required to perform the inference can be estimated using only the fluorescence data, obviating the need to perform joint electrophysiological and imaging calibration experiments.
△ Less
Submitted 9 December, 2009;
originally announced December 2009.