-
More powerful selective inference for the graph fused lasso
Authors:
Yiqun T. Chen,
Sean W. Jewell,
Daniela M. Witten
Abstract:
The graph fused lasso -- which includes as a special case the one-dimensional fused lasso -- is widely used to reconstruct signals that are piecewise constant on a graph, meaning that nodes connected by an edge tend to have identical values. We consider testing for a difference in the means of two connected components estimated using the graph fused lasso. A naive procedure such as a z-test for a…
▽ More
The graph fused lasso -- which includes as a special case the one-dimensional fused lasso -- is widely used to reconstruct signals that are piecewise constant on a graph, meaning that nodes connected by an edge tend to have identical values. We consider testing for a difference in the means of two connected components estimated using the graph fused lasso. A naive procedure such as a z-test for a difference in means will not control the selective Type I error, since the hypothesis that we are testing is itself a function of the data. In this work, we propose a new test for this task that controls the selective Type I error, and conditions on less information than existing approaches, leading to substantially higher power. We illustrate our approach in simulation and on datasets of drug overdose death rates and teenage birth rates in the contiguous United States. Our approach yields more discoveries on both datasets.
△ Less
Submitted 9 February, 2022; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Quantifying uncertainty in spikes estimated from calcium imaging data
Authors:
Yiqun T. Chen,
Sean W. Jewell,
Daniela M. Witten
Abstract:
In recent years, a number of methods have been proposed to estimate the times at which a neuron spikes on the basis of calcium imaging data. However, quantifying the uncertainty associated with these estimated spikes remains an open problem. We consider a simple and well-studied model for calcium imaging data, which states that calcium decays exponentially in the absence of a spike, and instantane…
▽ More
In recent years, a number of methods have been proposed to estimate the times at which a neuron spikes on the basis of calcium imaging data. However, quantifying the uncertainty associated with these estimated spikes remains an open problem. We consider a simple and well-studied model for calcium imaging data, which states that calcium decays exponentially in the absence of a spike, and instantaneously increases when a spike occurs. We wish to test the null hypothesis that the neuron did not spike -- i.e., that there was no increase in calcium -- at a particular timepoint at which a spike was estimated. In this setting, classical hypothesis tests lead to inflated Type I error, because the spike was estimated on the same data used for testing. To overcome this problem, we propose a selective inference approach. We describe an efficient algorithm to compute finite-sample p-values that control selective Type I error, and confidence intervals with correct selective coverage, for spikes estimated using a recent proposal from the literature. We apply our proposal in simulation and on calcium imaging data from the spikefinder challenge.
△ Less
Submitted 29 July, 2021; v1 submitted 13 March, 2021;
originally announced March 2021.
-
Testing for a Change in Mean After Changepoint Detection
Authors:
Sean Jewell,
Paul Fearnhead,
Daniela Witten
Abstract:
While many methods are available to detect structural changes in a time series, few procedures are available to quantify the uncertainty of these estimates post-detection. In this work, we fill this gap by proposing a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. We further show that it is possible to efficiently carry out this framework…
▽ More
While many methods are available to detect structural changes in a time series, few procedures are available to quantify the uncertainty of these estimates post-detection. In this work, we fill this gap by proposing a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. We further show that it is possible to efficiently carry out this framework in the case of changepoints estimated by binary segmentation and its variants, $\ell_{0}$ segmentation, or the fused lasso. Our setup allows us to condition on much less information than existing approaches, which yields higher powered tests. We apply our proposals in a simulation study and on a dataset of chromosomal guanine-cytosine content. These approaches are freely available in the R package ChangepointInference at https://jewellsean.github.io/changepoint-inference/.
△ Less
Submitted 14 April, 2021; v1 submitted 9 October, 2019;
originally announced October 2019.
-
Fast Nonconvex Deconvolution of Calcium Imaging Data
Authors:
Sean Jewell,
Toby Dylan Hocking,
Paul Fearnhead,
Daniela Witten
Abstract:
Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously. However, determining the exact moment in time at which a neuron spikes, from a calcium imaging data set, amounts to a non-trivial deconvolution problem which is of critical importance for downstream analyses. While a number of formulations have been…
▽ More
Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously. However, determining the exact moment in time at which a neuron spikes, from a calcium imaging data set, amounts to a non-trivial deconvolution problem which is of critical importance for downstream analyses. While a number of formulations have been proposed for this task in the recent literature, in this paper we focus on a formulation recently proposed in Jewell and Witten (2017) which has shown initial promising results. However, this proposal is slow to run on fluorescence traces of hundreds of thousands of timesteps.
Here we develop a much faster online algorithm for solving the optimization problem of Jewell and Witten (2017) that can be used to deconvolve a fluorescence trace of 100,000 timesteps in less than a second. Furthermore, this algorithm overcomes a technical challenge of Jewell and Witten (2017) by avoiding the occurrence of so-called "negative" spikes. We demonstrate that this algorithm has superior performance relative to existing methods for spike deconvolution on calcium imaging datasets that were recently released as part of the spikefinder challenge (http://spikefinder.codeneuro.org/).
Our C++ implementation, along with R and python wrappers, is publicly available on Github at https://github.com/jewellsean/FastLZeroSpikeInference.
△ Less
Submitted 20 February, 2018;
originally announced February 2018.
-
Exact Spike Train Inference Via $\ell_0$ Optimization
Authors:
Sean Jewell,
Daniela Witten
Abstract:
In recent years, new technologies in neuroscience have made it possible to measure the activities of large numbers of neurons simultaneously in behaving animals. For each neuron, a fluorescence trace is measured; this can be seen as a first-order approximation of the neuron's activity over time. Determining the exact time at which a neuron spikes on the basis of its fluorescence trace is an import…
▽ More
In recent years, new technologies in neuroscience have made it possible to measure the activities of large numbers of neurons simultaneously in behaving animals. For each neuron, a fluorescence trace is measured; this can be seen as a first-order approximation of the neuron's activity over time. Determining the exact time at which a neuron spikes on the basis of its fluorescence trace is an important open problem in the field of computational neuroscience.
Recently, a convex optimization problem involving an $\ell_1$ penalty was proposed for this task. In this paper, we slightly modify that recent proposal by replacing the $\ell_1$ penalty with an $\ell_0$ penalty. In stark contrast to the conventional wisdom that $\ell_0$ optimization problems are computationally intractable, we show that the resulting optimization problem can be efficiently solved for the global optimum using an extremely simple and efficient dynamic programming algorithm. Our R-language implementation of the proposed algorithm runs in a few minutes on fluorescence traces of $100,000$ timesteps. Furthermore, our proposal leads to substantial improvements over the previous $\ell_1$ proposal, in simulations as well as on two calcium imaging data sets.
R-language software for our proposal is available on CRAN in the package LZeroSpikeInference. Instructions for running this software in python can be found at https://github.com/jewellsean/LZeroSpikeInference.
△ Less
Submitted 12 November, 2017; v1 submitted 24 March, 2017;
originally announced March 2017.