Search | arXiv e-print repository

Sketching the Heat Kernel: Using Gaussian Processes to Embed Data

Abstract: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian proce… ▽ More This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian process to be the heat kernel, and computing the embedding amounts to sketching a matrix representing the heat kernel. The Karhunen-Loève expansion reveals that the straight-line distances in the embedding approximate the diffusion distance in a probabilistic sense, avoiding the need for sharp cutoffs and maintaining some of the smaller-scale structure. Our method demonstrates further advantage in its robustness to outliers. We justify the approach with both theory and experiments. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 28 pages

arXiv:2309.13478 [pdf, other]

CA-PCA: Manifold Dimension Estimation, Adapted for Curvature

Authors: Anna C. Gilbert, Kevin O'Neill

Abstract: The success of algorithms in the analysis of high-dimensional data is often attributed to the manifold hypothesis, which supposes that this data lie on or near a manifold of much lower dimension. It is often useful to determine or estimate the dimension of this manifold before performing dimension reduction, for instance. Existing methods for dimension estimation are calibrated using a flat unit b… ▽ More The success of algorithms in the analysis of high-dimensional data is often attributed to the manifold hypothesis, which supposes that this data lie on or near a manifold of much lower dimension. It is often useful to determine or estimate the dimension of this manifold before performing dimension reduction, for instance. Existing methods for dimension estimation are calibrated using a flat unit ball. In this paper, we develop CA-PCA, a version of local PCA based instead on a calibration of a quadratic embedding, acknowledging the curvature of the underlying manifold. Numerous careful experiments show that this adaptation improves the estimator in a wide range of settings. △ Less

Submitted 23 September, 2023; originally announced September 2023.

Comments: 26 pages

MSC Class: 62H25; 62R30

arXiv:2208.06676 [pdf, other]

May the force be with you

Authors: Yulan Zhang, Anna C. Gilbert, Stefan Steinerberger

Abstract: Modern methods in dimensionality reduction are dominated by nonlinear attraction-repulsion force-based methods (this includes t-SNE, UMAP, ForceAtlas2, LargeVis, and many more). The purpose of this paper is to demonstrate that all such methods, by design, come with an additional feature that is being automatically computed along the way, namely the vector field associated with these forces. We sho… ▽ More Modern methods in dimensionality reduction are dominated by nonlinear attraction-repulsion force-based methods (this includes t-SNE, UMAP, ForceAtlas2, LargeVis, and many more). The purpose of this paper is to demonstrate that all such methods, by design, come with an additional feature that is being automatically computed along the way, namely the vector field associated with these forces. We show how this vector field gives additional high-quality information and propose a general refinement strategy based on ideas from Morse theory. The efficiency of these ideas is illustrated specifically using t-SNE on synthetic and real-life data sets. △ Less

Submitted 13 August, 2022; originally announced August 2022.

Comments: 23 pages, 17 figures

arXiv:2110.11430 [pdf, other]

How can classical multidimensional scaling go wrong?

Authors: Rishi Sonthalia, Gregory Van Buskirk, Benjamin Raichel, Anna C. Gilbert

Abstract: Given a matrix $D$ describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we deriv… ▽ More Given a matrix $D$ describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we derive a formula, based on the eigenvalues of a matrix obtained from $D$, for the Frobenius norm of the difference between $D$ and the metric $D_{\text{cmds}}$ returned by cMDS. This error analysis leads us to the conclusion that when the derived matrix has a significant number of negative eigenvalues, then $\|D-D_{\text{cmds}}\|_F$, after initially decreasing, will eventually increase as we increase the dimension. Hence, counterintuitively, the quality of the embedding degrades as we increase the dimension. We empirically verify that the Frobenius norm increases as we increase the dimension for a variety of non-Euclidean metrics. We also show on several benchmark datasets that this degradation in the embedding results in the classification accuracy of both simple (e.g., 1-nearest neighbor) and complex (e.g., multi-layer neural nets) classifiers decreasing as we increase the embedding dimension. Finally, our analysis leads us to a new efficiently computable algorithm that returns a matrix $D_l$ that is at least as close to the original distances as $D_t$ (the Euclidean metric closest in $\ell_2$ distance). While $D_l$ is not metric, when given as input to cMDS instead of $D$, it empirically results in solutions whose distance to $D$ does not increase when we increase the dimension and the classification accuracy degrades less than the cMDS solution. △ Less

Submitted 28 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: Accepted to NeurIPS 2021

arXiv:2012.03126 [pdf, other]

Dual Regularized Optimal Transport

Authors: Rishi Sonthalia, Anna C. Gilbert

Abstract: In this paper, we present a new formulation of unbalanced optimal transport called Dual Regularized Optimal Transport (DROT). We argue that regularizing the dual formulation of optimal transport results in a version of unbalanced optimal transport that leads to sparse solutions and that gives us control over mass creation and destruction. We build intuition behind such control and present theoreti… ▽ More In this paper, we present a new formulation of unbalanced optimal transport called Dual Regularized Optimal Transport (DROT). We argue that regularizing the dual formulation of optimal transport results in a version of unbalanced optimal transport that leads to sparse solutions and that gives us control over mass creation and destruction. We build intuition behind such control and present theoretical properties of the solutions to DROT. We demonstrate that due to recent advances in optimization techniques, we can feasibly solve such a formulation at large scales and present extensive experimental evidence for this formulation and its solution. △ Less

Submitted 5 December, 2020; originally announced December 2020.

arXiv:2007.01346 [pdf, other]

Spectral Methods for Ranking with Scarce Data

Authors: Umang Varma, Lalit Jain, Anna C. Gilbert

Abstract: Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g.,… ▽ More Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g., movie genre, director, and cast). In this paper we modify a popular and well studied method, RankCentrality for rank aggregation to account for few comparisons and that incorporates additional feature information. This method returns meaningful rankings even under scarce comparisons. Using diffusion based methods, we incorporate feature information that outperforms state-of-the-art methods in practice. We also provide improved sample complexity for RankCentrality in a variety of sampling schemes. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: To appear in Proceedings of Uncertainty in Artificial Intelligence (UAI) 2020

MSC Class: 68T05

arXiv:2005.03853 [pdf, other]

Project and Forget: Solving Large-Scale Metric Constrained Problems

Authors: Rishi Sonthalia, Anna C. Gilbert

Abstract: Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric… ▽ More Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric constraints in such problems. In this paper, we provide an active set algorithm, Project and Forget, that uses Bregman projections, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We provide a theoretical analysis of \textsc{Project and Forget} and prove that our algorithm converges to the global optimal solution and that the $L_2$ distance of the current iterate to the optimal solution decays asymptotically at an exponential rate. We demonstrate that using our method we can solve large problem instances of three types of metric constrained problems: general weight correlation clustering, metric nearness, and metric learning; in each case, out-performing the state of the art methods with respect to CPU times and problem sizes. △ Less

Submitted 26 September, 2022; v1 submitted 8 May, 2020; originally announced May 2020.

arXiv:2005.03847 [pdf, other]

Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

Authors: Rishi Sonthalia, Anna C. Gilbert

Abstract: Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a… ▽ More Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure on the data. This tree structure can then be used directly to extract hierarchical information, embedded into a hyperbolic manifold using Sarkar's construction \cite{sarkar}, or used as a tree approximation of the original metric. To this end, we present a novel fast algorithm \textsc{TreeRep} such that, given a $δ$-hyperbolic metric (for any $δ\geq 0$), the algorithm learns a tree structure that approximates the original metric. In the case when $δ= 0$, we show analytically that \textsc{TreeRep} exactly recovers the original tree structure. We show empirically that \textsc{TreeRep} is not only many orders of magnitude faster than previously known algorithms, but also produces metrics with lower average distortion and higher mean average precision than most previous algorithms for learning hyperbolic embeddings, extracting hierarchical information, and approximating metrics via tree metrics. △ Less

Submitted 22 October, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

Comments: Code available at https://github.com/rsonthal/TreeRep

arXiv:1908.08411 [pdf, other]

Generalized Metric Repair on Graphs

Authors: Chenglin Fan, Anna C. Gilbert, Benjamin Raichel, Rishi Sonthalia, Gregory Van Buskirk

Abstract: Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimension reduction. As real data sets are noisy, distances often fail to satisfy a metric. For this reason, Gilbert and Jain and Fan et al. introduced the closely related sparse metric repair and… ▽ More Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimension reduction. As real data sets are noisy, distances often fail to satisfy a metric. For this reason, Gilbert and Jain and Fan et al. introduced the closely related sparse metric repair and metric violation distance problems. The goal of these problems is to repair as few distances as possible to ensure they satisfy a metric. Three variants were considered, one admitting a polynomial time algorithm. The other variants were shown to be APX-hard, and an $O(OPT^{1/3})$-approximation was given, where $OPT$ is the optimal solution size. In this paper, we generalize these problems to no longer consider all distances between the data points. That is, we consider a weighted graph $G$ with corrupted weights $w$, and our goal is to find the smallest number of weight modifications so that the resulting weighted graph distances satisfy a metric. This is a natural generalization and is more flexible as it takes into account different relationships among the data points. As in previous work, we distinguish among the types of repairs permitted and focus on the increase only and general versions. We demonstrate the inherent combinatorial structure of the problem, and give an approximation-preserving reduction from MULTICUT. Conversely, we show that for any fixed constant $ς$, for the large class of $ς$-chordal graphs, the problems are fixed parameter tractable. Call a cycle broken if it contains an edge whose weight is larger than the sum of all its other edges, and call the amount of this difference its deficit. We present approximation algorithms, one which depends on the maximum number of edges in a broken cycle, and one which depends on the number of distinct deficit values. △ Less

Submitted 21 August, 2019; originally announced August 2019.

Comments: arXiv admin note: text overlap with arXiv:1807.08078

arXiv:1903.10875 [pdf, other]

Nonlinear Iterative Hard Thresholding for Inverse Scattering

Authors: Anna C. Gilbert, Howard W. Levinson, John C. Schotland

Abstract: We consider the inverse scattering problem for sparse scatterers. An image reconstruction algorithm is proposed that is based on a nonlinear generalization of iterative hard thresholding. The convergence and error of the method was analyzed by means of coherence estimates and compared to numerical simulations. We consider the inverse scattering problem for sparse scatterers. An image reconstruction algorithm is proposed that is based on a nonlinear generalization of iterative hard thresholding. The convergence and error of the method was analyzed by means of coherence estimates and compared to numerical simulations. △ Less

Submitted 22 March, 2019; originally announced March 2019.

Comments: 30 pages, 10 figures

arXiv:1807.07619 [pdf, other]

Generalized Metric Repair on Graphs

Authors: Anna C. Gilbert, Rishi Sonthalia

Abstract: Many modern data analysis algorithms either assume that or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimensionality reduction. Because real data sets are noisy, the similarity measures often fail to satisfy a metric. For this reason, Gilbert and Jain [11] and Fan, et al. [8] introduce the clo… ▽ More Many modern data analysis algorithms either assume that or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimensionality reduction. Because real data sets are noisy, the similarity measures often fail to satisfy a metric. For this reason, Gilbert and Jain [11] and Fan, et al. [8] introduce the closely related problems of $\textit{sparse metric repair}$ and $\textit{metric violation distance}$. The goal of each problem is to repair as few distances as possible to ensure that the distances between the data points satisfy a metric. We generalize these problems so as to no longer require all the distances between the data points. That is, we consider a weighted graph $G$ with corrupted weights w and our goal is to find the smallest number of modifications to the weights so that the resulting weighted graph distances satisfy a metric. This problem is a natural generalization of the sparse metric repair problem and is more flexible as it takes into account different relationships amongst the input data points. As in previous work, we distinguish amongst the types of repairs permitted (decrease, increase, and general repairs). We focus on the increase and general versions and establish hardness results and show the inherent combinatorial structure of the problem. We then show that if we restrict to the case when $G$ is a chordal graph, then the problem is fixed parameter tractable. We also present several classes of approximation algorithms. These include and improve upon previous metric repair algorithms for the special case when $G = K_n$ △ Less

Submitted 19 July, 2018; originally announced July 2018.

arXiv:1807.07610 [pdf, other]

doi 10.1109/ALLERTON.2018.8635955

Unsupervised Metric Learning in Presence of Missing Data

Authors: Anna C. Gilbert, Rishi Sonthalia

Abstract: For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the dimension of the input data using standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always wo… ▽ More For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the dimension of the input data using standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always work in practice as these algorithms require that we have somewhat ideal data. Unfortunately, most data sets either have missing entries or unacceptably noisy values. That is, real data are far from ideal and we cannot use these algorithms directly. In this paper, we focus on the case when we have missing data. Some techniques, such as matrix completion, can be used to fill in missing data but these methods do not capture the non-linear structure of the manifold. Here, we present a new algorithm MR-MISSING that extends these previous algorithms and can be used to compute low dimensional representation on data sets with missing entries. We demonstrate the effectiveness of our algorithm by running three different experiments. We visually verify the effectiveness of our algorithm on synthetic manifolds, we numerically compare our projections against those computed by first filling in data using nlPCA and mDRUR on the MNIST data set, and we also show that we can do classification on MNIST with missing data. We also provide a theoretical guarantee for MR-MISSING under some simplifying assumptions. △ Less

Submitted 3 March, 2019; v1 submitted 19 July, 2018; originally announced July 2018.

Journal ref: 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

arXiv:1710.10655 [pdf, other]

If it ain't broke, don't fix it: Sparse metric repair

Authors: Anna C. Gilbert, Lalit Jain

Abstract: Many modern data-intensive computational problems either require, or benefit from distance or similarity data that adhere to a metric. The algorithms run faster or have better performance guarantees. Unfortunately, in real applications, the data are messy and values are noisy. The distances between the data points are far from satisfying a metric. Indeed, there are a number of different algorithms… ▽ More Many modern data-intensive computational problems either require, or benefit from distance or similarity data that adhere to a metric. The algorithms run faster or have better performance guarantees. Unfortunately, in real applications, the data are messy and values are noisy. The distances between the data points are far from satisfying a metric. Indeed, there are a number of different algorithms for finding the closest set of distances to the given ones that also satisfy a metric (sometimes with the extra condition of being Euclidean). These algorithms can have unintended consequences, they can change a large number of the original data points, and alter many other features of the data. The goal of sparse metric repair is to make as few changes as possible to the original data set or underlying distances so as to ensure the resulting distances satisfy the properties of a metric. In other words, we seek to minimize the sparsity (or the $\ell_0$ "norm") of the changes we make to the distances subject to the new distances satisfying a metric. We give three different combinatorial algorithms to repair a metric sparsely. In one setting the algorithm is guaranteed to return the sparsest solution and in the other settings, the algorithms repair the metric. Without prior information, the algorithms run in time proportional to the cube of the number of input data points and, with prior information we can reduce the running time considerably. △ Less

Submitted 29 October, 2017; originally announced October 2017.

arXiv:1708.00128 [pdf, ps, other]

doi 10.1364/OL.43.003005

Imaging from the Inside Out: Inverse Scattering with Photoactivated Internal Sources

Authors: Anna C. Gilbert, Howard W. Levinson, John C. Schotland

Abstract: We propose a method to reconstruct the optical properties of a scattering medium with subwavelength resolution. The method is based on the solution to the inverse scattering problem with photoactivated internal sources. Numerical simulations of three-dimensional structures demonstrate that a resolution of approximately $λ/25$ is achievable. We propose a method to reconstruct the optical properties of a scattering medium with subwavelength resolution. The method is based on the solution to the inverse scattering problem with photoactivated internal sources. Numerical simulations of three-dimensional structures demonstrate that a resolution of approximately $λ/25$ is achievable. △ Less

Submitted 31 July, 2017; originally announced August 2017.

arXiv:1706.05916 [pdf, other]

Local Differential Privacy for Physical Sensor Data and Sparse Recovery

Authors: Anna C. Gilbert, Audra McMillan

Abstract: In this work we explore the utility of locally differentially private thermal sensor data. We design a locally differentially private recovery algorithm for the 1-dimensional, discrete heat source location problem and analyse its performance in terms of the Earth Mover Distance error. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact loc… ▽ More In this work we explore the utility of locally differentially private thermal sensor data. We design a locally differentially private recovery algorithm for the 1-dimensional, discrete heat source location problem and analyse its performance in terms of the Earth Mover Distance error. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact locations of the heat sources private and permit recovery of the "general geographic vicinity" of the sources. We also discuss the relationship between the property of an inverse problem being ill-conditioned and the amount of noise needed to maintain privacy. △ Less

Submitted 23 March, 2018; v1 submitted 30 May, 2017; originally announced June 2017.

Comments: appeared at CISS 2018

arXiv:1705.08664 [pdf, other]

Towards Understanding the Invertibility of Convolutional Neural Networks

Authors: Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang, Honglak Lee

Abstract: Several recent works have empirically observed that Convolutional Neural Nets (CNNs) are (approximately) invertible. To understand this approximate invertibility phenomenon and how to leverage it more effectively, we focus on a theoretical explanation and develop a mathematical model of sparse signal recovery that is consistent with CNNs with random weights. We give an exact connection to a partic… ▽ More Several recent works have empirically observed that Convolutional Neural Nets (CNNs) are (approximately) invertible. To understand this approximate invertibility phenomenon and how to leverage it more effectively, we focus on a theoretical explanation and develop a mathematical model of sparse signal recovery that is consistent with CNNs with random weights. We give an exact connection to a particular model of model-based compressive sensing (and its recovery algorithms) and random-weight CNNs. We show empirically that several learned networks are consistent with our mathematical analysis and then demonstrate that with such a simple theoretical framework, we can obtain reasonable re- construction results on real images. We also discuss gaps between our model assumptions and the CNN trained for classification in practical scenarios. △ Less

Submitted 24 May, 2017; originally announced May 2017.

Journal ref: IJCAI 2017

arXiv:1609.03041 [pdf, other]

doi 10.1088/1361-6420/aa66d1

Optical tomography on graphs

Authors: Francis J. Chung, Anna C. Gilbert, Jeremy G. Hoskins, John C. Schotland

Abstract: We present an algorithm for solving inverse problems on graphs analogous to those arising in diffuse optical tomography for continuous media. In particular, we formulate and analyze a discrete version of the inverse Born series, proving estimates characterizing the domain of convergence, approximation errors, and stability of our approach. We also present a modification which allows additional inf… ▽ More We present an algorithm for solving inverse problems on graphs analogous to those arising in diffuse optical tomography for continuous media. In particular, we formulate and analyze a discrete version of the inverse Born series, proving estimates characterizing the domain of convergence, approximation errors, and stability of our approach. We also present a modification which allows additional information on the structure of the potential to be incorporated, facilitating recovery for a broader class of problems. △ Less

Submitted 10 September, 2016; originally announced September 2016.

arXiv:1404.5190 [pdf, other]

Sparse Approximation, List Decoding, and Uncertainty Principles

Authors: Mahmoud Abo Khamis, Anna C. Gilbert, Hung Q. Ngo, Atri Rudra

Abstract: We consider list versions of sparse approximation problems, where unlike the existing results in sparse approximation that consider situations with unique solutions, we are interested in multiple solutions. We introduce these problems and present the first combinatorial results on the output list size. These generalize and enhance some of the existing results on threshold phenomenon and uncertaint… ▽ More We consider list versions of sparse approximation problems, where unlike the existing results in sparse approximation that consider situations with unique solutions, we are interested in multiple solutions. We introduce these problems and present the first combinatorial results on the output list size. These generalize and enhance some of the existing results on threshold phenomenon and uncertainty principles in sparse approximations. Our definitions and results are inspired by similar results in list decoding. We also present lower bound examples that bolster our results and show they are of the appropriate size. △ Less

Submitted 8 August, 2014; v1 submitted 18 April, 2014; originally announced April 2014.

arXiv:1402.1726 [pdf, ps, other]

For-all Sparse Recovery in Near-Optimal Time

Authors: Anna C. Gilbert, Yi Li, Ely Porat, Martin J. Strauss

Abstract: An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $ε$, $N$, an $m$-by-$N$ measurement $Φ$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mathbf{x}$, the system approximates $x$ by $\widehat{\mathbf{x}} = \mathcal{R}(Φ\mathbf{x})$, which must satisfy $\|\widehat{\mathbf{x}}-\mathbf{x}\|_1 \leq (1+ε)\|\mathbf{x}-\mathbf{x}_k\|_1$. We consider the 'for al… ▽ More An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $ε$, $N$, an $m$-by-$N$ measurement $Φ$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mathbf{x}$, the system approximates $x$ by $\widehat{\mathbf{x}} = \mathcal{R}(Φ\mathbf{x})$, which must satisfy $\|\widehat{\mathbf{x}}-\mathbf{x}\|_1 \leq (1+ε)\|\mathbf{x}-\mathbf{x}_k\|_1$. We consider the 'for all' model, in which a single matrix $Φ$, possibly 'constructed' non-explicitly using the probabilistic method, is used for all signals $\mathbf{x}$. The best existing sublinear algorithm by Porat and Strauss (SODA'12) uses $O(ε^{-3} k\log(N/k))$ measurements and runs in time $O(k^{1-α}N^α)$ for any constant $α> 0$. In this paper, we improve the number of measurements to $O(ε^{-2} k \log(N/k))$, matching the best existing upper bound (attained by super-linear algorithms), and the runtime to $O(k^{1+β}\textrm{poly}(\log N,1/ε))$, with a modest restriction that $ε\leq (\log k/\log N)^γ$, for any constants $β,γ> 0$. When $k\leq \log^c N$ for some $c>0$, the runtime is reduced to $O(k\textrm{poly}(N,1/ε))$. With no restrictions on $ε$, we have an approximation recovery system with $m = O(k/ε\log(N/k)((\log N/\log k)^γ+ 1/ε))$ measurements. △ Less

Submitted 7 March, 2017; v1 submitted 7 February, 2014; originally announced February 2014.

ACM Class: F.2.2; E.4

Journal ref: ACM Transactions on Algorithms, Vol. 13, No. 3, pp 32:1--32:26, 2017

arXiv:1401.4428 [pdf, other]

Diffuse Scattering on Graphs

Authors: Anna C. Gilbert, Jeremy G. Hoskins, John C. Schotland

Abstract: We formulate and analyze difference equations on graphs analogous to time-independent diffusion equations arising in the study of diffuse scattering in continuous media. Moreover, we show how to construct solutions in the presence of weak scatterers from the solution to the homogeneous (background problem) using Born series, providing necessary conditions for convergence and demonstrating the proc… ▽ More We formulate and analyze difference equations on graphs analogous to time-independent diffusion equations arising in the study of diffuse scattering in continuous media. Moreover, we show how to construct solutions in the presence of weak scatterers from the solution to the homogeneous (background problem) using Born series, providing necessary conditions for convergence and demonstrating the process through numerous examples. In addition, we outline a method for finding Green's functions for Cayley graphs for both abelian and non-abelian groups. Finally, we conclude with a discussion of the effects of sparsity on our method and results, outlining the simplifications that can be made provided that the scatterers are weak and well-separated. △ Less

Submitted 1 November, 2016; v1 submitted 17 January, 2014; originally announced January 2014.

arXiv:1307.7810 [pdf, ps, other]

Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing

Authors: Denisa Duma, Mary Wootters, Anna C. Gilbert, Hung Q. Ngo, Atri Rudra, Matthew Alpert, Timothy J. Close, Gianfranco Ciardo, Stefano Lonardi

Abstract: In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) doi:10.1371/journal.pcbi.1003010. We have also demonstrated how this new protocol would enable de novo selective… ▽ More In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) doi:10.1371/journal.pcbi.1003010. We have also demonstrated how this new protocol would enable de novo selective sequencing and assembly of large, highly-repetitive genomes. Here we address the problem of decoding pooled sequenced data obtained from such a protocol. Our algorithm employs a synergistic combination of ideas from compressed sensing and the decoding of error-correcting codes. Experimental results on synthetic data for the rice genome and real data for the barley genome show that our novel decoding algorithm enables significantly higher quality assemblies than the previous approach. △ Less

Submitted 30 July, 2013; originally announced July 2013.

Comments: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013)

arXiv:1307.1960 [pdf, other]

doi 10.1109/TSP.2014.2302736

Modal Analysis with Compressive Measurements

Authors: Jae Young Park, Michael B. Wakin, Anna C. Gilbert

Abstract: Structural Health Monitoring (SHM) systems are critical for monitoring aging infrastructure (such as buildings or bridges) in a cost-effective manner. Such systems typically involve collections of battery-operated wireless sensors that sample vibration data over time. After the data is transmitted to a central node, modal analysis can be used to detect damage in the structure. In this paper, we pr… ▽ More Structural Health Monitoring (SHM) systems are critical for monitoring aging infrastructure (such as buildings or bridges) in a cost-effective manner. Such systems typically involve collections of battery-operated wireless sensors that sample vibration data over time. After the data is transmitted to a central node, modal analysis can be used to detect damage in the structure. In this paper, we propose and study three frameworks for Compressive Sensing (CS) in SHM systems; these methods are intended to minimize power consumption by allowing the data to be sampled and/or transmitted more efficiently. At the central node, all of these frameworks involve a very simple technique for estimating the structure's mode shapes without requiring a traditional CS reconstruction of the vibration signals; all that is needed is to compute a simple Singular Value Decomposition. We provide theoretical justification (including measurement bounds) for each of these techniques based on the equations of motion describing a simplified Multiple-Degree-Of-Freedom (MDOF) system, and we support our proposed techniques using simulations based on synthetic and real data. △ Less

Submitted 8 July, 2013; originally announced July 2013.

arXiv:1304.6232 [pdf, other]

L2/L2-foreach sparse recovery with low risk

Authors: Anna C. Gilbert, Hung Q. Ngo, Ely Porat, Atri Rudra, Martin J. Strauss

Abstract: In this paper, we consider the "foreach" sparse recovery problem with failure probability $p$. The goal of which is to design a distribution over $m \times N$ matrices $Φ$ and a decoding algorithm $\algo$ such that for every $\vx\in\R^N$, we have the following error guarantee with probability at least $1-p$ \[\|\vx-\algo(Φ\vx)\|_2\le C\|\vx-\vx_k\|_2,\] where $C$ is a constant (ideally arbitrarily… ▽ More In this paper, we consider the "foreach" sparse recovery problem with failure probability $p$. The goal of which is to design a distribution over $m \times N$ matrices $Φ$ and a decoding algorithm $\algo$ such that for every $\vx\in\R^N$, we have the following error guarantee with probability at least $1-p$ \[\|\vx-\algo(Φ\vx)\|_2\le C\|\vx-\vx_k\|_2,\] where $C$ is a constant (ideally arbitrarily close to 1) and $\vx_k$ is the best $k$-sparse approximation of $\vx$. Much of the sparse recovery or compressive sensing literature has focused on the case of either $p = 0$ or $p = Ω(1)$. We initiate the study of this problem for the entire range of failure probability. Our two main results are as follows: \begin{enumerate} \item We prove a lower bound on $m$, the number measurements, of $Ω(k\log(n/k)+\log(1/p))$ for $2^{-Θ(N)}\le p <1$. Cohen, Dahmen, and DeVore \cite{CDD2007:NearOptimall2l2} prove that this bound is tight. \item We prove nearly matching upper bounds for \textit{sub-linear} time decoding. Previous such results addressed only $p = Ω(1)$. \end{enumerate} Our results and techniques lead to the following corollaries: (i) the first ever sub-linear time decoding $\lolo$ "forall" sparse recovery system that requires a $\log^γ{N}$ extra factor (for some $γ<1$) over the optimal $O(k\log(N/k))$ number of measurements, and (ii) extensions of Gilbert et al. \cite{GHRSW12:SimpleSignals} results for information-theoretically bounded adversaries. △ Less

Submitted 23 April, 2013; originally announced April 2013.

Comments: 1 figure, extended abstract to appear in ICALP 2013

arXiv:1302.0441 [pdf, ps, other]

doi 10.1088/0266-5611/29/4/045003

A generalization of variable elimination for separable inverse problems beyond least squares

Authors: Paul Shearer, Anna C. Gilbert

Abstract: In linear inverse problems, we have data derived from a noisy linear transformation of some unknown parameters, and we wish to estimate these unknowns from the data. Separable inverse problems are a powerful generalization in which the transformation itself depends on additional unknown parameters and we wish to determine both sets of parameters simultaneously. When separable problems are solved b… ▽ More In linear inverse problems, we have data derived from a noisy linear transformation of some unknown parameters, and we wish to estimate these unknowns from the data. Separable inverse problems are a powerful generalization in which the transformation itself depends on additional unknown parameters and we wish to determine both sets of parameters simultaneously. When separable problems are solved by optimization, convergence can often be accelerated by elimination of the linear variables, a strategy which appears most prominently in the variable projection methods due to Golub, Pereyra, and Kaufman. Existing variable elimination methods require an explicit formula for the optimal value of the linear variables, so they cannot be used in problems with Poisson likelihoods, bound constraints, or other important departures from least squares. To address this limitation, we propose a generalization of variable elimination in which standard optimization methods are modified to behave as though a variable has been eliminated. We verify that this approach is a proper generalization by using it to re-derive several existing variable elimination techniques. We then extend the approach to bound-constrained and Poissonian problems, showing in the process that many of the best features of variable elimination methods can be duplicated in our framework. Tests on difficult exponential sum fitting and blind deconvolution problems indicate that the proposed approach can have significant speed and robustness advantages over standard methods. △ Less

Submitted 30 April, 2013; v1 submitted 2 February, 2013; originally announced February 2013.

Comments: 27 pages, submitted

arXiv:1302.0439 [pdf, ps, other]

Correcting Camera Shake by Incremental Sparse Approximation

Authors: Paul Shearer, Anna C. Gilbert, Alfred O. Hero III

Abstract: The problem of deblurring an image when the blur kernel is unknown remains challenging after decades of work. Recently there has been rapid progress on correcting irregular blur patterns caused by camera shake, but there is still much room for improvement. We propose a new blind deconvolution method using incremental sparse edge approximation to recover images blurred by camera shake. We estimate… ▽ More The problem of deblurring an image when the blur kernel is unknown remains challenging after decades of work. Recently there has been rapid progress on correcting irregular blur patterns caused by camera shake, but there is still much room for improvement. We propose a new blind deconvolution method using incremental sparse edge approximation to recover images blurred by camera shake. We estimate the blur kernel first from only the strongest edges in the image, then gradually refine this estimate by allowing for weaker and weaker edges. Our method competes with the benchmark deblurring performance of the state-of-the-art while being significantly faster and easier to generalize. △ Less

Submitted 7 February, 2013; v1 submitted 2 February, 2013; originally announced February 2013.

Comments: 5 pages, 3 figures. Conference submission

arXiv:1211.0361 [pdf, ps, other]

Sketched SVD: Recovering Spectral Features from Compressive Measurements

Authors: Anna C. Gilbert, Jae Young Park, Michael B. Wakin

Abstract: We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "… ▽ More We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "sketch matrix" Y, and then to compute the SVD of the sketch matrix. We show that the singular values and right singular vectors of Y are close to those of X, with small relative error. We also believe that this bound is of independent interest in non-streaming and non-distributed data collection settings. Assuming that the data matrix X is of size Nxn, then with m linear measurements of each column of X, we obtain a smaller matrix Y with dimensions mxn. If m = O(k ε^{-2} (log(1/ε) + log(1/δ)), where k denotes the rank of X, then with probability at least 1-δ, the singular values σ'_j of Y satisfy the following relative error result (1-ε)^(1/2)<= σ'_j/σ_j <= (1 + ε)^(1/2) as compared to the singular values σ_j of the original matrix X. Furthermore, the right singular vectors v'_j of Y satisfy ||v_j-v_j'||_2 <= min(sqrt{2}, (ε\sqrt{1+ε})/(\sqrt{1-ε}) max_{i\neq j} (\sqrt{2}σ_iσ_j)/(min_{c\in[-1,1]}(|σ^2_i-σ^2_j(1+cε)|))) as compared to the right singular vectors v_j of X. We apply this result to obtain a streaming graph algorithm to approximate the eigenvalues and eigenvectors of the graph Laplacian in the case where the graph has low rank (many connected components). △ Less

Submitted 1 November, 2012; originally announced November 2012.

arXiv:1110.3052 [pdf, ps, other]

doi 10.1088/2041-8205/749/1/L8

The First Stray Light Corrected EUV Images of Solar Coronal Holes

Authors: Paul Shearer, Richard A. Frazin, Alfred O. Hero III, Anna C. Gilbert

Abstract: Coronal holes are the source regions of the fast solar wind, which fills most of the solar system volume near the cycle minimum. Removing stray light from extreme ultraviolet (EUV) images of the Sun's corona is of high astrophysical importance, as it is required to make meaningful determinations of temperatures and densities of coronal holes. EUV images tend to be dominated by the component of the… ▽ More Coronal holes are the source regions of the fast solar wind, which fills most of the solar system volume near the cycle minimum. Removing stray light from extreme ultraviolet (EUV) images of the Sun's corona is of high astrophysical importance, as it is required to make meaningful determinations of temperatures and densities of coronal holes. EUV images tend to be dominated by the component of the stray light due to the long-range scatter caused by microroughness of telescope mirror surfaces, and this component has proven very difficult to measure in pre-flight characterization. In-flight characterization heretofore has proven elusive due to the fact that the detected image is simultaneously nonlinear in two unknown functions: the stray light pattern and the true image which would be seen by an ideal telescope. Using a constrained blind deconvolution technique that takes advantage of known zeros in the true image provided by a fortuitous lunar transit, we have removed the stray light from solar images seen by the EUVI instrument on STEREO-B in all four filter bands (171, 195, 284, and 304 Å). Uncertainty measures of the stray light corrected images, which include the systematic error due to misestimation of the scatter, are provided. It is shown that in EUVI, stray light contributes up to 70% of the emission in coronal holes seen on the solar disk, which has dramatic consequences for diagnostics of temperature and density and therefore estimates of key plasma parameters such as the plasma $β$\ and ion-electron collision rates. △ Less

Submitted 7 March, 2012; v1 submitted 13 October, 2011; originally announced October 2011.

Comments: Accepted to Astrophysical Journal Letters

arXiv:0912.0229 [pdf, other]

Approximate Sparse Recovery: Optimizing Time and Measurements

Authors: Anna C. Gilbert, Yi Li, Ely Porat, Martin J. Strauss

Abstract: An approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $Φ$, and a decoding algorithm, $\mathcal{D}$. Given a vector, $x$, the system approximates $x$ by $\widehat x =\mathcal{D}(Φx)$, which must satisfy $\| \widehat x - x\|_2\le C \|x - x_k\|_2$, where $x_k$ denotes the optimal $k$-term approximation to $x$. For each vector $x$, the system must succ… ▽ More An approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $Φ$, and a decoding algorithm, $\mathcal{D}$. Given a vector, $x$, the system approximates $x$ by $\widehat x =\mathcal{D}(Φx)$, which must satisfy $\| \widehat x - x\|_2\le C \|x - x_k\|_2$, where $x_k$ denotes the optimal $k$-term approximation to $x$. For each vector $x$, the system must succeed with probability at least 3/4. Among the goals in designing such systems are minimizing the number $m$ of measurements and the runtime of the decoding algorithm, $\mathcal{D}$. In this paper, we give a system with $m=O(k \log(N/k))$ measurements--matching a lower bound, up to a constant factor--and decoding time $O(k\log^c N)$, matching a lower bound up to $\log(N)$ factors. We also consider the encode time (i.e., the time to multiply $Φ$ by $x$), the time to update measurements (i.e., the time to multiply $Φ$ by a 1-sparse $x$), and the robustness and stability of the algorithm (adding noise before and after the measurements). Our encode and update times are optimal up to $\log(N)$ factors. △ Less

Submitted 1 December, 2009; originally announced December 2009.

Journal ref: SIAM J. Comput. 41(2), pp. 436-453, 2012

arXiv:0804.4666 [pdf, ps, other]

Combining geometry and combinatorics: A unified approach to sparse signal recovery

Authors: R. Berinde, A. C. Gilbert, P. Indyk, H. Karloff, M. J. Strauss

Abstract: There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix and then uses linear programming to decode information about the signal from its measurements. The combinatorial approach constructs the measurement matrix and a combinatorial decoding algorithm to match. We present… ▽ More There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix and then uses linear programming to decode information about the signal from its measurements. The combinatorial approach constructs the measurement matrix and a combinatorial decoding algorithm to match. We present a unified approach to these two classes of sparse signal recovery algorithms. The unifying elements are the adjacency matrices of high-quality unbalanced expanders. We generalize the notion of Restricted Isometry Property (RIP), crucial to compressed sensing results for signal recovery, from the Euclidean norm to the l_p norm for p about 1, and then show that unbalanced expanders are essentially equivalent to RIP-p matrices. From known deterministic constructions for such matrices, we obtain new deterministic measurement matrix constructions and algorithms for signal recovery which, compared to previous deterministic algorithms, are superior in either the number of measurements or in noise tolerance. △ Less

Submitted 29 April, 2008; originally announced April 2008.

ACM Class: F.2; G.1; G.2

arXiv:cs/0608079 [pdf, ps, other]

Algorithmic linear dimension reduction in the l_1 norm for sparse vectors

Authors: A. C. Gilbert, M. J. Strauss, J. A. Tropp, R. Vershynin

Abstract: This paper develops a new method for recovering m-sparse signals that is simultaneously uniform and quick. We present a reconstruction algorithm whose run time, O(m log^2(m) log^2(d)), is sublinear in the length d of the signal. The reconstruction error is within a logarithmic factor (in m) of the optimal m-term approximation error in l_1. In particular, the algorithm recovers m-sparse signals p… ▽ More This paper develops a new method for recovering m-sparse signals that is simultaneously uniform and quick. We present a reconstruction algorithm whose run time, O(m log^2(m) log^2(d)), is sublinear in the length d of the signal. The reconstruction error is within a logarithmic factor (in m) of the optimal m-term approximation error in l_1. In particular, the algorithm recovers m-sparse signals perfectly and noisy signals are recovered with polylogarithmic distortion. Our algorithm makes O(m log^2 (d)) measurements, which is within a logarithmic factor of optimal. We also present a small-space implementation of the algorithm. These sketching techniques and the corresponding reconstruction algorithms provide an algorithmic dimension reduction in the l_1 norm. In particular, vectors of support m in dimension d can be linearly embedded into O(m log^2 d) dimensions with polylogarithmic distortion. We can reconstruct a vector from its low-dimensional sketch in time O(m log^2(m) log^2(d)). Furthermore, this reconstruction is stable and robust under small perturbations. △ Less

Submitted 18 August, 2006; originally announced August 2006.

arXiv:cs/0607098 [pdf, ps, other]

List decoding of noisy Reed-Muller-like codes

Authors: A. R. Calderbank, Anna C. Gilbert, Martin J. Strauss

Abstract: First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic re… ▽ More First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic result is an extension of the RM(1) techniques from Goldreich-Levin and Kushilevitz-Mansour algorithms to the Hankel code, a code between RM(1) and RM(2). That is, given signal s of length N, we find a list that is a superset of all Hankel codewords phi with dot product to s at least (1/sqrt(k)) times the norm of s, in time polynomial in k and log(N). We also give a new and simple formulation of a known Kerdock code as a subcode of the Hankel code. As a corollary, we can list-decode Kerdock, too. Also, we get a quick algorithm for finding a sparse Kerdock approximation. That is, for k small compared with 1/sqrt{N} and for epsilon > 0, we find, in time polynomial in (k log(N)/epsilon), a k-Kerdock-term approximation s~ to s with Euclidean error at most the factor (1+epsilon+O(k^2/sqrt{N})) times that of the best such approximation. △ Less

Submitted 2 August, 2006; v1 submitted 20 July, 2006; originally announced July 2006.

ACM Class: E.4; F.2.1

Showing 1–31 of 31 results for author: Gilbert, A C