Skip to main content

Showing 1–31 of 31 results for author: Gilbert, A C

.
  1. arXiv:2403.07929  [pdf, other

    cs.LG math.NA stat.ML

    Sketching the Heat Kernel: Using Gaussian Processes to Embed Data

    Authors: Anna C. Gilbert, Kevin O'Neill

    Abstract: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian proce… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 28 pages

  2. arXiv:2309.13478  [pdf, other

    stat.ML cs.LG

    CA-PCA: Manifold Dimension Estimation, Adapted for Curvature

    Authors: Anna C. Gilbert, Kevin O'Neill

    Abstract: The success of algorithms in the analysis of high-dimensional data is often attributed to the manifold hypothesis, which supposes that this data lie on or near a manifold of much lower dimension. It is often useful to determine or estimate the dimension of this manifold before performing dimension reduction, for instance. Existing methods for dimension estimation are calibrated using a flat unit b… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: 26 pages

    MSC Class: 62H25; 62R30

  3. arXiv:2208.06676  [pdf, other

    cs.LG

    May the force be with you

    Authors: Yulan Zhang, Anna C. Gilbert, Stefan Steinerberger

    Abstract: Modern methods in dimensionality reduction are dominated by nonlinear attraction-repulsion force-based methods (this includes t-SNE, UMAP, ForceAtlas2, LargeVis, and many more). The purpose of this paper is to demonstrate that all such methods, by design, come with an additional feature that is being automatically computed along the way, namely the vector field associated with these forces. We sho… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: 23 pages, 17 figures

  4. arXiv:2110.11430  [pdf, other

    cs.CG cs.LG

    How can classical multidimensional scaling go wrong?

    Authors: Rishi Sonthalia, Gregory Van Buskirk, Benjamin Raichel, Anna C. Gilbert

    Abstract: Given a matrix $D$ describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we deriv… ▽ More

    Submitted 28 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021

  5. arXiv:2012.03126  [pdf, other

    cs.CG math.PR

    Dual Regularized Optimal Transport

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: In this paper, we present a new formulation of unbalanced optimal transport called Dual Regularized Optimal Transport (DROT). We argue that regularizing the dual formulation of optimal transport results in a version of unbalanced optimal transport that leads to sparse solutions and that gives us control over mass creation and destruction. We build intuition behind such control and present theoreti… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  6. arXiv:2007.01346  [pdf, other

    cs.LG stat.ML

    Spectral Methods for Ranking with Scarce Data

    Authors: Umang Varma, Lalit Jain, Anna C. Gilbert

    Abstract: Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g.,… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: To appear in Proceedings of Uncertainty in Artificial Intelligence (UAI) 2020

    MSC Class: 68T05

  7. arXiv:2005.03853  [pdf, other

    cs.LG math.OC stat.ML

    Project and Forget: Solving Large-Scale Metric Constrained Problems

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric… ▽ More

    Submitted 26 September, 2022; v1 submitted 8 May, 2020; originally announced May 2020.

  8. arXiv:2005.03847  [pdf, other

    cs.LG math.MG stat.ML

    Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a… ▽ More

    Submitted 22 October, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: Code available at https://github.com/rsonthal/TreeRep

  9. arXiv:1908.08411  [pdf, other

    cs.DS cs.CG

    Generalized Metric Repair on Graphs

    Authors: Chenglin Fan, Anna C. Gilbert, Benjamin Raichel, Rishi Sonthalia, Gregory Van Buskirk

    Abstract: Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimension reduction. As real data sets are noisy, distances often fail to satisfy a metric. For this reason, Gilbert and Jain and Fan et al. introduced the closely related sparse metric repair and… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

    Comments: arXiv admin note: text overlap with arXiv:1807.08078

  10. arXiv:1903.10875  [pdf, other

    math.NA

    Nonlinear Iterative Hard Thresholding for Inverse Scattering

    Authors: Anna C. Gilbert, Howard W. Levinson, John C. Schotland

    Abstract: We consider the inverse scattering problem for sparse scatterers. An image reconstruction algorithm is proposed that is based on a nonlinear generalization of iterative hard thresholding. The convergence and error of the method was analyzed by means of coherence estimates and compared to numerical simulations.

    Submitted 22 March, 2019; originally announced March 2019.

    Comments: 30 pages, 10 figures

  11. arXiv:1807.07619  [pdf, other

    cs.DS

    Generalized Metric Repair on Graphs

    Authors: Anna C. Gilbert, Rishi Sonthalia

    Abstract: Many modern data analysis algorithms either assume that or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimensionality reduction. Because real data sets are noisy, the similarity measures often fail to satisfy a metric. For this reason, Gilbert and Jain [11] and Fan, et al. [8] introduce the clo… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

  12. Unsupervised Metric Learning in Presence of Missing Data

    Authors: Anna C. Gilbert, Rishi Sonthalia

    Abstract: For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the dimension of the input data using standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always wo… ▽ More

    Submitted 3 March, 2019; v1 submitted 19 July, 2018; originally announced July 2018.

    Journal ref: 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

  13. arXiv:1710.10655  [pdf, other

    stat.ML cs.DS

    If it ain't broke, don't fix it: Sparse metric repair

    Authors: Anna C. Gilbert, Lalit Jain

    Abstract: Many modern data-intensive computational problems either require, or benefit from distance or similarity data that adhere to a metric. The algorithms run faster or have better performance guarantees. Unfortunately, in real applications, the data are messy and values are noisy. The distances between the data points are far from satisfying a metric. Indeed, there are a number of different algorithms… ▽ More

    Submitted 29 October, 2017; originally announced October 2017.

  14. arXiv:1708.00128  [pdf, ps, other

    physics.optics

    Imaging from the Inside Out: Inverse Scattering with Photoactivated Internal Sources

    Authors: Anna C. Gilbert, Howard W. Levinson, John C. Schotland

    Abstract: We propose a method to reconstruct the optical properties of a scattering medium with subwavelength resolution. The method is based on the solution to the inverse scattering problem with photoactivated internal sources. Numerical simulations of three-dimensional structures demonstrate that a resolution of approximately $λ/25$ is achievable.

    Submitted 31 July, 2017; originally announced August 2017.

  15. arXiv:1706.05916  [pdf, other

    cs.CR cs.DB

    Local Differential Privacy for Physical Sensor Data and Sparse Recovery

    Authors: Anna C. Gilbert, Audra McMillan

    Abstract: In this work we explore the utility of locally differentially private thermal sensor data. We design a locally differentially private recovery algorithm for the 1-dimensional, discrete heat source location problem and analyse its performance in terms of the Earth Mover Distance error. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact loc… ▽ More

    Submitted 23 March, 2018; v1 submitted 30 May, 2017; originally announced June 2017.

    Comments: appeared at CISS 2018

  16. arXiv:1705.08664  [pdf, other

    stat.ML cs.LG

    Towards Understanding the Invertibility of Convolutional Neural Networks

    Authors: Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang, Honglak Lee

    Abstract: Several recent works have empirically observed that Convolutional Neural Nets (CNNs) are (approximately) invertible. To understand this approximate invertibility phenomenon and how to leverage it more effectively, we focus on a theoretical explanation and develop a mathematical model of sparse signal recovery that is consistent with CNNs with random weights. We give an exact connection to a partic… ▽ More

    Submitted 24 May, 2017; originally announced May 2017.

    Journal ref: IJCAI 2017

  17. Optical tomography on graphs

    Authors: Francis J. Chung, Anna C. Gilbert, Jeremy G. Hoskins, John C. Schotland

    Abstract: We present an algorithm for solving inverse problems on graphs analogous to those arising in diffuse optical tomography for continuous media. In particular, we formulate and analyze a discrete version of the inverse Born series, proving estimates characterizing the domain of convergence, approximation errors, and stability of our approach. We also present a modification which allows additional inf… ▽ More

    Submitted 10 September, 2016; originally announced September 2016.

  18. arXiv:1404.5190  [pdf, other

    cs.IT

    Sparse Approximation, List Decoding, and Uncertainty Principles

    Authors: Mahmoud Abo Khamis, Anna C. Gilbert, Hung Q. Ngo, Atri Rudra

    Abstract: We consider list versions of sparse approximation problems, where unlike the existing results in sparse approximation that consider situations with unique solutions, we are interested in multiple solutions. We introduce these problems and present the first combinatorial results on the output list size. These generalize and enhance some of the existing results on threshold phenomenon and uncertaint… ▽ More

    Submitted 8 August, 2014; v1 submitted 18 April, 2014; originally announced April 2014.

  19. arXiv:1402.1726  [pdf, ps, other

    cs.DS cs.IT

    For-all Sparse Recovery in Near-Optimal Time

    Authors: Anna C. Gilbert, Yi Li, Ely Porat, Martin J. Strauss

    Abstract: An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $ε$, $N$, an $m$-by-$N$ measurement $Φ$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mathbf{x}$, the system approximates $x$ by $\widehat{\mathbf{x}} = \mathcal{R}(Φ\mathbf{x})$, which must satisfy $\|\widehat{\mathbf{x}}-\mathbf{x}\|_1 \leq (1+ε)\|\mathbf{x}-\mathbf{x}_k\|_1$. We consider the 'for al… ▽ More

    Submitted 7 March, 2017; v1 submitted 7 February, 2014; originally announced February 2014.

    ACM Class: F.2.2; E.4

    Journal ref: ACM Transactions on Algorithms, Vol. 13, No. 3, pp 32:1--32:26, 2017

  20. arXiv:1401.4428  [pdf, other

    math.CO cs.DM

    Diffuse Scattering on Graphs

    Authors: Anna C. Gilbert, Jeremy G. Hoskins, John C. Schotland

    Abstract: We formulate and analyze difference equations on graphs analogous to time-independent diffusion equations arising in the study of diffuse scattering in continuous media. Moreover, we show how to construct solutions in the presence of weak scatterers from the solution to the homogeneous (background problem) using Born series, providing necessary conditions for convergence and demonstrating the proc… ▽ More

    Submitted 1 November, 2016; v1 submitted 17 January, 2014; originally announced January 2014.

  21. arXiv:1307.7810  [pdf, ps, other

    q-bio.QM cs.CE cs.IT q-bio.GN

    Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing

    Authors: Denisa Duma, Mary Wootters, Anna C. Gilbert, Hung Q. Ngo, Atri Rudra, Matthew Alpert, Timothy J. Close, Gianfranco Ciardo, Stefano Lonardi

    Abstract: In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) doi:10.1371/journal.pcbi.1003010. We have also demonstrated how this new protocol would enable de novo selective… ▽ More

    Submitted 30 July, 2013; originally announced July 2013.

    Comments: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013)

  22. Modal Analysis with Compressive Measurements

    Authors: Jae Young Park, Michael B. Wakin, Anna C. Gilbert

    Abstract: Structural Health Monitoring (SHM) systems are critical for monitoring aging infrastructure (such as buildings or bridges) in a cost-effective manner. Such systems typically involve collections of battery-operated wireless sensors that sample vibration data over time. After the data is transmitted to a central node, modal analysis can be used to detect damage in the structure. In this paper, we pr… ▽ More

    Submitted 8 July, 2013; originally announced July 2013.

  23. arXiv:1304.6232  [pdf, other

    cs.DS

    L2/L2-foreach sparse recovery with low risk

    Authors: Anna C. Gilbert, Hung Q. Ngo, Ely Porat, Atri Rudra, Martin J. Strauss

    Abstract: In this paper, we consider the "foreach" sparse recovery problem with failure probability $p$. The goal of which is to design a distribution over $m \times N$ matrices $Φ$ and a decoding algorithm $\algo$ such that for every $\vx\in\R^N$, we have the following error guarantee with probability at least $1-p$ \[\|\vx-\algo(Φ\vx)\|_2\le C\|\vx-\vx_k\|_2,\] where $C$ is a constant (ideally arbitrarily… ▽ More

    Submitted 23 April, 2013; originally announced April 2013.

    Comments: 1 figure, extended abstract to appear in ICALP 2013

  24. A generalization of variable elimination for separable inverse problems beyond least squares

    Authors: Paul Shearer, Anna C. Gilbert

    Abstract: In linear inverse problems, we have data derived from a noisy linear transformation of some unknown parameters, and we wish to estimate these unknowns from the data. Separable inverse problems are a powerful generalization in which the transformation itself depends on additional unknown parameters and we wish to determine both sets of parameters simultaneously. When separable problems are solved b… ▽ More

    Submitted 30 April, 2013; v1 submitted 2 February, 2013; originally announced February 2013.

    Comments: 27 pages, submitted

  25. arXiv:1302.0439  [pdf, ps, other

    cs.CV cs.GR

    Correcting Camera Shake by Incremental Sparse Approximation

    Authors: Paul Shearer, Anna C. Gilbert, Alfred O. Hero III

    Abstract: The problem of deblurring an image when the blur kernel is unknown remains challenging after decades of work. Recently there has been rapid progress on correcting irregular blur patterns caused by camera shake, but there is still much room for improvement. We propose a new blind deconvolution method using incremental sparse edge approximation to recover images blurred by camera shake. We estimate… ▽ More

    Submitted 7 February, 2013; v1 submitted 2 February, 2013; originally announced February 2013.

    Comments: 5 pages, 3 figures. Conference submission

  26. arXiv:1211.0361  [pdf, ps, other

    cs.IT cs.DS

    Sketched SVD: Recovering Spectral Features from Compressive Measurements

    Authors: Anna C. Gilbert, Jae Young Park, Michael B. Wakin

    Abstract: We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "… ▽ More

    Submitted 1 November, 2012; originally announced November 2012.

  27. arXiv:1110.3052  [pdf, ps, other

    astro-ph.SR astro-ph.IM

    The First Stray Light Corrected EUV Images of Solar Coronal Holes

    Authors: Paul Shearer, Richard A. Frazin, Alfred O. Hero III, Anna C. Gilbert

    Abstract: Coronal holes are the source regions of the fast solar wind, which fills most of the solar system volume near the cycle minimum. Removing stray light from extreme ultraviolet (EUV) images of the Sun's corona is of high astrophysical importance, as it is required to make meaningful determinations of temperatures and densities of coronal holes. EUV images tend to be dominated by the component of the… ▽ More

    Submitted 7 March, 2012; v1 submitted 13 October, 2011; originally announced October 2011.

    Comments: Accepted to Astrophysical Journal Letters

  28. arXiv:0912.0229  [pdf, other

    cs.DS cs.IT

    Approximate Sparse Recovery: Optimizing Time and Measurements

    Authors: Anna C. Gilbert, Yi Li, Ely Porat, Martin J. Strauss

    Abstract: An approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $Φ$, and a decoding algorithm, $\mathcal{D}$. Given a vector, $x$, the system approximates $x$ by $\widehat x =\mathcal{D}(Φx)$, which must satisfy $\| \widehat x - x\|_2\le C \|x - x_k\|_2$, where $x_k$ denotes the optimal $k$-term approximation to $x$. For each vector $x$, the system must succ… ▽ More

    Submitted 1 December, 2009; originally announced December 2009.

    Journal ref: SIAM J. Comput. 41(2), pp. 436-453, 2012

  29. arXiv:0804.4666  [pdf, ps, other

    cs.DM cs.DS math.NA

    Combining geometry and combinatorics: A unified approach to sparse signal recovery

    Authors: R. Berinde, A. C. Gilbert, P. Indyk, H. Karloff, M. J. Strauss

    Abstract: There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix and then uses linear programming to decode information about the signal from its measurements. The combinatorial approach constructs the measurement matrix and a combinatorial decoding algorithm to match. We present… ▽ More

    Submitted 29 April, 2008; originally announced April 2008.

    ACM Class: F.2; G.1; G.2

  30. arXiv:cs/0608079  [pdf, ps, other

    cs.DS

    Algorithmic linear dimension reduction in the l_1 norm for sparse vectors

    Authors: A. C. Gilbert, M. J. Strauss, J. A. Tropp, R. Vershynin

    Abstract: This paper develops a new method for recovering m-sparse signals that is simultaneously uniform and quick. We present a reconstruction algorithm whose run time, O(m log^2(m) log^2(d)), is sublinear in the length d of the signal. The reconstruction error is within a logarithmic factor (in m) of the optimal m-term approximation error in l_1. In particular, the algorithm recovers m-sparse signals p… ▽ More

    Submitted 18 August, 2006; originally announced August 2006.

  31. arXiv:cs/0607098  [pdf, ps, other

    cs.DS cs.IT

    List decoding of noisy Reed-Muller-like codes

    Authors: A. R. Calderbank, Anna C. Gilbert, Martin J. Strauss

    Abstract: First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic re… ▽ More

    Submitted 2 August, 2006; v1 submitted 20 July, 2006; originally announced July 2006.

    ACM Class: E.4; F.2.1