Skip to main content

Showing 1–25 of 25 results for author: Clarkson, K L

.
  1. arXiv:2301.10352  [pdf, ps, other

    cs.LG cs.DS

    Capacity Analysis of Vector Symbolic Architectures

    Authors: Kenneth L. Clarkson, Shashanka Ubaru, Elizabeth Yang

    Abstract: Hyperdimensional computing (HDC) is a biologically-inspired framework which represents symbols with high-dimensional vectors, and uses vector operations to manipulate them. The ensemble of a particular vector space and a prescribed set of vector operations (including one addition-like for "bundling" and one outer-product-like for "binding") form a *vector symbolic architecture* (VSA). While VSAs h… ▽ More

    Submitted 14 February, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

  2. arXiv:2211.15860  [pdf, other

    cs.LG stat.CO

    Bayesian Experimental Design for Symbolic Discovery

    Authors: Kenneth L. Clarkson, Cristina Cornelio, Sanjeeb Dash, Joao Goncalves, Lior Horesh, Nimrod Megiddo

    Abstract: This study concerns the formulation and application of Bayesian optimal experimental design to symbolic discovery, which is the inference from observational data of predictive models taking general functional forms. We apply constrained first-order methods to optimize an appropriate selection criterion, using Hamiltonian Monte Carlo to sample from the prior. A step for computing the predictive dis… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  3. arXiv:2209.09371  [pdf, other

    quant-ph cs.LG math.NA

    Topological data analysis on noisy quantum computers

    Authors: Ismail Yunus Akhalwaya, Shashanka Ubaru, Kenneth L. Clarkson, Mark S. Squillante, Vishnu Jejjala, Yang-Hui He, Kugendran Naidoo, Vasileios Kalantzis, Lior Horesh

    Abstract: Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational probl… ▽ More

    Submitted 19 March, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: This paper is a follow up to arXiv:2108.02811 with improved theoretical results and other additional results. This new version presents an improved runtime for the algorithm, and fixes an issue present in the previous version

    Journal ref: In the Proceedings of The Twelfth International Conference on Learning Representations (ICLR 2024)

  4. arXiv:2202.05120  [pdf, ps, other

    cs.DS cs.LG math.NA

    Low-Rank Approximation with $1/ε^{1/3}$ Matrix-Vector Products

    Authors: Ainesh Bakshi, Kenneth L. Clarkson, David P. Woodruff

    Abstract: We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm. Here, given access to a matrix $A$ through matrix-vector products, an accuracy parameter $ε$, and a target rank $k$, the goal is to find a rank-$k$ matrix $Z$ with orthonormal columns such that $\| A(I -ZZ^\top)\|_{S_p} \leq (1+ε)\min_{U^\top U = I_k} \|A(I - U U^\top)\|_{S_p}$, where… ▽ More

    Submitted 16 June, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: STOC '22

  5. arXiv:2108.02811  [pdf, ps, other

    quant-ph cs.LG math.NA

    Quantum Topological Data Analysis with Linear Depth and Exponential Speedup

    Authors: Shashanka Ubaru, Ismail Yunus Akhalwaya, Mark S. Squillante, Kenneth L. Clarkson, Lior Horesh

    Abstract: Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantizati… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 27 pages

  6. arXiv:2107.08090  [pdf, ps, other

    cs.DS cs.LG

    Near-Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time

    Authors: Nadiia Chepurko, Kenneth L. Clarkson, Praneeth Kacham, David P. Woodruff

    Abstract: In the numerical linear algebra community, it was suggested that to obtain nearly optimal bounds for various problems such as rank computation, finding a maximal linearly independent subset of columns (a basis), regression, or low-rank approximation, a natural way would be to resolve the main open question of Nelson and Nguyen (FOCS, 2013). This question is regarding the logarithmic factors in the… ▽ More

    Submitted 2 November, 2021; v1 submitted 16 July, 2021; originally announced July 2021.

    Comments: Accepted to SODA 2022

  7. arXiv:2102.05758  [pdf, other

    math.NA

    Sparse graph based sketching for fast numerical linear algebra

    Authors: Dong Hu, Shashanka Ubaru, Alex Gittens, Kenneth L. Clarkson, Lior Horesh, Vassilis Kalantzis

    Abstract: In recent years, a variety of randomized constructions of sketching matrices have been devised, that have been used in fast algorithms for numerical linear algebra problems, such as least squares regression, low-rank approximation, and the approximation of leverage scores. A key property of sketching matrices is that of subspace embedding. In this paper, we study sketching matrices that are obtain… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    Journal ref: ICASSP 2021

  8. arXiv:2101.02158  [pdf, other

    cs.CL cs.AI

    Order Embeddings from Merged Ontologies using Sketching

    Authors: Kenneth L. Clarkson, Sanjana Sahayaraj

    Abstract: We give a simple, low resource method to produce order embeddings from ontologies. Such embeddings map words to vectors so that order relations on the words, such as hypernymy/hyponymy, are represented in a direct way. Our method uses sketching techniques, in particular countsketch, for dimensionality reduction. We also study methods to merge ontologies, in particular those in medical domains, so… ▽ More

    Submitted 6 January, 2021; originally announced January 2021.

  9. arXiv:2011.04125  [pdf, ps, other

    cs.DS cs.LG quant-ph

    Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra

    Authors: Nadiia Chepurko, Kenneth L. Clarkson, Lior Horesh, Honghao Lin, David P. Woodruff

    Abstract: We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression that are comparable to their quantum analogues. De-quantizing such algorithms has received a flurry of attention in recent years; we obtain sharper bounds for these problems. More significantly, we achieve these improvements by arguing that the previous quantum-inspired… ▽ More

    Submitted 28 June, 2022; v1 submitted 8 November, 2020; originally announced November 2020.

    Comments: Minor edits to exposition

  10. arXiv:2010.06392  [pdf, ps, other

    math.NA cs.IR stat.ML

    Projection techniques to update the truncated SVD of evolving matrices

    Authors: Vassilis Kalantzis, Georgios Kollias, Shashanka Ubaru, Athanasios N. Nikolakopoulos, Lior Horesh, Kenneth L. Clarkson

    Abstract: This paper considers the problem of updating the rank-k truncated Singular Value Decomposition (SVD) of matrices subject to the addition of new rows and/or columns over time. Such matrix problems represent an important computational kernel in applications such as Latent Semantic Indexing and Recommender Systems. Nonetheless, the proposed framework is purely algebraic and targets general updating p… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: 13 pages

  11. arXiv:1905.05376  [pdf, ps, other

    cs.DS cs.LG

    Dimensionality Reduction for Tukey Regression

    Authors: Kenneth L. Clarkson, Ruosong Wang, David P. Woodruff

    Abstract: We give the first dimensionality reduction methods for the overconstrained Tukey regression problem. The Tukey loss function $\|y\|_M = \sum_i M(y_i)$ has $M(y_i) \approx |y_i|^p$ for residual errors $y_i$ smaller than a prescribed threshold $τ$, but $M(y_i)$ becomes constant for errors $|y_i| > τ$. Our results depend on a new structural result, proven constructively, showing that for any $d$-dime… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: To appear in ICML 2019

  12. arXiv:1902.00995  [pdf, ps, other

    cs.LG stat.ML

    Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression

    Authors: Michał Dereziński, Kenneth L. Clarkson, Michael W. Mahoney, Manfred K. Warmuth

    Abstract: In experimental design, we are given a large collection of vectors, each with a hidden response value that we assume derives from an underlying linear model, and we wish to pick a small subset of the vectors such that querying the corresponding responses will lead to a good estimator of the model. A classical approach in statistics is to assume the responses are linear, plus zero-mean i.i.d. Gauss… ▽ More

    Submitted 3 February, 2019; originally announced February 2019.

  13. arXiv:1807.09754  [pdf

    cs.IR cs.AI

    Data Infrastructure and Approaches for Ontology-Based Drug Repurposing

    Authors: Stephen Boyer, Thomas Griffin, Sarath Swaminathan, Kenneth L. Clarkson, Dmitry Zubarev

    Abstract: We report development of a data infrastructure for drug repurposing that takes advantage of two currently available chemical ontologies. The data infrastructure includes a database of compound- target associations augmented with molecular ontological labels. It also contains two computational tools for prediction of new associations. We describe two drug-repurposing systems: one, Nascent Ontologic… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 17 pages

  14. arXiv:1611.03225  [pdf, ps, other

    cs.DS cs.LG math.NA

    Sharper Bounds for Regularized Data Fitting

    Authors: Haim Avron, Kenneth L. Clarkson, David P. Woodruff

    Abstract: We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of the p… ▽ More

    Submitted 26 June, 2017; v1 submitted 10 November, 2016; originally announced November 2016.

  15. arXiv:1611.03220  [pdf, ps, other

    math.NA cs.DS cs.LG

    Faster Kernel Ridge Regression Using Sketching and Preconditioning

    Authors: Haim Avron, Kenneth L. Clarkson, David P. Woodruff

    Abstract: Kernel Ridge Regression (KRR) is a simple yet powerful technique for non-parametric regression whose computation amounts to solving a linear system. This system is usually dense and highly ill-conditioned. In addition, the dimensions of the matrix are the same as the number of data points, so direct methods are unrealistic for large-scale datasets. In this paper, we propose a preconditioning techn… ▽ More

    Submitted 15 July, 2017; v1 submitted 10 November, 2016; originally announced November 2016.

  16. arXiv:1512.04226  [pdf, ps, other

    cs.CG

    Random Sampling with Removal

    Authors: Kenneth L. Clarkson, Bernd Gärtner, Johannes Lengler, May Szedlak

    Abstract: We study randomized algorithms for constrained optimization, in abstract frameworks that include, in strictly increasing generality: convex programming; LP-type problems; violator spaces; and a setting we introduce, consistent spaces. Such algorithms typically involve a step of finding the optimal solution for a random sample of the constraints. They exploit the condition that, in finite dimension… ▽ More

    Submitted 31 May, 2019; v1 submitted 14 December, 2015; originally announced December 2015.

  17. arXiv:1510.06073  [pdf, ps, other

    cs.DS

    Input Sparsity and Hardness for Robust Subspace Approximation

    Authors: Kenneth L. Clarkson, David P. Woodruff

    Abstract: In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functio… ▽ More

    Submitted 20 October, 2015; originally announced October 2015.

    Comments: paper appeared in FOCS, 2015

  18. Self-improving Algorithms for Coordinate-Wise Maxima and Convex Hulls

    Authors: Kenneth L. Clarkson, Wolfgang Mulzer, C. Seshadhri

    Abstract: Finding the coordinate-wise maxima and the convex hull of a planar point set are probably the most classic problems in computational geometry. We consider these problems in the self-improving setting. Here, we have $n$ distributions $\mathcal{D}_1, \ldots, \mathcal{D}_n$ of planar points. An input point set $(p_1, \ldots, p_n)$ is generated by taking an independent sample $p_i$ from each… ▽ More

    Submitted 11 January, 2014; v1 submitted 5 November, 2012; originally announced November 2012.

    Comments: 39 pages, 17 figures; thoroughly revised presentation; preliminary versions appeared at SODA 2010 and SoCG 2012

    Journal ref: SIAM Journal on Computing (SICOMP), 43(2), 2014, pp. 617-653

  19. arXiv:1207.6365  [pdf, other

    cs.DS

    Low Rank Approximation and Regression in Input Sparsity Time

    Authors: Kenneth L. Clarkson, David P. Woodruff

    Abstract: We design a new distribution over $\poly(r \eps^{-1}) \times n$ matrices $S$ so that for any fixed $n \times d$ matrix $A$ of rank $r$, with probability at least 9/10, $\norm{SAx}_2 = (1 \pm \eps)\norm{Ax}_2$ simultaneously for all $x \in \mathbb{R}^d$. Such a matrix $S$ is called a \emph{subspace embedding}. Furthermore, $SA$ can be computed in $\nnz(A) + \poly(d \eps^{-1})$ time, where… ▽ More

    Submitted 5 April, 2013; v1 submitted 26 July, 2012; originally announced July 2012.

    Comments: Included optimization of subspace embedding dimension from (d/eps)^4 to O~(d/eps)^2 in Section 4, by more careful analysis of perfect hashing, and minor improvements to regression / low rank approximation because of it

  20. arXiv:1207.4684  [pdf, other

    cs.DS stat.ML

    The Fast Cauchy Transform and Faster Robust Linear Regression

    Authors: Kenneth L. Clarkson, Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, Xiangrui Meng, David P. Woodruff

    Abstract: We provide fast algorithms for overconstrained $\ell_p$ regression and related problems: for an $n\times d$ input matrix $A$ and vector $b\in\mathbb{R}^n$, in $O(nd\log n)$ time we reduce the problem $\min_{x\in\mathbb{R}^d} \|Ax-b\|_p$ to the same problem with input matrix $\tilde A$ of dimension $s \times d$ and corresponding $\tilde b$ of dimension $s\times 1$. Here, $\tilde A$ and $\tilde b$ a… ▽ More

    Submitted 5 April, 2014; v1 submitted 19 July, 2012; originally announced July 2012.

    Comments: 48 pages; substantially extended and revised; short version in SODA 2013

  21. Self-improving Algorithms for Coordinate-wise Maxima

    Authors: Kenneth L. Clarkson, Wolfgang Mulzer, C. Seshadhri

    Abstract: Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have $n$ (unknown) independent distributions $\cD_1, \cD_2, ..., \cD_n$ of planar points. An input pointset $(p_1, p_2, ..., p_n)$ is generated by taking an independent sample $p_i$ from each… ▽ More

    Submitted 3 April, 2012; originally announced April 2012.

    Comments: To appear in Symposium of Computational Geometry 2012 (17 pages, 2 figures)

    Journal ref: SIAM Journal on Computing (SICOMP), 43(2), 2014, pp. 617-653

  22. arXiv:1010.4408  [pdf, other

    cs.LG

    Sublinear Optimization for Machine Learning

    Authors: Kenneth L. Clarkson, Elad Hazan, David P. Woodruff

    Abstract: We give sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination… ▽ More

    Submitted 21 October, 2010; originally announced October 2010.

    Comments: extended abstract appeared in FOCS 2010

  23. arXiv:0909.0537  [pdf, ps, other

    cs.CG

    On the Set Multi-Cover Problem in Geometric Settings

    Authors: Chandra Chekuri, Kenneth L. Clarkson, Sariel Har-Peled

    Abstract: We consider the set multi-cover problem in geometric settings. Given a set of points P and a collection of geometric shapes (or sets) F, we wish to find a minimum cardinality subset of F such that each point p in P is covered by (contained in) at least d(p) sets. Here d(p) is an integer demand (requirement) for p. When the demands d(p)=1 for all p, this is the standard set cover problem. The set… ▽ More

    Submitted 2 September, 2009; originally announced September 2009.

  24. arXiv:0907.0884  [pdf, ps, other

    cs.DS cs.CG

    Self-Improving Algorithms

    Authors: Nir Ailon, Bernard Chazelle, Kenneth L. Clarkson, Ding Liu, Wolfgang Mulzer, C. Seshadhri

    Abstract: We investigate ways in which an algorithm can improve its expected performance by fine-tuning itself automatically with respect to an unknown input distribution D. We assume here that D is of product type. More precisely, suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1, x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently from some arbitrary, unk… ▽ More

    Submitted 18 October, 2010; v1 submitted 5 July, 2009; originally announced July 2009.

    Comments: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and SoCG 2008. Thorough revision to improve the presentation of the paper

    ACM Class: F.2.2; D.1; F.1.1; I.2.6

    Journal ref: SIAM Journal on Computing (SICOMP), 40(2), 2011, pp. 350-375

  25. arXiv:cs/0501045  [pdf, ps, other

    cs.CG cs.DS

    Improved Approximation Algorithms for Geometric Set Cover

    Authors: Kenneth L. Clarkson, Kasturi Varadarajan

    Abstract: Given a collection S of subsets of some set U, and M a subset of U, the set cover problem is to find the smallest subcollection C of S such that M is a subset of the union of the sets in C. While the general problem is NP-hard to solve, even approximately, here we consider some geometric special cases, where usually U = R^d. Extending prior results, we show that approximation algorithms with pro… ▽ More

    Submitted 20 January, 2005; originally announced January 2005.

    ACM Class: F.2.2