Skip to main content

Showing 1–31 of 31 results for author: Musco, C

Searching in archive math. Search in all archives.
.
  1. arXiv:2407.04686  [pdf, other

    cs.DS math.NA

    Near-optimal hierarchical matrix approximation from matrix-vector products

    Authors: Tyler Chen, Feyza Duman Keles, Diana Halikias, Cameron Musco, Christopher Musco, David Persson

    Abstract: We describe a randomized algorithm for producing a near-optimal hierarchical off-diagonal low-rank (HODLR) approximation to an $n\times n$ matrix $\mathbf{A}$, accessible only though matrix-vector products with $\mathbf{A}$ and $\mathbf{A}^{\mathsf{T}}$. We prove that, for the rank-$k$ HODLR approximation problem, our method achieves a $(1+β)^{\log(n)}$-optimal approximation in expected Frobenius… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2405.05865  [pdf, ps, other

    cs.DS cs.LG math.NA math.OC

    Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning

    Authors: Michał Dereziński, Christopher Musco, Jiaming Yang

    Abstract: We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$. Our methods are based on constructing a low-rank Nyström approximation to $A$ using sparse random sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergen… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2404.13757  [pdf, ps, other

    cs.DS math.NA

    Sublinear Time Low-Rank Approximation of Toeplitz Matrices

    Authors: Cameron Musco, Kshiteej Sheth

    Abstract: We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix $T\in \mathbb{R}^{d\times d}$, given noisy access to its entries. In particular, given entrywise query access to $T+E$ for an arbitrary noise matrix $E\in \mathbb{R}^{d\times d}$, integer rank $k\leq d$, and error parameter $δ>0$, our algorithm runs in time… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Published in SODA 2024. Updated proofs

  4. arXiv:2402.09379  [pdf, other

    cs.DS math.NA

    Fixed-sparsity matrix approximation from matrix-vector products

    Authors: Noah Amsel, Tyler Chen, Feyza Duman Keles, Diana Halikias, Cameron Musco, Christopher Musco

    Abstract: We study the problem of approximating a matrix $\mathbf{A}$ with a matrix that has a fixed sparsity pattern (e.g., diagonal, banded, etc.), when $\mathbf{A}$ is accessed only by matrix-vector products. We describe a simple randomized algorithm that returns an approximation with the given sparsity pattern with Frobenius-norm error at most $(1+\varepsilon)$ times the best possible error. When each r… ▽ More

    Submitted 26 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  5. arXiv:2311.14023  [pdf, ps, other

    math.NA cs.DS

    Algorithm-agnostic low-rank approximation of operator monotone matrix functions

    Authors: David Persson, Raphael A. Meyer, Christopher Musco

    Abstract: Low-rank approximation of a matrix function, $f(A)$, is an important task in computational mathematics. Most methods require direct access to $f(A)$, which is often considerably more expensive than accessing $A$. Persson and Kressner (SIMAX 2023) avoid this issue for symmetric positive semidefinite matrices by proposing funNyström, which first constructs a Nyström approximation to $A$ using subspa… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    MSC Class: 65F15; 65F55; 65F60; 68W25

  6. arXiv:2310.18265  [pdf, other

    cs.DS cs.LG math.OC stat.ML

    Structured Semidefinite Programming for Recovering Structured Preconditioners

    Authors: Arun Jambulapati, Jerry Li, Christopher Musco, Kirankumar Shiragur, Aaron Sidford, Kevin Tian

    Abstract: We develop a general framework for finding approximately-optimal preconditioners for solving linear systems. Leveraging this framework we obtain improved runtimes for fundamental preconditioning and linear system solving problems including the following. We give an algorithm which, given positive definite $\mathbf{K} \in \mathbb{R}^{d \times d}$ with $\mathrm{nnz}(\mathbf{K})$ nonzero entries, com… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Merge of arXiv:1812.06295 and arXiv:2008.01722

  7. arXiv:2305.05826  [pdf, ps, other

    cs.DS math.NA

    Universal Matrix Sparsifiers and Fast Deterministic Algorithms for Linear Algebra

    Authors: Rajarshi Bhattacharjee, Gregory Dexter, Cameron Musco, Archan Ray, Sushant Sachdeva, David P Woodruff

    Abstract: Let $\mathbf S \in \mathbb R^{n \times n}$ satisfy $\|\mathbf 1-\mathbf S\|_2\leεn$, where $\mathbf 1$ is the all ones matrix and $\|\cdot\|_2$ is the spectral norm. It is well-known that there exists such an $\mathbf S$ with just $O(n/ε^2)$ non-zero entries: we can let $\mathbf S$ be the scaled adjacency matrix of a Ramanujan expander graph. We show that such an $\mathbf S$ yields a $universal$… ▽ More

    Submitted 12 January, 2024; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: 41 pages

    ACM Class: F.2.1; G.1.3; G.1.2; G.4; I.1.2

  8. arXiv:2305.02535  [pdf, other

    cs.DS math.NA

    On the Unreasonable Effectiveness of Single Vector Krylov Methods for Low-Rank Approximation

    Authors: Raphael A. Meyer, Cameron Musco, Christopher Musco

    Abstract: Krylov subspace methods are a ubiquitous tool for computing near-optimal rank $k$ approximations of large matrices. While "large block" Krylov methods with block size at least $k$ give the best known theoretical guarantees, block size one (a single vector) or a small constant is often preferred in practice. Despite their popularity, we lack theoretical bounds on the performance of such "small bloc… ▽ More

    Submitted 6 November, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 41 pages, 7 figures. To appear at SODA 2024

    MSC Class: 65F55 (Primary) 65F15 (Secondary) ACM Class: G.1.3; F.2.1

  9. arXiv:2303.03358  [pdf, other

    math.NA

    Near-Optimal Approximation of Matrix Functions by the Lanczos Method

    Authors: Noah Amsel, Tyler Chen, Anne Greenbaum, Cameron Musco, Chris Musco

    Abstract: We study the widely used Lanczos method for approximating the action of a matrix function $f(\mathbf{A})$ on a vector $\mathbf{b}$ (Lanczos-FA). For the function $f(\mathbf{A})=\mathbf{A}^{-1}$, it is known that, when $\mathbf{A}$ is positive definite, the $\mathbf{A}$-norm error of Lanczos-FA after $k$ iterations matches the optimal approximation from the Krylov subspace of degree $k$ generated b… ▽ More

    Submitted 12 December, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    MSC Class: 65F60; 65F50; 68Q25

  10. arXiv:2211.11328  [pdf, ps, other

    cs.DS math.NA

    Toeplitz Low-Rank Approximation with Sublinear Query Complexity

    Authors: Michael Kapralov, Hannah Lawrence, Mikhail Makarov, Cameron Musco, Kshiteej Sheth

    Abstract: We present a sublinear query algorithm for outputting a near-optimal low-rank approximation to any positive semidefinite Toeplitz matrix $T \in \mathbb{R}^{d \times d}$. In particular, for any integer rank $k \leq d$ and $ε,δ> 0$, our algorithm makes $\tilde{O} \left (k^2 \cdot \log(1/δ) \cdot \text{poly}(1/ε) \right )$ queries to the entries of $T$ and outputs a rank… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted in SODA 2023

  11. arXiv:2208.03268  [pdf, other

    cs.DS math.NA

    A Tight Analysis of Hutchinson's Diagonal Estimator

    Authors: Prathamesh Dharangutte, Christopher Musco

    Abstract: Let $\mathbf{A}\in \mathbb{R}^{n\times n}$ be a matrix with diagonal $\text{diag}(\mathbf{A})$ and let $\bar{\mathbf{A}}$ be $\mathbf{A}$ with its diagonal set to all zeros. We show that Hutchinson's estimator run for $m$ iterations returns a diagonal estimate $\tilde{d}\in \mathbb{R}^n$ such that with probability $(1-δ)$,… ▽ More

    Submitted 6 November, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: To appear in SIAM Symposium on Simplicity in Algorithms (SOSA23)

  12. Low-memory Krylov subspace methods for optimal rational matrix function approximation

    Authors: Tyler Chen, Anne Greenbaum, Cameron Musco, Christopher Musco

    Abstract: We describe a Lanczos-based algorithm for approximating the product of a rational matrix function with a vector. This algorithm, which we call the Lanczos method for optimal rational matrix function approximation (Lanczos-OR), returns the optimal approximation from a given Krylov subspace in a norm depending on the rational function's denominator, and can be computed using the information from a s… ▽ More

    Submitted 31 May, 2023; v1 submitted 22 February, 2022; originally announced February 2022.

    Journal ref: SIAM Journal on Matrix Analysis and Applications 2023 44:2, 670-692

  13. arXiv:2109.07647  [pdf, other

    cs.DS math.NA

    Sublinear Time Eigenvalue Approximation via Random Sampling

    Authors: Rajarshi Bhattacharjee, Gregory Dexter, Petros Drineas, Cameron Musco, Archan Ray

    Abstract: We study the problem of approximating the eigenspectrum of a symmetric matrix $\mathbf A \in \mathbb{R}^{n \times n}$ with bounded entries (i.e., $\|\mathbf A\|_{\infty} \leq 1$). We present a simple sublinear time algorithm that approximates all eigenvalues of $\mathbf{A}$ up to additive error $\pm εn$ using those of a randomly sampled… ▽ More

    Submitted 21 July, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: 58 pages, 4 figures

    MSC Class: F.2.1; G.1.3; G.1.2; G.4; I.1.2

  14. Error bounds for Lanczos-based matrix function approximation

    Authors: Tyler Chen, Anne Greenbaum, Cameron Musco, Christopher Musco

    Abstract: We analyze the Lanczos method for matrix function approximation (Lanczos-FA), an iterative algorithm for computing $f(\mathbf{A}) \mathbf{b}$ when $\mathbf{A}$ is a Hermitian matrix and $\mathbf{b}$ is a given vector. Assuming that $f : \mathbb{C} \rightarrow \mathbb{C}$ is piecewise analytic, we give a framework, based on the Cauchy integral formula, which can be used to derive a priori and a pos… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Journal ref: SIAM Journal on Matrix Analysis and Applications, 2022, Vol. 43, No. 2 : pp. A3084-A3108

  15. Sublinear Time Spectral Density Estimation

    Authors: Vladimir Braverman, Aditya Krishnan, Christopher Musco

    Abstract: We present a new sublinear time algorithm for approximating the spectral density (eigenvalue distribution) of an $n\times n$ normalized graph adjacency or Laplacian matrix. The algorithm recovers the spectrum up to $ε$ accuracy in the Wasserstein-1 distance in $O(n\cdot \text{poly}(1/ε))$ time given sample access to the graph. This result compliments recent work by David Cohen-Steiner, Weihao Kong… ▽ More

    Submitted 14 April, 2022; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to STOC'22

  16. arXiv:2102.08341  [pdf, other

    cs.DS cs.LG math.NA

    Faster Kernel Matrix Algebra via Density Estimation

    Authors: Arturs Backurs, Piotr Indyk, Cameron Musco, Tal Wagner

    Abstract: We study fast algorithms for computing fundamental properties of a positive semidefinite kernel matrix $K \in \mathbb{R}^{n \times n}$ corresponding to $n$ points $x_1,\ldots,x_n \in \mathbb{R}^d$. In particular, we consider estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector. We show that the sum of matrix entries can be estimated to $1+ε$ relative error i… ▽ More

    Submitted 17 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  17. arXiv:2010.09649  [pdf, other

    cs.DS cs.LG math.NA

    Hutch++: Optimal Stochastic Trace Estimation

    Authors: Raphael A. Meyer, Cameron Musco, Christopher Musco, David P. Woodruff

    Abstract: We study the problem of estimating the trace of a matrix $A$ that can only be accessed through matrix-vector multiplication. We introduce a new randomized algorithm, Hutch++, which computes a $(1 \pm ε)$ approximation to $tr(A)$ for any positive semidefinite (PSD) $A$ using just $O(1/ε)$ matrix-vector products. This improves on the ubiquitous Hutchinson's estimator, which requires $O(1/ε^2)$ matri… ▽ More

    Submitted 10 June, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: SIAM Symposium on Simplicity in Algorithms (SOSA21)

  18. arXiv:2008.01722  [pdf, ps, other

    math.OC cs.DS cs.LG stat.ML

    Fast and Near-Optimal Diagonal Preconditioning

    Authors: Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, Kevin Tian

    Abstract: The convergence rates of iterative methods for solving a linear system $\mathbf{A} x = b$ typically depend on the condition number of the matrix $\mathbf{A}$. Preconditioning is a common way of speeding up these methods by reducing that condition number in a computationally inexpensive way. In this paper, we revisit the decades-old problem of how to best improve $\mathbf{A}$'s condition number by… ▽ More

    Submitted 3 November, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: 46 pages

  19. arXiv:2006.07340  [pdf, other

    cs.DS cs.LG math.CA

    Fourier Sparse Leverage Scores and Approximate Kernel Learning

    Authors: Tamás Erdélyi, Cameron Musco, Christopher Musco

    Abstract: We prove new explicit upper bounds on the leverage scores of Fourier sparse functions under both the Gaussian and Laplace measures. In particular, we study $s$-sparse functions of the form $f(x) = \sum_{j=1}^s a_j e^{i λ_j x}$ for coefficients $a_j \in \mathbb{C}$ and frequencies $λ_j \in \mathbb{R}$. Bounding Fourier sparse leverage scores under various measures is of pure mathematical interest i… ▽ More

    Submitted 7 July, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

  20. arXiv:2004.08434  [pdf, ps, other

    cs.DS cs.LG math.NA

    Projection-Cost-Preserving Sketches: Proof Strategies and Constructions

    Authors: Cameron Musco, Christopher Musco

    Abstract: In this note we illustrate how common matrix approximation methods, such as random projection and random sampling, yield projection-cost-preserving sketches, as introduced in [FSS13, CEM+15]. A projection-cost-preserving sketch is a matrix approximation which, for a given parameter $k$, approximately preserves the distance of the target matrix to all $k$-dimensional subspaces. Such sketches have a… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

  21. arXiv:1911.01575  [pdf, other

    math.OC cs.LG stat.ML

    Importance Sampling via Local Sensitivity

    Authors: Anant Raj, Cameron Musco, Lester Mackey

    Abstract: Given a loss function $F:\mathcal{X} \rightarrow \R^+$ that can be written as the sum of losses over a large set of inputs $a_1,\ldots, a_n$, it is often desirable to approximate $F$ by subsampling the input points. Strong theoretical guarantees require taking into account the importance of each point, measured by how much its individual loss contributes to $F(x)$. Maximizing this importance over… ▽ More

    Submitted 19 March, 2020; v1 submitted 3 November, 2019; originally announced November 2019.

  22. arXiv:1905.05643  [pdf, other

    eess.SP cs.DS cs.LG math.ST

    Sample Efficient Toeplitz Covariance Estimation

    Authors: Yonina C. Eldar, Jerry Li, Cameron Musco, Christopher Musco

    Abstract: We study the sample complexity of estimating the covariance matrix $T$ of a distribution $\mathcal{D}$ over $d$-dimensional vectors, under the assumption that $T$ is Toeplitz. This assumption arises in many signal processing problems, where the covariance between any two measurements only depends on the time or distance between those measurements. We are interested in estimation strategies that ma… ▽ More

    Submitted 30 October, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

  23. arXiv:1904.09841  [pdf, ps, other

    cs.DS cs.LG math.NA

    Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

    Authors: Cameron Musco, Christopher Musco, David P. Woodruff

    Abstract: In $masked\ low-rank\ approximation$, one is given $A \in \mathbb{R}^{n \times n}$ and binary mask matrix $W \in \{0,1\}^{n \times n}$. The goal is to find a rank-$k$ matrix $L$ for which: $$cost(L) = \sum_{i=1}^{n} \sum_{j = 1}^{n} W_{i,j} \cdot (A_{i,j} - L_{i,j} )^2 \leq OPT + ε\|A\|_F^2 ,$$ where $OPT = \min_{rank-k\ \hat{L}} cost(\hat L)$ and $ε$ is a given error parameter. Depending on the c… ▽ More

    Submitted 30 November, 2020; v1 submitted 22 April, 2019; originally announced April 2019.

    Comments: ITCS 2021

  24. arXiv:1812.08723  [pdf, ps, other

    cs.DS cs.LG eess.SP math.NA

    A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms

    Authors: Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh

    Abstract: Reconstructing continuous signals from a small number of discrete samples is a fundamental problem across science and engineering. In practice, we are often interested in signals with 'simple' Fourier structure, such as bandlimited, multiband, and Fourier sparse signals. More broadly, any prior knowledge about a signal's Fourier power spectrum can constrain its complexity. Intuitively, signals wit… ▽ More

    Submitted 20 December, 2018; originally announced December 2018.

  25. arXiv:1804.09893  [pdf, other

    cs.LG cs.DS math.NA stat.ML

    Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees

    Authors: Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh

    Abstract: Random Fourier features is one of the most popular techniques for scaling up kernel methods, such as kernel ridge regression. However, despite impressive empirical results, the statistical properties of random Fourier features are still not well understood. In this paper we take steps toward filling this gap. Specifically, we approach random Fourier features from a spectral matrix approximation po… ▽ More

    Submitted 21 May, 2018; v1 submitted 26 April, 2018; originally announced April 2018.

    Comments: An extended abstract of this work appears in the Proceedings of the 34th International Conference on Machine Learning (ICML 2017)

  26. arXiv:1708.07788  [pdf, ps, other

    cs.DS math.NA

    Stability of the Lanczos Method for Matrix Function Approximation

    Authors: Cameron Musco, Christopher Musco, Aaron Sidford

    Abstract: The ubiquitous Lanczos method can approximate $f(A)x$ for any symmetric $n \times n$ matrix $A$, vector $x$, and function $f$. In exact arithmetic, the method's error after $k$ iterations is bounded by the error of the best degree-$k$ polynomial uniformly approximating $f(x)$ on the range $[λ_{min}(A), λ_{max}(A)]$. However, despite decades of work, it has been unclear if this powerful guarantee h… ▽ More

    Submitted 25 August, 2017; originally announced August 2017.

  27. arXiv:1704.04163  [pdf, ps, other

    cs.DS cs.LG math.NA

    Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

    Authors: Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, David P. Woodruff

    Abstract: Understanding the singular value spectrum of a matrix $A \in \mathbb{R}^{n \times n}$ is a fundamental task in countless applications. In matrix multiplication time, it is possible to perform a full SVD and directly compute the singular values $σ_1,...,σ_n$. However, little is known about algorithms that break this runtime barrier. Using tools from stochastic trace estimation, polynomial approxi… ▽ More

    Submitted 3 January, 2019; v1 submitted 13 April, 2017; originally announced April 2017.

    Comments: ITCS 2018

  28. arXiv:1704.03371  [pdf, ps, other

    cs.DS cs.LG math.NA

    Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices

    Authors: Cameron Musco, David P. Woodruff

    Abstract: We show how to compute a relative-error low-rank approximation to any positive semidefinite (PSD) matrix in sublinear time, i.e., for any $n \times n$ PSD matrix $A$, in $\tilde O(n \cdot poly(k/ε))$ time we output a rank-$k$ matrix $B$, in factored form, for which $\|A-B\|_F^2 \leq (1+ε)\|A-A_k\|_F^2$, where $A_k$ is the best rank-$k$ approximation to $A$. When $k$ and $1/ε$ are not too large com… ▽ More

    Submitted 3 January, 2019; v1 submitted 11 April, 2017; originally announced April 2017.

  29. arXiv:1605.08754  [pdf, other

    cs.DS cs.LG math.NA math.OC

    Faster Eigenvector Computation via Shift-and-Invert Preconditioning

    Authors: Dan Garber, Elad Hazan, Chi **, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

    Abstract: We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $Σ$ -- i.e. computing a unit vector $x$ such that $x^T Σx \ge (1-ε)λ_1(Σ)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $Σ= A^TA$, we show how to compute an $ε$ approximate top eigenvector in time… ▽ More

    Submitted 25 May, 2016; originally announced May 2016.

    Comments: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.08896

  30. arXiv:1510.08896  [pdf, other

    cs.DS cs.LG math.NA math.OC

    Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

    Authors: Chi **, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

    Abstract: We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an $n \times d$ matrix $A$, we show how to compute an $ε$ approximate top eigenvector in time $\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/ε)$ and $\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/ε)$. Here $sr(A)$ is… ▽ More

    Submitted 29 May, 2016; v1 submitted 29 October, 2015; originally announced October 2015.

    Comments: Manuscript outdated. Updated version at arxiv:1605.08754

  31. arXiv:1504.05477  [pdf, ps, other

    cs.DS cs.LG math.NA

    Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition

    Authors: Cameron Musco, Christopher Musco

    Abstract: Since being analyzed by Rokhlin, Szlam, and Tygert and popularized by Halko, Martinsson, and Tropp, randomized Simultaneous Power Iteration has become the method of choice for approximate singular value decomposition. It is more accurate than simpler sketching algorithms, yet still converges quickly for any matrix, independently of singular value gaps. After $\tilde{O}(1/ε)$ iterations, it gives a… ▽ More

    Submitted 30 October, 2015; v1 submitted 21 April, 2015; originally announced April 2015.

    Comments: Neural Information Processing Systems 2015