Search | arXiv e-print repository

Efficient Numerical Integration in Reproducing Kernel Hilbert Spaces via Leverage Scores Sampling

Authors: Antoine Chatalic, Nicolas Schreuder, Ernesto De Vito, Lorenzo Rosasco

Abstract: In this work we consider the problem of numerical integration, i.e., approximating integrals with respect to a target probability measure using only pointwise evaluations of the integrand. We focus on the setting in which the target distribution is only accessible through a set of $n$ i.i.d. observations, and the integrand belongs to a reproducing kernel Hilbert space. We propose an efficient proc… ▽ More In this work we consider the problem of numerical integration, i.e., approximating integrals with respect to a target probability measure using only pointwise evaluations of the integrand. We focus on the setting in which the target distribution is only accessible through a set of $n$ i.i.d. observations, and the integrand belongs to a reproducing kernel Hilbert space. We propose an efficient procedure which exploits a small i.i.d. random subset of $m<n$ samples drawn either uniformly or using approximate leverage scores from the initial observations. Our main result is an upper bound on the approximation error of this procedure for both sampling strategies. It yields sufficient conditions on the subsample size to recover the standard (optimal) $n^{-1/2}$ rate while reducing drastically the number of functions evaluations, and thus the overall computational cost. Moreover, we obtain rates with respect to the number $m$ of evaluations of the integrand which adapt to its smoothness, and match known optimal rates for instance for Sobolev spaces. We illustrate our theoretical findings with numerical experiments on real datasets, which highlight the attractive efficiency-accuracy tradeoff of our method compared to existing randomized and greedy quadrature methods. We note that, the problem of numerical integration in RKHS amounts to designing a discrete approximation of the kernel mean embedding of the target distribution. As a consequence, direct applications of our results also include the efficient computation of maximum mean discrepancies between distributions and the design of efficient kernel-based tests. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 46 pages, 5 figures. Submitted to JMLR

arXiv:2306.04520 [pdf, other]

Estimating Koopman operators with sketching to provably learn large scale dynamical systems

Authors: Giacomo Meanti, Antoine Chatalic, Vladimir R. Kostic, Pietro Novelli, Massimiliano Pontil, Lorenzo Rosasco

Abstract: The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very lon… ▽ More The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very long trajectories is a challenge and requires introducing suitable approximations to make computations feasible. In this paper, we boost the efficiency of different kernel-based Koopman operator estimators using random projections (sketching). We derive, implement and test the new "sketched" estimators with extensive experiments on synthetic and large-scale molecular dynamics datasets. Further, we establish non asymptotic error bounds giving a sharp characterization of the trade-offs between statistical learning rates and computational efficiency. Our empirical and theoretical analysis shows that the proposed estimators provide a sound and efficient way to learn large scale dynamical systems. In particular our experiments indicate that the proposed estimators retain the same accuracy of PCR or RRR, while being much faster. △ Less

Submitted 30 July, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: 9 pages, 4 figures, code at https://github.com/Giodiro/NystromKoopman

arXiv:2211.14062 [pdf, other]

M$^2$M: A general method to perform various data analysis tasks from a differentially private sketch

Authors: Florimond Houssiau, Vincent Schellekens, Antoine Chatalic, Shreyas Kumar Annamraju, Yves-Alexandre de Montjoye

Abstract: Differential privacy is the standard privacy definition for performing analyses over sensitive data. Yet, its privacy budget bounds the number of tasks an analyst can perform with reasonable accuracy, which makes it challenging to deploy in practice. This can be alleviated by private sketching, where the dataset is compressed into a single noisy sketch vector which can be shared with the analysts… ▽ More Differential privacy is the standard privacy definition for performing analyses over sensitive data. Yet, its privacy budget bounds the number of tasks an analyst can perform with reasonable accuracy, which makes it challenging to deploy in practice. This can be alleviated by private sketching, where the dataset is compressed into a single noisy sketch vector which can be shared with the analysts and used to perform arbitrarily many analyses. However, the algorithms to perform specific tasks from sketches must be developed on a case-by-case basis, which is a major impediment to their use. In this paper, we introduce the generic moment-to-moment (M$^2$M) method to perform a wide range of data exploration tasks from a single private sketch. Among other things, this method can be used to estimate empirical moments of attributes, the covariance matrix, counting queries (including histograms), and regression models. Our method treats the sketching mechanism as a black-box operation, and can thus be applied to a wide variety of sketches from the literature, widening their ranges of applications without further engineering or privacy loss, and removing some of the technical barriers to the wider adoption of sketches for data exploration under differential privacy. We validate our method with data exploration tasks on artificial and real-world data, and show that it can be used to reliably estimate statistics and train classification models from private sketches. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: Published at the 18th International Workshop on Security and Trust Management (STM 2022)

arXiv:2201.13055 [pdf]

Nyström Kernel Mean Embeddings

Authors: Antoine Chatalic, Nicolas Schreuder, Alessandro Rudi, Lorenzo Rosasco

Abstract: Kernel mean embeddings are a powerful tool to represent probability distributions over arbitrary spaces as single points in a Hilbert space. Yet, the cost of computing and storing such embeddings prohibits their direct use in large-scale settings. We propose an efficient approximation procedure based on the Nyström method, which exploits a small random subset of the dataset. Our main result is an… ▽ More Kernel mean embeddings are a powerful tool to represent probability distributions over arbitrary spaces as single points in a Hilbert space. Yet, the cost of computing and storing such embeddings prohibits their direct use in large-scale settings. We propose an efficient approximation procedure based on the Nyström method, which exploits a small random subset of the dataset. Our main result is an upper bound on the approximation error of this procedure. It yields sufficient conditions on the subsample size to obtain the standard $n^{-1/2}$ rate while reducing computational costs. We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules, and illustrate our theoretical findings with numerical experiments. △ Less

Submitted 15 June, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

Comments: 8 pages

Journal ref: ICML 2022

arXiv:2110.10996 [pdf]

Mean Nyström Embeddings for Adaptive Compressive Learning

Authors: Antoine Chatalic, Luigi Carratino, Ernesto De Vito, Lorenzo Rosasco

Abstract: Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an inverse problem using an adapted parametric model. Previous works in this context have focused on sketches obtained by averaging random features, that while univ… ▽ More Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an inverse problem using an adapted parametric model. Previous works in this context have focused on sketches obtained by averaging random features, that while universal can be poorly adapted to the problem at hand. In this paper, we propose and study the idea of performing sketching based on data-dependent Nyström approximation. From a theoretical perspective we prove that the excess risk can be controlled under a geometric assumption relating the parametric model used to learn from the sketch and the covariance operator associated to the task at hand. Empirically, we show for k-means clustering and Gaussian modeling that for a fixed sketch size, Nyström sketches indeed outperform those built with random features. △ Less

Submitted 10 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: Accepted to AISTATS 2022. 21 pages, 4 figures

arXiv:2008.01839 [pdf, other]

Sketching Datasets for Large-Scale Learning (long version)

Authors: Rémi Gribonval, Antoine Chatalic, Nicolas Keriven, Vincent Schellekens, Laurent Jacques, Philip Schniter

Abstract: This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is first constructed by computing carefully chosen nonlinear random features (e.g., random Fourier features) and averaging them over the whole dataset. Parameters… ▽ More This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is first constructed by computing carefully chosen nonlinear random features (e.g., random Fourier features) and averaging them over the whole dataset. Parameters are then learned from the sketch, without access to the original dataset. This article surveys the current state-of-the-art in compressive learning, including the main concepts and algorithms, their connections with established signal-processing methods, existing theoretical guarantees -- on both information preservation and privacy preservation, and important open problems. △ Less

Submitted 24 June, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

arXiv:1712.02849 [pdf, ps, other]

Sketched Clustering via Hybrid Approximate Message Passing

Authors: Evan Byrne, Antoine Chatalic, Remi Gribonval, Philip Schniter

Abstract: In sketched clustering, a dataset of $T$ samples is first sketched down to a vector of modest size, from which the centroids are subsequently extracted. Advantages include i) reduced storage complexity and ii) centroid extraction complexity independent of $T$. For the sketching methodology recently proposed by Keriven, et al., which can be interpreted as a random sampling of the empirical characte… ▽ More In sketched clustering, a dataset of $T$ samples is first sketched down to a vector of modest size, from which the centroids are subsequently extracted. Advantages include i) reduced storage complexity and ii) centroid extraction complexity independent of $T$. For the sketching methodology recently proposed by Keriven, et al., which can be interpreted as a random sampling of the empirical characteristic function, we propose a sketched clustering algorithm based on approximate message passing. Numerical experiments suggest that our approach is more efficient than the state-of-the-art sketched clustering algorithm "CL-OMPR" (in both computational and sample complexity) and more efficient than k-means++ when $T$ is large. △ Less

Submitted 18 May, 2019; v1 submitted 7 December, 2017; originally announced December 2017.

Showing 1–7 of 7 results for author: Chatalic, A