Skip to main content

Showing 1–12 of 12 results for author: Carratino, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2201.12909  [pdf, other

    stat.ML cs.LG

    Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times

    Authors: Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

    Abstract: Computing a Gaussian process (GP) posterior has a computational cost cubical in the number of historical points. A reformulation of the same GP posterior highlights that this complexity mainly depends on how many \emph{unique} historical points are considered. This can have important implication in active learning settings, where the set of historical points is constructed sequentially by the lear… ▽ More

    Submitted 30 January, 2022; originally announced January 2022.

  2. arXiv:2201.06314  [pdf, other

    cs.LG stat.ML

    Efficient Hyperparameter Tuning for Large Scale Kernel Ridge Regression

    Authors: Giacomo Meanti, Luigi Carratino, Ernesto De Vito, Lorenzo Rosasco

    Abstract: Kernel methods provide a principled approach to nonparametric learning. While their basic implementations scale poorly to large problems, recent advances showed that approximate solvers can efficiently handle massive datasets. A shortcoming of these solutions is that hyperparameter tuning is not taken care of, and left for the user to perform. Hyperparameters are crucial in practice and the lack o… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: 24 pages, 3 figures

  3. arXiv:2110.10996  [pdf

    stat.ML cs.LG

    Mean Nyström Embeddings for Adaptive Compressive Learning

    Authors: Antoine Chatalic, Luigi Carratino, Ernesto De Vito, Lorenzo Rosasco

    Abstract: Compressive learning is an approach to efficient large scale learning based on sketching an entire dataset to a single mean embedding (the sketch), i.e. a vector of generalized moments. The learning task is then approximately solved as an inverse problem using an adapted parametric model. Previous works in this context have focused on sketches obtained by averaging random features, that while univ… ▽ More

    Submitted 10 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to AISTATS 2022. 21 pages, 4 figures

  4. arXiv:2106.12231  [pdf, ps, other

    stat.ML cs.LG

    ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

    Authors: Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco

    Abstract: We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the… ▽ More

    Submitted 17 October, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  5. arXiv:2106.08598  [pdf, other

    cs.LG stat.ML

    Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domains by Adaptive Discretization

    Authors: Marco Rando, Luigi Carratino, Silvia Villa, Lorenzo Rosasco

    Abstract: Gaussian process optimization is a successful class of algorithms(e.g. GP-UCB) to optimize a black-box function through sequential evaluations. However, for functions with continuous domains, Gaussian process optimization has to rely on either a fixed discretization of the space, or the solution of a non-convex optimization subproblem at each evaluation. The first approach can negatively affect pe… ▽ More

    Submitted 11 March, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

  6. arXiv:2006.10350  [pdf, other

    cs.LG stat.ML

    Kernel methods through the roof: handling billions of points efficiently

    Authors: Giacomo Meanti, Luigi Carratino, Lorenzo Rosasco, Alessandro Rudi

    Abstract: Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems, since naïve implementations scale poorly with data size. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to dev… ▽ More

    Submitted 26 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: 33 pages, 7 figures, NeurIPS 2020

  7. arXiv:2006.06049  [pdf, other

    cs.LG stat.ML

    On Mixup Regularization

    Authors: Luigi Carratino, Moustapha Cissé, Rodolphe Jenatton, Jean-Philippe Vert

    Abstract: Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretica… ▽ More

    Submitted 17 October, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

  8. arXiv:2002.09954  [pdf, other

    stat.ML cs.LG

    Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification

    Authors: Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

    Abstract: Gaussian processes (GP) are one of the most successful frameworks to model uncertainty. However, GP optimization (e.g., GP-UCB) suffers from major scalability issues. Experimental time grows linearly with the number of evaluations, unless candidates are selected in batches (e.g., using GP-BUCB) and evaluated in parallel. Furthermore, computational cost is often prohibitive since algorithms such as… ▽ More

    Submitted 26 February, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

  9. arXiv:1903.05594  [pdf, other

    stat.ML cs.LG

    Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret

    Authors: Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

    Abstract: Gaussian processes (GP) are a well studied Bayesian approach for the optimization of black-box functions. Despite their effectiveness in simple problems, GP-based algorithms hardly scale to high-dimensional functions, as their per-iteration time and space cost is at least quadratic in the number of dimensions $d$ and iterations $t$. Given a set of $A$ alternatives to choose from, the overall runti… ▽ More

    Submitted 27 August, 2019; v1 submitted 13 March, 2019; originally announced March 2019.

    Comments: Accepted at COLT 2019. Corrected typos and improved comparison with existing methods

    Journal ref: Proceedings of Machine Learning Research vol, 99, (COLT 2019)

  10. arXiv:1810.13258  [pdf, other

    stat.ML cs.DS cs.LG

    On Fast Leverage Score Sampling and Optimal Learning

    Authors: Alessandro Rudi, Daniele Calandriello, Luigi Carratino, Lorenzo Rosasco

    Abstract: Leverage score sampling provides an appealing way to perform approximate computations for large matrices. Indeed, it allows to derive faithful approximations with a complexity adapted to the problem at hand. Yet, performing leverage scores sampling is a challenge in its own right requiring further approximations. In this paper, we study the problem of leverage score sampling for positive definite… ▽ More

    Submitted 24 January, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  11. arXiv:1807.06343  [pdf, other

    stat.ML cs.LG

    Learning with SGD and Random Features

    Authors: Luigi Carratino, Alessandro Rudi, Lorenzo Rosasco

    Abstract: Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching… ▽ More

    Submitted 24 January, 2019; v1 submitted 17 July, 2018; originally announced July 2018.

  12. arXiv:1705.10958  [pdf, ps, other

    stat.ML cs.LG

    FALKON: An Optimal Large Scale Kernel Method

    Authors: Alessandro Rudi, Luigi Carratino, Lorenzo Rosasco

    Abstract: Kernel methods provide a principled way to perform non linear, nonparametric learning. They rely on solid functional analytic foundations and enjoy optimal statistical properties. However, at least in their basic form, they have limited applicability in large scale scenarios because of stringent computational requirements in terms of time and especially memory. In this paper, we take a substantial… ▽ More

    Submitted 31 January, 2018; v1 submitted 31 May, 2017; originally announced May 2017.

    Comments: NIPS 2017