Skip to main content

Showing 1–17 of 17 results for author: Vigogna, S

.
  1. arXiv:2405.09541  [pdf, other

    stat.ML cs.LG math.PR

    Spectral complexity of deep neural networks

    Authors: Simmaco Di Lillo, Domenico Marinucci, Michele Salvi, Stefano Vigogna

    Abstract: It is well-known that randomly initialized, push-forward, fully-connected neural networks weakly converge to isotropic Gaussian processes, in the limit where the width of all layers goes to infinity. In this paper, we propose to use the angular power spectrum of the limiting field to characterize the complexity of the network architecture. In particular, we define sequences of random variables ass… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    MSC Class: 68T07; 60G60; 33C55; 62M15

  2. arXiv:2403.08750  [pdf, ps, other

    stat.ML cs.LG math.FA

    Neural reproducing kernel Banach spaces and representer theorems for deep networks

    Authors: Francesca Bartolucci, Ernesto De Vito, Lorenzo Rosasco, Stefano Vigogna

    Abstract: Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitabl… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  3. arXiv:2306.16932  [pdf, ps, other

    math.PR stat.ML

    A Quantitative Functional Central Limit Theorem for Shallow Neural Networks

    Authors: Valentina Cammarota, Domenico Marinucci, Michele Salvi, Stefano Vigogna

    Abstract: We prove a Quantitative Functional Central Limit Theorem for one-hidden-layer neural networks with generic activation function. The rates of convergence that we establish depend heavily on the smoothness of the activation function, and they range from logarithmic in non-differentiable cases such as the Relu to $\sqrt{n}$ for very regular activations. Our main tools are functional versions of the S… ▽ More

    Submitted 5 July, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    MSC Class: 60F17; 68T07; 60G60

  4. arXiv:2305.16014  [pdf, other

    stat.ML cs.AI cs.LG math.ST

    How many samples are needed to leverage smoothness?

    Authors: Vivien Cabannes, Stefano Vigogna

    Abstract: A core principle in statistical learning is that smoothness of target functions allows to break the curse of dimensionality. However, learning a smooth function seems to require enough samples close to one another to get meaningful estimate of high-order derivatives, which would be hard in machine learning problems where the ratio between number of data and input dimension is relatively small. By… ▽ More

    Submitted 16 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 34 pages, 13 figures

    MSC Class: 68T05 ACM Class: I.2.6; F.2.2; G.3

    Journal ref: NeurIPS 2023

  5. arXiv:2205.10055  [pdf, other

    stat.ML cs.AI cs.LG

    A Case of Exponential Convergence Rates for SVM

    Authors: Vivien Cabannes, Stefano Vigogna

    Abstract: Classification is often the first problem described in introductory machine learning classes. Generalization guarantees of classification have historically been offered by Vapnik-Chervonenkis theory. Yet those guarantees are based on intractable algorithms, which has led to the theory of surrogate methods in classification. Guarantees offered by surrogate methods are based on calibration inequalit… ▽ More

    Submitted 22 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: 16 pages, 6 figures

    MSC Class: 68T05 ACM Class: G.3

    Journal ref: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, 2023, PMLR 206:359-374

  6. arXiv:2202.01773  [pdf, other

    stat.ML cs.LG

    Multiclass learning with margin: exponential rates with no bias-variance trade-off

    Authors: Stefano Vigogna, Giacomo Meanti, Ernesto De Vito, Lorenzo Rosasco

    Abstract: We study the behavior of error bounds for multiclass classification under suitable margin conditions. For a wide variety of methods we prove that the classification error under a hard-margin condition decreases exponentially fast without any bias-variance trade-off. Different convergence rates can be obtained in correspondence of different margin assumptions. With a self-contained and instructive… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  7. arXiv:2109.09710  [pdf, ps, other

    stat.ML cs.LG math.FA

    Understanding neural networks with reproducing kernel Banach spaces

    Authors: Francesca Bartolucci, Ernesto De Vito, Lorenzo Rosasco, Stefano Vigogna

    Abstract: Characterizing the function spaces corresponding to neural networks can provide a way to understand their properties. In this paper we discuss how the theory of reproducing kernel Banach spaces can be used to tackle this challenge. In particular, we prove a representer theorem for a wide class of reproducing kernel Banach spaces that admit a suitable integral representation and include one hidden… ▽ More

    Submitted 26 October, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

  8. arXiv:2106.12231  [pdf, ps, other

    stat.ML cs.LG

    ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

    Authors: Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco

    Abstract: We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the… ▽ More

    Submitted 17 October, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  9. arXiv:2101.05119  [pdf, ps, other

    stat.ML cs.LG math.ST

    Multiscale regression on unknown manifolds

    Authors: Wen**g Liao, Mauro Maggioni, Stefano Vigogna

    Abstract: We consider the regression problem of estimating functions on $\mathbb{R}^D$ but supported on a $d$-dimensional manifold $ \mathcal{M} \subset \mathbb{R}^D $ with $ d \ll D $. Drawing ideas from multi-resolution analysis and nonlinear approximation, we construct low-dimensional coordinates on $\mathcal{M}$ at multiple scales, and perform multiscale regression by local polynomial fitting. We propos… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

  10. arXiv:2006.09870  [pdf, ps, other

    math.FA stat.ML

    Construction and Monte Carlo estimation of wavelet frames generated by a reproducing kernel

    Authors: Ernesto De Vito, Zeljko Kereta, Valeriya Naumova, Lorenzo Rosasco, Stefano Vigogna

    Abstract: We introduce a construction of multiscale tight frames on general domains. The frame elements are obtained by spectral filtering of the integral operator associated with a reproducing kernel. Our construction extends classical wavelets as well as generalized wavelets on both continuous and discrete non-Euclidean structures such as Riemannian manifolds and weighted graphs. Moreover, it allows to st… ▽ More

    Submitted 8 March, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

    MSC Class: 42C15; 42C40; 65T60; 46E22; 47A52; 68T05

  11. arXiv:2003.04788  [pdf, other

    math.ST

    Estimating multi-index models with response-conditional least squares

    Authors: Timo Klock, Alessandro Lanteri, Stefano Vigogna

    Abstract: The multi-index model is a simple yet powerful high-dimensional regression model which circumvents the curse of dimensionality assuming $ \mathbb{E} [ Y | X ] = g(A^\top X) $ for some unknown index space $A$ and link function $g$. In this paper we introduce a method for the estimation of the index space, and study the propagation error of an index space estimate in the regression of the link funct… ▽ More

    Submitted 3 June, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: 30 pages, 13 figures, 1 table

    MSC Class: 62G05; 62G08; 62H99

  12. arXiv:2002.10008  [pdf, other

    math.ST

    Conditional regression for single-index models

    Authors: Alessandro Lanteri, Mauro Maggioni, Stefano Vigogna

    Abstract: The single-index model is a statistical model for intrinsic regression where responses are assumed to depend on a single yet unknown linear combination of the predictors, allowing to express the regression function as $ \mathbb{E} [ Y | X ] = f ( \langle v , X \rangle ) $ for some unknown \emph{index} vector $v$ and \emph{link} function $f$. Conditional methods provide a simple and effective appro… ▽ More

    Submitted 27 May, 2022; v1 submitted 23 February, 2020; originally announced February 2020.

    MSC Class: 62G05 (Primary) 62G08; 62H99 (Secondary)

  13. arXiv:1903.06594  [pdf, ps, other

    math.FA stat.ML

    Monte Carlo wavelets: a randomized approach to frame discretization

    Authors: Zeljko Kereta, Stefano Vigogna, Valeriya Naumova, Lorenzo Rosasco, Ernesto De Vito

    Abstract: In this paper we propose and study a family of continuous wavelets on general domains, and a corresponding stochastic discretization that we call Monte Carlo wavelets. First, using tools from the theory of reproducing kernel Hilbert spaces and associated integral operators, we define a family of continuous wavelets by spectral calculus. Then, we propose a stochastic discretization based on Monte C… ▽ More

    Submitted 23 October, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

  14. Continuous and discrete frames generated by the evolution flow of the Schrödinger equation

    Authors: Giovanni S. Alberti, Stephan Dahlke, Filippo De Mari, Ernesto De Vito, Stefano Vigogna

    Abstract: We study a family of coherent states, called Schrödingerlets, both in the continuous and discrete setting. They are defined in terms of the Schrödinger equation of a free quantum particle and some of its invariant transformations.

    Submitted 9 December, 2016; v1 submitted 15 October, 2015; originally announced October 2015.

    Comments: 20 pages

    Report number: SAM Reports, 2015-29 MSC Class: 22D10; 42C40; 42C15

    Journal ref: Anal. Appl. 15, 915, 2017

  15. arXiv:1403.1396  [pdf, ps, other

    math.FA

    Intrinsic Localization of Anisotropic Frames II: $α$-Molecules

    Authors: Philipp Grohs, Stefano Vigogna

    Abstract: This article is a continuation of the recent paper [Grohs, Intrinsic localization of anisotropic frames, ACHA, 2013], where off-diagonal-decay properties (often referred to as 'localization' in the literature) of Moore-Penrose pseudoinverses of (bi-infinite) matrices are established, whenever the latter possess similar off-diagonal-decay properties. This problem is especially interesting if the ma… ▽ More

    Submitted 6 March, 2014; originally announced March 2014.

    Comments: 16 pages

    MSC Class: Primary 41AXX; Secondary 41A25; 53B; 22E

  16. arXiv:1402.5833  [pdf, other

    math.GR

    Geometric classification of semidirect products in the maximal parabolic subgroup of $\operatorname{Sp}(2,\mathbb{R})$

    Authors: Filippo De Mari, Ernesto De Vito, Stefano Vigogna

    Abstract: We classify up to conjugation by $\operatorname{GL}(2,\mathbb{R})$ (more precisely, block diagonal symplectic matrices) all the semidirect products inside the maximal parabolic of $\operatorname{Sp}(2,\mathbb{R})$ by means of an essentially geometric argument. This classification has already been established without geometry, under a stricter notion of equivalence, namely conjugation by arbitrary… ▽ More

    Submitted 24 February, 2014; originally announced February 2014.

    Comments: 11 pages, 1 figure

  17. arXiv:1402.3917  [pdf, other

    math.FA

    Coorbit spaces with voice in a Fréchet space

    Authors: Stephan Dahlke, Filippo De Mari, Ernesto De Vito, Demetrio Labate, Gabrielle Steidl, Gerd Teschke, Stefano Vigogna

    Abstract: We set up a new general coorbit space theory for reproducing representations of a locally compact second countable group $G$ that are not necessarily irreducible nor integrable. Our basic assumption is that the kernel associated with the voice transform belongs to a Fréchet space $\mathcal T$ of functions on $G$, which generalizes the classical choice $\mathcal T=L_w^1(G)$. Our basic example is… ▽ More

    Submitted 17 February, 2014; originally announced February 2014.

    Comments: 52 pages, 1 figures

    MSC Class: 43A15; 42B35; 22D10; 46A04; 46F05