Skip to main content

Showing 1–11 of 11 results for author: Urbani, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.10822  [pdf, other

    cs.LG cond-mat.dis-nn

    Generative modeling through internal high-dimensional chaotic activity

    Authors: Samantha J. Fournier, Pierfrancesco Urbani

    Abstract: Generative modeling aims at producing new datapoints whose statistical properties resemble the ones in a training dataset. In recent years, there has been a burst of machine learning techniques and settings that can achieve this goal with remarkable performances. In most of these settings, one uses the training dataset in conjunction with noise, which is added as a source of statistical variabilit… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  2. arXiv:2309.04788  [pdf, other

    cs.LG cond-mat.dis-nn

    Stochastic Gradient Descent outperforms Gradient Descent in recovering a high-dimensional signal in a glassy energy landscape

    Authors: Persia Jana Kamali, Pierfrancesco Urbani

    Abstract: Stochastic Gradient Descent (SGD) is an out-of-equilibrium algorithm used extensively to train artificial neural networks. However very little is known on to what extent SGD is crucial for to the success of this technology and, in particular, how much it is effective in optimizing high-dimensional non-convex cost functions as compared to other optimization algorithms such as Gradient Descent (GD).… ▽ More

    Submitted 18 December, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: 5 pages + appendix. 3 figures

  3. arXiv:2112.10852  [pdf, other

    cond-mat.dis-nn cs.LG stat.ML

    The effective noise of Stochastic Gradient Descent

    Authors: Francesca Mignacco, Pierfrancesco Urbani

    Abstract: Stochastic Gradient Descent (SGD) is the workhorse algorithm of deep learning technology. At each step of the training phase, a mini batch of samples is drawn from the training dataset and the weights of the neural network are adjusted according to the performance on this specific subset of examples. The mini-batch sampling procedure introduces a stochastic dynamics to the gradient descent, with a… ▽ More

    Submitted 1 June, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: 7 pages + appendix, 5 figures

  4. arXiv:2103.04902  [pdf, other

    cond-mat.dis-nn cs.LG math.ST stat.ML

    Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

    Authors: Francesca Mignacco, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity. We consider the loss landscape of the high-dimensional phase retrieval problem as a prototy… ▽ More

    Submitted 13 April, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: 28 pages, 11 figures

    Journal ref: Mach. Learn.: Sci. Technol. 2 035029 (2021)

  5. arXiv:2102.11755  [pdf, other

    cond-mat.dis-nn cs.LG stat.ML

    Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

    Authors: Stefano Sarao Mannelli, Pierfrancesco Urbani

    Abstract: The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods. Despite these algorithms being widely applied to arbitrary loss functions, their behaviour in generically non-convex, high dimensional landscapes is poorly understood. In this work, we use dynamical mean field theory techniques to de… ▽ More

    Submitted 27 October, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: To appear in NeurIPS 2021

  6. arXiv:2006.06997  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

    Authors: Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimensio… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 9 pages, 5 figures + appendix

    Journal ref: Advances in Neural Information Processing Systems, v22, page 3265--327, 2020

  7. arXiv:2006.06098  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

    Authors: Francesca Mignacco, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: We analyze in a closed form the learning dynamics of stochastic gradient descent (SGD) for a single-layer neural network classifying a high-dimensional Gaussian mixture where each cluster is assigned one of two labels. This problem provides a prototype of a non-convex loss landscape with interpolating regimes and a large generalization gap. We define a particular stochastic process for which SGD c… ▽ More

    Submitted 9 November, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 8 pages + appendix, 4 figures

    Journal ref: J. Stat. Mech. 2021 124008 & NeurIPS 2020

  8. arXiv:1902.00139  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

    Authors: Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model. We study a loss function that is the negative log-likelihood of the model. We analyse the number of local minima at a fixed distance from the signal/spike with the Kac-Rice formula, and locate trivialization of th… ▽ More

    Submitted 20 January, 2020; v1 submitted 31 January, 2019; originally announced February 2019.

    Comments: 12 pages + appendix, 10 figures. Appears in Proceedings of the International Conference on Machine Learning (ICML 2019)

    Journal ref: International Conference on Machine Learning, 4333-4342 (ICML 2019)

  9. arXiv:1812.09066  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

    Authors: Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference. In this work we perform an analytic study of the performances of one of them, the Langevin algorithm, in the context of noisy high-dimensional inference. We employ the Langevin algorithm to sample the posterior probability measure for the spiked matrix-tensor… ▽ More

    Submitted 13 January, 2020; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: 11 pages and 5 figures + appendix

    Journal ref: Phys. Rev. X 10, 011057 (2020)

  10. arXiv:1807.01296  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.IT math.ST stat.ML

    Approximate Survey Propagation for Statistical Inference

    Authors: Fabrizio Antenucci, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: Approximate message passing algorithm enjoyed considerable attention in the last decade. In this paper we introduce a variant of the AMP algorithm that takes into account glassy nature of the system under consideration. We coin this algorithm as the approximate survey propagation (ASP) and derive it for a class of low-rank matrix estimation problems. We derive the state evolution for the ASP algor… ▽ More

    Submitted 3 July, 2018; originally announced July 2018.

    Comments: 37 pages, 14 figures

    Journal ref: J. Stat. Mech. (2019) 023401

  11. arXiv:1805.05857  [pdf, ps, other

    cond-mat.dis-nn cond-mat.stat-mech cs.IT math.ST stat.ML

    Glassy nature of the hard phase in inference problems

    Authors: Fabrizio Antenucci, Silvio Franz, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: An algorithmically hard phase was described in a range of inference problems: even if the signal can be reconstructed with a small error from an information theoretic point of view, known algorithms fail unless the noise-to-signal ratio is sufficiently small. This hard phase is typically understood as a metastable branch of the dynamical evolution of message passing algorithms. In this work we stu… ▽ More

    Submitted 9 January, 2019; v1 submitted 15 May, 2018; originally announced May 2018.

    Comments: 10 pages, 3 figures

    Journal ref: Phys. Rev. X 9, 011020 (2019)