Skip to main content

Showing 1–7 of 7 results for author: Cammarota, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.02418  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech

    From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima

    Authors: Tony Bonnaire, Giulio Biroli, Chiara Cammarota

    Abstract: We investigate the optimization dynamics of gradient descent in a non-convex and high-dimensional setting, with a focus on the phase retrieval problem as a case study for complex loss landscapes. We first study the high-dimensional limit where both the number $M$ and the dimension $N$ of the data are going to infinity at fixed signal-to-noise ratio $α= M/N$. By analyzing how the local curvature ch… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 21 pages, 10 figures

  2. arXiv:2006.06997  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

    Authors: Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimensio… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 9 pages, 5 figures + appendix

    Journal ref: Advances in Neural Information Processing Systems, v22, page 3265--327, 2020

  3. arXiv:1907.08226  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

    Authors: Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

    Abstract: Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model.… ▽ More

    Submitted 20 January, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: 9 pages, 4 figures + appendix. Appears in Proceedings of the Advances in Neural Information Processing Systems 2019 (NeurIPS 2019)

    Journal ref: Advances in Neural Information Processing Systems, pp. 8676-8686. 2019

  4. arXiv:1906.04148  [pdf, other

    cs.SI

    Who has the last word? Understanding How to Sample Online Discussions

    Authors: Gioia Boschi, Anthony P. Young, Sagar Joglekar, Chiara Cammarota, Nishanth Sastry

    Abstract: In online debates individual arguments support or attack each other, leading to some subset of arguments being considered more relevant than others. However, in large discussions readers are often forced to sample a subset of the arguments being put forth. Since such sampling is rarely done in a principled manner, users may not read all the relevant arguments to get a full picture of the debate. T… ▽ More

    Submitted 10 April, 2021; v1 submitted 10 June, 2019; originally announced June 2019.

    ACM Class: H.5.0; H.5.4; E.1; F.3; G.3

  5. arXiv:1905.12294  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG

    How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA

    Authors: Giulio Biroli, Chiara Cammarota, Federico Ricci-Tersenghi

    Abstract: In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtain a smoother empirical risk. Here we propose a complementary method… ▽ More

    Submitted 6 February, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 23 pages, 16 figures, including Supplementary Material

    Journal ref: J. Phys. A: Math. Theor. 53, 174003 (2020)

  6. arXiv:1812.09066  [pdf, other

    cs.LG cond-mat.dis-nn math.ST stat.ML

    Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

    Authors: Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

    Abstract: Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference. In this work we perform an analytic study of the performances of one of them, the Langevin algorithm, in the context of noisy high-dimensional inference. We employ the Langevin algorithm to sample the posterior probability measure for the spiked matrix-tensor… ▽ More

    Submitted 13 January, 2020; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: 11 pages and 5 figures + appendix

    Journal ref: Phys. Rev. X 10, 011057 (2020)

  7. arXiv:1803.06969  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Comparing Dynamics: Deep Neural Networks versus Glassy Systems

    Authors: M. Baity-Jesi, L. Sagun, M. Geiger, S. Spigler, G. Ben Arous, C. Cammarota, Y. LeCun, M. Wyart, G. Biroli

    Abstract: We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that dur… ▽ More

    Submitted 7 June, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

    Comments: 10 pages, 5 figures. Version accepted at ICML 2018

    Journal ref: PMLR 80:324-333, 2018; Republication with DOI (cite this one): J. Stat. Mech. (2019) 124013