Search | arXiv e-print repository

High-dimensional SGD aligns with emerging outlier eigenspaces

Authors: Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath

Abstract: We rigorously study the joint evolution of training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, the SGD trajectory rapidly aligns with emerging low-rank outlier eigenspaces of the Hessian and gradient m… ▽ More We rigorously study the joint evolution of training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, the SGD trajectory rapidly aligns with emerging low-rank outlier eigenspaces of the Hessian and gradient matrices. Moreover, in multi-layer settings this alignment occurs per layer, with the final layer's outlier eigenspace evolving over the course of training, and exhibiting rank deficiency when the SGD converges to sub-optimal classifiers. This establishes some of the rich predictions that have arisen from extensive numerical studies in the last decade about the spectra of Hessian and information matrices over the course of training in overparametrized networks. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 52 pages, 12 figures

arXiv:2206.04030 [pdf, other]

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

Authors: Gerard Ben Arous, Reza Gheissari, Aukosh Jagannath

Abstract: We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ball… ▽ More We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ballistic (ODE) and diffusive (SDE) limits, with the limit depending dramatically on the former choices. We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss, but at which, a new correction term appears which changes the phase diagram. About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two-layer networks for binary and XOR-type Gaussian mixture models. These examples exhibit surprising phenomena including multimodal timescales to convergence as well as convergence to sub-optimal solutions with probability bounded away from zero from random (e.g., Gaussian) initializations. At the same time, we demonstrate the benefit of overparametrization by showing that the latter probability goes to zero as the second layer width grows. △ Less

Submitted 17 August, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

Comments: 43 pages, 11 figures

arXiv:2110.10210 [pdf, other]

Long Random Matrices and Tensor Unfolding

Authors: Gérard Ben Arous, Daniel Zhengyu Huang, Jiaoyang Huang

Abstract: In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and sing… ▽ More In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and singular vectors exhibit a BBP type phase transition. As a main application, we investigate the tensor unfolding algorithm for the asymmetric rank-one spiked tensor model, and obtain an exact threshold, which is independent of the procedure of tensor unfolding. If the signal-to-noise ratio is above the threshold, tensor unfolding detects the signals; otherwise, it fails to capture the signals. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: 29 pages, 4 figures

arXiv:2106.09768 [pdf, other]

doi 10.1063/5.0070300

Sharp complexity asymptotics and topological trivialization for the (p, k) spiked tensor model

Authors: Antonio Auffinger, Gerard Ben Arous, Zhehua Li

Abstract: We provide O(1) asymptotics for the average number of deep minima of the (p,k) spiked tensor model. We also derive an explicit formula for the limiting ground state energy on the N-dimensional sphere, similar to the work of Jagannath-Lopatto-Miolane. Moreover, when the signal to noise ratio is large enough, the expected number of deep minima is asymptotically finite as N tends to infinity and we d… ▽ More We provide O(1) asymptotics for the average number of deep minima of the (p,k) spiked tensor model. We also derive an explicit formula for the limiting ground state energy on the N-dimensional sphere, similar to the work of Jagannath-Lopatto-Miolane. Moreover, when the signal to noise ratio is large enough, the expected number of deep minima is asymptotically finite as N tends to infinity and we determine its limit as the signal-to-noise ratio diverges. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Comments: 24 pages, 2 figures

arXiv:2105.05051 [pdf, other]

Landscape complexity beyond invariance and the elastic manifold

Authors: Gérard Ben Arous, Paul Bourgade, Benjamin McKenna

Abstract: This paper characterizes the annealed, topological complexity (both of total critical points and of local minima) of the elastic manifold. This classical model of a disordered elastic system captures point configurations with self-interactions in a random medium. We establish the simple-vs.-glassy phase diagram in the model parameters, with these phases separated by a physical boundary known as th… ▽ More This paper characterizes the annealed, topological complexity (both of total critical points and of local minima) of the elastic manifold. This classical model of a disordered elastic system captures point configurations with self-interactions in a random medium. We establish the simple-vs.-glassy phase diagram in the model parameters, with these phases separated by a physical boundary known as the Larkin mass, confirming formulas of Fyodorov and Le Doussal. One essential, dynamical, step of the proof also applies to a general signal-to-noise model of soft spins in an anisotropic well, for which we prove a negative-second-moment threshold distinguishing positive from zero complexity. A universal near-critical behavior appears within this phase portrait, namely quadratic near-critical vanishing of the complexity of total critical points, and cubic near-critical vanishing of the complexity of local minima. These two models serve as a paradigm of complexity calculations for Gaussian landscapes exhibiting few distributional symmetries, i.e. beyond the invariant setting. The two main inputs for the proof are determinant asymptotics for non-invariant random matrices from our companion paper [Ben Arous, Bourgade, McKenna 2021], and the atypical convexity and integrability of the limiting variational problems. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: 37 pages, 2 figures

MSC Class: 60B20; 60G15; 82B44

arXiv:2105.05000 [pdf, ps, other]

doi 10.2140/pmp.2022.3.731

Exponential growth of random determinants beyond invariance

Authors: Gérard Ben Arous, Paul Bourgade, Benjamin McKenna

Abstract: We give simple criteria to identify the exponential order of magnitude of the absolute value of the determinant for wide classes of random matrix models, not requiring the assumption of invariance. These include Gaussian matrices with covariance profiles, Wigner matrices and covariance matrices with subexponential tails, Erdős-Rényi and $d$-regular graphs for any polynomial sparsity parameter, and… ▽ More We give simple criteria to identify the exponential order of magnitude of the absolute value of the determinant for wide classes of random matrix models, not requiring the assumption of invariance. These include Gaussian matrices with covariance profiles, Wigner matrices and covariance matrices with subexponential tails, Erdős-Rényi and $d$-regular graphs for any polynomial sparsity parameter, and non-mean-field random matrix models, such as random band matrices for any polynomial bandwidth. The proof builds on recent tools, including the theory of the Matrix Dyson Equation as developed in [Ajanki, Erdős, Krüger 2019]. We use these asymptotics as an important input to identify the complexity of classes of Gaussian random landscapes in our companion papers [Ben Arous, Bourgade, McKenna 2021; McKenna 2021]. △ Less

Submitted 15 April, 2022; v1 submitted 11 May, 2021; originally announced May 2021.

Comments: Application of main result to Wigner matrices actually requires subexponential moments (but see new Appendix B for a related result under weaker assumptions). New results on $d$-regular random graphs. 48 pages

MSC Class: 60B20

Journal ref: Prob. Math. Phys. 3 (2022) 731-789

arXiv:2104.08299 [pdf, ps, other]

Shattering Versus Metastability in Spin Glasses

Authors: Gérard Ben Arous, Aukosh Jagannath

Abstract: Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove t… ▽ More Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove that there is a regime of temperatures in which the spherical $p$-spin model exhibits a shattering phase. Our results holds in a regime above but near $T_s$. We then find that metastable states exist up to an even higher temperature $T_{BBM}$ as predicted by Barrat--Burioni--Mézard which is expected to be higher than the phase boundary for the shattering phase $T_d <T_{BBM}$. We develop this work by first develo** a Thouless--Anderson--Palmer decomposition which builds on the work of Subag. We then present a series of questions and conjectures regarding the sharp phase boundaries for shattering and slow mixing. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2008.00690 [pdf, other]

doi 10.1073/pnas.2023719118

Counting equilibria of large complex systems by instability index

Authors: Gérard Ben Arous, Yan V Fyodorov, Boris A Khoruzhenko

Abstract: We consider a nonlinear autonomous system of $N\gg 1$ degrees of freedom randomly coupled by both relaxational ('gradient') and non-relaxational ('solenoidal') random interactions. We show that with increased interaction strength such systems generically undergo an abrupt transition from a trivial phase portrait with a single stable equilibrium into a topologically non-trivial regime of 'absolute… ▽ More We consider a nonlinear autonomous system of $N\gg 1$ degrees of freedom randomly coupled by both relaxational ('gradient') and non-relaxational ('solenoidal') random interactions. We show that with increased interaction strength such systems generically undergo an abrupt transition from a trivial phase portrait with a single stable equilibrium into a topologically non-trivial regime of 'absolute instability' where equilibria are on average exponentially abundant, but typically all of them are unstable, unless the dynamics is purely gradient. When interactions increase even further the stable equilibria eventually become on average exponentially abundant unless the interaction is purely solenoidal. We further calculate the mean proportion of equilibria which have a fixed fraction of unstable directions. △ Less

Submitted 2 July, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: Main paper - 7 pages, Supplementary Information - 20 pages. Revised version - minor changes, including adding short discussion of model assumptions

Journal ref: PNAS (2021) Vol. 118 No. 34 e2023719118

arXiv:2006.10689 [pdf, ps, other]

Free Energy Wells and Overlap Gap Property in Sparse PCA

Authors: Gérard Ben Arous, Alexander S. Wein, Ilias Zadik

Abstract: We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspir… ▽ More We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low-degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst-case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the Overlap Gap Property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime. △ Less

Submitted 18 June, 2020; originally announced June 2020.

Comments: 63 pages. Accepted for presentation at the Conference on Learning Theory (COLT) 2020

arXiv:2003.10409 [pdf, other]

Online stochastic gradient descent on non-convex losses from high-dimensional inference

Authors: Gerard Ben Arous, Reza Gheissari, Aukosh Jagannath

Abstract: Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random… ▽ More Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random start in the setting where the parameter space is high-dimensional. We develop nearly sharp thresholds for the number of samples needed for consistent estimation as one varies the dimension. Our thresholds depend only on an intrinsic property of the population loss which we call the information exponent. In particular, our results do not assume uniform control on the loss itself, such as convexity or uniform derivative bounds. The thresholds we obtain are polynomial in the dimension and the precise exponent depends explicitly on the information exponent. As a consequence of our results, we find that except for the simplest tasks, almost all of the data is used simply in the initial search phase to obtain non-trivial correlation with the ground truth. Upon attaining non-trivial correlation, the descent is rapid and exhibits law of large numbers type behavior. We illustrate our approach by applying it to a wide set of inference tasks such as phase retrieval, and parameter estimation for generalized linear models, online PCA, and spiked tensor models, as well as to supervised learning for single-layer networks with general activation functions. △ Less

Submitted 10 May, 2021; v1 submitted 23 March, 2020; originally announced March 2020.

Comments: final version to appear at Jour. Mach. Learn. Res$.$

Journal ref: J. Mach. Learn. Res., Vol 22, No.106,1-51(2021)

arXiv:1912.02143 [pdf, other]

Landscape Complexity for the Empirical Risk of Generalized Linear Models

Authors: Antoine Maillard, Gérard Ben Arous, Giulio Biroli

Abstract: We present a method to obtain the average and the typical value of the number of critical points of the empirical risk landscape for generalized linear estimation problems and variants. This represents a substantial extension of previous applications of the Kac-Rice method since it allows to analyze the critical points of high dimensional non-Gaussian random functions. Under a technical hypothesis… ▽ More We present a method to obtain the average and the typical value of the number of critical points of the empirical risk landscape for generalized linear estimation problems and variants. This represents a substantial extension of previous applications of the Kac-Rice method since it allows to analyze the critical points of high dimensional non-Gaussian random functions. Under a technical hypothesis, we obtain a rigorous explicit variational formula for the annealed complexity, which is the logarithm of the average number of critical points at fixed value of the empirical risk. This result is simplified, and extended, using the non-rigorous Kac-Rice replicated method from theoretical physics. In this way we find an explicit variational formula for the quenched complexity, which is generally different from its annealed counterpart, and allows to obtain the number of critical points for typical instances up to exponential accuracy. △ Less

Submitted 18 January, 2023; v1 submitted 4 December, 2019; originally announced December 2019.

Comments: 18 pages and 18 pages appendix. Update to match the published version (v2). Corrections of remaining small typos (v3). Simplification of a technical argument in Appendix A (v4) and clarification of a technical hypothesis (v5)

Journal ref: Proceedings of The First Mathematical and Scientific Machine Learning Conference, PMLR 107:287-327, 2020

arXiv:1901.10025 [pdf, ps, other]

Very rare events for diffusion processes in short time

Authors: Gérard Ben Arous, **g Wang

Abstract: We study the large deviation estimates for the short time asymptotic behavior of a strongly degenerate diffusion process. Assuming a nilpotent structure of the Lie algebra generated by the driving vector fields, we obtain a graded large deviation principle and prove the existence of those "very rare events". In particular the first grade coincides with the classical Large Deviation Principle. We study the large deviation estimates for the short time asymptotic behavior of a strongly degenerate diffusion process. Assuming a nilpotent structure of the Lie algebra generated by the driving vector fields, we obtain a graded large deviation principle and prove the existence of those "very rare events". In particular the first grade coincides with the classical Large Deviation Principle. △ Less

Submitted 28 January, 2019; originally announced January 2019.

Comments: 24 pages

arXiv:1808.00929 [pdf, other]

doi 10.1007/s00220-019-03649-4

Bounding flows for spherical spin glass dynamics

Authors: Gerard Ben Arous, Reza Gheissari, Aukosh Jagannath

Abstract: We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly… ▽ More We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly over all starting points, the process must reach and remain in an absorbing region of large negative values of $H$ and large (in norm) gradients in order 1 time. Furthermore, if the process starts in a neighborhood of a critical point of $H$ with negative energy, then both the gradient and energy must increase macroscopically under this evolution, even if this critical point is a saddle with index of order $N$. As a key technical tool, we estimate Sobolev norms of spin glass Hamiltonians, which are of independent interest. △ Less

Submitted 24 October, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

Comments: 32 pages, 6 figures

Journal ref: Commun. Math. Phys. 373, 1011-1048 (2020)

arXiv:1808.00921 [pdf, ps, other]

doi 10.1214/19-AOP1415

Algorithmic thresholds for tensor PCA

Authors: Gerard Ben Arous, Reza Gheissari, Aukosh Jagannath

Abstract: We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvatur… ▽ More We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvature" of the spike on the maximum entropy region of the initial data. To demonstrate this, we study the dynamics on a generalized family of high-dimensional landscapes with planted signals, containing the spiked tensor models as specific instances. We identify thresholds of signal-to-noise ratios above which order 1 time recovery succeeds; in the case of the spiked tensor model these match the thresholds conjectured for algorithms such as Approximate Message Passing. Below these thresholds, where the curvature of the signal on the maximal entropy region is weak, we show that recovery from certain natural initializations takes at least stretched exponential time. Our approach combines global regularity estimates for spin glasses with point-wise estimates, to study the recovery problem by a perturbative approach. △ Less

Submitted 10 September, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

Comments: 34 pages. The manuscript has been updated to add a proof of what was Conjecture 1 in the first version

Journal ref: Ann. Probab. (2020), Vol. 48, No. 4, 2052-2087

arXiv:1804.10573 [pdf, ps, other]

Geometry and temperature chaos in mixed spherical spin glasses at low temperature - the perturbative regime

Authors: Gérard Ben Arous, Eliran Subag, Ofer Zeitouni

Abstract: We study the Gibbs measure of mixed spherical $p$-spin glass models at low temperature, in (part of) the 1-RSB regime, including, in particular, models close to pure in an appropriate sense. We show that the Gibbs measure concentrates on spherical bands around deep critical points of the (extended) Hamiltonian restricted to the sphere of radius $\sqrt N q_\star$, where $q_\star^2$ is the rightmost… ▽ More We study the Gibbs measure of mixed spherical $p$-spin glass models at low temperature, in (part of) the 1-RSB regime, including, in particular, models close to pure in an appropriate sense. We show that the Gibbs measure concentrates on spherical bands around deep critical points of the (extended) Hamiltonian restricted to the sphere of radius $\sqrt N q_\star$, where $q_\star^2$ is the rightmost point in the support of the overlap distribution. We also show that the relevant critical points are pairwise orthogonal for two different low temperatures. This allows us to explain why temperature chaos occurs for those models, in contrast to the pure spherical models. △ Less

Submitted 27 April, 2018; originally announced April 2018.

arXiv:1804.02686 [pdf, other]

doi 10.1103/PhysRevX.9.011003

Complex energy landscapes in spiked-tensor and simple glassy models: ruggedness, arrangements of local minima and phase transitions

Authors: Valentina Ros, Gerard Ben Arous, Giulio Biroli, Chiara Cammarota

Abstract: We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges. Such energy landscapes arise in glass physics and inference. In particular we focus on random Gaussian functions, and on the spiked-tensor model and generalizations. We thoroughly analyze the statistical properties of the corresponding landscapes and characterize the associate… ▽ More We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges. Such energy landscapes arise in glass physics and inference. In particular we focus on random Gaussian functions, and on the spiked-tensor model and generalizations. We thoroughly analyze the statistical properties of the corresponding landscapes and characterize the associated geometrical phase transitions. In order to perform our study, we develop a framework based on the Kac-Rice method that allows to compute the complexity of the landscape, i.e. the logarithm of the typical number of stationary points and their Hessian. This approach generalizes the one used to compute rigorously the annealed complexity of mean-field glass models. We discuss its advantages with respect to previous frameworks, in particular the thermodynamical replica method which is shown to lead to partially incorrect predictions. △ Less

Submitted 24 April, 2018; v1 submitted 8 April, 2018; originally announced April 2018.

Comments: v2 with references added, typos corrected

Journal ref: Phys. Rev. X 9, 011003 (2019)

arXiv:1803.06969 [pdf, other]

doi 10.1088/1742-5468/ab3281

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

Authors: M. Baity-Jesi, L. Sagun, M. Geiger, S. Spigler, G. Ben Arous, C. Cammarota, Y. LeCun, M. Wyart, G. Biroli

Abstract: We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that dur… ▽ More We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that during the training process the dynamics slows down because of an increasingly large number of flat directions. At large times, when the loss is approaching zero, the system diffuses at the bottom of the landscape. Despite some similarities with the dynamics of mean-field glassy systems, in particular, the absence of barrier crossing, we find distinctive dynamical behaviors in the two cases, showing that the statistical properties of the corresponding loss and energy landscapes are different. In contrast, when the network is under-parametrized we observe a typical glassy behavior, thus suggesting the existence of different phases depending on whether the network is under-parametrized or over-parametrized. △ Less

Submitted 7 June, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

Comments: 10 pages, 5 figures. Version accepted at ICML 2018

Journal ref: PMLR 80:324-333, 2018; Republication with DOI (cite this one): J. Stat. Mech. (2019) 124013

arXiv:1711.05424 [pdf, other]

The landscape of the spiked tensor model

Authors: Gerard Ben Arous, Song Mei, Andrea Montanari, Mihai Nica

Abstract: We consider the problem of estimating a large rank-one tensor ${\boldsymbol u}^{\otimes k}\in({\mathbb R}^{n})^{\otimes k}$, $k\ge 3$ in Gaussian noise. Earlier work characterized a critical signal-to-noise ratio $λ_{Bayes}= O(1)$ above which an ideal estimator achieves strictly positive correlation with the unknown vector of interest. Remarkably no polynomial-time algorithm is known that achieved… ▽ More We consider the problem of estimating a large rank-one tensor ${\boldsymbol u}^{\otimes k}\in({\mathbb R}^{n})^{\otimes k}$, $k\ge 3$ in Gaussian noise. Earlier work characterized a critical signal-to-noise ratio $λ_{Bayes}= O(1)$ above which an ideal estimator achieves strictly positive correlation with the unknown vector of interest. Remarkably no polynomial-time algorithm is known that achieved this goal unless $λ\ge C n^{(k-2)/4}$ and even powerful semidefinite programming relaxations appear to fail for $1\ll λ\ll n^{(k-2)/4}$. In order to elucidate this behavior, we consider the maximum likelihood estimator, which requires maximizing a degree-$k$ homogeneous polynomial over the unit sphere in $n$ dimensions. We compute the expected number of critical points and local maxima of this objective function and show that it is exponential in the dimensions $n$, and give exact formulas for the exponential growth rate. We show that (for $λ$ larger than a constant) critical points are either very close to the unknown vector ${\boldsymbol u}$, or are confined in a band of width $Θ(λ^{-1/(k-1)})$ around the maximum circle that is orthogonal to ${\boldsymbol u}$. For local maxima, this band shrinks to be of size $Θ(λ^{-1/(k-2)})$. These `uninformative' local maxima are likely to cause the failure of optimization algorithms. △ Less

Submitted 25 January, 2018; v1 submitted 15 November, 2017; originally announced November 2017.

Comments: 40 pages, 20 pdf figures

arXiv:1706.06666 [pdf, ps, other]

Stable limit laws and structure of the scaling function for reaction-diffusion in random environment

Authors: Gérard Ben Arous, Stanislav Molchanov, Alejandro F. Ramírez

Abstract: We prove the emergence of stable fluctuations for reaction-diffusion in random environment with Weibull tails. This completes our work around the quenched to annealed transition phenomenon in this context of reaction diffusion. In [9], we had already considered the model treated here and had studied fully the regimes where the law of large numbers is satisfied and where the fluctuations are Gaussi… ▽ More We prove the emergence of stable fluctuations for reaction-diffusion in random environment with Weibull tails. This completes our work around the quenched to annealed transition phenomenon in this context of reaction diffusion. In [9], we had already considered the model treated here and had studied fully the regimes where the law of large numbers is satisfied and where the fluctuations are Gaussian, but we had left open the regime of stable fluctuations. Our work is based on a spectral approach centered on the classical theory of rank-one perturbations. It illustrates the gradual emergence of the role of the higher peaks of the environments. This approach also allows us to give the delicate exact asymptotics of the normalizing constants needed in the stable limit law. △ Less

Submitted 20 June, 2017; originally announced June 2017.

Comments: 42 pages

MSC Class: 82B41; 82B44

arXiv:1705.05883 [pdf, ps, other]

Backbone scaling limits for random walks on random critical trees

Authors: Gérard Ben Arous, Manuel Cabezas, Alexander Fribergh

Abstract: We prove the existence of scaling limits for the projection on the backbone of the random walks on the Incipient Infinite Cluster and the Invasion Percolation Cluster on a regular tree. We treat these projected random walks as randomly trapped random walks (as defined in [BCČR15]) and thus describe these scaling limits as spatially subordinated Brownian motions We prove the existence of scaling limits for the projection on the backbone of the random walks on the Incipient Infinite Cluster and the Invasion Percolation Cluster on a regular tree. We treat these projected random walks as randomly trapped random walks (as defined in [BCČR15]) and thus describe these scaling limits as spatially subordinated Brownian motions △ Less

Submitted 15 October, 2021; v1 submitted 16 May, 2017; originally announced May 2017.

Comments: 47 pages

MSC Class: 60F17; 60K37

arXiv:1705.04243 [pdf, ps, other]

doi 10.1007/s00220-018-3152-6

Spectral gap estimates in mean field spin glasses

Authors: Gérard Ben Arous, Aukosh Jagannath

Abstract: We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which gu… ▽ More We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which guarantee the existence of these barriers, using the notion of replicon eigenvalue and 2D Guerra Talagrand bounds. We show how these sufficient conditions cover large classes of Ising spin models for reversible nearest-neighbor dynamics and spherical models for Langevin dynamics. Finally, in the case of Ising spins, Panchenko's recent rigorous calculation [79] of the free energy for a system of "two real replica" enables us to prove a quenched LDP for the overlap distribution, which gives us a wider criterion for slow mixing directly related to the Franz-Parisi-Virasoro approach [43,60]. This condition holds in a wider range of temperatures. △ Less

Submitted 2 March, 2018; v1 submitted 11 May, 2017; originally announced May 2017.

Comments: revised, 44pp

Journal ref: Commun. Math. Phys. 361, 1-52 (2018)

arXiv:1609.03980 [pdf, ps, other]

Scaling limit for the ant in a simple labyrinth

Authors: Gérard Ben Arous, Manuel Cabezas, Alexander Fribergh

Abstract: We prove that, after suitable rescaling, the simple random walk on the trace of a large critical branching random walk converges to the Brownian motion on the integrated super-Brownian excursion. We prove that, after suitable rescaling, the simple random walk on the trace of a large critical branching random walk converges to the Brownian motion on the integrated super-Brownian excursion. △ Less

Submitted 13 September, 2016; originally announced September 2016.

Comments: 94 pages, 8 figures. arXiv admin note: text overlap with arXiv:1609.03977; text overlap with arXiv:math/0605484 by other authors

arXiv:1609.03977 [pdf, ps, other]

Scaling limit for the ant in high-dimensional labyrinths

Authors: Gérard Ben Arous, Manuel Cabezas, Alexander Fribergh

Abstract: We study here a detailed conjecture regarding one of the most important cases of anomalous diffusion, i.e the behavior of the "ant in the labyrinth". It is natural to conjecture (see [16] and [8]) that the scaling limit for random walks on large critical random graphs exists in high dimensions, and is universal. This scaling limit is simply the natural Brownian Motion on the Integrated Super-Brown… ▽ More We study here a detailed conjecture regarding one of the most important cases of anomalous diffusion, i.e the behavior of the "ant in the labyrinth". It is natural to conjecture (see [16] and [8]) that the scaling limit for random walks on large critical random graphs exists in high dimensions, and is universal. This scaling limit is simply the natural Brownian Motion on the Integrated Super-Brownian Excursion. We give here a set of four natural sufficient conditions on the critical graphs and prove that this set of assumptions ensures the validity of this conjecture. The remaining future task is to prove that these sufficient conditions hold for the various classical cases of critical random structures, like the usual Bernoulli bond percolation, oriented percolation, spread-out percolation in high enough dimension. In the companion paper [10], we do precisely that in a first case, the random walk on the trace of a large critical branching random walk. We verify the validity of these sufficient conditions and thus obtain the scaling limit mentioned above, in dimensions larger than 14. △ Less

Submitted 13 September, 2016; originally announced September 2016.

Comments: 98 pages, 8 figures. arXiv admin note: text overlap with arXiv:1609.03980; text overlap with arXiv:math/0605484 by other authors

arXiv:1412.6615 [pdf, other]

Explorations on high dimensional landscapes

Authors: Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann LeCun

Abstract: Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with… ▽ More Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with the previous theoretical work on spin glasses that proves the existence of such a band when the dimension of the domain tends to infinity. Furthermore our experiments on teacher-student networks with the MNIST dataset establish a similar phenomenon in deep networks. We finally observe that both the gradient descent and the stochastic gradient descent methods can reach this level within the same number of steps. △ Less

Submitted 6 April, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

Comments: 11 pages, 8 figures, workshop contribution at ICLR 2015

arXiv:1412.0233 [pdf, other]

The Loss Surfaces of Multilayer Networks

Authors: Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann LeCun

Abstract: We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network t… ▽ More We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between large- and small-size networks where for the latter poor quality local minima have non-zero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting. △ Less

Submitted 21 January, 2015; v1 submitted 30 November, 2014; originally announced December 2014.

arXiv:1406.5076 [pdf, ps, other]

Biased random walks on random graphs

Authors: Gerard Ben Arous, Alexander Fribergh

Abstract: These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012. The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and… ▽ More These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012. The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and $\mathbb{Z}^d$ for $d\geq 2$. △ Less

Submitted 19 June, 2014; originally announced June 2014.

Comments: Survey based one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012. 64 pages, 16 figures

arXiv:1404.3954 [pdf, ps, other]

Smallest Singular Value for Perturbations of Random Permutation Matrices

Authors: Gérard Ben Arous, Kim Dang

Abstract: We take a first small step to extend the validity of Rudelson-Vershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a Rudels… ▽ More We take a first small step to extend the validity of Rudelson-Vershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a Rudelson-Vershynin type estimate, using the classical theory of random walks with negative drift. △ Less

Submitted 15 April, 2014; originally announced April 2014.

Comments: 33 pages

MSC Class: 15B52; 60B20; 60C05

arXiv:1302.7227 [pdf, ps, other]

doi 10.1214/14-AOP939

Randomly trapped random walks

Authors: Gérard Ben Arous, Manuel Cabezas, Jiří Černý, Roman Royfman

Abstract: We introduce a general model of trap** for random walks on graphs. We give the possible scaling limits of these Randomly Trapped Random Walks on $\mathbb {Z}$. These scaling limits include the well-known fractional kinetics process, the Fontes-Isopi-Newman singular diffusion as well as a new broad class we call spatially subordinated Brownian motions. We give sufficient conditions for convergenc… ▽ More We introduce a general model of trap** for random walks on graphs. We give the possible scaling limits of these Randomly Trapped Random Walks on $\mathbb {Z}$. These scaling limits include the well-known fractional kinetics process, the Fontes-Isopi-Newman singular diffusion as well as a new broad class we call spatially subordinated Brownian motions. We give sufficient conditions for convergence and illustrate these on two important examples. △ Less

Submitted 29 October, 2015; v1 submitted 28 February, 2013; originally announced February 2013.

Comments: Published at http://dx.doi.org/10.1214/14-AOP939 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP939

Journal ref: Annals of Probability 2015, Vol. 43, No. 5, 2405-2457

arXiv:1111.6999 [pdf, ps, other]

A Central Limit Theorem in Many-Body Quantum Dynamics

Authors: Gerard Ben Arous, Kay Kirkpatrick, Benjamin Schlein

Abstract: We study the many body quantum evolution of bosonic systems in the mean field limit. The dynamics is known to be well approximated by the Hartree equation. So far, the available results have the form of a law of large numbers. In this paper we go one step further and we show that the fluctuations around the Hartree evolution satisfy a central limit theorem. Interestingly, the variance of the limit… ▽ More We study the many body quantum evolution of bosonic systems in the mean field limit. The dynamics is known to be well approximated by the Hartree equation. So far, the available results have the form of a law of large numbers. In this paper we go one step further and we show that the fluctuations around the Hartree evolution satisfy a central limit theorem. Interestingly, the variance of the limiting Gaussian distribution is determined by a time-dependent Bogoliubov transformation describing the dynamics of initial coherent states in a Fock space representation of the system. △ Less

Submitted 26 March, 2012; v1 submitted 29 November, 2011; originally announced November 2011.

Comments: 42 pages; some references added

MSC Class: 81U05; 81U30; 81V70; 82C10; 60F05

arXiv:1111.5865 [pdf, ps, other]

A proof of the Lyons-Pemantle-Peres monotonicity conjecture for high biases

Authors: Gerard Ben Arous, Alexander Fribergh, Vladas Sidoravicius

Abstract: The speed $v(β)$ of a $β$-biased random walk on a Galton-Watson tree without leaves is increasing for $β\geq 717$. The speed $v(β)$ of a $β$-biased random walk on a Galton-Watson tree without leaves is increasing for $β\geq 717$. △ Less

Submitted 24 November, 2011; originally announced November 2011.

Comments: 12 pages

MSC Class: 60K37

arXiv:1110.5872 [pdf, ps, other]

doi 10.1214/13-AOP862

Complexity of random smooth functions on the high-dimensional sphere

Authors: Antonio Auffinger, Gerard Ben Arous

Abstract: We analyze the landscape of general smooth Gaussian functions on the sphere in dimension $N$, when $N$ is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom landscape, one that has a layered… ▽ More We analyze the landscape of general smooth Gaussian functions on the sphere in dimension $N$, when $N$ is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom landscape, one that has a layered structure of critical values and a strong correlation between indexes and critical values and another where even at levels below the limiting ground state energy the mean number of local minima is exponentially large. We end the paper by discussing how these results can be interpreted in the language of spin glasses models. △ Less

Submitted 16 December, 2013; v1 submitted 26 October, 2011; originally announced October 2011.

Comments: Published in at http://dx.doi.org/10.1214/13-AOP862 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP862

Journal ref: Annals of Probability 2013, Vol. 41, No. 6, 4214-4247

arXiv:1106.4387 [pdf, ps, other]

Einstein relation for biased random walk on Galton--Watson trees

Authors: Gerard Ben Arous, Yueyun Hu, Stefano Olla, Ofer Zeitouni

Abstract: We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on Galton--Watson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps. We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on Galton--Watson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps. △ Less

Submitted 22 December, 2011; v1 submitted 22 June, 2011; originally announced June 2011.

Comments: Minor errors corrected

MSC Class: 60K37; 60J80

arXiv:1106.2108 [pdf, ps, other]

On fluctuations of eigenvalues of random permutation matrices

Authors: Gérard Ben Arous, Kim Dang

Abstract: Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting non-universality phenomenon. Though they have bounded variance, their fluctuations are asymptotically non-Gaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothnes… ▽ More Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting non-universality phenomenon. Though they have bounded variance, their fluctuations are asymptotically non-Gaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothness is measured in terms of the quality of the trapezoidal approximations of the integral of the observable. △ Less

Submitted 10 June, 2011; originally announced June 2011.

arXiv:1101.4041 [pdf, ps, other]

Randomly biased walks on subcritical trees

Authors: Gerard Ben Arous, Alan Hammond

Abstract: As a model of trap** by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical Galton-Watson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic non-lattice condition. The mean return time of the walk is in essence given by the tota… ▽ More As a model of trap** by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical Galton-Watson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic non-lattice condition. The mean return time of the walk is in essence given by the total conductance of the tree. We determine the asymptotic decay of this total conductance, finding it to have a pure power-law decay. In the case of the conductance associated to a single vertex at maximal depth in the tree, this asymptotic decay may be analysed by the classical defective renewal theorem, due to the non-lattice edge-bias assumption. However, the derivation of the decay for total conductance requires computing an additional constant multiple outside the power-law that allows for the contribution of all vertices close to the base of the tree. This computation entails a detailed study of a convenient decomposition of the tree, under conditioning on the tree having high total conductance. As such, our principal conclusion may be viewed as a development of renewal theory in the context of random environments. For randomly biased random walk on a supercritical Galton-Watson tree with positive extinction probability, our main results may be regarded as a description of the slowdown mechanism caused by the presence of subcritical trees adjacent to the backbone that may act as traps that detain the walker. Indeed, this conclusion is exploited in \cite{GerardAlan} to obtain a stable limiting law for walker displacement in such a tree. △ Less

Submitted 20 January, 2011; originally announced January 2011.

Comments: 61 pages, four figures

arXiv:1010.5190 [pdf, ps, other]

Universality and extremal aging for dynamics of spin glasses on sub-exponential time scales

Authors: G. Ben Arous, O. Gun

Abstract: We consider Random Hop** Time (RHT) dynamics of the Sherrington - Kirkpatrick (SK) model and p-spin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are sub-exponential in the dimension, the properly scaled clock process (time-change process) of the dynamics converges to an extremal process. Moreover, on these time scales, the sys… ▽ More We consider Random Hop** Time (RHT) dynamics of the Sherrington - Kirkpatrick (SK) model and p-spin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are sub-exponential in the dimension, the properly scaled clock process (time-change process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aging like behavior which we called extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REM-like trap model as a universal aging mechanism for a wide range of systems which, for the first time, includes the SK model. △ Less

Submitted 25 October, 2010; originally announced October 2010.

arXiv:1010.1294 [pdf, ps, other]

doi 10.1214/11-AOP710

Extreme gaps between eigenvalues of random matrices

Authors: Gérard Ben Arous, Paul Bourgade

Abstract: This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haar-distributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{-4/3}$, has a limiting density proportional to $x^{3k-1}e^{-x^3}$. Concerning the largest gaps, normalized by… ▽ More This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haar-distributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{-4/3}$, has a limiting density proportional to $x^{3k-1}e^{-x^3}$. Concerning the largest gaps, normalized by $n/\sqrt{\log n}$, they converge in ${\mathrm{L}}^p$ to a constant for all $p>0$. These results are compared with the extreme gaps between zeros of the Riemann zeta function. △ Less

Submitted 24 July, 2013; v1 submitted 6 October, 2010; originally announced October 2010.

Comments: Published in at http://dx.doi.org/10.1214/11-AOP710 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP710

Journal ref: Annals of Probability 2013, Vol. 41, No. 4, 2648-2681

arXiv:1003.1129 [pdf, other]

Random Matrices and complexity of Spin Glasses

Authors: A. Auffinger, G. Ben Arous, J. Cerny

Abstract: We give an asymptotic evaluation of the complexity of spherical p-spin spin-glass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We a… ▽ More We give an asymptotic evaluation of the complexity of spherical p-spin spin-glass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We also show that our approach allows us to compute the related TAP-complexity and extend the results known in the physics literature. As an independent tool, we prove a LDP for the k-th largest eigenvalue of the GOE, extending the results of Ben Arous, Dembo and Guionnett (2001). △ Less

Submitted 7 November, 2011; v1 submitted 4 March, 2010; originally announced March 2010.

Comments: 25 pages, 1 figure, references added

MSC Class: 60G15; 60B20; 82B44

arXiv:0905.2993 [pdf, ps, other]

doi 10.1214/10-AOP550

Current fluctuations for TASEP: A proof of the Prähofer--Spohn conjecture

Authors: Gérard Ben Arous, Ivan Corwin

Abstract: We consider the family of two-sided Bernoulli initial conditions for TASEP which, as the left and right densities ($ρ_-,ρ_+$) are varied, give rise to shock waves and rarefaction fans---the two phenomena which are typical to TASEP. We provide a proof of Conjecture 7.1 of [Progr. Probab. 51 (2002) 185--204] which characterizes the order of and scaling functions for the fluctuations of the height fu… ▽ More We consider the family of two-sided Bernoulli initial conditions for TASEP which, as the left and right densities ($ρ_-,ρ_+$) are varied, give rise to shock waves and rarefaction fans---the two phenomena which are typical to TASEP. We provide a proof of Conjecture 7.1 of [Progr. Probab. 51 (2002) 185--204] which characterizes the order of and scaling functions for the fluctuations of the height function of two-sided TASEP in terms of the two densities $ρ_-,ρ_+$ and the speed $y$ around which the height is observed. In proving this theorem for TASEP, we also prove a fluctuation theorem for a class of corner growth processes with external sources, or equivalently for the last passage time in a directed last passage percolation model with two-sided boundary conditions: $ρ_-$ and $1-ρ_+$. We provide a complete characterization of the order of and the scaling functions for the fluctuations of this model's last passage time $L(N,M)$ as a function of three parameters: the two boundary/source rates $ρ_-$ and $1-ρ_+$, and the scaling ratio $γ^2=M/N$. The proof of this theorem draws on the results of [Comm. Math. Phys. 265 (2006) 1--44] and extensively on the work of [Ann. Probab. 33 (2005) 1643--1697] on finite rank perturbations of Wishart ensembles in random matrix theory. △ Less

Submitted 7 December, 2010; v1 submitted 19 May, 2009; originally announced May 2009.

Comments: Published in at http://dx.doi.org/10.1214/10-AOP550 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP550

Journal ref: Annals of Probability 2011, Vol. 39, No. 1, 104-138

arXiv:0903.2672 [pdf, ps, other]

Free point processes and free extreme values

Authors: G. Ben Arous, V. Kargin

Abstract: We continue here the study of free extreme values begun in Ben Arous and Voiculescu (2006). We study the convergence of the free point processes associated with free extreme values to a free Poisson random measure (Voiculescu (1998), Barndorff-Nielsen and Thorbjornsen (2005)). We relate this convergence to the free extremal laws introduced in Ben Arous and Voiculescu (2006) and give the limit la… ▽ More We continue here the study of free extreme values begun in Ben Arous and Voiculescu (2006). We study the convergence of the free point processes associated with free extreme values to a free Poisson random measure (Voiculescu (1998), Barndorff-Nielsen and Thorbjornsen (2005)). We relate this convergence to the free extremal laws introduced in Ben Arous and Voiculescu (2006) and give the limit laws for free order statistics. △ Less

Submitted 15 March, 2009; originally announced March 2009.

Comments: 23 pages, to appear in Probability Theory and Related Fields

MSC Class: 46L54; 62G32

Journal ref: Probability Theory and Related Fields (2010), 147, 161-183

arXiv:0712.1914 [pdf, ps, other]

doi 10.1088/1742-5468/2008/04/L04003

Universality of REM-like aging in mean field spin glasses

Authors: G. Ben Arous, A. Bovier, J. Cerny

Abstract: Aging has become the paradigm to describe dynamical behavior of glassy systems, and in particular spin glasses. Trap models have been introduced as simple caricatures of effective dynamics of such systems. In this Letter we show that in a wide class of mean field models and on a wide range of time scales, aging occurs precisely as predicted by the REM-like trap model of Bouchaud and Dean. This i… ▽ More Aging has become the paradigm to describe dynamical behavior of glassy systems, and in particular spin glasses. Trap models have been introduced as simple caricatures of effective dynamics of such systems. In this Letter we show that in a wide class of mean field models and on a wide range of time scales, aging occurs precisely as predicted by the REM-like trap model of Bouchaud and Dean. This is the first rigorous result about aging in mean field models except for the REM and the spherical model. △ Less

Submitted 12 December, 2007; originally announced December 2007.

Comments: 4 pages

Journal ref: J. Stat. Mech. (2008) L04003

arXiv:0711.3686 [pdf, ps, other]

Biased random walks on a Galton-Watson tree with leaves

Authors: Gérard Ben Arous, Alexander Fribergh, Nina Gantert, Alan Hammond

Abstract: We consider a biased random walk $X_n$ on a Galton-Watson tree with leaves in the sub-ballistic regime. We prove that there exists an explicit constant $γ= γ(β) \in (0,1)$, depending on the bias $β$, such that $X_n$ is of order $n^γ$. Denoting $Δ_n$ the hitting time of level $n$, we prove that $Δ_n/n^{1/γ}$ is tight. Moreover we show that $Δ_n/n^{1/γ}$ does not converge in law (at least for large… ▽ More We consider a biased random walk $X_n$ on a Galton-Watson tree with leaves in the sub-ballistic regime. We prove that there exists an explicit constant $γ= γ(β) \in (0,1)$, depending on the bias $β$, such that $X_n$ is of order $n^γ$. Denoting $Δ_n$ the hitting time of level $n$, we prove that $Δ_n/n^{1/γ}$ is tight. Moreover we show that $Δ_n/n^{1/γ}$ does not converge in law (at least for large values of $β$). We prove that along the sequences $n_λ(k)=\lfloor λβ^{γk}\rfloor$, $Δ_n/n^{1/γ}$ converges to certain infinitely divisible laws. Key tools for the proof are the classical Harris decomposition for Galton-Watson trees, a new variant of regeneration times and the careful analysis of triangular arrays of i.i.d. heavy-tailed random variables. △ Less

Submitted 17 November, 2010; v1 submitted 23 November, 2007; originally announced November 2007.

Comments: 49 pages, 2 figures. To appear in Ann. Probab

arXiv:0710.3132 [pdf, ps, other]

Poisson convergence for the largest eigenvalues of Heavy Tailed Random Matrices

Authors: Antonio Auffinger, Gerard Ben Arous, Sandrine Peche

Abstract: We study the statistics of the largest eigenvalues of real symmetric and sample covariance matrices when the entries are heavy tailed. Extending the result obtained by Soshnikov in \cite{Sos1}, we prove that, in the absence of the fourth moment, the top eigenvalues behave, in the limit, as the largest entries of the matrix. We study the statistics of the largest eigenvalues of real symmetric and sample covariance matrices when the entries are heavy tailed. Extending the result obtained by Soshnikov in \cite{Sos1}, we prove that, in the absence of the fourth moment, the top eigenvalues behave, in the limit, as the largest entries of the matrix. △ Less

Submitted 7 May, 2008; v1 submitted 16 October, 2007; originally announced October 2007.

Comments: 22 pages, to appear in Annales de l'Institut Henri Poincare

MSC Class: 15A52; 62G32; 60G55

arXiv:0707.2159 [pdf, ps, other]

The spectrum of heavy-tailed random matrices

Authors: Gerard Ben Arous, Alice Guionnet

Abstract: Let $X_N$ be an $N\ts N$ random symmetric matrix with independent equidistributed entries. If the law $P$ of the entries has a finite second moment, it was shown by Wigner \cite{wigner} that the empirical distribution of the eigenvalues of $X_N$, once renormalized by $\sqrt{N}$, converges almost surely and in expectation to the so-called semicircular distribution as $N$ goes to infinity. In this… ▽ More Let $X_N$ be an $N\ts N$ random symmetric matrix with independent equidistributed entries. If the law $P$ of the entries has a finite second moment, it was shown by Wigner \cite{wigner} that the empirical distribution of the eigenvalues of $X_N$, once renormalized by $\sqrt{N}$, converges almost surely and in expectation to the so-called semicircular distribution as $N$ goes to infinity. In this paper we study the same question when $P$ is in the domain of attraction of an $α$-stable law. We prove that if we renormalize the eigenvalues by a constant $a_N$ of order $N^{\frac{1}α}$, the corresponding spectral distribution converges in expectation towards a law $μ_α$ which only depends on $α$. We characterize $μ_α$ and study some of its properties; it is a heavy-tailed probability measure which is absolutely continuous with respect to Lebesgue measure except possibly on a compact set of capacity zero. △ Less

Submitted 14 July, 2007; originally announced July 2007.

MSC Class: 15A52; 60E07

arXiv:0706.2135 [pdf, ps, other]

Universality of the REM for dynamics of mean-field spin glasses

Authors: Gerard Ben Arous, Anton Bovier, Jiri Cerny

Abstract: We consider a version of a Glauber dynamics for a p-spin Sherrington--Kirkpatrick model of a spin glass that can be seen as a time change of simple random walk on the N-dimensional hypercube. We show that, for any p>2 and any inverse temperature β>0, there exist constants g>0, such that for all exponential time scales, $\exp(γN)$, with $γ< g$, the properly rescaled clock process (time-change pro… ▽ More We consider a version of a Glauber dynamics for a p-spin Sherrington--Kirkpatrick model of a spin glass that can be seen as a time change of simple random walk on the N-dimensional hypercube. We show that, for any p>2 and any inverse temperature β>0, there exist constants g>0, such that for all exponential time scales, $\exp(γN)$, with $γ< g$, the properly rescaled clock process (time-change process), converges to an α-stable subordinator where α=γ/β^2<1. Moreover, the dynamics exhibits aging at these time scales with time-time correlation function converging to the arcsine law of this α-stable subordinator. In other words, up to rescaling, on these time scales (that are shorter than the equilibration time of the system), the dynamics of p-spin models ages in the same way as the REM, and by extension Bouchaud's REM-like trap model, confirming the latter as a universal aging mechanism for a wide range of systems. The SK model (the case p=2) seems to belong to a different universality class. △ Less

Submitted 14 June, 2007; originally announced June 2007.

Comments: 30 pages, 1 fugre

MSC Class: 82C44; 60K35; 60G70

arXiv:math/0612373 [pdf, ps, other]

A new REM conjecture

Authors: Gerard Ben Arous, Veronique Gayrard, Alexey Kuptsov

Abstract: We introduce here a new universality conjecture for levels of random Hamiltonians, in the same spirit as the local REM conjecture made by S. Mertens and H. Bauke. We establish our conjecture for a wide class of Gaussian and non-Gaussian Hamiltonians, which include the $p$-spin models, the Sherrington-Kirkpatrick model and the number partitioning problem. We prove that our universality result is… ▽ More We introduce here a new universality conjecture for levels of random Hamiltonians, in the same spirit as the local REM conjecture made by S. Mertens and H. Bauke. We establish our conjecture for a wide class of Gaussian and non-Gaussian Hamiltonians, which include the $p$-spin models, the Sherrington-Kirkpatrick model and the number partitioning problem. We prove that our universality result is optimal for the last two models by showing when this universality breaks down. △ Less

Submitted 13 December, 2006; originally announced December 2006.

Comments: 34 pages

arXiv:math/0611178 [pdf, ps, other]

Elementary potential theory on the hypercube

Authors: Gerard Ben Arous, Veronique Gayrard

Abstract: This work addresses potential theoretic questions for the standard nearest neighbor random walk on the hypercube $\{-1,+1\}^N$. For a large class of subsets $A\subset\{-1,+1\}^N$ we give precise estimates for the harmonic measure of $A$, the mean hitting time of $A$, and the Laplace transform of this hitting time. In particular, we give precise sufficient conditions for the harmonic measure to b… ▽ More This work addresses potential theoretic questions for the standard nearest neighbor random walk on the hypercube $\{-1,+1\}^N$. For a large class of subsets $A\subset\{-1,+1\}^N$ we give precise estimates for the harmonic measure of $A$, the mean hitting time of $A$, and the Laplace transform of this hitting time. In particular, we give precise sufficient conditions for the harmonic measure to be asymptotically uniform, and for the hitting time to be asymptotically exponentially distributed, as $N\to\infty$. Our approach relies on a $d$-dimensional extension of the Ehrenfest urn scheme called lum** and covers the case where $d$ is allowed to diverge with $N$ as long as $d\leqα_0\frac{N}{\log N}$ for some constant $0<α_0<1$. △ Less

Submitted 7 November, 2006; originally announced November 2006.

Comments: 99 pages

arXiv:math/0606719 [pdf, ps, other]

doi 10.1214/009117907000000024

Scaling limit for trap models on $\mathbb{Z}^d$

Authors: Gérard Ben Arous, Jiří Černý

Abstract: We give the ``quenched'' scaling limit of Bouchaud's trap model in ${d\ge 2}$. This scaling limit is the fractional-kinetics process, that is the time change of a $d$-dimensional Brownian motion by the inverse of an independent $α$-stable subordinator. We give the ``quenched'' scaling limit of Bouchaud's trap model in ${d\ge 2}$. This scaling limit is the fractional-kinetics process, that is the time change of a $d$-dimensional Brownian motion by the inverse of an independent $α$-stable subordinator. △ Less

Submitted 27 November, 2007; v1 submitted 28 June, 2006; originally announced June 2006.

Comments: Published in at http://dx.doi.org/10.1214/009117907000000024 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP325 MSC Class: 60K37; 60G52; 60F17 (Primary); 82D30 (Secondary)

Journal ref: Annals of Probability 2007, Vol. 35, No. 6, 2356-2384

arXiv:math/0603344 [pdf, ps, other]

Dynamics of trap models

Authors: Gerard Ben Arous, Jiri Cerny

Abstract: These notes cover one of the topics of the class given in the Les Houches Summer School ``Mathematical statistical physics'' in July 2005. The lectures tried to give a summary of the recent mathematical results about the long-time behaviour of dynamics of (mean-field) spin-glasses and other disordered media. We have chosen here to restrict the scope of these notes to the dynamics of trap models… ▽ More These notes cover one of the topics of the class given in the Les Houches Summer School ``Mathematical statistical physics'' in July 2005. The lectures tried to give a summary of the recent mathematical results about the long-time behaviour of dynamics of (mean-field) spin-glasses and other disordered media. We have chosen here to restrict the scope of these notes to the dynamics of trap models only, but to cover this topic in somewhat more depth. △ Less

Submitted 14 March, 2006; originally announced March 2006.

Comments: Lecture Notes for Les Houches Summer School "Mathematical statistical physics", 52 pages

arXiv:math/0603340 [pdf, ps, other]

The arcsine law as a universal aging scheme for trap models

Authors: Gerard Ben Arous, Jiri Cerny

Abstract: We give a general proof of aging for trap models using the arcsine law for stable subordinators. This proof is based on abstract conditions on the potential theory of the underlying graph and on the randomness of the trap** landscape. We apply this proof to aging for trap models on large two-dimensional tori and for trap dynamics of the Random Energy Model on a broad range of time scales. We give a general proof of aging for trap models using the arcsine law for stable subordinators. This proof is based on abstract conditions on the potential theory of the underlying graph and on the randomness of the trap** landscape. We apply this proof to aging for trap models on large two-dimensional tori and for trap dynamics of the Random Energy Model on a broad range of time scales. △ Less

Submitted 14 March, 2006; originally announced March 2006.

Comments: 36 pages

MSC Class: 82D30; 82C41; 60F17

arXiv:math/0510519 [pdf, ps, other]

Transition asymptotics for reaction-diffusion in random media

Authors: Gerard Ben Arous, Stanislav Molchanov, Alejandro F. Ramirez

Abstract: We describe a universal transition mechanism characterizing the passage to an annealed behavior and to a regime where the fluctuations about this behavior are Gaussian, for the long time asymptotics of the empirical average of the expected value of the number of random walks which branch and annihilate on ${\mathbb Z}^d$, with stationary random rates. The random walks are independent, continuous… ▽ More We describe a universal transition mechanism characterizing the passage to an annealed behavior and to a regime where the fluctuations about this behavior are Gaussian, for the long time asymptotics of the empirical average of the expected value of the number of random walks which branch and annihilate on ${\mathbb Z}^d$, with stationary random rates. The random walks are independent, continuous time rate $2dκ$, simple, symmetric, with $κ\ge 0$. A random walk at $x\in{\mathbb Z}^d$, binary branches at rate $v_+(x)$, and annihilates at rate $v_-(x)$. The random environment $w$ has coordinates $w(x)=(v_-(x),v_+(x))$ which are i.i.d. We identify a natural way to describe the annealed-Gaussian transition mechanism under mild conditions on the rates. Indeed, we introduce the exponents $F_θ(t):=\frac{H_1((1+θ)t)-(1+θ)H_1(t)}θ$, and assume that $\frac{F_{2θ}(t)-F_θ(t)}{θ\log(κt+e)}\to\infty$ for $|θ|>0$ small enough, where $H_1(t):=\log < m(0,t)>$ and $<m(0,t)>$ denotes the average of the expected value of the number of particles $m(0,t,w)$ at time $t$ and an environment of rates $w$, given that initially there was only one particle at 0. Then the empirical average of $m(x,t,w)$ over a box of side $L(t)$ has different behaviors: if $ L(t)\ge e^{\frac{1}{d} F_ε(t)}$ for some $ε>0$ and large enough $t$, a law of large numbers is satisfied; if $ L(t)\ge e^{\frac{1}{d} F_ε(2t)}$ for some $ε>0$ and large enough $t$, a CLT is satisfied. These statements are violated if the reversed inequalities are satisfied for some negative $ε$. Applications to potentials with Weibull, Frechet and double exponential tails are given. △ Less

Submitted 4 July, 2007; v1 submitted 24 October, 2005; originally announced October 2005.

Comments: To appear in: Probability and Mathematical Physics: A Volume in Honor of Stanislav Molchanov, Editors - AMS | CRM, (2007)

MSC Class: 82B41;82B44

Showing 1–50 of 56 results for author: Arous, G B