Skip to main content

Showing 1–21 of 21 results for author: Gabriel, F

.
  1. arXiv:2205.15809  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Feature Learning in $L_{2}$-regularized DNNs: Attraction/Repulsion and Sparsity

    Authors: Arthur Jacot, Eugene Golikov, Clément Hongler, Franck Gabriel

    Abstract: We study the loss surface of DNNs with $L_{2}$ regularization. We show that the loss in terms of the parameters can be reformulated into a loss in terms of the layerwise activations $Z_{\ell}$ of the training set. This reformulation reveals the dynamics behind feature learning: each hidden representations $Z_{\ell}$ are optimal w.r.t. to an attraction/repulsion problem and interpolate between the… ▽ More

    Submitted 13 October, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

  2. arXiv:2205.01926  [pdf, ps, other

    math.PR math.OA

    Freeness of type $B$ and conditional freeness for random matrices

    Authors: Guillaume Cébron, Antoine Dahlqvist, Franck Gabriel

    Abstract: The asymptotic freeness of independent unitarily invariant $N\times N$ random matrices holds in expectation up to $O(N^{-2})$. An already known consequence is the infinitesimal freeness in expectation. We put in evidence another consequence for unitarily invariant random matrices: the almost sure asymptotic freeness of type $B$. As byproducts, we recover the asymptotic cyclic monotonicity, and we… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

  3. arXiv:2106.15933  [pdf, other

    stat.ML cs.LG

    Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity

    Authors: Arthur Jacot, François Ged, Berfin Şimşek, Clément Hongler, Franck Gabriel

    Abstract: The dynamics of Deep Linear Networks (DLNs) is dramatically affected by the variance $σ^2$ of the parameters at initialization $θ_0$. For DLNs of width $w$, we show a phase transition w.r.t. the scaling $γ$ of the variance $σ^2=w^{-γ}$ as $w\to\infty$: for large variance ($γ<1$), $θ_0$ is very close to a global minimum but far from any saddle point, and for small variance ($γ>1$), $θ_0$ is close t… ▽ More

    Submitted 31 January, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

  4. arXiv:2102.03044  [pdf, other

    cs.GT cs.CL cs.LO cs.SI

    Smart Proofs via Smart Contracts: Succinct and Informative Mathematical Derivations via Decentralized Markets

    Authors: Sylvain Carré, Franck Gabriel, Clément Hongler, Gustavo Lacerda, Gloria Capano

    Abstract: Modern mathematics is built on the idea that proofs should be translatable into formal proofs, whose validity is an objective question, decidable by a computer. Yet, in practice, proofs are informal and may omit many details. An agent considers a proof valid if they trust that it could be expanded into a machine-verifiable proof. A proof's validity can thus become a subjective matter and lead to a… ▽ More

    Submitted 13 October, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: 45 pages, 12 figures

    MSC Class: 03F07; 03F20; 91A05; 91A06; 91A07; 91A10; 91A11; 91A24; 91A26; 91A27; 91A28; 91A80; 68N17; 68P05; 68V15; 68V20; 68V30 ACM Class: F.4

  5. arXiv:2006.09796  [pdf, other

    stat.ML cs.LG math.PR

    Kernel Alignment Risk Estimator: Risk Prediction from Training Data

    Authors: Arthur Jacot, Berfin Şimşek, Francesco Spadaro, Clément Hongler, Franck Gabriel

    Abstract: We study the risk (i.e. generalization error) of Kernel Ridge Regression (KRR) for a kernel $K$ with ridge $λ>0$ and i.i.d. observations. For this, we introduce two objects: the Signal Capture Threshold (SCT) and the Kernel Alignment Risk Estimator (KARE). The SCT $\vartheta_{K,λ}$ is a function of the data distribution: it can be used to identify the components of the data that the KRR predictor… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  6. arXiv:2002.08404  [pdf, other

    stat.ML cs.LG

    Implicit Regularization of Random Feature Models

    Authors: Arthur Jacot, Berfin Şimşek, Francesco Spadaro, Clément Hongler, Franck Gabriel

    Abstract: Random Feature (RF) models are used as efficient parametric approximations of kernel methods. We investigate, by means of random matrix theory, the connection between Gaussian RF models and Kernel Ridge Regression (KRR). For a Gaussian RF model with $P$ features, $N$ data points, and a ridge $λ$, we show that the average (i.e. expected) RF predictor is close to a KRR predictor with an effective ri… ▽ More

    Submitted 23 September, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of the International Conference on Machine Learning, 2020, pp. 7397-7406

  7. arXiv:1910.02875  [pdf, other

    cs.LG cs.NE stat.ML

    The asymptotic spectrum of the Hessian of DNN throughout training

    Authors: Arthur Jacot, Franck Gabriel, Clément Hongler

    Abstract: The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs. When the NTK is fixed during training, we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training. In the so-called mean-… ▽ More

    Submitted 10 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  8. arXiv:1907.05715  [pdf, other

    cs.LG stat.ML

    Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts

    Authors: Arthur Jacot, Franck Gabriel, François Ged, Clément Hongler

    Abstract: We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: "order", where the (scaled) NTK converges to a constant, and "chaos", where it converges to a Kronecker… ▽ More

    Submitted 22 June, 2020; v1 submitted 11 July, 2019; originally announced July 2019.

  9. arXiv:1902.02884  [pdf, other

    math.PR math.AP

    Geometric stochastic heat equations

    Authors: Yvain Bruned, Franck Gabriel, Martin Hairer, Lorenzo Zambotti

    Abstract: We consider a natural class of $\mathbf{R}^d$-valued one-dimensional stochastic PDEs driven by space-time white noise that is formally invariant under the action of the diffeomorphism group on $\mathbf{R}^d$. This class contains in particular the KPZ equation, the multiplicative stochastic heat equation, the additive stochastic heat equation, and rough Burgers-type equations. We exhibit a one-para… ▽ More

    Submitted 7 February, 2021; v1 submitted 7 February, 2019; originally announced February 2019.

    Comments: Accepted version; major changes in Sections 5 and 6

    MSC Class: 60H15; 60L30

  10. arXiv:1901.01608  [pdf, other

    cond-mat.dis-nn cs.LG

    Scaling description of generalization with number of parameters in deep learning

    Authors: Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart

    Abstract: Supervised deep learning involves the training of neural networks with a large number $N$ of parameters. For large enough $N$, in the so-called over-parametrized regime, one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as $N$ grows past a certain threshold $N^{*}$. Instead, empirical studies have shown that in the over… ▽ More

    Submitted 8 October, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

    Comments: The clarity of the text has been improved: the section "Related works" has been updated and the section "3.1 Regression task" has been added

  11. arXiv:1809.07545  [pdf, other

    q-fin.TR math.OC

    Insider Trading with Penalties

    Authors: Sylvain Carré, Pierre Collin-Dufresne, Franck Gabriel

    Abstract: We consider a one-period Kyle (1985) framework where the insider can be subject to a penalty if she trades. We establish existence and uniqueness of equilibrium for virtually any penalty function when noise is uniform. In equilibrium, the demand of the insider and the price functions are in general non-linear and remain analytically tractable because the expected price function is linear. We use t… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

  12. arXiv:1806.07572  [pdf, other

    cs.LG cs.NE math.PR stat.ML

    Neural Tangent Kernel: Convergence and Generalization in Neural Networks

    Authors: Arthur Jacot, Franck Gabriel, Clément Hongler

    Abstract: At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function $f_θ$ (which maps input vectors to output vectors) follows the kernel gradient… ▽ More

    Submitted 10 February, 2020; v1 submitted 20 June, 2018; originally announced June 2018.

    Journal ref: In Advances in neural information processing systems (pp. 8571-8580) 2018

  13. Large permutation invariant random matrices are asymptotically free over the diagonal

    Authors: Benson Au, Guillaume Cébron, Antoine Dahlqvist, Franck Gabriel, Camille Male

    Abstract: We prove that independent families of permutation invariant random matrices are asymptotically free over the diagonal, both in probability and in expectation, under a uniform boundedness assumption on the operator norm. We can relax the operator norm assumption to an estimate on sums associated to graphs of matrices, further extending the range of applications (for example, to Wigner matrices with… ▽ More

    Submitted 18 May, 2018; originally announced May 2018.

    Comments: 18 pages, 3 figures

    Journal ref: Ann. Probab. 49 (2021), no. 1, 157-179

  14. arXiv:1802.00521  [pdf, other

    cs.NI

    Multipath Communication with Finite Sliding Window Network Coding for Ultra-Reliability and Low Latency

    Authors: Frank Gabriel, Anil Kumar Chorppath, Ievgenii Tsokalo, Frank H. P. Fitzek

    Abstract: We use random linear network coding (RLNC) based scheme for multipath communication in the presence of lossy links with different delay characteristics to obtain ultra-reliability and low latency. A sliding window version of RLNC is proposed where the coded packets are generated using packets in a window size and are inserted among systematic packets in different paths. The packets are scheduled i… ▽ More

    Submitted 1 February, 2018; originally announced February 2018.

  15. Full-disk Synoptic Observations of the Chromosphere Using H$_α$ Telescope at the Kodaikanal Observatory

    Authors: B. Ravindra, K. Prabhu, K. E. Rangarajan, S. P. Bagare, Singh Jagdev, P. M. M. Kemkar, J. P. Lancelot, K. C. Thulasidharen, F. Gabriel, R. Selvendran

    Abstract: This paper reports on the installation and observations of a new solar telescope installed on 7th October, 2014 at the Kodaikanal Observatory. The telescope is a refractive type equipped with a tunable Lyot H$_α$ filter. A CCD camera of 2k$\times$2k size makes the image of the Sun with a pixel size of 1.21$^{\prime\prime}$ pixel$^{-1}$ with a full field-of-view of 41$^{\prime}$. The telescope is e… ▽ More

    Submitted 11 August, 2016; originally announced August 2016.

    Comments: Aceepted for publication in RAA

  16. The Makeenko-Migdal equation for Yang-Mills theory on compact surfaces

    Authors: Bruce K. Driver, Franck Gabriel, Brian C. Hall, Todd Kemp

    Abstract: We prove the Makeenko-Migdal equation for two-dimensional Euclidean Yang-Mills theory on an arbitrary compact surface, possibly with boundary. In particular, we show that two of the proofs given by the first, third, and fourth authors for the plane case extend essentially without change to compact surfaces.

    Submitted 25 January, 2017; v1 submitted 11 February, 2016; originally announced February 2016.

    Comments: Final version, minor typographical corrections. To appear in Comm. Math. Phys

    MSC Class: 81T40 (primary); 81T13

    Journal ref: Communications in Mathematical Physics, 352 (2017), 967-978

  17. The generalized master fields

    Authors: Guillaume Cébron, Antoine Dahlqvist, Franck Gabriel

    Abstract: The master field is the large $N$ limit of the Yang-Mills measure on the Euclidean plane. It can be viewed as a non-commutative process indexed by paths on the plane. We construct and study generalized master fields, called free planar Markovian holonomy fields, which are versions of the master field where the law of a simple loop can be as more general as it is possible. We prove that those free… ▽ More

    Submitted 2 January, 2016; originally announced January 2016.

  18. arXiv:1510.01046  [pdf, ps, other

    math.PR

    A combinatorial theory of random matrices III: random walks on $\mathfrak{S}(N)$, ramified coverings and the $\mathfrak{S}(\infty)$ Yang-Mills measure

    Authors: Franck Gabriel

    Abstract: The aim of this article is to study some asymptotics of a natural model of random ramified coverings on the disk of degree $N$. We prove that the monodromy field, called also the holonomy field, converges in probability to a non-random field as $N$ goes to infinity. In order to do so, we use the fact that the monodromy field of random uniform labelled simple ramified coverings on the disk of degre… ▽ More

    Submitted 31 October, 2016; v1 submitted 5 October, 2015; originally announced October 2015.

  19. arXiv:1507.02465  [pdf, ps, other

    math.PR

    Combinatorial theory of permutation-invariant random matrices II: cumulants, freeness and Levy processes

    Authors: Franck Gabriel

    Abstract: The $\mathcal{A}$-tracial algebras are algebras endowed with multi-linear forms, compatible with the product, and indexed by partitions. Using the notion of $\mathcal{A}$-cumulants, we define and study the $\mathcal{A}$-freeness property which generalizes the independence and freeness properties, and some invariance properties which model the invariance by conjugation for random matrices. A centra… ▽ More

    Submitted 3 November, 2016; v1 submitted 9 July, 2015; originally announced July 2015.

  20. arXiv:1503.02792  [pdf, ps, other

    math.CO math.PR

    Combinatorial theory of permutation-invariant random matrices I: partitions, geometry and renormalization

    Authors: Franck Gabriel

    Abstract: In this article, we define and study a geometry and an order on the set of partitions of an even number of objects. One of the definitions involves the partition algebra, a structure of algebra on the set of such partitions depending on an integer parameter N. Then we emulate the theory of random matrices in a combinatorial framework: for any parameter N, we introduce a family of linear forms on t… ▽ More

    Submitted 31 October, 2016; v1 submitted 10 March, 2015; originally announced March 2015.

  21. arXiv:1501.05077  [pdf, ps, other

    math-ph math.PR

    Planar Markovian Holonomy Fields

    Authors: Franck Gabriel

    Abstract: We study planar random holonomy fields which are processes indexed by paths on the plane which behave well under the concatenation and orientation-reversing operations on paths. We define the Planar Markovian Holonomy Fields as planar random holonomy fields which satisfy some independence and invariance by area-preserving homeomorphisms properties. We use the theory of braids in the framework of c… ▽ More

    Submitted 31 October, 2016; v1 submitted 21 January, 2015; originally announced January 2015.