-
Feature Learning in $L_{2}$-regularized DNNs: Attraction/Repulsion and Sparsity
Authors:
Arthur Jacot,
Eugene Golikov,
Clément Hongler,
Franck Gabriel
Abstract:
We study the loss surface of DNNs with $L_{2}$ regularization. We show that the loss in terms of the parameters can be reformulated into a loss in terms of the layerwise activations $Z_{\ell}$ of the training set. This reformulation reveals the dynamics behind feature learning: each hidden representations $Z_{\ell}$ are optimal w.r.t. to an attraction/repulsion problem and interpolate between the…
▽ More
We study the loss surface of DNNs with $L_{2}$ regularization. We show that the loss in terms of the parameters can be reformulated into a loss in terms of the layerwise activations $Z_{\ell}$ of the training set. This reformulation reveals the dynamics behind feature learning: each hidden representations $Z_{\ell}$ are optimal w.r.t. to an attraction/repulsion problem and interpolate between the input and output representations, kee** as little information from the input as necessary to construct the activation of the next layer. For positively homogeneous non-linearities, the loss can be further reformulated in terms of the covariances of the hidden representations, which takes the form of a partially convex optimization over a convex cone.
This second reformulation allows us to prove a sparsity result for homogeneous DNNs: any local minimum of the $L_{2}$-regularized loss can be achieved with at most $N(N+1)$ neurons in each hidden layer (where $N$ is the size of the training set). We show that this bound is tight by giving an example of a local minimum that requires $N^{2}/4$ hidden neurons. But we also observe numerically that in more traditional settings much less than $N^{2}$ neurons are required to reach the minima.
△ Less
Submitted 13 October, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Freeness of type $B$ and conditional freeness for random matrices
Authors:
Guillaume Cébron,
Antoine Dahlqvist,
Franck Gabriel
Abstract:
The asymptotic freeness of independent unitarily invariant $N\times N$ random matrices holds in expectation up to $O(N^{-2})$. An already known consequence is the infinitesimal freeness in expectation. We put in evidence another consequence for unitarily invariant random matrices: the almost sure asymptotic freeness of type $B$. As byproducts, we recover the asymptotic cyclic monotonicity, and we…
▽ More
The asymptotic freeness of independent unitarily invariant $N\times N$ random matrices holds in expectation up to $O(N^{-2})$. An already known consequence is the infinitesimal freeness in expectation. We put in evidence another consequence for unitarily invariant random matrices: the almost sure asymptotic freeness of type $B$. As byproducts, we recover the asymptotic cyclic monotonicity, and we get the asymptotic conditional freeness. In particular, the eigenvector empirical spectral distribution of the sum of two randomly rotated random matrices converges towards the conditionally free convolution. We also show new connections between infinitesimal freeness, freeness of type $B$, conditional freeness, cyclic monotonicity and monotone independence. Finally, we show rigorously that the BBP phase transition for an additive rank-one perturbation of a GUE matrix is a consequence of the asymptotic conditional freeness, and the arguments extend to the study of the outlier eigenvalues of other unitarily invariant ensembles.
△ Less
Submitted 4 May, 2022;
originally announced May 2022.
-
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Authors:
Arthur Jacot,
François Ged,
Berfin Şimşek,
Clément Hongler,
Franck Gabriel
Abstract:
The dynamics of Deep Linear Networks (DLNs) is dramatically affected by the variance $σ^2$ of the parameters at initialization $θ_0$. For DLNs of width $w$, we show a phase transition w.r.t. the scaling $γ$ of the variance $σ^2=w^{-γ}$ as $w\to\infty$: for large variance ($γ<1$), $θ_0$ is very close to a global minimum but far from any saddle point, and for small variance ($γ>1$), $θ_0$ is close t…
▽ More
The dynamics of Deep Linear Networks (DLNs) is dramatically affected by the variance $σ^2$ of the parameters at initialization $θ_0$. For DLNs of width $w$, we show a phase transition w.r.t. the scaling $γ$ of the variance $σ^2=w^{-γ}$ as $w\to\infty$: for large variance ($γ<1$), $θ_0$ is very close to a global minimum but far from any saddle point, and for small variance ($γ>1$), $θ_0$ is close to a saddle point and far from any global minimum. While the first case corresponds to the well-studied NTK regime, the second case is less understood. This motivates the study of the case $γ\to +\infty$, where we conjecture a Saddle-to-Saddle dynamics: throughout training, gradient descent visits the neighborhoods of a sequence of saddles, each corresponding to linear maps of increasing rank, until reaching a sparse global minimum. We support this conjecture with a theorem for the dynamics between the first two saddles, as well as some numerical experiments.
△ Less
Submitted 31 January, 2022; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Smart Proofs via Smart Contracts: Succinct and Informative Mathematical Derivations via Decentralized Markets
Authors:
Sylvain Carré,
Franck Gabriel,
Clément Hongler,
Gustavo Lacerda,
Gloria Capano
Abstract:
Modern mathematics is built on the idea that proofs should be translatable into formal proofs, whose validity is an objective question, decidable by a computer. Yet, in practice, proofs are informal and may omit many details. An agent considers a proof valid if they trust that it could be expanded into a machine-verifiable proof. A proof's validity can thus become a subjective matter and lead to a…
▽ More
Modern mathematics is built on the idea that proofs should be translatable into formal proofs, whose validity is an objective question, decidable by a computer. Yet, in practice, proofs are informal and may omit many details. An agent considers a proof valid if they trust that it could be expanded into a machine-verifiable proof. A proof's validity can thus become a subjective matter and lead to a debate, which may be difficult to settle. Hence, while the concept of valid proof is well-defined, the process to establish validity is itself a complex multi-agent problem.
We introduce the SPRIG protocol. SPRIG allows agents to propose and verify succinct and informative proofs in a decentralized fashion; the trust is established by agents being able to request more details in the proof steps; debates, if they arise, must isolate details of proofs and, if they persist, go down to machine-level details, where they are automatically settled. A structure of bounties and stakes is set to incentivize agents to act in good faith.
We propose a game-theoretic discussion of SPRIG, showing how agents with various types of information interact, leading to a proof tree with an appropriate level of detail and to the invalidation of wrong proofs, and we discuss resilience against various attacks. We then analyze a simplified model, characterize its equilibria and compute the agents' level of trust.
SPRIG is designed to run as a smart contract on a blockchain platform. This allows anonymous agents to participate in the verification debate, and to contribute with their information. The smart contract mediates the interactions, settles debates, and guarantees that bounties and stakes are paid as specified.
SPRIG enables new applications, such as the issuance of bounties for open problems, and the creation of derivatives markets, allowing agents to inject more information pertaining to proofs.
△ Less
Submitted 13 October, 2021; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Authors:
Arthur Jacot,
Berfin Şimşek,
Francesco Spadaro,
Clément Hongler,
Franck Gabriel
Abstract:
We study the risk (i.e. generalization error) of Kernel Ridge Regression (KRR) for a kernel $K$ with ridge $λ>0$ and i.i.d. observations. For this, we introduce two objects: the Signal Capture Threshold (SCT) and the Kernel Alignment Risk Estimator (KARE). The SCT $\vartheta_{K,λ}$ is a function of the data distribution: it can be used to identify the components of the data that the KRR predictor…
▽ More
We study the risk (i.e. generalization error) of Kernel Ridge Regression (KRR) for a kernel $K$ with ridge $λ>0$ and i.i.d. observations. For this, we introduce two objects: the Signal Capture Threshold (SCT) and the Kernel Alignment Risk Estimator (KARE). The SCT $\vartheta_{K,λ}$ is a function of the data distribution: it can be used to identify the components of the data that the KRR predictor captures, and to approximate the (expected) KRR risk. This then leads to a KRR risk approximation by the KARE $ρ_{K, λ}$, an explicit function of the training data, agnostic of the true data distribution. We phrase the regression problem in a functional setting. The key results then follow from a finite-size analysis of the Stieltjes transform of general Wishart random matrices. Under a natural universality assumption (that the KRR moments depend asymptotically on the first two moments of the observations) we capture the mean and variance of the KRR predictor. We numerically investigate our findings on the Higgs and MNIST datasets for various classical kernels: the KARE gives an excellent approximation of the risk, thus supporting our universality assumption. Using the KARE, one can compare choices of Kernels and hyperparameters directly from the training set. The KARE thus provides a promising data-dependent procedure to select Kernels that generalize well.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
Implicit Regularization of Random Feature Models
Authors:
Arthur Jacot,
Berfin Şimşek,
Francesco Spadaro,
Clément Hongler,
Franck Gabriel
Abstract:
Random Feature (RF) models are used as efficient parametric approximations of kernel methods. We investigate, by means of random matrix theory, the connection between Gaussian RF models and Kernel Ridge Regression (KRR). For a Gaussian RF model with $P$ features, $N$ data points, and a ridge $λ$, we show that the average (i.e. expected) RF predictor is close to a KRR predictor with an effective ri…
▽ More
Random Feature (RF) models are used as efficient parametric approximations of kernel methods. We investigate, by means of random matrix theory, the connection between Gaussian RF models and Kernel Ridge Regression (KRR). For a Gaussian RF model with $P$ features, $N$ data points, and a ridge $λ$, we show that the average (i.e. expected) RF predictor is close to a KRR predictor with an effective ridge $\tildeλ$. We show that $\tildeλ > λ$ and $\tildeλ \searrow λ$ monotonically as $P$ grows, thus revealing the implicit regularization effect of finite RF sampling. We then compare the risk (i.e. test error) of the $\tildeλ$-KRR predictor with the average risk of the $λ$-RF predictor and obtain a precise and explicit bound on their difference. Finally, we empirically find an extremely good agreement between the test errors of the average $λ$-RF predictor and $\tildeλ$-KRR predictor.
△ Less
Submitted 23 September, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
The asymptotic spectrum of the Hessian of DNN throughout training
Authors:
Arthur Jacot,
Franck Gabriel,
Clément Hongler
Abstract:
The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs. When the NTK is fixed during training, we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training. In the so-called mean-…
▽ More
The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs. When the NTK is fixed during training, we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training. In the so-called mean-field limit, where the NTK is not fixed during training, we describe the first two moments of the Hessian at initialization.
△ Less
Submitted 10 February, 2020; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts
Authors:
Arthur Jacot,
Franck Gabriel,
François Ged,
Clément Hongler
Abstract:
We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: "order", where the (scaled) NTK converges to a constant, and "chaos", where it converges to a Kronecker…
▽ More
We analyze architectural features of Deep Neural Networks (DNNs) using the so-called Neural Tangent Kernel (NTK), which describes the training and generalization of DNNs in the infinite-width setting. In this setting, we show that for fully-connected DNNs, as the depth grows, two regimes appear: "order", where the (scaled) NTK converges to a constant, and "chaos", where it converges to a Kronecker delta. Extreme order slows down training while extreme chaos hinders generalization. Using the scaled ReLU as a nonlinearity, we end up in the ordered regime. In contrast, Layer Normalization brings the network into the chaotic regime. We observe a similar effect for Batch Normalization (BN) applied after the last nonlinearity. We uncover the same order and chaos modes in Deep Deconvolutional Networks (DC-NNs). Our analysis explains the appearance of so-called checkerboard patterns and border artifacts. Moving the network into the chaotic regime prevents checkerboard patterns; we propose a graph-based parametrization which eliminates border artifacts; finally, we introduce a new layer-dependent learning rate to improve the convergence of DC-NNs. We illustrate our findings on DCGANs: the ordered regime leads to a collapse of the generator to a checkerboard mode, which can be avoided by tuning the nonlinearity to reach the chaotic regime. As a result, we are able to obtain good quality samples for DCGANs without BN.
△ Less
Submitted 22 June, 2020; v1 submitted 11 July, 2019;
originally announced July 2019.
-
Geometric stochastic heat equations
Authors:
Yvain Bruned,
Franck Gabriel,
Martin Hairer,
Lorenzo Zambotti
Abstract:
We consider a natural class of $\mathbf{R}^d$-valued one-dimensional stochastic PDEs driven by space-time white noise that is formally invariant under the action of the diffeomorphism group on $\mathbf{R}^d$. This class contains in particular the KPZ equation, the multiplicative stochastic heat equation, the additive stochastic heat equation, and rough Burgers-type equations. We exhibit a one-para…
▽ More
We consider a natural class of $\mathbf{R}^d$-valued one-dimensional stochastic PDEs driven by space-time white noise that is formally invariant under the action of the diffeomorphism group on $\mathbf{R}^d$. This class contains in particular the KPZ equation, the multiplicative stochastic heat equation, the additive stochastic heat equation, and rough Burgers-type equations. We exhibit a one-parameter family of solution theories with the following properties:
- For all SPDEs in our class for which a solution was previously available, every solution in our family coincides with the previously constructed solution, whether that was obtained using Itô calculus (additive and multiplicative stochastic heat equation), rough path theory (rough Burgers-type equations), or the Hopf-Cole transform (KPZ equation).
- Every solution theory is equivariant under the action of the diffeomorphism group, i.e. identities obtained by formal calculations treating the noise as a smooth function are valid.
- Every solution theory satisfies an analogue of Itô's isometry.
- The counterterms leading to our solution theories vanish at points where the equation agrees to leading order with the additive stochastic heat equation.
In particular, points 2 and 3 show that, surprisingly, our solution theories enjoy properties analogous to those holding for both the Stratonovich and Itô interpretations of SDEs simultaneously. For the natural noisy perturbation of the harmonic map flow with values in an arbitrary Riemannian manifold, we show that all these solution theories coincide. In particular, this allows us to conjecturally identify the process associated to the Markov extension of the Dirichlet form corresponding to the $L^2$-gradient flow for the Brownian loop measure.
△ Less
Submitted 7 February, 2021; v1 submitted 7 February, 2019;
originally announced February 2019.
-
Scaling description of generalization with number of parameters in deep learning
Authors:
Mario Geiger,
Arthur Jacot,
Stefano Spigler,
Franck Gabriel,
Levent Sagun,
Stéphane d'Ascoli,
Giulio Biroli,
Clément Hongler,
Matthieu Wyart
Abstract:
Supervised deep learning involves the training of neural networks with a large number $N$ of parameters. For large enough $N$, in the so-called over-parametrized regime, one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as $N$ grows past a certain threshold $N^{*}$. Instead, empirical studies have shown that in the over…
▽ More
Supervised deep learning involves the training of neural networks with a large number $N$ of parameters. For large enough $N$, in the so-called over-parametrized regime, one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as $N$ grows past a certain threshold $N^{*}$. Instead, empirical studies have shown that in the over-parametrized regime, generalization error keeps decreasing with $N$. We resolve this paradox through a new framework. We rely on the so-called Neural Tangent Kernel, which connects large neural nets to kernel methods, to show that the initialization causes finite-size random fluctuations $\|f_{N}-\bar{f}_{N}\|\sim N^{-1/4}$ of the neural net output function $f_{N}$ around its expectation $\bar{f}_{N}$. These affect the generalization error $ε_{N}$ for classification: under natural assumptions, it decays to a plateau value $ε_{\infty}$ in a power-law fashion $\sim N^{-1/2}$. This description breaks down at a so-called jamming transition $N=N^{*}$. At this threshold, we argue that $\|f_{N}\|$ diverges. This result leads to a plausible explanation for the cusp in test error known to occur at $N^{*}$. Our results are confirmed by extensive empirical observations on the MNIST and CIFAR image datasets. Our analysis finally suggests that, given a computational envelope, the smallest generalization error is obtained using several networks of intermediate sizes, just beyond $N^{*}$, and averaging their outputs.
△ Less
Submitted 8 October, 2019; v1 submitted 6 January, 2019;
originally announced January 2019.
-
Insider Trading with Penalties
Authors:
Sylvain Carré,
Pierre Collin-Dufresne,
Franck Gabriel
Abstract:
We consider a one-period Kyle (1985) framework where the insider can be subject to a penalty if she trades. We establish existence and uniqueness of equilibrium for virtually any penalty function when noise is uniform. In equilibrium, the demand of the insider and the price functions are in general non-linear and remain analytically tractable because the expected price function is linear. We use t…
▽ More
We consider a one-period Kyle (1985) framework where the insider can be subject to a penalty if she trades. We establish existence and uniqueness of equilibrium for virtually any penalty function when noise is uniform. In equilibrium, the demand of the insider and the price functions are in general non-linear and remain analytically tractable because the expected price function is linear. We use this result to investigate the trade off between price efficiency and 'fairness': we consider a regulator that wants to minimise post-trade standard deviation for a given level of uninformed traders' losses. The minimisation is over the function space of penalties; for each possible penalty, our existence and uniqueness theorem allows to define unambiguously the post-trade standard deviation and the uninformed traders' losses that prevail in equilibrium.Optimal penalties are characterized in closed-form. They must increase quickly with the magnitude of the insider's order for small orders and become flat for large orders: in cases where the fundamental realizes at very high or very low values, the insider finds it optimal to trade despite the high penalty. Although such trades-if they occur-are costly for liquidity traders, they signal extreme events and therefore incorporate a lot of information into prices. We generalize this result in two directions by imposing a budget constraint on the regulator and considering the cases of either non-pecuniary or pecuniary penalties. In the first case, we establish that optimal penalties are a subset of the previously optimal penalties: the patterns of equilibrium trade volumes and prices is unchanged. In the second case, we also fully characterize the constrained efficient points and penalties and show that new patterns emerge in the demand schedules of the insider trader and the associated price functions.
△ Less
Submitted 20 September, 2018;
originally announced September 2018.
-
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Authors:
Arthur Jacot,
Franck Gabriel,
Clément Hongler
Abstract:
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function $f_θ$ (which maps input vectors to output vectors) follows the kernel gradient…
▽ More
At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function $f_θ$ (which maps input vectors to output vectors) follows the kernel gradient of the functional cost (which is convex, in contrast to the parameter cost) w.r.t. a new kernel: the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and it stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We prove the positive-definiteness of the limiting NTK when the data is supported on the sphere and the non-linearity is non-polynomial. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function $f_θ$ follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stop**. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.
△ Less
Submitted 10 February, 2020; v1 submitted 20 June, 2018;
originally announced June 2018.
-
Large permutation invariant random matrices are asymptotically free over the diagonal
Authors:
Benson Au,
Guillaume Cébron,
Antoine Dahlqvist,
Franck Gabriel,
Camille Male
Abstract:
We prove that independent families of permutation invariant random matrices are asymptotically free over the diagonal, both in probability and in expectation, under a uniform boundedness assumption on the operator norm. We can relax the operator norm assumption to an estimate on sums associated to graphs of matrices, further extending the range of applications (for example, to Wigner matrices with…
▽ More
We prove that independent families of permutation invariant random matrices are asymptotically free over the diagonal, both in probability and in expectation, under a uniform boundedness assumption on the operator norm. We can relax the operator norm assumption to an estimate on sums associated to graphs of matrices, further extending the range of applications (for example, to Wigner matrices with exploding moments and so the sparse regime of the Erdős-Rényi model). The result still holds even if the matrices are multiplied entrywise by bounded random variables (for example, as in the case of matrices with a variance profile and percolation models).
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
Multipath Communication with Finite Sliding Window Network Coding for Ultra-Reliability and Low Latency
Authors:
Frank Gabriel,
Anil Kumar Chorppath,
Ievgenii Tsokalo,
Frank H. P. Fitzek
Abstract:
We use random linear network coding (RLNC) based scheme for multipath communication in the presence of lossy links with different delay characteristics to obtain ultra-reliability and low latency. A sliding window version of RLNC is proposed where the coded packets are generated using packets in a window size and are inserted among systematic packets in different paths. The packets are scheduled i…
▽ More
We use random linear network coding (RLNC) based scheme for multipath communication in the presence of lossy links with different delay characteristics to obtain ultra-reliability and low latency. A sliding window version of RLNC is proposed where the coded packets are generated using packets in a window size and are inserted among systematic packets in different paths. The packets are scheduled in the paths in a round robin fashion proportional to the data rates. We use finite encoding and decoding window size and do not rely on feedback for closing the sliding window, unlike the previous work. Our implementation of two paths with LTE and WiFi characteristics shows that the proposed sliding window scheme achieves better latency compared to the block RLNC code. It is also shown that the proposed scheme achieves low latency communication through multiple paths compared to the individual paths for bursty traffic by translating the throughput on both the paths into latency gain.
△ Less
Submitted 1 February, 2018;
originally announced February 2018.
-
Full-disk Synoptic Observations of the Chromosphere Using H$_α$ Telescope at the Kodaikanal Observatory
Authors:
B. Ravindra,
K. Prabhu,
K. E. Rangarajan,
S. P. Bagare,
Singh Jagdev,
P. M. M. Kemkar,
J. P. Lancelot,
K. C. Thulasidharen,
F. Gabriel,
R. Selvendran
Abstract:
This paper reports on the installation and observations of a new solar telescope installed on 7th October, 2014 at the Kodaikanal Observatory. The telescope is a refractive type equipped with a tunable Lyot H$_α$ filter. A CCD camera of 2k$\times$2k size makes the image of the Sun with a pixel size of 1.21$^{\prime\prime}$ pixel$^{-1}$ with a full field-of-view of 41$^{\prime}$. The telescope is e…
▽ More
This paper reports on the installation and observations of a new solar telescope installed on 7th October, 2014 at the Kodaikanal Observatory. The telescope is a refractive type equipped with a tunable Lyot H$_α$ filter. A CCD camera of 2k$\times$2k size makes the image of the Sun with a pixel size of 1.21$^{\prime\prime}$ pixel$^{-1}$ with a full field-of-view of 41$^{\prime}$. The telescope is equipped with a guiding system which keeps the image of the Sun within a few pixels throughout the observations. The FWHM of the Lyot filter is 0.4Å~and the filter is motorized, capable of scanning the H$_α$ line profile at a smaller step size of 0.01Å. Partial-disk imaging covering about 10$^{\prime}$, is also possible with the help of a relay lens kept in front of the CCD camera. In this paper, we report the detailed specifications of the telescope, filter unit, its installation, observations and the procedures we have followed to calibrate and align the data. We also present preliminary results with this new full-disk telescope.
△ Less
Submitted 11 August, 2016;
originally announced August 2016.
-
The Makeenko-Migdal equation for Yang-Mills theory on compact surfaces
Authors:
Bruce K. Driver,
Franck Gabriel,
Brian C. Hall,
Todd Kemp
Abstract:
We prove the Makeenko-Migdal equation for two-dimensional Euclidean Yang-Mills theory on an arbitrary compact surface, possibly with boundary. In particular, we show that two of the proofs given by the first, third, and fourth authors for the plane case extend essentially without change to compact surfaces.
We prove the Makeenko-Migdal equation for two-dimensional Euclidean Yang-Mills theory on an arbitrary compact surface, possibly with boundary. In particular, we show that two of the proofs given by the first, third, and fourth authors for the plane case extend essentially without change to compact surfaces.
△ Less
Submitted 25 January, 2017; v1 submitted 11 February, 2016;
originally announced February 2016.
-
The generalized master fields
Authors:
Guillaume Cébron,
Antoine Dahlqvist,
Franck Gabriel
Abstract:
The master field is the large $N$ limit of the Yang-Mills measure on the Euclidean plane. It can be viewed as a non-commutative process indexed by paths on the plane. We construct and study generalized master fields, called free planar Markovian holonomy fields, which are versions of the master field where the law of a simple loop can be as more general as it is possible. We prove that those free…
▽ More
The master field is the large $N$ limit of the Yang-Mills measure on the Euclidean plane. It can be viewed as a non-commutative process indexed by paths on the plane. We construct and study generalized master fields, called free planar Markovian holonomy fields, which are versions of the master field where the law of a simple loop can be as more general as it is possible. We prove that those free planar Markovian holonomy fields can be seen as well as the large $N$ limit of some Markovian holonomy fields on the plane with unitary structure group.
△ Less
Submitted 2 January, 2016;
originally announced January 2016.
-
A combinatorial theory of random matrices III: random walks on $\mathfrak{S}(N)$, ramified coverings and the $\mathfrak{S}(\infty)$ Yang-Mills measure
Authors:
Franck Gabriel
Abstract:
The aim of this article is to study some asymptotics of a natural model of random ramified coverings on the disk of degree $N$. We prove that the monodromy field, called also the holonomy field, converges in probability to a non-random field as $N$ goes to infinity. In order to do so, we use the fact that the monodromy field of random uniform labelled simple ramified coverings on the disk of degre…
▽ More
The aim of this article is to study some asymptotics of a natural model of random ramified coverings on the disk of degree $N$. We prove that the monodromy field, called also the holonomy field, converges in probability to a non-random field as $N$ goes to infinity. In order to do so, we use the fact that the monodromy field of random uniform labelled simple ramified coverings on the disk of degree $N$ has the same law as the $\mathfrak{S}(N)$-Yang-Mills measure associated with the random walk by transposition on $\mathfrak{S}(N)$. This allows us to restrict our study to random walks on $\mathfrak{S}(N)$: we prove theorems about asymptotics of random walks on $\mathfrak{S}(N)$ in a new framework based on the geometric study of partitions and the Schur-Weyl-Jones's dualities. In particular, given a sequence of conjugacy classes $(λ\_N \subset \mathfrak{S}(N))\_{N \in \mathbb{N}}$, we define a notion of convergence for $(λ\_N)\_{N \in \mathbb{N}}$ which implies the convergence in non-commutative distribution and in $\mathcal{P}$-expectation of the $λ\_N$-random walk to a $\mathcal{P}$-free multiplicative L{é}vy process. This limiting process is shown not to be a free multiplicative L{é}vy process and we compute its log-cumulant functional. We give also a criterion on $(λ\_N)\_{N \in \mathbb{N}}$ in order to know if the limit is random or not.
△ Less
Submitted 31 October, 2016; v1 submitted 5 October, 2015;
originally announced October 2015.
-
Combinatorial theory of permutation-invariant random matrices II: cumulants, freeness and Levy processes
Authors:
Franck Gabriel
Abstract:
The $\mathcal{A}$-tracial algebras are algebras endowed with multi-linear forms, compatible with the product, and indexed by partitions. Using the notion of $\mathcal{A}$-cumulants, we define and study the $\mathcal{A}$-freeness property which generalizes the independence and freeness properties, and some invariance properties which model the invariance by conjugation for random matrices. A centra…
▽ More
The $\mathcal{A}$-tracial algebras are algebras endowed with multi-linear forms, compatible with the product, and indexed by partitions. Using the notion of $\mathcal{A}$-cumulants, we define and study the $\mathcal{A}$-freeness property which generalizes the independence and freeness properties, and some invariance properties which model the invariance by conjugation for random matrices. A central limit theorem is given in the setting of $\mathcal{A}$-tracial algebras. A generalization of the normalized moments for random matrices is used to define convergence in $\mathcal{A}$-distribution: this allows us to apply the theory of $\mathcal{A}$-tracial algebras to random matrices. This study is deepened with the use of $\mathcal{A}$-finite dimensional cumulants which are related to some dualities as the Schur-Weyl's duality. This gives a unified and simple framework in order to understand families of random matrices which are invariant by conjugation in law by any group whose associated tensor category is spanned by partitions, this includes for example the unitary groups or the symmetric groups. Among the various by-products, we prove that unitary invariance and convergence in distribution implies convergence in $\mathcal{P}$-distribution. Besides, a new notion of strong asymptotic invariance and independence are shown to imply $\mathcal{A}$-freeness. Finally, we prove general theorems about convergence of matrix-valued additive and multiplicative L{é}vy processes which are invariant in law by conjugation by the symmetric group. Using these results, a unified point of view on the study of matricial L{é}vy processes is given.
△ Less
Submitted 3 November, 2016; v1 submitted 9 July, 2015;
originally announced July 2015.
-
Combinatorial theory of permutation-invariant random matrices I: partitions, geometry and renormalization
Authors:
Franck Gabriel
Abstract:
In this article, we define and study a geometry and an order on the set of partitions of an even number of objects. One of the definitions involves the partition algebra, a structure of algebra on the set of such partitions depending on an integer parameter N. Then we emulate the theory of random matrices in a combinatorial framework: for any parameter N, we introduce a family of linear forms on t…
▽ More
In this article, we define and study a geometry and an order on the set of partitions of an even number of objects. One of the definitions involves the partition algebra, a structure of algebra on the set of such partitions depending on an integer parameter N. Then we emulate the theory of random matrices in a combinatorial framework: for any parameter N, we introduce a family of linear forms on the partition algebras which allows us to define a notion of weak convergence similar to the convergence in moments in random matrices theory. A renormalization of the partition algebras allows us to consider the weak convergence as a simple convergence in a fixed space. This leads us to the definition of a deformed partition algebra for any integer parameter N and to the definition of two transforms: the cumulants transform and the exclusive moments transform. Using an improved triangular inequality for the distance defined on partitions, we prove that the deformed partition algebras, endowed with a deformation of the linear forms converge as N go to infinity. This result allows us to prove combinatorial properties about geodesics and a convergence theorem for semi-groups of functions on partitions. At the end we study a sub-algebra of functions on infinite partitions with finite support : a new addition operation and a notion of R-transform are defined. We introduce the set of multiplicative functions which becomes a Lie group for the new addition and multiplication operations. For each of them, the Lie algebra is studied. The appropriate tools are developed in order to understand the algebraic fluctuations of the moments and cumulants for converging sequences. This allows us to extend all the results we got for the zero order of fluctuations to any order.
△ Less
Submitted 31 October, 2016; v1 submitted 10 March, 2015;
originally announced March 2015.
-
Planar Markovian Holonomy Fields
Authors:
Franck Gabriel
Abstract:
We study planar random holonomy fields which are processes indexed by paths on the plane which behave well under the concatenation and orientation-reversing operations on paths. We define the Planar Markovian Holonomy Fields as planar random holonomy fields which satisfy some independence and invariance by area-preserving homeomorphisms properties. We use the theory of braids in the framework of c…
▽ More
We study planar random holonomy fields which are processes indexed by paths on the plane which behave well under the concatenation and orientation-reversing operations on paths. We define the Planar Markovian Holonomy Fields as planar random holonomy fields which satisfy some independence and invariance by area-preserving homeomorphisms properties. We use the theory of braids in the framework of classical probabilities: for finite and infinite random sequences the notion of invariance by braids is defined and we prove a new version of the de-Finetti's Theorem. This allows us to construct a family of Planar Markovian Holonomy Fields, the Yang-Mills fields, and we prove that any regular Planar Markovian Holonomy Field is a planar Yang-Mills field. This family of planar Yang-Mills fields can be partitioned into three categories according to the degree of symmetry: we study some equivalent conditions in order to classify them. Finally, we recall the notion of Markovian Holonomy Fields and construct a bridge between the planar and non-planar theories. Using the results previously proved in the article, we compute, for any Markovian Holonomy Field, the "law" of any family of contractible loops drawn on a surface.
△ Less
Submitted 31 October, 2016; v1 submitted 21 January, 2015;
originally announced January 2015.