Search | arXiv e-print repository

Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting

Authors: Romain Ilbert, Malik Tiomoko, Cosme Louart, Ambroise Odonnat, Vasilii Feofanov, Themis Palpanas, Ievgen Redko

Abstract: In this paper, we introduce a novel theoretical framework for multi-task regression, applying random matrix theory to provide precise performance estimations, under high-dimensional, non-Gaussian data distributions. We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information. We derive a closed-form solution… ▽ More In this paper, we introduce a novel theoretical framework for multi-task regression, applying random matrix theory to provide precise performance estimations, under high-dimensional, non-Gaussian data distributions. We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information. We derive a closed-form solution for multi-task optimization in the context of linear models. Our analysis provides valuable insights by linking the multi-task learning performance to various model statistics such as raw data covariances, signal-generating hyperplanes, noise levels, as well as the size and number of datasets. We finally propose a consistent estimation of training and testing errors, thereby offering a robust foundation for hyperparameter optimization in multi-task regression scenarios. Experimental validations on both synthetic and real-world datasets in regression and multivariate time series forecasting demonstrate improvements on univariate models, incorporating our method into the training loss and thus leveraging multivariate information. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2001.08370 [pdf, other]

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Authors: Mohamed El Amine Seddik, Cosme Louart, Mohamed Tamaazousti, Romain Couillet

Abstract: This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture m… ▽ More This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks. △ Less

Submitted 21 January, 2020; originally announced January 2020.

arXiv:1702.05419 [pdf, ps, other]

A Random Matrix Approach to Neural Networks

Authors: Cosme Louart, Zhenyu Liao, Romain Couillet

Abstract: This article studies the Gram random matrix model $G=\frac1TΣ^{\rm T}Σ$, $Σ=σ(WX)$, classically found in the analysis of random feature maps and random neural networks, where $X=[x_1,\ldots,x_T]\in{\mathbb R}^{p\times T}$ is a (data) matrix of bounded norm, $W\in{\mathbb R}^{n\times p}$ is a matrix of independent zero-mean unit variance entries, and $σ:{\mathbb R}\to{\mathbb R}$ is a Lipschitz con… ▽ More This article studies the Gram random matrix model $G=\frac1TΣ^{\rm T}Σ$, $Σ=σ(WX)$, classically found in the analysis of random feature maps and random neural networks, where $X=[x_1,\ldots,x_T]\in{\mathbb R}^{p\times T}$ is a (data) matrix of bounded norm, $W\in{\mathbb R}^{n\times p}$ is a matrix of independent zero-mean unit variance entries, and $σ:{\mathbb R}\to{\mathbb R}$ is a Lipschitz continuous (activation) function --- $σ(WX)$ being understood entry-wise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as $n,p,T$ grow large at the same rate, the resolvent $Q=(G+γI_T)^{-1}$, for $γ>0$, has a similar behavior as that met in sample covariance matrix models, involving notably the moment $Φ=\frac{T}n{\mathbb E}[G]$, which provides in passing a deterministic equivalent for the empirical spectral measure of $G$. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters. △ Less

Submitted 29 June, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

Showing 1–3 of 3 results for author: Louart, C