Search | arXiv e-print repository

High-dimensional Learning with Noisy Labels

Authors: Aymane El Firdoussi, Mohamed El Amine Seddik

Abstract: This paper provides theoretical insights into high-dimensional binary classification with class-conditional noisy labels. Specifically, we study the behavior of a linear classifier with a label noisiness aware loss function, when both the dimension of data $p$ and the sample size $n$ are large and comparable. Relying on random matrix theory by supposing a Gaussian mixture data model, the performan… ▽ More This paper provides theoretical insights into high-dimensional binary classification with class-conditional noisy labels. Specifically, we study the behavior of a linear classifier with a label noisiness aware loss function, when both the dimension of data $p$ and the sample size $n$ are large and comparable. Relying on random matrix theory by supposing a Gaussian mixture data model, the performance of the linear classifier when $p,n\to \infty$ is shown to converge towards a limit, involving scalar statistics of the data. Importantly, our findings show that the low-dimensional intuitions to handle label noise do not hold in high-dimension, in the sense that the optimal classifier in low-dimension dramatically fails in high-dimension. Based on our derivations, we design an optimized method that is shown to be provably more efficient in handling noisy labels in high dimensions. Our theoretical conclusions are further confirmed by experiments on real datasets, where we show that our optimized approach outperforms the considered baselines. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2402.10677 [pdf, other]

Performance Gaps in Multi-view Clustering under the Nested Matrix-Tensor Model

Authors: Hugo Lebeau, Mohamed El Amine Seddik, José Henrique de Morais Goulart

Abstract: We study the estimation of a planted signal hidden in a recently introduced nested matrix-tensor model, which is an extension of the classical spiked rank-one tensor model, motivated by multi-view clustering. Prior work has theoretically examined the performance of a tensor-based approach, which relies on finding a best rank-one approximation, a problem known to be computationally hard. A tractabl… ▽ More We study the estimation of a planted signal hidden in a recently introduced nested matrix-tensor model, which is an extension of the classical spiked rank-one tensor model, motivated by multi-view clustering. Prior work has theoretically examined the performance of a tensor-based approach, which relies on finding a best rank-one approximation, a problem known to be computationally hard. A tractable alternative approach consists in computing instead the best rank-one (matrix) approximation of an unfolding of the observed tensor data, but its performance was hitherto unknown. We quantify here the performance gap between these two approaches, in particular by deriving the precise algorithmic threshold of the unfolding approach and demonstrating that it exhibits a BBP-type transition behavior. This work is therefore in line with recent contributions which deepen our understanding of why tensor-based methods surpass matrix-based methods in handling structured tensor data. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2310.18717 [pdf, other]

On the Accuracy of Hotelling-Type Asymmetric Tensor Deflation: A Random Tensor Analysis

Authors: Mohamed El Amine Seddik, Maxime Guillaud, Alexis Decurninge, José Henrique de Morais Goulart

Abstract: This work introduces an asymptotic study of Hotelling-type tensor deflation in the presence of noise, in the regime of large tensor dimensions. Specifically, we consider a low-rank asymmetric tensor model of the form $\sum_{i=1}^r β_i{\mathcal{A}}_i + {\mathcal{W}}$ where $β_i\geq 0$ and the ${\mathcal{A}}_i$'s are unit-norm rank-one tensors such that… ▽ More This work introduces an asymptotic study of Hotelling-type tensor deflation in the presence of noise, in the regime of large tensor dimensions. Specifically, we consider a low-rank asymmetric tensor model of the form $\sum_{i=1}^r β_i{\mathcal{A}}_i + {\mathcal{W}}$ where $β_i\geq 0$ and the ${\mathcal{A}}_i$'s are unit-norm rank-one tensors such that $\left| \langle {\mathcal{A}}_i, {\mathcal{A}}_j \rangle \right| \in [0, 1]$ for $i\neq j$ and ${\mathcal{W}}$ is an additive noise term. Assuming that the dominant components are successively estimated from the noisy observation and subsequently subtracted, we leverage recent advances in random tensor theory in the regime of asymptotically large tensor dimensions to analytically characterize the estimated singular values and the alignment of estimated and true singular vectors at each step of the deflation procedure. Furthermore, this result can be used to construct estimators of the signal-to-noise ratios $β_i$ and the alignments between the estimated and true rank-1 signal components. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: Accepted at IEEE CAMSAP 2023. See also companion paper arXiv:2304.10248 for the symmetric case. arXiv admin note: text overlap with arXiv:2211.09004

arXiv:2305.19992 [pdf, ps, other]

A Nested Matrix-Tensor Model for Noisy Multi-view Clustering

Authors: Mohamed El Amine Seddik, Mastane Achab, Henrique Goulart, Merouane Debbah

Abstract: In this paper, we propose a nested matrix-tensor model which extends the spiked rank-one tensor model of order three. This model is particularly motivated by a multi-view clustering problem in which multiple noisy observations of each data point are acquired, with potentially non-uniform variances along the views. In this case, data can be naturally represented by an order-three tensor where the v… ▽ More In this paper, we propose a nested matrix-tensor model which extends the spiked rank-one tensor model of order three. This model is particularly motivated by a multi-view clustering problem in which multiple noisy observations of each data point are acquired, with potentially non-uniform variances along the views. In this case, data can be naturally represented by an order-three tensor where the views are stacked. Given such a tensor, we consider the estimation of the hidden clusters via performing a best rank-one tensor approximation. In order to study the theoretical performance of this approach, we characterize the behavior of this best rank-one approximation in terms of the alignments of the obtained component vectors with the hidden model parameter vectors, in the large-dimensional regime. In particular, we show that our theoretical results allow us to anticipate the exact accuracy of the proposed clustering approach. Furthermore, numerical experiments indicate that leveraging our tensor-based approach yields better accuracy compared to a naive unfolding-based algorithm which ignores the underlying low-rank tensor structure. Our analysis unveils unexpected and non-trivial phase transition phenomena depending on the model parameters, ``interpolating'' between the typical behavior observed for the spiked matrix and tensor models. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2304.10248 [pdf, ps, other]

Hotelling Deflation on Large Symmetric Spiked Tensors

Authors: Mohamed El Amine Seddik, José Henrique de Morais Goulart, Maxime Guillaud

Abstract: This paper studies the deflation algorithm when applied to estimate a low-rank symmetric spike contained in a large tensor corrupted by additive Gaussian noise. Specifically, we provide a precise characterization of the large-dimensional performance of deflation in terms of the alignments of the vectors obtained by successive rank-1 approximation and of their estimated weights, assuming non-trivia… ▽ More This paper studies the deflation algorithm when applied to estimate a low-rank symmetric spike contained in a large tensor corrupted by additive Gaussian noise. Specifically, we provide a precise characterization of the large-dimensional performance of deflation in terms of the alignments of the vectors obtained by successive rank-1 approximation and of their estimated weights, assuming non-trivial (fixed) correlations among spike components. Our analysis allows an understanding of the deflation mechanism in the presence of noise and can be exploited for designing more efficient signal estimation methods. △ Less

Submitted 20 April, 2023; originally announced April 2023.

Comments: 4 pages, 1 figure

arXiv:2302.05798 [pdf, ps, other]

Optimizing Orthogonalized Tensor Deflation via Random Tensor Theory

Authors: Mohamed El Amine Seddik, Mohammed Mahfoud, Merouane Debbah

Abstract: This paper tackles the problem of recovering a low-rank signal tensor with possibly correlated components from a random noisy tensor, or so-called spiked tensor model. When the underlying components are orthogonal, they can be recovered efficiently using tensor deflation which consists of successive rank-one approximations, while non-orthogonal components may alter the tensor deflation mechanism,… ▽ More This paper tackles the problem of recovering a low-rank signal tensor with possibly correlated components from a random noisy tensor, or so-called spiked tensor model. When the underlying components are orthogonal, they can be recovered efficiently using tensor deflation which consists of successive rank-one approximations, while non-orthogonal components may alter the tensor deflation mechanism, thereby preventing efficient recovery. Relying on recently developed random tensor tools, this paper deals precisely with the non-orthogonal case by deriving an asymptotic analysis of a parameterized deflation procedure performed on an order-three and rank-two spiked tensor. Based on this analysis, an efficient tensor deflation algorithm is proposed by optimizing the parameter introduced in the deflation mechanism, which in turn is proven to be optimal by construction for the studied tensor model. The same ideas could be extended to more general low-rank tensor models, e.g., higher ranks and orders, leading to more efficient tensor methods with a broader impact on machine learning and beyond. △ Less

Submitted 16 March, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

arXiv:2211.09004 [pdf, other]

On the Accuracy of Hotelling-Type Tensor Deflation: A Random Tensor Analysis

Authors: Mohamed El Amine Seddik, Maxime Guillaud, Alexis Decurninge

Abstract: Leveraging on recent advances in random tensor theory, we consider in this paper a rank-$r$ asymmetric spiked tensor model of the form $\sum_{i=1}^r β_i A_i + W$ where $β_i\geq 0$ and the $A_i$'s are rank-one tensors such that $\langle A_i, A_j \rangle\in [0, 1]$ for $i\neq j$, based on which we provide an asymptotic study of Hotelling-type tensor deflation in the large dimensional regime. Specifi… ▽ More Leveraging on recent advances in random tensor theory, we consider in this paper a rank-$r$ asymmetric spiked tensor model of the form $\sum_{i=1}^r β_i A_i + W$ where $β_i\geq 0$ and the $A_i$'s are rank-one tensors such that $\langle A_i, A_j \rangle\in [0, 1]$ for $i\neq j$, based on which we provide an asymptotic study of Hotelling-type tensor deflation in the large dimensional regime. Specifically, our analysis characterizes the singular values and alignments at each step of the deflation procedure, for asymptotically large tensor dimensions. This can be used to construct consistent estimators of different quantities involved in the underlying problem, such as the signal-to-noise ratios $β_i$ or the alignments between the different signal components $\langle A_i, A_j \rangle$. △ Less

Submitted 16 November, 2022; originally announced November 2022.

arXiv:2112.12348 [pdf, other]

When Random Tensors meet Random Matrices

Authors: Mohamed El Amine Seddik, Maxime Guillaud, Romain Couillet

Abstract: Relying on random matrix theory (RMT), this paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise. Using the variational definition of the singular vectors and values of (Lim, 2005), we show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric \textit{block-wise} random matrix, that is constructed from \textit{contractions} of… ▽ More Relying on random matrix theory (RMT), this paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise. Using the variational definition of the singular vectors and values of (Lim, 2005), we show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric \textit{block-wise} random matrix, that is constructed from \textit{contractions} of the studied tensor with the singular vectors associated to its best rank-1 approximation. Our approach allows the exact characterization of the almost sure asymptotic singular value and alignments of the corresponding singular vectors with the true spike components, when $\frac{n_i}{\sum_{j=1}^d n_j}\to c_i\in (0, 1)$ with $n_i$'s the tensor dimensions. In contrast to other works that rely mostly on tools from statistical physics to study random tensors, our results rely solely on classical RMT tools such as Stein's lemma. Finally, classical RMT results concerning spiked random matrices are recovered as a particular case. △ Less

Submitted 19 November, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

arXiv:2109.01785 [pdf, other]

Node Feature Kernels Increase Graph Convolutional Network Robustness

Authors: Mohamed El Amine Seddik, Changmin Wu, Johannes F. Lutzeyer, Michalis Vazirgiannis

Abstract: The robustness of the much-used Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance. In this paper, the random GCN is introduced for which a random matrix theory analysis is possible. This analysis suggests that if the graph is sufficiently perturbed, or in the extreme case random, then the GCN fails to benefit from the node features. It… ▽ More The robustness of the much-used Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance. In this paper, the random GCN is introduced for which a random matrix theory analysis is possible. This analysis suggests that if the graph is sufficiently perturbed, or in the extreme case random, then the GCN fails to benefit from the node features. It is furthermore observed that enhancing the message passing step in GCNs by adding the node feature kernel to the adjacency matrix of the graph structure solves this problem. An empirical study of a GCN utilised for node classification on six real datasets further confirms the theoretical findings and demonstrates that perturbations of the graph structure can result in GCNs performing significantly worse than Multi-Layer Perceptrons run on the node features alone. In practice, adding a node feature kernel to the message passing of perturbed graphs results in a significant improvement of the GCN's performance, thereby rendering it more robust to graph perturbations. Our code is publicly available at:https://github.com/ChangminWu/RobustGCN. △ Less

Submitted 21 February, 2022; v1 submitted 4 September, 2021; originally announced September 2021.

Comments: 16 pages, 5 figures

arXiv:2001.08370 [pdf, other]

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Authors: Mohamed El Amine Seddik, Cosme Louart, Mohamed Tamaazousti, Romain Couillet

Abstract: This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture m… ▽ More This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks. △ Less

Submitted 21 January, 2020; originally announced January 2020.

Showing 1–10 of 10 results for author: Seddik, M E A