Search | arXiv e-print repository

One-Bit Total Variation Denoising over Networks with Applications to Partially Observed Epidemics

Authors: Claire Donnat, Olga Klopp, Nicolas Verzelen

Abstract: This paper introduces a novel approach for epidemic nowcasting and forecasting over networks using total variation (TV) denoising, a method inspired by classical signal processing techniques. Considering a network that models a population as a set of $n$ nodes characterized by their infection statuses $Y_i$ and that represents contacts as edges, we prove the consistency of graph-TV denoising for e… ▽ More This paper introduces a novel approach for epidemic nowcasting and forecasting over networks using total variation (TV) denoising, a method inspired by classical signal processing techniques. Considering a network that models a population as a set of $n$ nodes characterized by their infection statuses $Y_i$ and that represents contacts as edges, we prove the consistency of graph-TV denoising for estimating the underlying infection probabilities $\{p_i\}_{ i \in \{1,\cdots, n\}}$ in the presence of Bernoulli noise. Our results provide an important extension of existing bounds derived in the Gaussian case to the study of binary variables -- an approach hereafter referred to as one-bit total variation denoising. The methodology is further extended to handle incomplete observations, thereby expanding its relevance to various real-world situations where observations over the full graph may not be accessible. Focusing on the context of epidemics, we establish that one-bit total variation denoising enhances both nowcasting and forecasting accuracy in networks, as further evidenced by comprehensive numerical experiments and two real-world examples. The contributions of this paper lie in its theoretical developments, particularly in addressing the incomplete data case, thereby paving the way for more precise epidemic modelling and enhanced surveillance strategies in practical settings. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.17209 [pdf, other]

Generalized multi-view model: Adaptive density estimation under low-rank constraints

Authors: Julien Chhor, Olga Klopp, Alexandre Tsybakov

Abstract: We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix. In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $β$-H{ö}lder and can… ▽ More We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix. In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $β$-H{ö}lder and can be decomposed as a sum of $K$ components, each of which is a product of one-dimensional functions. In both settings, we propose estimators that achieve, up to logarithmic factors, the minimax optimal convergence rates under such low-rank constraints. In the discrete case, the proposed estimator is adaptive to the rank $K$. In the continuous case, our estimator converges with the $L_1$ rate $\min((K/n)^{β/(2β+1)}, n^{-β/(2β+2)})$ up to logarithmic factors, and it is adaptive to the unknown support as well as to the smoothness $β$ and to the unknown number of separable components $K$. We present efficient algorithms for computing our estimators. △ Less

Submitted 18 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

arXiv:2305.00311 [pdf, ps, other]

Change point detection in low-rank VAR processes

Authors: Farida Enikeeva, Olga Klopp, Mathilde Rousselot

Abstract: Vector autoregressive (VAR) models are widely used in multivariate time series analysis for describing the short-time dynamics of the data. The reduced-rank VAR models are of particular interest when dealing with high-dimensional and highly correlated time series. Many results for these models are based on the stationarity assumption that does not hold in several applications when the data exhibit… ▽ More Vector autoregressive (VAR) models are widely used in multivariate time series analysis for describing the short-time dynamics of the data. The reduced-rank VAR models are of particular interest when dealing with high-dimensional and highly correlated time series. Many results for these models are based on the stationarity assumption that does not hold in several applications when the data exhibits structural breaks. We consider a low-rank piecewise stationary VAR model with possible changes in the transition matrix of the observed process. We develop a new test of presence of a change-point in the transition matrix and show its minimax optimality with respect to the dimension and the sample size. Our two-step change-point detection strategy is based on the construction of estimators for the transition matrices and using them in a penalized version of the likelihood ratio test statistic. The effectiveness of the proposed procedure is illustrated on synthetic data. △ Less

Submitted 29 April, 2023; originally announced May 2023.

arXiv:2111.03305 [pdf, other]

Optimality of variational inference for stochastic block model with missing links

Authors: Solenne Gaucher, Olga Klopp

Abstract: Variational methods are extremely popular in the analysis of network data. Statistical guarantees obtained for these methods typically provide asymptotic normality for the problem of estimation of global model parameters under the stochastic block model. In the present work, we consider the case of networks with missing links that is important in application and show that the variational approxima… ▽ More Variational methods are extremely popular in the analysis of network data. Statistical guarantees obtained for these methods typically provide asymptotic normality for the problem of estimation of global model parameters under the stochastic block model. In the present work, we consider the case of networks with missing links that is important in application and show that the variational approximation to the maximum likelihood estimator converges at the minimax rate. This provides the first minimax optimal and tractable estimator for the problem of parameter estimation for the stochastic block model with missing links. We complement our results with numerical studies of simulated and real networks, which confirm the advantages of this estimator over current methods. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2107.03684 [pdf, other]

Assigning Topics to Documents by Successive Projections

Authors: Olga Klopp, Maxim Panov, Suzanne Sigalla, Alexandre Tsybakov

Abstract: Topic models provide a useful tool to organize and understand the structure of large corpora of text documents, in particular, to discover hidden thematic structure. Clustering documents from big unstructured corpora into topics is an important task in various areas, such as image analysis, e-commerce, social networks, population genetics. A common approach to topic modeling is to associate each t… ▽ More Topic models provide a useful tool to organize and understand the structure of large corpora of text documents, in particular, to discover hidden thematic structure. Clustering documents from big unstructured corpora into topics is an important task in various areas, such as image analysis, e-commerce, social networks, population genetics. A common approach to topic modeling is to associate each topic with a probability distribution on the dictionary of words and to consider each document as a mixture of topics. Since the number of topics is typically substantially smaller than the size of the corpus and of the dictionary, the methods of topic modeling can lead to a dramatic dimension reduction. In this paper, we study the problem of estimating topics distribution for each document in the given corpus, that is, we focus on the clustering aspect of the problem. We introduce an algorithm that we call Successive Projection Overlap** Clustering (SPOC) inspired by the Successive Projection Algorithm for separable matrix factorization. This algorithm is simple to implement and computationally fast. We establish theoretical guarantees on the performance of the SPOC algorithm, in particular, near matching minimax upper and lower bounds on its estimation risk. We also propose a new method that estimates the number of topics. We complement our theoretical results with a numerical study on synthetic and semi-synthetic data to analyze the performance of this new algorithm in practice. One of the conclusions is that the error of the algorithm grows at most logarithmically with the size of the dictionary, in contrast to what one observes for Latent Dirichlet Allocation. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.14470 [pdf, ps, other]

Change-Point Detection in Dynamic Networks with Missing Links

Authors: Farida Enikeeva, Olga Klopp

Abstract: Structural changes occur in dynamic networks quite frequently and its detection is an important question in many situations such as fraud detection or cybersecurity. Real-life networks are often incompletely observed due to individual non-response or network size. In the present paper we consider the problem of change-point detection at a temporal sequence of partially observed networks. The goal… ▽ More Structural changes occur in dynamic networks quite frequently and its detection is an important question in many situations such as fraud detection or cybersecurity. Real-life networks are often incompletely observed due to individual non-response or network size. In the present paper we consider the problem of change-point detection at a temporal sequence of partially observed networks. The goal is to test whether there is a change in the network parameters. Our approach is based on the Matrix CUSUM test statistic and allows growing size of networks. We show that the proposed test is minimax optimal and robust to missing links. We also demonstrate the good behavior of our approach in practice through simulation study and a real-data application. △ Less

Submitted 28 June, 2021; originally announced June 2021.

arXiv:1911.13122 [pdf, other]

Outliers Detection in Networks with Missing Links

Authors: Solenne Gaucher, Olga Klopp, Geneviève Robin

Abstract: Outliers arise in networks due to different reasons such as fraudulent behavior of malicious users or default in measurement instruments and can significantly impair network analyses. In addition, real-life networks are likely to be incompletely observed, with missing links due to individual non-response or machine failures. Identifying outliers in the presence of missing links is therefore a cruc… ▽ More Outliers arise in networks due to different reasons such as fraudulent behavior of malicious users or default in measurement instruments and can significantly impair network analyses. In addition, real-life networks are likely to be incompletely observed, with missing links due to individual non-response or machine failures. Identifying outliers in the presence of missing links is therefore a crucial problem in network analysis. In this work, we introduce a new algorithm to detect outliers in a network that simultaneously predicts the missing links. The proposed method is statistically sound: we prove that, under fairly general assumptions, our algorithm exactly detects the outliers, and achieves the best known error for the prediction of missing links with polynomial computation cost. It is also computationally efficient: we prove sub-linear convergence of our algorithm. We provide a simulation study which demonstrates the good behavior of the algorithm in terms of outliers detection and prediction of the missing links. We also illustrate the method with an application in epidemiology, and with the analysis of a political Twitter network. The method is freely available as an R package on the Comprehensive R Archive Network. △ Less

Submitted 1 December, 2020; v1 submitted 29 November, 2019; originally announced November 2019.

arXiv:1902.10605 [pdf, other]

Maximum Likelihood Estimation of Sparse Networks with Missing Observations

Authors: Solenne Gaucher, Olga Klopp

Abstract: Estimating the matrix of connections probabilities is one of the key questions when studying sparse networks. In this work, we consider networks generated under the sparse graphon model and the in-homogeneous random graph model with missing observations. Using the Stochastic Block Model as a parametric proxy, we bound the risk of the maximum likelihood estimator of network connections probabilitie… ▽ More Estimating the matrix of connections probabilities is one of the key questions when studying sparse networks. In this work, we consider networks generated under the sparse graphon model and the in-homogeneous random graph model with missing observations. Using the Stochastic Block Model as a parametric proxy, we bound the risk of the maximum likelihood estimator of network connections probabilities , and show that it is minimax optimal. When risk is measured in Frobenius norm, no estimator running in polynomial time has been shown to attain the minimax optimal rate of convergence for this problem. Thus, maximum likelihood estimation is of particular interest as computationally efficient approximations to it have been proposed in the literature and are often used in practice. △ Less

Submitted 27 April, 2021; v1 submitted 27 February, 2019; originally announced February 2019.

Comments: We derive a variational approximation to the maximum likelihood estimator of the connection probabilities. We bound the risk of this tractable estimator

arXiv:1812.08398 [pdf, other]

Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames

Authors: Geneviève Robin, Hoi-To Wai, Julie Josse, Olga Klopp, Éric Moulines

Abstract: Many applications of machine learning involve the analysis of large data frames-matrices collecting heterogeneous measurements (binary, numerical, counts, etc.) across samples-with missing values. Low-rank models, as studied by Udell et al. [30], are popular in this framework for tasks such as visualization, clustering and missing value imputation. Yet, available methods with statistical guarantee… ▽ More Many applications of machine learning involve the analysis of large data frames-matrices collecting heterogeneous measurements (binary, numerical, counts, etc.) across samples-with missing values. Low-rank models, as studied by Udell et al. [30], are popular in this framework for tasks such as visualization, clustering and missing value imputation. Yet, available methods with statistical guarantees and efficient optimization do not allow explicit modeling of main additive effects such as row and column, or covariate effects. In this paper, we introduce a low-rank interaction and sparse additive effects (LORIS) model which combines matrix regression on a dictionary and low-rank design, to estimate main effects and interactions simultaneously. We provide statistical guarantees in the form of upper bounds on the estimation error of both components. Then, we introduce a mixed coordinate gradient descent (MCGD) method which provably converges sub-linearly to an optimal solution and is computationally efficient for large scale data sets. We show on simulated and survey data that the method has a clear advantage over current practices, which consist in dealing separately with additive effects in a preprocessing step. △ Less

Submitted 20 December, 2018; originally announced December 2018.

arXiv:1807.09010 [pdf, other]

Collective Matrix Completion

Authors: Mokhtar Z. Alaya, Olga Klopp

Abstract: Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we… ▽ More Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then, we relax the assumption of exponential family distribution for the noise and we investigate the distribution-free case. In this setting, we do not assume any specific model for the observations. The estimation procedures are based on minimizing the sum of a goodness-of-fit term and the nuclear norm penalization of the whole collective matrix. We prove that the proposed estimators achieve fast rates of convergence under the two considered settings and we corroborate our results with numerical experiments. △ Less

Submitted 21 October, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

arXiv:1806.09734 [pdf, other]

Main effects and interactions in mixed and incomplete data frames

Authors: Geneviève Robin, Olga Klopp, Julie Josse, Éric Moulines, Robert Tibshirani

Abstract: A mixed data frame (MDF) is a table collecting categorical, numerical and count observations. The use of MDF is widespread in statistics and the applications are numerous from abundance data in ecology to recommender systems. In many cases, an MDF exhibits simultaneously main effects, such as row, column or group effects and interactions, for which a low-rank model has often been suggested. Althou… ▽ More A mixed data frame (MDF) is a table collecting categorical, numerical and count observations. The use of MDF is widespread in statistics and the applications are numerous from abundance data in ecology to recommender systems. In many cases, an MDF exhibits simultaneously main effects, such as row, column or group effects and interactions, for which a low-rank model has often been suggested. Although the literature on low-rank approximations is very substantial, with few exceptions, existing methods do not allow to incorporate main effects and interactions while providing statistical guarantees. The present work fills this gap. We propose an estimation method which allows to recover simultaneously the main effects and the interactions. We show that our method is near optimal under conditions which are met in our targeted applications. We also propose an optimization algorithm which provably converges to an optimal solution. Numerical experiments reveal that our method, mimi, performs well when the main effects are sparse and the interaction matrix has low-rank. We also show that mimi compares favorably to existing methods, in particular when the main effects are significantly large compared to the interactions, and when the proportion of missing entries is large. The method is available as an R package on the Comprehensive R Archive Network. △ Less

Submitted 26 March, 2019; v1 submitted 25 June, 2018; originally announced June 2018.

Comments: 25 pages, 1 figure, 4 tables

arXiv:1707.02090 [pdf, ps, other]

Structured Matrix Estimation and Completion

Authors: Olga Klopp, Yu Lu, Alexandre B. Tsybakov, Harrison H. Zhou

Abstract: We study the problem of matrix estimation and matrix completion under a general framework. This framework includes several important models as special cases such as the gaussian mixture model, mixed membership model, bi-clustering model and dictionary learning. We consider the optimal convergence rates in a minimax sense for estimation of the signal matrix under the Frobenius norm and under the sp… ▽ More We study the problem of matrix estimation and matrix completion under a general framework. This framework includes several important models as special cases such as the gaussian mixture model, mixed membership model, bi-clustering model and dictionary learning. We consider the optimal convergence rates in a minimax sense for estimation of the signal matrix under the Frobenius norm and under the spectral norm. As a consequence of our general result we obtain minimax optimal rates of convergence for various special models. △ Less

Submitted 7 July, 2017; originally announced July 2017.

arXiv:1704.02760 [pdf, ps, other]

Constructing confidence sets for the matrix completion problem

Authors: Alexandra Carpentier, Olga Klopp, Matthias Löffler

Abstract: In the present note we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a realizable method for constructing confidence sets that adapt to the unknown rank of the true matrix. In the present note we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a realizable method for constructing confidence sets that adapt to the unknown rank of the true matrix. △ Less

Submitted 10 April, 2017; originally announced April 2017.

arXiv:1703.05101 [pdf, ps, other]

Optimal graphon estimation in cut distance

Authors: Olga Klopp, Nicolas Verzelen

Abstract: Consider the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a W-random graph. We establish the minimax estimation rates with respect to the cut metric for classes of block constant matrices and step function graphons. Surprisingly, our results imply that, from the minimax point of view, the raw data, that is, the adjacency matrix o… ▽ More Consider the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a W-random graph. We establish the minimax estimation rates with respect to the cut metric for classes of block constant matrices and step function graphons. Surprisingly, our results imply that, from the minimax point of view, the raw data, that is, the adjacency matrix of the observed graph, is already optimal and more involved procedures cannot improve the convergence rates for this metric. This phenomenon contrasts with optimal rates of convergence with respect to other classical distances for graphons such as the l 1 or l 2 metrics. △ Less

Submitted 16 October, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

arXiv:1608.04861 [pdf, ps, other]

Adaptive confidence sets for matrix completion

Authors: Alexandra Carpentier, Olga Klopp, Matthias Löffler, Richard Nickl

Abstract: In the present paper we study the problem of existence of honest and adaptive confidence sets for matrix completion. We consider two statistical models: the trace regression model and the Bernoulli model. In the trace regression model, we show that honest confidence sets that adapt to the unknown rank of the matrix exist even when the error variance is unknown. Contrary to this, we prove that in t… ▽ More In the present paper we study the problem of existence of honest and adaptive confidence sets for matrix completion. We consider two statistical models: the trace regression model and the Bernoulli model. In the trace regression model, we show that honest confidence sets that adapt to the unknown rank of the matrix exist even when the error variance is unknown. Contrary to this, we prove that in the Bernoulli model, honest and adaptive confidence sets exist only when the error variance is known a priori. In the course of our proofs we obtain bounds for the minimax rates of certain composite hypothesis testing problems arising in low rank inference. △ Less

Submitted 6 February, 2017; v1 submitted 17 August, 2016; originally announced August 2016.

arXiv:1509.00319 [pdf, ps, other]

Estimation of matrices with row sparsity

Authors: O. Klopp, A. B. Tsybakov

Abstract: An increasing number of applications is concerned with recovering a sparse matrix from noisy observations. In this paper, we consider the setting where each row of the unknown matrix is sparse. We establish minimax optimal rates of convergence for estimating matrices with row sparsity. A major focus in the present paper is on the derivation of lower bounds. An increasing number of applications is concerned with recovering a sparse matrix from noisy observations. In this paper, we consider the setting where each row of the unknown matrix is sparse. We establish minimax optimal rates of convergence for estimating matrices with row sparsity. A major focus in the present paper is on the derivation of lower bounds. △ Less

Submitted 1 September, 2015; originally announced September 2015.

arXiv:1507.04118 [pdf, ps, other]

Oracle inequalities for network models and sparse graphon estimation

Authors: Olga Klopp, Alexandre B. Tsybakov, Nicolas Verzelen

Abstract: Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities --… ▽ More Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities -- the ordinary block constant least squares estimator, and its restricted version. We show that they satisfy oracle inequalities with respect to the block constant oracle. As a consequence, we derive optimal rates of estimation of the probability matrix. Our results cover the important setting of sparse networks. Another consequence consists in establishing upper bounds on the minimax risks for graphon estimation in the $L\_2$ norm when the probability matrix is sampled according to a graphon model. These bounds include an additional term accounting for the "agnostic" error induced by the variability of the latent unobserved variables of the graphon model. In this setting, the optimal rates are influenced not only by the bias and variance components as in usual nonparametric problems but also include the third component, which is the agnostic error. The results shed light on the differences between estimation under the empirical loss (the probability matrix estimation) and under the integrated loss (the graphon estimation). △ Less

Submitted 13 September, 2017; v1 submitted 15 July, 2015; originally announced July 2015.

Comments: Annals of Statistics, Institute of Mathematical Statistics, 2017

arXiv:1502.00146 [pdf, ps, other]

Matrix completion by singular value thresholding: sharp bounds

Authors: Olga Klopp

Abstract: We consider the matrix completion problem where the aim is to esti-mate a large data matrix for which only a relatively small random subset of its entries is observed. Quite popular approaches to matrix completion problem are iterative thresholding methods. In spite of their empirical success, the theoretical guarantees of such iterative thresholding methods are poorly understood. The goal of this… ▽ More We consider the matrix completion problem where the aim is to esti-mate a large data matrix for which only a relatively small random subset of its entries is observed. Quite popular approaches to matrix completion problem are iterative thresholding methods. In spite of their empirical success, the theoretical guarantees of such iterative thresholding methods are poorly understood. The goal of this paper is to provide strong theo-retical guarantees, similar to those obtained for nuclear-norm penalization methods and one step thresholding methods, for an iterative thresholding algorithm which is a modification of the softImpute algorithm. An im-portant consequence of our result is the exact minimax optimal rates of convergence for matrix completion problem which were known until know only up to a logarithmic factor. △ Less

Submitted 31 January, 2015; originally announced February 2015.

arXiv:1412.8132 [pdf, ps, other]

Robust Matrix Completion

Authors: Olga Klopp, Karim Lounici, Alexandre B. Tsybakov

Abstract: This paper considers the problem of recovery of a low-rank matrix in the situation when most of its entries are not observed and a fraction of observed entries are corrupted. The observations are noisy realizations of the sum of a low rank matrix, which we wish to recover, with a second matrix having a complementary sparse structure such as element-wise or column-wise sparsity. We analyze a class… ▽ More This paper considers the problem of recovery of a low-rank matrix in the situation when most of its entries are not observed and a fraction of observed entries are corrupted. The observations are noisy realizations of the sum of a low rank matrix, which we wish to recover, with a second matrix having a complementary sparse structure such as element-wise or column-wise sparsity. We analyze a class of estimators obtained by solving a constrained convex optimization problem that combines the nuclear norm and a convex relaxation for a sparse constraint. Our results are obtained for the simultaneous presence of random and deterministic patterns in the sampling scheme. We provide guarantees for recovery of low-rank and sparse components from partial and corrupted observations in the presence of noise and show that the obtained rates of convergence are minimax optimal. △ Less

Submitted 4 July, 2016; v1 submitted 28 December, 2014; originally announced December 2014.

arXiv:1412.2632 [pdf, ps, other]

Probabilistic low-rank matrix completion on finite alphabets

Authors: Jean Lafond, Olga Klopp, Eric Moulines, Jospeh Salmon

Abstract: The task of reconstructing a matrix given a sample of observedentries is known as the matrix completion problem. It arises ina wide range of problems, including recommender systems, collaborativefiltering, dimensionality reduction, image processing, quantum physics or multi-class classificationto name a few. Most works have focused on recovering an unknown real-valued low-rankmatrix from randomly… ▽ More The task of reconstructing a matrix given a sample of observedentries is known as the matrix completion problem. It arises ina wide range of problems, including recommender systems, collaborativefiltering, dimensionality reduction, image processing, quantum physics or multi-class classificationto name a few. Most works have focused on recovering an unknown real-valued low-rankmatrix from randomly sub-sampling its entries.Here, we investigate the case where the observations take a finite number of values, corresponding for examples to ratings in recommender systems or labels in multi-class classification.We also consider a general sampling scheme (not necessarily uniform) over the matrix entries.The performance of a nuclear-norm penalized estimator is analyzed theoretically.More precisely, we derive bounds for the Kullback-Leibler divergence between the true and estimated distributions.In practice, we have also proposed an efficient algorithm based on lifted coordinate gradient descent in order to tacklepotentially high dimensional settings. △ Less

Submitted 8 December, 2014; originally announced December 2014.

Comments: arXiv admin note: text overlap with arXiv:1408.6218

Journal ref: NIPS, Dec 2014, Montreal, Canada

arXiv:1408.6218 [pdf, ps, other]

Adaptive Multinomial Matrix Completion

Authors: Olga Klopp, Jean Lafond, Eric Moulines, Joseph Salmon

Abstract: The task of estimating a matrix given a sample of observed entries is known as the \emph{matrix completion problem}. Most works on matrix completion have focused on recovering an unknown real-valued low-rank matrix from a random sample of its entries. Here, we investigate the case of highly quantized observations when the measurements can take only a small number of values. These quantized outputs… ▽ More The task of estimating a matrix given a sample of observed entries is known as the \emph{matrix completion problem}. Most works on matrix completion have focused on recovering an unknown real-valued low-rank matrix from a random sample of its entries. Here, we investigate the case of highly quantized observations when the measurements can take only a small number of values. These quantized outputs are generated according to a probability distribution parametrized by the unknown matrix of interest. This model corresponds, for example, to ratings in recommender systems or labels in multi-class classification. We consider a general, non-uniform, sampling scheme and give theoretical guarantees on the performance of a constrained, nuclear norm penalized maximum likelihood estimator. One important advantage of this estimator is that it does not require knowledge of the rank or an upper bound on the nuclear norm of the unknown matrix and, thus, it is adaptive. We provide lower bounds showing that our estimator is minimax optimal. An efficient algorithm based on lifted coordinate gradient descent is proposed to compute the estimator. A limited Monte-Carlo experiment, using both simulated and real data is provided to support our claims. △ Less

Submitted 26 August, 2014; originally announced August 2014.

arXiv:1312.4087 [pdf, ps, other]

Sparse high-dimensional varying coefficient model: non-asymptotic minimax study

Authors: Olga Klopp, Marianna Pensky

Abstract: The objective of the present paper is to develop a minimax theory for the varying coefficient model in a non-asymptotic setting. We consider a high-dimensional sparse varying coefficient model where only few of the covariates are present and only some of those covariates are time dependent. Our analysis allows the time dependent covariates to have different degrees of smoothness and to be spatia… ▽ More The objective of the present paper is to develop a minimax theory for the varying coefficient model in a non-asymptotic setting. We consider a high-dimensional sparse varying coefficient model where only few of the covariates are present and only some of those covariates are time dependent. Our analysis allows the time dependent covariates to have different degrees of smoothness and to be spatially inhomogeneous. We develop the minimax lower bounds for the quadratic risk and construct an adaptive estimator which attains those lower bounds within a constant (if all time-dependent covariates are spatially homogeneous) or logarithmic factor of the number of observations. △ Less

Submitted 14 May, 2014; v1 submitted 14 December, 2013; originally announced December 2013.

Comments: 27 pages

MSC Class: 62H12; 62J05; 62C20

arXiv:1211.3394 [pdf, ps, other]

Non-asymptotic approach to varying coefficient model

Authors: Olga Klopp, Marianna Pensky

Abstract: In the present paper we consider the varying coefficient model which represents a useful tool for exploring dynamic patterns in many applications. Existing methods typically provide asymptotic evaluation of precision of estimation procedures under the assumption that the number of observations tends to infinity. In practical applications, however, only a finite number of measurements are available… ▽ More In the present paper we consider the varying coefficient model which represents a useful tool for exploring dynamic patterns in many applications. Existing methods typically provide asymptotic evaluation of precision of estimation procedures under the assumption that the number of observations tends to infinity. In practical applications, however, only a finite number of measurements are available. In the present paper we focus on a non-asymptotic approach to the problem. We propose a novel estimation procedure which is based on recent developments in matrix estimation. In particular, for our estimator, we obtain upper bounds for the mean squared and the pointwise estimation errors. The obtained oracle inequalities are non-asymptotic and hold for finite sample size. △ Less

Submitted 6 February, 2013; v1 submitted 14 November, 2012; originally announced November 2012.

arXiv:1203.0108 [pdf, ps, other]

doi 10.3150/12-BEJ486

Noisy low-rank matrix completion with general sampling distribution

Authors: Olga Klopp

Abstract: In the present paper, we consider the problem of matrix completion with noise. Unlike previous works, we consider quite general sampling distribution and we do not need to know or to estimate the variance of the noise. Two new nuclear-norm penalized estimators are proposed, one of them of "square-root" type. We analyse their performance under high-dimensional scaling and provide non-asymptotic bou… ▽ More In the present paper, we consider the problem of matrix completion with noise. Unlike previous works, we consider quite general sampling distribution and we do not need to know or to estimate the variance of the noise. Two new nuclear-norm penalized estimators are proposed, one of them of "square-root" type. We analyse their performance under high-dimensional scaling and provide non-asymptotic bounds on the Frobenius norm error. Up to a logarithmic factor, these performance guarantees are minimax optimal in a number of circumstances. △ Less

Submitted 5 February, 2014; v1 submitted 1 March, 2012; originally announced March 2012.

Comments: Published in at http://dx.doi.org/10.3150/12-BEJ486 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ486

Journal ref: Bernoulli 2014, Vol. 20, No. 1, 282-303

arXiv:1112.3055 [pdf, other]

High dimensional matrix estimation with unknown variance of the noise

Authors: Olga Klopp, Stéphane Gaiffas

Abstract: We propose a new pivotal method for estimating high-dimensional matrices. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix $A\_0$ corrupted by noise. We propose a new method for estimating $A\_0$ which does not rely on the knowledge or an estimation of the standard deviation of the noise $σ$. Our estimator achieves, up to a logarithmic factor, op… ▽ More We propose a new pivotal method for estimating high-dimensional matrices. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix $A\_0$ corrupted by noise. We propose a new method for estimating $A\_0$ which does not rely on the knowledge or an estimation of the standard deviation of the noise $σ$. Our estimator achieves, up to a logarithmic factor, optimal rates of convergence under the Frobenius risk and, thus, has the same prediction performance as previously proposed estimators which rely on the knowledge of $σ$. Our method is based on the solution of a convex optimization problem which makes it computationally attractive. △ Less

Submitted 31 January, 2015; v1 submitted 13 December, 2011; originally announced December 2011.

arXiv:1104.1244 [pdf, ps, other]

Rank penalized estimators for high-dimensional matrices

Authors: Olga Klopp

Abstract: In this paper we consider the trace regression model. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix $A_0$ corrupted by noise. We propose a new rank penalized estimator of $A_0$. For this estimator we establish general oracle inequality for the prediction error both in probability and in expectation. We also prove upper bounds for the rank of o… ▽ More In this paper we consider the trace regression model. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix $A_0$ corrupted by noise. We propose a new rank penalized estimator of $A_0$. For this estimator we establish general oracle inequality for the prediction error both in probability and in expectation. We also prove upper bounds for the rank of our estimator. Then, we apply our general results to the problems of matrix completion and matrix regression. In these cases our estimator has a particularly simple form: it is obtained by hard thresholding of the singular values of a matrix constructed from the observations. △ Less

Submitted 12 September, 2011; v1 submitted 7 April, 2011; originally announced April 2011.

Comments: We added a new section on matrix regression

Showing 1–26 of 26 results for author: Klopp, O