-
Alternating minimization for generalized rank one matrix sensing: Sharp predictions from a random initialization
Authors:
Kabir Aladin Chandrasekher,
Mengqi Lou,
Ashwin Pananjady
Abstract:
We consider the problem of estimating the factors of a rank-$1$ matrix with i.i.d. Gaussian, rank-$1$ measurements that are nonlinearly transformed and corrupted by noise. Considering two prototypical choices for the nonlinearity, we study the convergence properties of a natural alternating update rule for this nonconvex optimization problem starting from a random initialization. We show sharp con…
▽ More
We consider the problem of estimating the factors of a rank-$1$ matrix with i.i.d. Gaussian, rank-$1$ measurements that are nonlinearly transformed and corrupted by noise. Considering two prototypical choices for the nonlinearity, we study the convergence properties of a natural alternating update rule for this nonconvex optimization problem starting from a random initialization. We show sharp convergence guarantees for a sample-split version of the algorithm by deriving a deterministic recursion that is accurate even in high-dimensional problems. Notably, while the infinite-sample population update is uninformative and suggests exact recovery in a single step, the algorithm -- and our deterministic prediction -- converges geometrically fast from a random initialization. Our sharp, non-asymptotic analysis also exposes several other fine-grained properties of this problem, including how the nonlinearity and noise level affect convergence behavior.
On a technical level, our results are enabled by showing that the empirical error recursion can be predicted by our deterministic sequence within fluctuations of the order $n^{-1/2}$ when each iteration is run with $n$ observations. Our technique leverages leave-one-out tools originating in the literature on high-dimensional $M$-estimation and provides an avenue for sharply analyzing higher-order iterative algorithms from a random initialization in other high-dimensional optimization problems with random data.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
Sharp global convergence guarantees for iterative nonconvex optimization: A Gaussian process perspective
Authors:
Kabir Aladin Chandrasekher,
Ashwin Pananjady,
Christos Thrampoulidis
Abstract:
We consider a general class of regression models with normally distributed covariates, and the associated nonconvex problem of fitting these models from data. We develop a general recipe for analyzing the convergence of iterative algorithms for this task from a random initialization. In particular, provided each iteration can be written as the solution to a convex optimization problem satisfying s…
▽ More
We consider a general class of regression models with normally distributed covariates, and the associated nonconvex problem of fitting these models from data. We develop a general recipe for analyzing the convergence of iterative algorithms for this task from a random initialization. In particular, provided each iteration can be written as the solution to a convex optimization problem satisfying some natural conditions, we leverage Gaussian comparison theorems to derive a deterministic sequence that provides sharp upper and lower bounds on the error of the algorithm with sample-splitting. Crucially, this deterministic sequence accurately captures both the convergence rate of the algorithm and the eventual error floor in the finite-sample regime, and is distinct from the commonly used "population" sequence that results from taking the infinite-sample limit. We apply our general framework to derive several concrete consequences for parameter estimation in popular statistical models including phase retrieval and mixtures of regressions. Provided the sample size scales near-linearly in the dimension, we show sharp global convergence rates for both higher-order algorithms based on alternating updates and first-order algorithms based on subgradient descent. These corollaries, in turn, yield multiple consequences, including: (a) Proof that higher-order algorithms can converge significantly faster than their first-order counterparts (and sometimes super-linearly), even if the two share the same population update and (b) Intricacies in super-linear convergence behavior for higher-order algorithms, which can be nonstandard (e.g., with exponent 3/2) and sensitive to the noise level in the problem. We complement these results with extensive numerical experiments, which show excellent agreement with our theoretical predictions.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Finding Planted Cliques in Sublinear Time
Authors:
Jay Mardia,
Hilal Asi,
Kabir Aladin Chandrasekher
Abstract:
We study the planted clique problem in which a clique of size k is planted in an Erdos-Renyi graph G(n,1/2) and one is interested in recovering this planted clique. It is widely believed that it exhibits a statistical-computational gap when computational efficiency is equated with the existence of polynomial time algorithms. We study this problem under a more fine-grained computational lens and co…
▽ More
We study the planted clique problem in which a clique of size k is planted in an Erdos-Renyi graph G(n,1/2) and one is interested in recovering this planted clique. It is widely believed that it exhibits a statistical-computational gap when computational efficiency is equated with the existence of polynomial time algorithms. We study this problem under a more fine-grained computational lens and consider the following two questions.
1. Do there exist sublinear time algorithms for recovering the planted clique?
2. What is the smallest running time any algorithm can hope to have?
We show that because of a well known clique-completion property, very elementary sublinear time recovery algorithms do indeed exist for clique sizes k = ω(\sqrt{n}). This points to a qualitatively stronger statistical-computational gap. The planted clique recovery problem can be solved without even looking at most of the input above the Θ(\sqrt{n}) threshold and cannot be solved by any efficient algorithm below it.
A running time lower bound for the recovery problem follows easily from the results of [RS19], and this implies our recovery algorithms are optimal whenever k = Ω(n^{2/3}). However, for k = o(n^{2/3}) there is a gap between our algorithmic upper bound and the information-theoretic lower bound implied by [RS19].
With some caveats, we show stronger detection lower bounds based on the Planted Clique Conjecture for a natural but restricted class of algorithms. The key idea is to relate very fast sublinear time algorithms for detecting large planted cliques to polynomial time algorithms for detecting small planted cliques.
△ Less
Submitted 17 October, 2022; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Imputation for High-Dimensional Linear Regression
Authors:
Kabir Aladin Chandrasekher,
Ahmed El Alaoui,
Andrea Montanari
Abstract:
We study high-dimensional regression with missing entries in the covariates. A common strategy in practice is to \emph{impute} the missing entries with an appropriate substitute and then implement a standard statistical procedure acting as if the covariates were fully observed. Recent literature on this subject proposes instead to design a specific, often complicated or non-convex, algorithm tailo…
▽ More
We study high-dimensional regression with missing entries in the covariates. A common strategy in practice is to \emph{impute} the missing entries with an appropriate substitute and then implement a standard statistical procedure acting as if the covariates were fully observed. Recent literature on this subject proposes instead to design a specific, often complicated or non-convex, algorithm tailored to the case of missing covariates. We investigate a simpler approach where we fill-in the missing entries with their conditional mean given the observed covariates. We show that this imputation scheme coupled with standard off-the-shelf procedures such as the LASSO and square-root LASSO retains the minimax estimation rate in the random-design setting where the covariates are i.i.d.\ sub-Gaussian. We further show that the square-root LASSO remains \emph{pivotal} in this setting.
It is often the case that the conditional expectation cannot be computed exactly and must be approximated from data. We study two cases where the covariates either follow an autoregressive (AR) process, or are jointly Gaussian with sparse precision matrix. We propose tractable estimators for the conditional expectation and then perform linear regression via LASSO, and show similar estimation rates in both cases. We complement our theoretical results with simulations on synthetic and semi-synthetic examples, illustrating not only the sharpness of our bounds, but also the broader utility of this strategy beyond our theoretical assumptions.
△ Less
Submitted 24 January, 2020;
originally announced January 2020.
-
Maximizing Road Capacity Using Cars that Influence People
Authors:
Daniel A. Lazar,
Kabir Chandrasekher,
Ramtin Pedarsani,
Dorsa Sadigh
Abstract:
The emerging technology enabling autonomy in vehicles has led to a variety of new problems in transportation networks, such as planning and perception for autonomous vehicles. Other works consider social objectives such as decreasing fuel consumption and travel time by platooning. However, these strategies are limited by the actions of the surrounding human drivers. In this paper, we consider proa…
▽ More
The emerging technology enabling autonomy in vehicles has led to a variety of new problems in transportation networks, such as planning and perception for autonomous vehicles. Other works consider social objectives such as decreasing fuel consumption and travel time by platooning. However, these strategies are limited by the actions of the surrounding human drivers. In this paper, we consider proactively achieving these social objectives by influencing human behavior through planned interactions. Our key insight is that we can use these social objectives to design local interactions that influence human behavior to achieve these goals. To this end, we characterize the increase in road capacity afforded by platooning, as well as the vehicle configuration that maximizes road capacity. We present a novel algorithm that uses a low-level control framework to leverage local interactions to optimally rearrange vehicles. We showcase our algorithm using a simulated road shared between autonomous and human-driven vehicles, in which we illustrate the reordering in action.
△ Less
Submitted 9 October, 2018; v1 submitted 11 July, 2018;
originally announced July 2018.
-
Density Evolution on a Class of Smeared Random Graphs: A Theoretical Framework for Fast MRI
Authors:
Kabir Chandrasekher,
Orhan Ocal,
Kannan Ramchandran
Abstract:
We introduce a new ensemble of random bipartite graphs, which we term the `smearing ensemble', where each left node is connected to some number of consecutive right nodes. Such graphs arise naturally in the recovery of sparse wavelet coefficients when signal acquisition is in the Fourier domain, such as in magnetic resonance imaging (MRI). Graphs from this ensemble exhibit small, structured cycles…
▽ More
We introduce a new ensemble of random bipartite graphs, which we term the `smearing ensemble', where each left node is connected to some number of consecutive right nodes. Such graphs arise naturally in the recovery of sparse wavelet coefficients when signal acquisition is in the Fourier domain, such as in magnetic resonance imaging (MRI). Graphs from this ensemble exhibit small, structured cycles with high probability, rendering current techniques for determining iterative decoding thresholds inapplicable. In this paper, we develop a theoretical platform to analyze and evaluate the effects of smearing-based structure. Despite the existence of these small cycles, we derive exact density evolution recurrences for iterative decoding on graphs with smear-length two. Further, we give lower bounds on the performance of a much larger class from the smearing ensemble, and provide numerical experiments showing tight agreement between empirical thresholds and those determined by our bounds. Finally, we describe a system architecture to recover sparse wavelet representations in the MRI setting, giving explicit thresholds on the minimum number of Fourier samples needing to be acquired for the $1$-stage Haar wavelet setting. In particular, we show that $K$-sparse $1$-stage Haar wavelet coefficients of an $n$-dimensional signal can be recovered using $2.63K$ Fourier domain samples asymptotically using $\mathcal{O}(K\log{K})$ operations.
△ Less
Submitted 6 May, 2017;
originally announced May 2017.