Search | arXiv e-print repository

Optimized Gradient Tracking for Decentralized Online Learning

Authors: Shivangi Dubey Sharma, Ketan Rajawat

Abstract: This work considers the problem of decentralized online learning, where the goal is to track the optimum of the sum of time-varying functions, distributed across several nodes in a network. The local availability of the functions and their gradients necessitates coordination and consensus among the nodes. We put forth the Generalized Gradient Tracking (GGT) framework that unifies a number of exist… ▽ More This work considers the problem of decentralized online learning, where the goal is to track the optimum of the sum of time-varying functions, distributed across several nodes in a network. The local availability of the functions and their gradients necessitates coordination and consensus among the nodes. We put forth the Generalized Gradient Tracking (GGT) framework that unifies a number of existing approaches, including the state-of-the-art ones. The performance of the proposed GGT algorithm is theoretically analyzed using a novel semidefinite programming-based analysis that yields the desired regret bounds under very general conditions and without requiring the gradient boundedness assumption. The results are applicable to the special cases of GGT, which include various state-of-the-art algorithms as well as new dynamic versions of various classical decentralized algorithms. To further minimize the regret, we consider a condensed version of GGT with only four free parameters. A procedure for offline tuning of these parameters using only the problem parameters is also detailed. The resulting optimized GGT (oGGT) algorithm not only achieves improved dynamic regret bounds, but also outperforms all state-of-the-art algorithms on both synthetic and real-world datasets. △ Less

Submitted 13 February, 2024; v1 submitted 10 June, 2023; originally announced June 2023.

Comments: 30 pages, 6 Figures

arXiv:2305.17283 [pdf, other]

Sharpened Lazy Incremental Quasi-Newton Method

Authors: Aakash Lahoti, Spandan Senapati, Ketan Rajawat, Alec Koppel

Abstract: The problem of minimizing the sum of $n$ functions in $d$ dimensions is ubiquitous in machine learning and statistics. In many applications where the number of observations $n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of $n$. Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence… ▽ More The problem of minimizing the sum of $n$ functions in $d$ dimensions is ubiquitous in machine learning and statistics. In many applications where the number of observations $n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of $n$. Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence rate. Specifically, they exhibit a superlinear rate with $O(d^2)$ cost in contrast to the linear rate of first-order methods with $O(d)$ cost and the quadratic rate of second-order methods with $O(d^3)$ cost. However, existing incremental methods have notable shortcomings: Incremental Quasi-Newton (IQN) only exhibits asymptotic superlinear convergence. In contrast, Incremental Greedy BFGS (IGS) offers explicit superlinear convergence but suffers from poor empirical performance and has a per-iteration cost of $O(d^3)$. To address these issues, we introduce the Sharpened Lazy Incremental Quasi-Newton Method (SLIQN) that achieves the best of both worlds: an explicit superlinear convergence rate, and superior empirical performance at a per-iteration $O(d^2)$ cost. SLIQN features two key changes: first, it incorporates a hybrid strategy of using both classic and greedy BFGS updates, allowing it to empirically outperform both IQN and IGS. Second, it employs a clever constant multiplicative factor along with a lazy propagation strategy, which enables it to have a cost of $O(d^2)$. Additionally, our experiments demonstrate the superiority of SLIQN over other incremental and stochastic Quasi-Newton variants and establish its competitiveness with second-order incremental methods. △ Less

Submitted 12 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 36 pages, 2 figures; Accepted to AISTATS 2024

arXiv:2305.02041 [pdf, other]

Low-complexity subspace-descent over symmetric positive definite manifold

Authors: Yogesh Darmwal, Ketan Rajawat

Abstract: This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The re… ▽ More This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The resulting updates avoid the costly matrix operations like matrix exponentiation and dense matrix multiplication, which are generally required in almost all other Riemannian optimization algorithms on SPD manifold. We further identify a broad class of functions, arising in diverse applications, such as kernel matrix learning, covariance estimation of Gaussian distributions, maximum likelihood parameter estimation of elliptically contoured distributions, and parameter estimation in Gaussian mixture model problems, over which the Riemannian gradients can be calculated efficiently. The proposed uni-directional and multi-directional Riemannian subspace descent variants incur per-iteration complexities of $O(n)$ and $O(n^2)$ respectively, as compared to the $O(n^3)$ or higher complexity incurred by all existing Riemannian gradient descent variants. The superior runtime and low per-iteration complexity of the proposed algorithms is also demonstrated via numerical tests on large-scale covariance estimation and matrix square root problems. MATLAB code implementation is publicly available on GitHub : https://github.com/yogeshd-iitk/subspace_descent_over_SPD_manifold △ Less

Submitted 18 December, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

arXiv:2206.08829 [pdf, other]

FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning

Authors: Anis Elgabli, Chaouki Ben Issaid, Amrit S. Bedi, Ketan Rajawat, Mehdi Bennis, Vaneet Aggarwal

Abstract: Newton-type methods are popular in federated learning due to their fast convergence. Still, they suffer from two main issues, namely: low communication efficiency and low privacy due to the requirement of sending Hessian information from clients to parameter server (PS). In this work, we introduced a novel framework called FedNew in which there is no need to transmit Hessian information from clien… ▽ More Newton-type methods are popular in federated learning due to their fast convergence. Still, they suffer from two main issues, namely: low communication efficiency and low privacy due to the requirement of sending Hessian information from clients to parameter server (PS). In this work, we introduced a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS, hence resolving the bottleneck to improve communication efficiency. In addition, FedNew hides the gradient information and results in a privacy-preserving approach compared to the existing state-of-the-art. The core novel idea in FedNew is to introduce a two level framework, and alternate between updating the inverse Hessian-gradient product using only one alternating direction method of multipliers (ADMM) step and then performing the global model update using Newton's method. Though only one ADMM pass is used to approximate the inverse Hessian-gradient product at each iteration, we develop a novel theoretical approach to show the converging behavior of FedNew for convex problems. Additionally, a significant reduction in communication overhead is achieved by utilizing stochastic quantization. Numerical results using real datasets show the superiority of FedNew compared to existing methods in terms of communication costs. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2110.11721 [pdf, other]

doi 10.1109/TSP.2023.3234462

Projection-Free Algorithm for Stochastic Bi-level Optimization

Authors: Zeeshan Akhtar, Amrit Singh Bedi, Srujan Teja Thomdapu, Ketan Rajawat

Abstract: This work presents the first projection-free algorithm to solve stochastic bi-level optimization problems, where the objective function depends on the solution of another stochastic optimization problem. The proposed $\textbf{S}$tochastic $\textbf{Bi}$-level $\textbf{F}$rank-$\textbf{W}$olfe ($\textbf{SBFW}$) algorithm can be applied to streaming settings and does not make use of large batches or… ▽ More This work presents the first projection-free algorithm to solve stochastic bi-level optimization problems, where the objective function depends on the solution of another stochastic optimization problem. The proposed $\textbf{S}$tochastic $\textbf{Bi}$-level $\textbf{F}$rank-$\textbf{W}$olfe ($\textbf{SBFW}$) algorithm can be applied to streaming settings and does not make use of large batches or checkpoints. The sample complexity of SBFW is shown to be $\mathcal{O}(ε^{-3})$ for convex objectives and $\mathcal{O}(ε^{-4})$ for non-convex objectives. Improved rates are derived for the stochastic compositional problem, which is a special case of the bi-level problem, and entails minimizing the composition of two expected-value functions. The proposed $\textbf{S}$tochastic $\textbf{C}$ompositional $\textbf{F}$rank-$\textbf{W}$olfe ($\textbf{SCFW}$) is shown to achieve a sample complexity of $\mathcal{O}(ε^{-2})$ for convex objectives and $\mathcal{O}(ε^{-3})$ for non-convex objectives, at par with the state-of-the-art sample complexities for projection-free algorithms solving single-level problems. We demonstrate the advantage of the proposed methods by solving the problem of matrix completion with denoising and the problem of policy value evaluation in reinforcement learning. △ Less

Submitted 3 April, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: 34 Pages

arXiv:2107.06534 [pdf, other]

doi 10.1109/TSP.2022.3162958

Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

Authors: Zeeshan Akhtar, Ketan Rajawat

Abstract: This paper considers stochastic convex optimization problems with two sets of constraints: (a) deterministic constraints on the domain of the optimization variable, which are difficult to project onto; and (b) deterministic or stochastic constraints that admit efficient projection. Problems of this form arise frequently in the context of semidefinite programming as well as when various NP-hard pro… ▽ More This paper considers stochastic convex optimization problems with two sets of constraints: (a) deterministic constraints on the domain of the optimization variable, which are difficult to project onto; and (b) deterministic or stochastic constraints that admit efficient projection. Problems of this form arise frequently in the context of semidefinite programming as well as when various NP-hard problems are solved approximately via semidefinite relaxation. Since projection onto the first set of constraints is difficult, it becomes necessary to explore projection-free algorithms, such as the stochastic Frank-Wolfe (FW) algorithm. On the other hand, the second set of constraints cannot be handled in the same way, and must be incorporated as an indicator function within the objective function, thereby complicating the application of FW methods. Similar problems have been studied before; however, they suffer from slow convergence rates. This work, equipped with momentum based gradient tracking technique, guarantees fast convergence rates on par with the best-known rates for problems without the second set of constraints. Zeroth-order variants of the proposed algorithms are also developed and again improve upon the state-of-the-art rate results. We further propose the novel trimmed FW variants that enjoy the same convergence rates as their classical counterparts, but are empirically shown to require significantly fewer calls to the linear minimization oracle speeding up the overall algorithm. The efficacy of the proposed algorithms is tested on relevant applications of sparse matrix estimation, clustering via semidefinite relaxation, and uniform sparsest cut problem. △ Less

Submitted 18 March, 2022; v1 submitted 14 July, 2021; originally announced July 2021.

arXiv:2106.10435 [pdf, other]

STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning

Authors: Prashant Khanduri, Pranay Sharma, Haibo Yang, Mingyi Hong, Jia Liu, Ketan Rajawat, Pramod K. Varshney

Abstract: Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achiev… ▽ More Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achieve the desired solution. This work addresses the above question and considers a class of stochastic algorithms where the WNs perform a few local updates before communication. We show that when both the WN's and the server's directions are chosen based on a stochastic momentum estimator, the algorithm requires $\tilde{\mathcal{O}}(ε^{-3/2})$ samples and $\tilde{\mathcal{O}}(ε^{-1})$ communication rounds to compute an $ε$-stationary solution. To the best of our knowledge, this is the first FL algorithm that achieves such {\it near-optimal} sample and communication complexities simultaneously. Further, we show that there is a trade-off curve between local update frequencies and local minibatch sizes, on which the above sample and communication complexities can be maintained. Finally, we show that for the classical FedAvg (a.k.a. Local SGD, which is a momentum-less special case of the STEM), a similar trade-off curve exists, albeit with worse sample and communication complexities. Our insights on this trade-off provides guidelines for choosing the four important design elements for FL algorithms, the update frequency, directions, and minibatch sizes to achieve the best performance. △ Less

Submitted 19 June, 2021; originally announced June 2021.

arXiv:2102.01914 [pdf, ps, other]

Optimizing QoS for Erasure-Coded Wireless Data Centers

Authors: Srujan Teja Thomdapu, Ketan Rajawat

Abstract: Cloud computing facilitates the access of applications and data from any location by a distributed storage system. Erasure codes offer better data replication technique with reduced storage costs for more reliability. This paper considers the erasure-coded data center with multiple servers in a wireless network where each is equipped with a base-station. The cause of latency in the file retrieval… ▽ More Cloud computing facilitates the access of applications and data from any location by a distributed storage system. Erasure codes offer better data replication technique with reduced storage costs for more reliability. This paper considers the erasure-coded data center with multiple servers in a wireless network where each is equipped with a base-station. The cause of latency in the file retrieval process is mainly due to queuing delays at each server. This work puts forth a stochastic optimization framework for obtaining the optimal scheduling policy that maximizes users' quality of service (QoS) while adhering to the latency requirements. We further show that the problem has non-linear functions of expectations in objective and constraints and is impossible to solve with traditional SGD like algorithms. We propose a new algorithm that addresses compositional structure in the problem. Further, we show that the proposed algorithm achieves a faster convergence rate than the best-known results. Finally, we test the efficacy of the proposed method in a simulated environment. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: This work is accepted for publication in proceedings of IEEE ICC 2021

arXiv:2012.09400 [pdf, ps, other]

Stochastic Compositional Gradient Descent under Compositional Constraints

Authors: Srujan Teja Thomdapu, Harshvardhan, Ketan Rajawat

Abstract: This work studies constrained stochastic optimization problems where the objective and constraint functions are convex and expressed as compositions of stochastic functions. The problem arises in the context of fair classification, fair regression, and the design of queuing systems. Of particular interest is the large-scale setting where an oracle provides the stochastic gradients of the constitue… ▽ More This work studies constrained stochastic optimization problems where the objective and constraint functions are convex and expressed as compositions of stochastic functions. The problem arises in the context of fair classification, fair regression, and the design of queuing systems. Of particular interest is the large-scale setting where an oracle provides the stochastic gradients of the constituent functions, and the goal is to solve the problem with a minimal number of calls to the oracle. Owing to the compositional form, the stochastic gradients provided by the oracle do not yield unbiased estimates of the objective or constraint gradients. Instead, we construct approximate gradients by tracking the inner function evaluations, resulting in a quasi-gradient saddle point algorithm. We prove that the proposed algorithm is guaranteed to find the optimal and feasible solution almost surely. We further establish that the proposed algorithm requires $\mathcal{O}(1/ε^4)$ data samples in order to obtain an $ε$-approximate optimal point while also ensuring zero constraint violation. The result matches the sample complexity of the stochastic compositional gradient descent method for unconstrained problems and improves upon the best-known sample complexity results for the constrained settings. The efficacy of the proposed algorithm is tested on both fair classification and fair regression problems. The numerical results show that the proposed algorithm outperforms the state-of-the-art algorithms in terms of the convergence rate. △ Less

Submitted 10 September, 2022; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: A part of this work is submitted in Asilomar Conference on Signals, Systems, and Computers

arXiv:2012.04402 [pdf, other]

A Primal-Dual Framework for Decentralized Stochastic Optimization

Authors: Ketan Rajawat, Chirag Kumar

Abstract: We consider the decentralized convex optimization problem, where multiple agents must cooperatively minimize a cumulative objective function, with each local function expressible as an empirical average of data-dependent losses. State-of-the-art approaches for decentralized optimization rely on gradient tracking, where consensus is enforced via a doubly stochastic mixing matrix. Construction of su… ▽ More We consider the decentralized convex optimization problem, where multiple agents must cooperatively minimize a cumulative objective function, with each local function expressible as an empirical average of data-dependent losses. State-of-the-art approaches for decentralized optimization rely on gradient tracking, where consensus is enforced via a doubly stochastic mixing matrix. Construction of such mixing matrices is not straightforward and requires coordination even prior to the start of the optimization algorithm. This paper puts forth a primal-dual framework for decentralized stochastic optimization that obviates the need for such doubly stochastic matrices. Instead, dual variables are maintained to track the disagreement between neighbors. The proposed framework is flexible and is used to develop decentralized variants of SAGA, L-SVRG, SVRG++, and SEGA algorithms. Using a unified proof, we establish that the oracle complexity of these decentralized variants is $O(1/ε)$, matching the complexity bounds obtained for the centralized variants. Additionally, we also present a decentralized primal-dual accelerated SVRG algorithm achieving $O(1/\sqrtε)$ oracle complexity, again matching the bound for the centralized accelerated SVRG. Numerical tests on the algorithms establish their superior performance as compared to the variance-reduced gradient tracking algorithms. △ Less

Submitted 13 December, 2020; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: 31 pages, 6 Figures

arXiv:2011.07142 [pdf, other]

Sparse Representations of Positive Functions via First and Second-Order Pseudo-Mirror Descent

Authors: Abhishek Chakraborty, Ketan Rajawat, Alec Koppel

Abstract: We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent e… ▽ More We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent employing (i) \emph{pseudo-gradients} and (ii) complexity-reducing projections. Compressive projection in the first-order scheme is executed via kernel orthogonal matching pursuit (KOMP), which overcomes the fact that the vanilla RKHS parameterization grows unbounded with the iteration index in the stochastic setting. Moreover, pseudo-gradients are needed when gradient estimates for cost are only computable up to some numerical error, which arise in, e.g., integral approximations. Under constant step-size and compression budget, we establish tradeoffs between the radius of convergence of the expected sub-optimality and the projection budget parameter, as well as non-asymptotic bounds on the model complexity. To refine the solution's precision, we develop a second-order extension which employs recursively averaged pseudo-gradient outer-products to approximate the Hessian inverse, whose convergence in mean is established under an additional eigenvalue decay condition on the Hessian of the optimal RKHS element, which is unique to this work. Experiments demonstrate favorable performance on inhomogeneous Poisson Process intensity estimation in practice. △ Less

Submitted 3 May, 2022; v1 submitted 13 November, 2020; originally announced November 2020.

arXiv:2010.01360 [pdf, ps, other]

doi 10.1109/TSP.2021.3094971

Practical Precoding via Asynchronous Stochastic Successive Convex Approximation

Authors: Basil M. Idrees, Javed Akhtar, Ketan Rajawat

Abstract: We consider stochastic optimization of a smooth non-convex loss function with a convex non-smooth regularizer. In the online setting, where a single sample of the stochastic gradient of the loss is available at every iteration, the problem can be solved using the proximal stochastic gradient descent (SGD) algorithm and its variants. However in many problems, especially those arising in communicati… ▽ More We consider stochastic optimization of a smooth non-convex loss function with a convex non-smooth regularizer. In the online setting, where a single sample of the stochastic gradient of the loss is available at every iteration, the problem can be solved using the proximal stochastic gradient descent (SGD) algorithm and its variants. However in many problems, especially those arising in communications and signal processing, information beyond the stochastic gradient may be available thanks to the structure of the loss function. Such extra-gradient information is not used by SGD, but has been shown to be useful, for instance in the context of stochastic expectation-maximization, stochastic majorization-minimization, and stochastic successive convex approximation (SCA) approaches. By constructing a stochastic strongly convex surrogates of the loss function at every iteration, the stochastic SCA algorithms can exploit the structural properties of the loss function and achieve superior empirical performance as compared to the SGD. In this work, we take a closer look at the stochastic SCA algorithm and develop its asynchronous variant which can be used for resource allocation in wireless networks. While the stochastic SCA algorithm is known to converge asymptotically, its iteration complexity has not been well-studied, and is the focus of the current work. The insights obtained from the non-asymptotic analysis allow us to develop a more practical asynchronous variant of the stochastic SCA algorithm which allows the use of surrogates calculated in earlier iterations. We characterize precise bound on the maximum delay the algorithm can tolerate, while still achieving the same convergence rate. We apply the algorithm to the problem of linear precoding in wireless sensor networks, where it can be implemented at low complexity but is shown to perform well in practice. △ Less

Submitted 3 October, 2020; originally announced October 2020.

arXiv:2008.05758 [pdf, other]

doi 10.1109/TSP.2021.3082467

Conservative Stochastic Optimization with Expectation Constraints

Authors: Zeeshan Akhtar, Amrit Singh Bedi, Ketan Rajawat

Abstract: This paper considers stochastic convex optimization problems where the objective and constraint functions involve expectations with respect to the data indices or environmental variables, in addition to deterministic convex constraints on the domain of the variables. Although the setting is generic and arises in different machine learning applications, online and efficient approaches for solving s… ▽ More This paper considers stochastic convex optimization problems where the objective and constraint functions involve expectations with respect to the data indices or environmental variables, in addition to deterministic convex constraints on the domain of the variables. Although the setting is generic and arises in different machine learning applications, online and efficient approaches for solving such problems have not been widely studied. Since the underlying data distribution is unknown a priori, a closed-form solution is generally not available, and classical deterministic optimization paradigms are not applicable. State-of-the-art approaches, such as those using the saddle point framework, can ensure that the optimality gap as well as the constraint violation decay as $Ø\left(T^{-\frac{1}{2}}\right)$ where $T$ is the number of stochastic gradients. The domain constraints are assumed simple and handled via projection at every iteration. In this work, we propose a novel conservative stochastic optimization algorithm (CSOA) that achieves zero constraint violation and $Ø\left(T^{-\frac{1}{2}}\right)$ optimality gap. Further, the projection operation (for scenarios when calculating projection is expensive) in the proposed algorithm can be avoided by considering the conditional gradient or Frank-Wolfe (FW) variant of the algorithm. The state-of-the-art stochastic FW variants achieve an optimality gap of $Ø\left(T^{-\frac{1}{3}}\right)$ after $T$ iterations, though these algorithms have not been applied to problems with functional expectation constraints. In this work, we propose the FW-CSOA algorithm that is not only projection-free but also achieves zero constraint violation with $Ø\left(T^{-\frac{1}{4}}\right)$ decay of the optimality gap. The efficacy of the proposed algorithms is tested on two relevant problems: fair classification and structured matrix completion. △ Less

Submitted 29 May, 2021; v1 submitted 13 August, 2020; originally announced August 2020.

arXiv:2005.11923 [pdf, ps, other]

Dynamic Cache Management In Content Delivery Networks

Authors: Srujan Teja Thomdapu, Palash Katiyar, Ketan Rajawat

Abstract: The importance of content delivery networks (CDN) continues to rise with the exponential increase in the generation and consumption of electronic media. In order to ensure a high quality of experience, CDNs often deploy cache servers that are capable of storing some of the popular files close to the user. Such edge caching solutions not only increase the content availability, but also result in hi… ▽ More The importance of content delivery networks (CDN) continues to rise with the exponential increase in the generation and consumption of electronic media. In order to ensure a high quality of experience, CDNs often deploy cache servers that are capable of storing some of the popular files close to the user. Such edge caching solutions not only increase the content availability, but also result in higher download rates and lower latency at the user. We consider the problem of content placement from an optimization perspective. Different from the classical eviction-based algorithms, the present work formulates the content placement problem from an optimization perspective and puts forth an online algorithm for the same. In contrast to the existing optimization-based solutions, the proposed algorithm is incremental and incurs very low computation cost, while yielding storage allocations that are provably near-optimal. The proposed algorithm can handle time varying content popularity, thereby obviating the need for periodically estimating demand distribution. Using synthetic and real IPTV data, we show that the proposed policies outperform all the state of art caching techniques in terms of various metrics. △ Less

Submitted 25 May, 2020; originally announced May 2020.

arXiv:2005.00224 [pdf, ps, other]

Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

Authors: Prashant Khanduri, Pranay Sharma, Swatantra Kafle, Saikiran Bulusu, Ketan Rajawat, Pramod K. Varshney

Abstract: In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of $K$ worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node… ▽ More In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of $K$ worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node having access to only the stochastic samples of its local objective function. In contrast to the existing approaches, we employ a momentum based "single loop" distributed algorithm which eliminates the need of computing large batch size gradients to achieve variance reduction. We propose two algorithms one with "adaptive" and the other with "non-adaptive" learning rates. We show that the proposed algorithms achieve the optimal computational complexity while attaining linear speedup with the number of WNs. Specifically, the algorithms reach an $ε$-stationary point $x_a$ with $\mathbb{E}\| \nabla f(x_a) \| \leq \tilde{O}(K^{-1/3}T^{-1/2} + K^{-1/3}T^{-1/3})$ in $T$ iterations, thereby requiring $\tilde{O}(K^{-1} ε^{-3})$ gradient computations at each WN. Moreover, our approach does not assume identical data distributions across WNs making the approach general enough for federated learning applications. △ Less

Submitted 1 May, 2020; originally announced May 2020.

arXiv:2004.11094 [pdf, other]

Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck

Authors: Alec Koppel, Hrusikesha Pradhan, Ketan Rajawat

Abstract: Gaussian processes provide a framework for nonlinear nonparametric Bayesian inference widely applicable across science and engineering. Unfortunately, their computational burden scales cubically with the training sample size, which in the case that samples arrive in perpetuity, approaches infinity. This issue necessitates approximations for use with streaming data, which to date mostly lack conver… ▽ More Gaussian processes provide a framework for nonlinear nonparametric Bayesian inference widely applicable across science and engineering. Unfortunately, their computational burden scales cubically with the training sample size, which in the case that samples arrive in perpetuity, approaches infinity. This issue necessitates approximations for use with streaming data, which to date mostly lack convergence guarantees. Thus, we develop the first online Gaussian process approximation that preserves convergence to the population posterior, i.e., asymptotic posterior consistency, while ameliorating its intractable complexity growth with the sample size. We propose an online compression scheme that, following each a posteriori update, fixes an error neighborhood with respect to the Hellinger metric centered at the current posterior, and greedily tosses out past kernel dictionary elements until its boundary is hit. We call the resulting method Parsimonious Online Gaussian Processes (POG). For diminishing error radius, exact asymptotic consistency is preserved (Theorem 1(i)) at the cost of unbounded memory in the limit. On the other hand, for constant error radius, POG converges to a neighborhood of the population posterior (Theorem 1(ii))but with finite memory at-worst determined by the metric entropy of the feature space (Theorem 2). Experimental results are presented on several nonlinear regression problems which illuminates the merits of this approach as compared with alternatives that fix the subspace dimension defining the history of past points. △ Less

Submitted 15 July, 2021; v1 submitted 23 April, 2020; originally announced April 2020.

arXiv:2001.01555 [pdf, other]

A Generalized Framework for Autonomous Calibration of Wheeled Mobile Robots

Authors: Mohan Krishna Nutalapati, Lavish Arora, Anway Bose, Ketan Rajawat, Rajesh M Hegde

Abstract: Robotic calibration allows for the fusion of data from multiple sensors such as odometers, cameras, etc., by providing appropriate transformational relationships between the corresponding reference frames. For wheeled robots equipped with exteroceptive sensors, calibration entails learning the motion model of the sensor or the robot in terms of the odometric data, and must generally be performed p… ▽ More Robotic calibration allows for the fusion of data from multiple sensors such as odometers, cameras, etc., by providing appropriate transformational relationships between the corresponding reference frames. For wheeled robots equipped with exteroceptive sensors, calibration entails learning the motion model of the sensor or the robot in terms of the odometric data, and must generally be performed prior to performing tasks such as simultaneous localization and map** (SLAM). Within this context, the current trend is to carry out simultaneous calibration of odometry and sensor without the use of any additional hardware. Building upon the existing simultaneous calibration algorithms, we put forth a generalized calibration framework that can not only handle robots operating in 2D with arbitrary or unknown motion models but also handle outliers in an automated manner. We first propose an algorithm based on the alternating minimization framework applicable to two-wheel differential drive. Subsequently, for arbitrary but known drive configurations we put forth an iteratively re-weighted least squares methodology leveraging an intelligent weighing scheme. Different from the existing works, these proposed algorithms require no manual intervention and seamlessly handle outliers that arise due to both systematic and non-systematic errors. Finally, we put forward a novel Gaussian Process-based non-parametric approach for calibrating wheeled robots with arbitrary or unknown drive configurations. Detailed experiments are performed to demonstrate the accuracy, usefulness, and flexibility of the proposed algorithms. △ Less

Submitted 6 January, 2020; originally announced January 2020.

Comments: This manuscript has been submitted to 'Elsevier Journal of Robotics and Autonomous Systems' and is under review for possible publication. Based on IROS 2019 conference submission [arXiv:1910.11917]

arXiv:1912.06036 [pdf, ps, other]

Parallel Restarted SPIDER -- Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity

Authors: Pranay Sharma, Swatantra Kafle, Prashant Khanduri, Saikiran Bulusu, Ketan Rajawat, Pramod K. Varshney

Abstract: In this paper, we propose a distributed algorithm for stochastic smooth, non-convex optimization. We assume a worker-server architecture where $N$ nodes, each having $n$ (potentially infinite) number of samples, collaborate with the help of a central server to perform the optimization task. The global objective is to minimize the average of local cost functions available at individual nodes. The p… ▽ More In this paper, we propose a distributed algorithm for stochastic smooth, non-convex optimization. We assume a worker-server architecture where $N$ nodes, each having $n$ (potentially infinite) number of samples, collaborate with the help of a central server to perform the optimization task. The global objective is to minimize the average of local cost functions available at individual nodes. The proposed approach is a non-trivial extension of the popular parallel-restarted SGD algorithm, incorporating the optimal variance-reduction based SPIDER gradient estimator into it. We prove convergence of our algorithm to a first-order stationary solution. The proposed approach achieves the best known communication complexity $O(ε^{-1})$ along with the optimal computation complexity. For finite-sum problems (finite $n$), we achieve the optimal computation (IFO) complexity $O(\sqrt{Nn}ε^{-1})$. For online problems ($n$ unknown or infinite), we achieve the optimal IFO complexity $O(ε^{-3/2})$. In both the cases, we maintain the linear speedup achieved by existing methods. This is a massive improvement over the $O(ε^{-2})$ IFO complexity of the existing approaches. Additionally, our algorithm is general enough to allow non-identical distributions of data across workers, as in the recently proposed federated learning paradigm. △ Less

Submitted 6 November, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

arXiv:1910.11917 [pdf, other]

Model Free Calibration of Wheeled Robots Using Gaussian Process

Authors: Mohan Krishna Nutalapati, Lavish Arora, Anway Bose, Ketan Rajawat, Rajesh M Hegde

Abstract: Robotic calibration allows for the fusion of data from multiple sensors such as odometers, cameras, etc., by providing appropriate relationships between the corresponding reference frames. For wheeled robots equipped with camera/lidar along with wheel encoders, calibration entails learning the motion model of the sensor or the robot in terms of the data from the encoders and generally carried out… ▽ More Robotic calibration allows for the fusion of data from multiple sensors such as odometers, cameras, etc., by providing appropriate relationships between the corresponding reference frames. For wheeled robots equipped with camera/lidar along with wheel encoders, calibration entails learning the motion model of the sensor or the robot in terms of the data from the encoders and generally carried out before performing tasks such as simultaneous localization and map** (SLAM). This work puts forward a novel Gaussian Process-based non-parametric approach for calibrating wheeled robots with arbitrary or unknown drive configurations. The procedure is more general as it learns the entire sensor/robot motion model in terms of odometry measurements. Different from existing non-parametric approaches, our method relies on measurements from the onboard sensors and hence does not require the ground truth information from external motion capture systems. Alternatively, we propose a computationally efficient approach that relies on the linear approximation of the sensor motion model. Finally, we perform experiments to calibrate robots with un-modelled effects to demonstrate the accuracy, usefulness, and flexibility of the proposed approach. △ Less

Submitted 25 October, 2019; originally announced October 2019.

Comments: To be published in International Conference on Intelligent Robots and Systems (IROS), 2019

arXiv:1909.11555 [pdf, other]

Optimally Compressed Nonparametric Online Learning

Authors: Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler

Abstract: Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes' Rule. Unfortunately, when used online, nonparametric meth… ▽ More Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes' Rule. Unfortunately, when used online, nonparametric methods suffer a "curse of dimensionality" which precludes their use: their complexity scales at least with the time index. We survey online compression tools which bring their memory under control and attain approximate convergence. The asymptotic bias depends on a compression parameter that trades off memory and accuracy. Further, the applications to robotics, communications, economics, and power are discussed, as well as extensions to multi-agent systems. △ Less

Submitted 17 January, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

arXiv:1909.05442 [pdf, other]

Nonstationary Nonparametric Online Learning: Balancing Dynamic Regret and Model Parsimony

Authors: Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Brian M. Sadler

Abstract: An open challenge in supervised learning is \emph{conceptual drift}: a data point begins as classified according to one label, but over time the notion of that label changes. Beyond linear autoregressive models, transfer and meta learning address drift, but require data that is representative of disparate domains at the outset of training. To relax this requirement, we propose a memory-efficient \… ▽ More An open challenge in supervised learning is \emph{conceptual drift}: a data point begins as classified according to one label, but over time the notion of that label changes. Beyond linear autoregressive models, transfer and meta learning address drift, but require data that is representative of disparate domains at the outset of training. To relax this requirement, we propose a memory-efficient \emph{online} universal function approximator based on compressed kernel methods. Our approach hinges upon viewing non-stationary learning as online convex optimization with dynamic comparators, for which performance is quantified by dynamic regret. Prior works control dynamic regret growth only for linear models. In contrast, we hypothesize actions belong to reproducing kernel Hilbert spaces (RKHS). We propose a functional variant of online gradient descent (OGD) operating in tandem with greedy subspace projections. Projections are necessary to surmount the fact that RKHS functions have complexity proportional to time. For this scheme, we establish sublinear dynamic regret growth in terms of both loss variation and functional path length, and that the memory of the function sequence remains moderate. Experiments demonstrate the usefulness of the proposed technique for online nonlinear regression and classification problems with non-stationary data. △ Less

Submitted 11 September, 2019; originally announced September 2019.

arXiv:1907.08969 [pdf, other]

Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I

Authors: Sandeep Kumar, Ketan Rajawat, Daniel P. Palomar

Abstract: In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), s… ▽ More In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), successive convex approximation (SCA), distributed and asynchronous algorithms, and inexact gradient methods. Different from a number of existing approaches, however, the proposed framework is flexible enough to incorporate a class of non-convex objective functions, allow distributed operation with and without a fusion center, and include variance reduced methods as special cases. Remarkably, the proposed algorithms are robust to uncertainties arising from random, deterministic, and adversarial sources. The part I of the paper develops two variants of the algorithm under very mild assumptions and establishes first-order convergence rate guarantees. The proof developed here allows for generic errors and delays, paving the way for different variance-reduced, asynchronous, and stochastic implementations, outlined and evaluated in part II. △ Less

Submitted 21 July, 2019; originally announced July 2019.

arXiv:1907.08774 [pdf, ps, other]

Optimal Design of Queuing Systems via Compositional Stochastic Programming

Authors: Srujan Teja Thomdapu, Ketan Rajawat

Abstract: Well-designed queuing systems form the backbone of modern communications, distributed computing, and content delivery architectures. Designs balancing infrastructure costs and user experience indices require tools from teletraffic theory and operations research. A standard approach to designing such systems involves formulating optimization problems that strive to maximize the pertinent utility fu… ▽ More Well-designed queuing systems form the backbone of modern communications, distributed computing, and content delivery architectures. Designs balancing infrastructure costs and user experience indices require tools from teletraffic theory and operations research. A standard approach to designing such systems involves formulating optimization problems that strive to maximize the pertinent utility functions while adhering to quality-of-service and other physical constraints. In many cases, formulating such problems necessitates making simplistic assumptions on arrival and departure processes to keep the problem simple. This work puts forth a stochastic optimization framework for designing queuing systems where the exogenous processes may have arbitrary and unknown distributions. We show that many such queuing design problems can generally be formulated as stochastic optimization problems where the objective and constraint are non-linear functions of expectations. The compositional structure obviates the use of classical stochastic approximation approaches where the stochastic gradients are often required to be unbiased. To this end, a constrained stochastic compositional gradient descent algorithm is proposed that utilizes a tracking step for the expected value functions. The non-asymptotic performance of the proposed algorithm is characterized via its iteration complexity. Numerical tests allow us to validate the theoretical results and demonstrate the efficacy of the proposed algorithm. △ Less

Submitted 20 July, 2019; originally announced July 2019.

Comments: This work has been submitted in "IEEE Transactions on Communication (TCOM)"

arXiv:1905.07018 [pdf, other]

Online Learning over Dynamic Graphs via Distributed Proximal Gradient Algorithm

Authors: Rishabh Dixit, Amrit Singh Bedi, Ketan Rajawat

Abstract: We consider the problem of tracking the minimum of a time-varying convex optimization problem over a dynamic graph. Motivated by target tracking and parameter estimation problems in intermittently connected robotic and sensor networks, the goal is to design a distributed algorithm capable of handling non-differentiable regularization penalties. The proposed proximal online gradient descent algorit… ▽ More We consider the problem of tracking the minimum of a time-varying convex optimization problem over a dynamic graph. Motivated by target tracking and parameter estimation problems in intermittently connected robotic and sensor networks, the goal is to design a distributed algorithm capable of handling non-differentiable regularization penalties. The proposed proximal online gradient descent algorithm is built to run in a fully decentralized manner and utilizes consensus updates over possibly disconnected graphs. The performance of the proposed algorithm is analyzed by develo** bounds on its dynamic regret in terms of the cumulative path length of the time-varying optimum. It is shown that as compared to the centralized case, the dynamic regret incurred by the proposed algorithm over $T$ time slots is worse by a factor of $\log(T)$ only, despite the disconnected and time-varying network topology. The empirical performance of the proposed algorithm is tested on the distributed dynamic sparse recovery problem, where it is shown to incur a dynamic regret that is close to that of the centralized algorithm. △ Less

Submitted 16 May, 2019; originally announced May 2019.

arXiv:1810.03373 [pdf, ps, other]

Decentralized Multi-Antenna Coded Caching with Cyclic Exchanges

Authors: Srujan Teja Thomdapu, Ketan Rajawat

Abstract: This paper considers a single cell multi-antenna base station delivering content to multiple cache enabled single-antenna users. Coding strategies are developed that allow for decentralized placement in the wireless setting. Three different cases namely, max-min multicasting, linear combinations in the complex field, and linear combinations in the finite field, are considered and closed-form rate… ▽ More This paper considers a single cell multi-antenna base station delivering content to multiple cache enabled single-antenna users. Coding strategies are developed that allow for decentralized placement in the wireless setting. Three different cases namely, max-min multicasting, linear combinations in the complex field, and linear combinations in the finite field, are considered and closed-form rate expressions are provided that hold with high probability. For the case of max-min fair multicasting delivery, we propose a new coding scheme that is capable of working with only two-user broadcasts. A cyclic-exchange protocol for efficient content delivery is proposed and shown to perform almost as well as the original multi-user broadcast scheme. △ Less

Submitted 8 October, 2018; originally announced October 2018.

Comments: Accepted in 56th Annual Allerton Conference 2018 on Communication, Control, and Computing at UIUC, IL,USA

arXiv:1612.07089 [pdf, ps, other]

Stochastic Multidimensional Scaling

Authors: Ketan Rajawat, Sandeep Kumar

Abstract: Multidimensional scaling (MDS) is a popular dimensionality reduction techniques that has been widely used for network visualization and cooperative localization. However, the traditional stress minimization formulation of MDS necessitates the use of batch optimization algorithms that are not scalable to large-sized problems. This paper considers an alternative stochastic stress minimization framew… ▽ More Multidimensional scaling (MDS) is a popular dimensionality reduction techniques that has been widely used for network visualization and cooperative localization. However, the traditional stress minimization formulation of MDS necessitates the use of batch optimization algorithms that are not scalable to large-sized problems. This paper considers an alternative stochastic stress minimization framework that is amenable to incremental and distributed solutions. A novel linear-complexity stochastic optimization algorithm is proposed that is provably convergent and simple to implement. The applicability of the proposed algorithm to localization and visualization tasks is also expounded. Extensive tests on synthetic and real datasets demonstrate the efficacy of the proposed algorithm. △ Less

Submitted 21 December, 2016; originally announced December 2016.

arXiv:1606.03665 [pdf, ps, other]

Resource Allocation and Fairness in Wireless Powered Cooperative Cognitive Radio Networks

Authors: Sanket S. Kalamkar, Jeya Pradha Jeyaraj, Adrish Banerjee, Ketan Rajawat

Abstract: We integrate a wireless powered communication network with a cooperative cognitive radio network, where multiple secondary users (SUs) powered wirelessly by a hybrid access point (HAP) help a primary user relay the data. As a reward for the cooperation, the secondary network gains the spectrum access where SUs transmit to HAP using time division multiple access. To maximize the sum-throughput of S… ▽ More We integrate a wireless powered communication network with a cooperative cognitive radio network, where multiple secondary users (SUs) powered wirelessly by a hybrid access point (HAP) help a primary user relay the data. As a reward for the cooperation, the secondary network gains the spectrum access where SUs transmit to HAP using time division multiple access. To maximize the sum-throughput of SUs, we present a secondary sum-throughput optimal resource allocation (STORA) scheme. Under the constraint of meeting target primary rate, the STORA scheme chooses the optimal set of relaying SUs and jointly performs the time and energy allocation for SUs. Specifically, by exploiting the structure of the optimal solution, we find the order in which SUs are prioritized to relay primary data. Since the STORA scheme focuses on the sum-throughput, it becomes inconsiderate towards individual SU throughput, resulting in low fairness. To enhance fairness, we investigate three resource allocation schemes, which are (i) equal time allocation, (ii) minimum throughput maximization, and (iii) proportional time allocation. Simulation results reveal the trade-off between sum-throughput and fairness. The minimum throughput maximization scheme is the fairest one as each SU gets the same throughput, but yields the least SU sum-throughput. △ Less

Submitted 12 June, 2016; originally announced June 2016.

Comments: Accepted in IEEE Transactions on Communications

arXiv:1605.00076 [pdf, ps, other]

Asynchronous Optimization Over Heterogeneous Networks via Consensus ADMM

Authors: Sandeep Kumar, Rahul Jain, Ketan Rajawat

Abstract: This paper considers the distributed optimization of a sum of locally observable, non-convex functions. The optimization is performed over a multi-agent networked system, and each local function depends only on a subset of the variables. An asynchronous and distributed alternating directions method of multipliers (ADMM) method that allows the nodes to defer or skip the computation and transmission… ▽ More This paper considers the distributed optimization of a sum of locally observable, non-convex functions. The optimization is performed over a multi-agent networked system, and each local function depends only on a subset of the variables. An asynchronous and distributed alternating directions method of multipliers (ADMM) method that allows the nodes to defer or skip the computation and transmission of updates is proposed in the paper. The proposed algorithm utilizes different approximations in the update step, resulting in proximal and majorized ADMM variants. Both variants are shown to converge to a local minimum, under certain regularity conditions. The proposed asynchronous algorithms are also applied to the problem of cooperative localization in wireless ad hoc networks, where it is shown to outperform the other state-of-the-art localization algorithms. △ Less

Submitted 30 April, 2016; originally announced May 2016.

Comments: Submitted to Transactions on signal and information processing over Networks

arXiv:1211.6950 [pdf, ps, other]

doi 10.1109/MSP.2012.2232355

Dynamic Network Cartography

Authors: Gonzalo Mateos, Ketan Rajawat

Abstract: Communication networks have evolved from specialized, research and tactical transmission systems to large-scale and highly complex interconnections of intelligent devices, increasingly becoming more commercial, consumer-oriented, and heterogeneous. Propelled by emergent social networking services and high-definition streaming platforms, network traffic has grown explosively thanks to the advances… ▽ More Communication networks have evolved from specialized, research and tactical transmission systems to large-scale and highly complex interconnections of intelligent devices, increasingly becoming more commercial, consumer-oriented, and heterogeneous. Propelled by emergent social networking services and high-definition streaming platforms, network traffic has grown explosively thanks to the advances in processing speed and storage capacity of state-of-the-art communication technologies. As "netizens" demand a seamless networking experience that entails not only higher speeds, but also resilience and robustness to failures and malicious cyber-attacks, ample opportunities for signal processing (SP) research arise. The vision is for ubiquitous smart network devices to enable data-driven statistical learning algorithms for distributed, robust, and online network operation and management, adaptable to the dynamically-evolving network landscape with minimal need for human intervention. The present paper aims at delineating the analytical background and the relevance of SP tools to dynamic network monitoring, introducing the SP readership to the concept of dynamic network cartography -- a framework to construct maps of the dynamic network state in an efficient and scalable manner tailored to large-scale heterogeneous networks. △ Less

Submitted 29 November, 2012; originally announced November 2012.

Comments: To appear in the IEEE Signal Processing Magazine - Special Issue on Adaptation and Learning over Complex Networks

arXiv:1204.5507 [pdf, ps, other]

doi 10.1109/TIT.2014.2311802

Dynamic Network Delay Cartography

Authors: Ketan Rajawat, Emiliano Dall'Anese, Georgios B. Giannakis

Abstract: Path delays in IP networks are important metrics, required by network operators for assessment, planning, and fault diagnosis. Monitoring delays of all source-destination pairs in a large network is however challenging and wasteful of resources. The present paper advocates a spatio-temporal Kalman filtering approach to construct network-wide delay maps using measurements on only a few paths. The p… ▽ More Path delays in IP networks are important metrics, required by network operators for assessment, planning, and fault diagnosis. Monitoring delays of all source-destination pairs in a large network is however challenging and wasteful of resources. The present paper advocates a spatio-temporal Kalman filtering approach to construct network-wide delay maps using measurements on only a few paths. The proposed network cartography framework allows efficient tracking and prediction of delays by relying on both topological as well as historical data. Optimal paths for delay measurement are selected in an online fashion by leveraging the notion of submodularity. The resulting predictor is optimal in the class of linear predictors, and outperforms competing alternatives on real-world datasets. △ Less

Submitted 11 November, 2012; v1 submitted 24 April, 2012; originally announced April 2012.

Comments: Part of this paper has been published in the \emph{IEEE Statistical Signal Processing Workshop}, Ann Arbor, MI, Aug. 2012

arXiv:1003.5239 [pdf, ps, other]

doi 10.1109/TNET.2011.2109010

Cross-Layer Designs in Coded Wireless Fading Networks with Multicast

Authors: Ketan Rajawat, Nikolaos Gatsis, Georgios B. Giannakis

Abstract: A cross-layer design along with an optimal resource allocation framework is formulated for wireless fading networks, where the nodes are allowed to perform network coding. The aim is to jointly optimize end-to-end transport layer rates, network code design variables, broadcast link flows, link capacities, average power consumption, and short-term power allocation policies. As in the routing paradi… ▽ More A cross-layer design along with an optimal resource allocation framework is formulated for wireless fading networks, where the nodes are allowed to perform network coding. The aim is to jointly optimize end-to-end transport layer rates, network code design variables, broadcast link flows, link capacities, average power consumption, and short-term power allocation policies. As in the routing paradigm where nodes simply forward packets, the cross-layer optimization problem with network coding is non-convex in general. It is proved however, that with network coding, dual decomposition for multicast is optimal so long as the fading at each wireless link is a continuous random variable. This lends itself to provably convergent subgradient algorithms, which not only admit a layered-architecture interpretation but also optimally integrate network coding in the protocol stack. The dual algorithm is also paired with a scheme that yields near-optimal network design variables, namely multicast end-to-end rates, network code design quantities, flows over the broadcast links, link capacities, and average power consumption. Finally, an asynchronous subgradient method is developed, whereby the dual updates at the physical layer can be affordably performed with a certain delay with respect to the resource allocation tasks in upper layers. This attractive feature is motivated by the complexity of the physical layer subproblem, and is an adaptation of the subgradient method suitable for network control. △ Less

Submitted 20 August, 2010; v1 submitted 26 March, 2010; originally announced March 2010.

Comments: Accepted in IEEE/ACM Transactions on Networking; revision pending

arXiv:0707.0234 [pdf, ps, other]

Selection Relaying at Low Signal to Noise Ratios

Authors: Ketan Rajawat, Adrish Banerjee

Abstract: Performance of cooperative diversity schemes at Low Signal to Noise Ratios (LSNR) was recently studied by Avestimehr et. al. [1] who emphasized the importance of diversity gain over multiplexing gain at low SNRs. It has also been pointed out that continuous energy transfer to the channel is necessary for achieving the max-flow min-cut bound at LSNR. Motivated by this we propose the use of Select… ▽ More Performance of cooperative diversity schemes at Low Signal to Noise Ratios (LSNR) was recently studied by Avestimehr et. al. [1] who emphasized the importance of diversity gain over multiplexing gain at low SNRs. It has also been pointed out that continuous energy transfer to the channel is necessary for achieving the max-flow min-cut bound at LSNR. Motivated by this we propose the use of Selection Decode and Forward (SDF) at LSNR and analyze its performance in terms of the outage probability. We also propose an energy optimization scheme which further brings down the outage probability. △ Less

Submitted 2 July, 2007; originally announced July 2007.

Showing 1–32 of 32 results for author: Rajawat, K