-
Sample Complexity of the Linear Quadratic Regulator: A Reinforcement Learning Lens
Authors:
Amirreza Neshaei Moghaddam,
Alex Olshevsky,
Bahman Gharesifard
Abstract:
We provide the first known algorithm that provably achieves $\varepsilon$-optimality within $\widetilde{\mathcal{O}}(1/\varepsilon)$ function evaluations for the discounted discrete-time LQR problem with unknown parameters, without relying on two-point gradient estimates. These estimates are known to be unrealistic in many settings, as they depend on using the exact same initialization, which is t…
▽ More
We provide the first known algorithm that provably achieves $\varepsilon$-optimality within $\widetilde{\mathcal{O}}(1/\varepsilon)$ function evaluations for the discounted discrete-time LQR problem with unknown parameters, without relying on two-point gradient estimates. These estimates are known to be unrealistic in many settings, as they depend on using the exact same initialization, which is to be selected randomly, for two different policies. Our results substantially improve upon the existing literature outside the realm of two-point gradient estimates, which either leads to $\widetilde{\mathcal{O}}(1/\varepsilon^2)$ rates or heavily relies on stability assumptions.
△ Less
Submitted 18 April, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Convex SGD: Generalization Without Early Stop**
Authors:
Julien Hendrickx,
Alex Olshevsky
Abstract:
We consider the generalization error associated with stochastic gradient descent on a smooth convex function over a compact set. We show the first bound on the generalization error that vanishes when the number of iterations $T$ and the dataset size $n$ go to zero at arbitrary rates; our bound scales as $\tilde{O}(1/\sqrt{T} + 1/\sqrt{n})$ with step-size $α_t = 1/\sqrt{t}$. In particular, strong c…
▽ More
We consider the generalization error associated with stochastic gradient descent on a smooth convex function over a compact set. We show the first bound on the generalization error that vanishes when the number of iterations $T$ and the dataset size $n$ go to zero at arbitrary rates; our bound scales as $\tilde{O}(1/\sqrt{T} + 1/\sqrt{n})$ with step-size $α_t = 1/\sqrt{t}$. In particular, strong convexity is not needed for stochastic gradient descent to generalize well.
△ Less
Submitted 14 April, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Distributed TD(0) with Almost No Communication
Authors:
Rui Liu,
Alex Olshevsky
Abstract:
We provide a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation. Our approach relies on ``one-shot averaging,'' where $N$ agents run identical local copies of the TD(0) method and average the outcomes only once at the very end. We demonstrate a version of the linear time speedup phenomenon, where the convergence time of the distributed proces…
▽ More
We provide a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation. Our approach relies on ``one-shot averaging,'' where $N$ agents run identical local copies of the TD(0) method and average the outcomes only once at the very end. We demonstrate a version of the linear time speedup phenomenon, where the convergence time of the distributed process is a factor of $N$ faster than the convergence time of TD(0). This is the first result proving benefits from parallelism for temporal difference methods.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
A Small Gain Analysis of Single Timescale Actor Critic
Authors:
Alex Olshevsky,
Bahman Gharesifard
Abstract:
We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small-gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-c…
▽ More
We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small-gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-critic methods to $O \left(μ^{-2} ε^{-2} \right)$ to find an $ε$-approximate stationary point where $μ$ is the condition number associated with the critic.
△ Less
Submitted 25 May, 2023; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Optimal Vaccine Allocation for Pandemic Stabilization
Authors:
Qianqian Ma,
Yang-Yu Liu,
Alex Olshevsky
Abstract:
How to strategically allocate the available vaccines is a crucial issue for pandemic control. In this work, we propose a mathematical framework for optimal stabilizing vaccine allocation, where our goal is to send the infections to zero as soon as possible with a fixed number of vaccine doses. This framework allows us to efficiently compute the optimal vaccine allocation policy for general epidemi…
▽ More
How to strategically allocate the available vaccines is a crucial issue for pandemic control. In this work, we propose a mathematical framework for optimal stabilizing vaccine allocation, where our goal is to send the infections to zero as soon as possible with a fixed number of vaccine doses. This framework allows us to efficiently compute the optimal vaccine allocation policy for general epidemic spread models including SIS/SIR/SEIR and a new model of COVID-19 transmissions. By fitting the real data in New York State to our framework, we found that the optimal stabilizing vaccine allocation policy suggests offering vaccines priority to locations where there are more susceptible people and where the residents spend longer time outside the home. Besides, we found that offering vaccines priority to young adults (20-29) and middle-age adults (20-44) can minimize the cumulative infected cases and the death cases. Moreover, we compared our method with five age-stratified strategies in \cite{bubar2021model} based on their epidemics model. We also found it's better to offer vaccine priorities to young people to curb the disease and minimize the deaths when the basic reproduction number $R_0$ is moderately above one, which describes the most world during COVID-19. Such phenomenon has been ignored in \cite{bubar2021model}.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Communication-efficient SGD: From Local SGD to One-Shot Averaging
Authors:
Artin Spiridonoff,
Alex Olshevsky,
Ioannis Ch. Paschalidis
Abstract:
We consider speeding up stochastic gradient descent (SGD) by parallelizing it across multiple workers. We assume the same data set is shared among $N$ workers, who can take SGD steps and coordinate with a central server. While it is possible to obtain a linear reduction in the variance by averaging all the stochastic gradients at every step, this requires a lot of communication between the workers…
▽ More
We consider speeding up stochastic gradient descent (SGD) by parallelizing it across multiple workers. We assume the same data set is shared among $N$ workers, who can take SGD steps and coordinate with a central server. While it is possible to obtain a linear reduction in the variance by averaging all the stochastic gradients at every step, this requires a lot of communication between the workers and the server, which can dramatically reduce the gains from parallelism. The Local SGD method, proposed and analyzed in the earlier literature, suggests machines should make many local steps between such communications. While the initial analysis of Local SGD showed it needs $Ω( \sqrt{T} )$ communications for $T$ local gradient steps in order for the error to scale proportionately to $1/(NT)$, this has been successively improved in a string of papers, with the state of the art requiring $Ω\left( N \left( \mbox{ poly} (\log T) \right) \right)$ communications. In this paper, we suggest a Local SGD scheme that communicates less overall by communicating less frequently as the number of iterations grows. Our analysis shows that this can achieve an error that scales as $1/(NT)$ with a number of communications that is completely independent of $T$. In particular, we show that $Ω(N)$ communications are sufficient. Empirical evidence suggests this bound is close to tight as we further show that $\sqrt{N}$ or $N^{3/4}$ communications fail to achieve linear speed-up in simulations. Moreover, we show that under mild assumptions, the main of which is twice differentiability on any neighborhood of the optimal solution, one-shot averaging which only uses a single round of communication can also achieve the optimal convergence rate asymptotically.
△ Less
Submitted 27 October, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Optimal Lockdown for Pandemic Control
Authors:
Qianqian Ma,
Yang-Yu Liu,
Alex Olshevsky
Abstract:
As a common strategy of contagious disease containment, lockdowns will inevitably weaken the economy. The ongoing COVID-19 pandemic underscores the trade-off arising from public health and economic cost. An optimal lockdown policy to resolve this trade-off is highly desired. Here we propose a mathematical framework of pandemic control through an optimal stabilizing non-uniform lockdown, where our…
▽ More
As a common strategy of contagious disease containment, lockdowns will inevitably weaken the economy. The ongoing COVID-19 pandemic underscores the trade-off arising from public health and economic cost. An optimal lockdown policy to resolve this trade-off is highly desired. Here we propose a mathematical framework of pandemic control through an optimal stabilizing non-uniform lockdown, where our goal is to reduce the economic activity as little as possible while decreasing the number of infected individuals at a prescribed rate. This framework allows us to efficiently compute the optimal stabilizing lockdown policy for general epidemic spread models, including both the classical SIS/SIR/SEIR models and a new model of COVID-19 transmissions. We demonstrate the power of this framework by analyzing publicly available data of inter-county travel frequencies to analyze a model of COVID-19 spread in the 62 counties of New York State. We find that an optimal stabilizing lockdown based on epidemic status in April 2020 would have reduced economic activity more stringently outside of New York City compared to within it, even though the epidemic was much more prevalent in New York City at that point. Such a counterintuitive result highlights the intricacies of pandemic control and sheds light on future lockdown policy design.
△ Less
Submitted 24 January, 2022; v1 submitted 24 October, 2020;
originally announced October 2020.
-
Asymptotic Convergence Rate of Alternating Minimization for Rank One Matrix Completion
Authors:
Rui Liu,
Alex Olshevsky
Abstract:
We study alternating minimization for matrix completion in the simplest possible setting: completing a rank-one matrix from a revealed subset of the entries. We bound the asymptotic convergence rate by the variational characterization of eigenvalues of a reversible consensus problem. This leads to a polynomial upper bound on the asymptotic rate in terms of number of nodes as well as the largest de…
▽ More
We study alternating minimization for matrix completion in the simplest possible setting: completing a rank-one matrix from a revealed subset of the entries. We bound the asymptotic convergence rate by the variational characterization of eigenvalues of a reversible consensus problem. This leads to a polynomial upper bound on the asymptotic rate in terms of number of nodes as well as the largest degree of the graph of revealed entries.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Local SGD With a Communication Overhead Depending Only on the Number of Workers
Authors:
Artin Spiridonoff,
Alex Olshevsky,
Ioannis Ch. Paschalidis
Abstract:
We consider speeding up stochastic gradient descent (SGD) by parallelizing it across multiple workers. We assume the same data set is shared among $n$ workers, who can take SGD steps and coordinate with a central server. Unfortunately, this could require a lot of communication between the workers and the server, which can dramatically reduce the gains from parallelism. The Local SGD method, propos…
▽ More
We consider speeding up stochastic gradient descent (SGD) by parallelizing it across multiple workers. We assume the same data set is shared among $n$ workers, who can take SGD steps and coordinate with a central server. Unfortunately, this could require a lot of communication between the workers and the server, which can dramatically reduce the gains from parallelism. The Local SGD method, proposed and analyzed in the earlier literature, suggests machines should make many local steps between such communications. While the initial analysis of Local SGD showed it needs $Ω( \sqrt{T} )$ communications for $T$ local gradient steps in order for the error to scale proportionately to $1/(nT)$, this has been successively improved in a string of papers, with the state-of-the-art requiring $Ω\left( n \left( \mbox{ polynomial in log } (T) \right) \right)$ communications. In this paper, we give a new analysis of Local SGD. A consequence of our analysis is that Local SGD can achieve an error that scales as $1/(nT)$ with only a fixed number of communications independent of $T$: specifically, only $Ω(n)$ communications are required.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.
-
Asymptotic Network Independence and Step-Size for A Distributed Subgradient Method
Authors:
Alex Olshevsky
Abstract:
We consider whether distributed subgradient methods can achieve a linear speedup over a centralized subgradient method. While it might be hoped that distributed network of $n$ nodes that can compute $n$ times more subgradients in parallel compared to a single node might, as a result, be $n$ times faster, existing bounds for distributed optimization methods are often consistent with a slowdown rath…
▽ More
We consider whether distributed subgradient methods can achieve a linear speedup over a centralized subgradient method. While it might be hoped that distributed network of $n$ nodes that can compute $n$ times more subgradients in parallel compared to a single node might, as a result, be $n$ times faster, existing bounds for distributed optimization methods are often consistent with a slowdown rather than speedup compared to a single node.
We show that a distributed subgradient method has this "linear speedup" property when using a class of square-summable-but-not-summable step-sizes which include $1/t^β$ when $β\in (1/2,1)$; for such step-sizes, we show that after a transient period whose size depends on the spectral gap of the network, the method achieves a performance guarantee that does not depend on the network or the number of nodes. We also show that the same method can fail to have this "asymptotic network independence" property under the optimally decaying step-size $1/\sqrt{t}$ and, as a consequence, can fail to provide a linear speedup compared to a single node with $1/\sqrt{t}$ step-size.
△ Less
Submitted 25 July, 2020; v1 submitted 14 March, 2020;
originally announced March 2020.
-
On A Relaxation of Time-Varying Actuator Placement
Authors:
Alex Olshevsky
Abstract:
We consider the time-varying actuator placement in continuous time, where the goal is to maximize the trace of the controllability Grammian. A natural relaxation of the problem is to allow the binary $\{0,1\}$ variable indicating whether an actuator is used at a given time to take on values in the closed interval $[0,1]$. We show that all optimal solutions of both the original and the relaxed prob…
▽ More
We consider the time-varying actuator placement in continuous time, where the goal is to maximize the trace of the controllability Grammian. A natural relaxation of the problem is to allow the binary $\{0,1\}$ variable indicating whether an actuator is used at a given time to take on values in the closed interval $[0,1]$. We show that all optimal solutions of both the original and the relaxed problems can be given via an explicit formula, and that, as long as the input matrix has no zero columns, the solutions sets of the original and relaxed problem coincide.
△ Less
Submitted 9 April, 2020; v1 submitted 19 December, 2019;
originally announced December 2019.
-
Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning
Authors:
Shi Pu,
Alex Olshevsky,
Ioannis Ch. Paschalidis
Abstract:
We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of n nodes asymptotically converges to the optimal solution at a comparable rate to a ce…
▽ More
We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of n nodes asymptotically converges to the optimal solution at a comparable rate to a centralized method with the same computational power as the entire network. We explain this property through an example involving the training of ML models and sketch a short mathematical analysis for comparing the performance of distributed stochastic gradient descent (DSGD) with centralized stochastic gradient decent (SGD).
△ Less
Submitted 18 February, 2020; v1 submitted 28 June, 2019;
originally announced June 2019.
-
A Sharp Estimate on the Transient Time of Distributed Stochastic Gradient Descent
Authors:
Shi Pu,
Alex Olshevsky,
Ioannis Ch. Paschalidis
Abstract:
This paper is concerned with minimizing the average of $n$ cost functions over a network in which agents may communicate and exchange information with each other. We consider the setting where only noisy gradient information is available. To solve the problem, we study the distributed stochastic gradient descent (DSGD) method and perform a non-asymptotic convergence analysis. For strongly convex a…
▽ More
This paper is concerned with minimizing the average of $n$ cost functions over a network in which agents may communicate and exchange information with each other. We consider the setting where only noisy gradient information is available. To solve the problem, we study the distributed stochastic gradient descent (DSGD) method and perform a non-asymptotic convergence analysis. For strongly convex and smooth objective functions, DSGD asymptotically achieves the optimal network independent convergence rate compared to centralized stochastic gradient descent (SGD). Our main contribution is to characterize the transient time needed for DSGD to approach the asymptotic convergence rate, which we show behaves as $K_T=\mathcal{O}\left(\frac{n}{(1-ρ_w)^2}\right)$, where $1-ρ_w$ denotes the spectral gap of the mixing matrix. Moreover, we construct a "hard" optimization problem for which we show the transient time needed for DSGD to approach the asymptotic convergence rate is lower bounded by $Ω\left(\frac{n}{(1-ρ_w)^2} \right)$, implying the sharpness of the obtained result. Numerical experiments demonstrate the tightness of the theoretical results.
△ Less
Submitted 29 January, 2021; v1 submitted 6 June, 2019;
originally announced June 2019.
-
On the Inapproximability of the Discrete Witsenhausen Problem
Authors:
Alex Olshevsky
Abstract:
We consider a discrete version of the Witsenhausen problem where all random variables are bounded and take on integer values. Our main goal is to understand the complexity of computing good strategies given the distributions for the initial state and second-stage noise as inputs to the problem. Following Papadimitriou and Tsitsiklis [1], who showed that computing the optimal solution is NP-complet…
▽ More
We consider a discrete version of the Witsenhausen problem where all random variables are bounded and take on integer values. Our main goal is to understand the complexity of computing good strategies given the distributions for the initial state and second-stage noise as inputs to the problem. Following Papadimitriou and Tsitsiklis [1], who showed that computing the optimal solution is NP-complete, we construct a sequence of problem instances with the initial state uniform over a set of size $n$ and the noise uniform over a set of size at most $n^2$, such that finding a strategy whose cost is a multiplicative $n^{2-ε}$ approximation to the optimal cost is NP-hard for any $ε> 0$.
△ Less
Submitted 11 April, 2019;
originally announced April 2019.
-
Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions
Authors:
Artin Spiridonoff,
Alex Olshevsky,
Ioannis Ch. Paschalidis
Abstract:
We consider the standard model of distributed optimization of a sum of functions $F(\bz) = \sum_{i=1}^n f_i(\bz)$, where node $i$ in a network holds the function $f_i(\bz)$. We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push me…
▽ More
We consider the standard model of distributed optimization of a sum of functions $F(\bz) = \sum_{i=1}^n f_i(\bz)$, where node $i$ in a network holds the function $f_i(\bz)$. We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that \begin{enumerate*}[label=(\roman*)] \item node $i$ is capable of generating gradients of its function $f_i(\bz)$ corrupted by zero-mean bounded-support additive noise at each step, \item $F(\bz)$ is strongly convex, and \item each $f_i(\bz)$ has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions $f_1(\bz), \ldots, f_n(\bz)$ at each step.
△ Less
Submitted 29 December, 2019; v1 submitted 9 November, 2018;
originally announced November 2018.
-
Graph-Theoretic Analysis of Belief System Dynamics under Logic Constraints
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
Opinion formation cannot be modeled solely as an ideological deduction from a set of principles; rather, repeated social interactions and logic constraints among statements are consequential in the construct of belief systems. We address three basic questions in the analysis of social opinion dynamics: (i) Will a belief system converge? (ii) How long does it take to converge? (iii) Where does it c…
▽ More
Opinion formation cannot be modeled solely as an ideological deduction from a set of principles; rather, repeated social interactions and logic constraints among statements are consequential in the construct of belief systems. We address three basic questions in the analysis of social opinion dynamics: (i) Will a belief system converge? (ii) How long does it take to converge? (iii) Where does it converge? We provide graph-theoretic answers to these questions for a model of opinion dynamics of a belief system with logic constraints. Our results make plain the implicit dependence of the convergence properties of a belief system on the underlying social network and on the set of logic constraints that relate beliefs on different statements. Moreover, we provide an explicit analysis of a variety of commonly used large-scale network models.
△ Less
Submitted 30 December, 2018; v1 submitted 4 October, 2018;
originally announced October 2018.
-
Deterministic and Randomized Actuator Scheduling With Guaranteed Performance Bounds
Authors:
Milad Siami,
Alex Olshevsky,
Ali Jadbabaie
Abstract:
In this paper, we investigate the problem of actuator selection for linear dynamical systems. We develop a framework to design a sparse actuator schedule for a given large-scale linear system with guaranteed performance bounds using deterministic polynomial-time and randomized approximately linear-time algorithms. First, we introduce systemic controllability metrics for linear dynamical systems th…
▽ More
In this paper, we investigate the problem of actuator selection for linear dynamical systems. We develop a framework to design a sparse actuator schedule for a given large-scale linear system with guaranteed performance bounds using deterministic polynomial-time and randomized approximately linear-time algorithms. First, we introduce systemic controllability metrics for linear dynamical systems that are monotone and homogeneous with respect to the controllability Gramian. We show that several popular and widely used optimization criteria in the literature belong to this class of controllability metrics. Our main result is to provide a polynomial-time actuator schedule that on average selects only a constant number of actuators at each time step, independent of the dimension, to furnish a guaranteed approximation of the controllability metrics in comparison to when all actuators are in use. Our results naturally apply to the dual problem of sensor selection, in which we provide a guaranteed approximation to the observability Gramian. We illustrate the effectiveness of our theoretical findings via several numerical simulations using benchmark examples.
△ Less
Submitted 3 June, 2020; v1 submitted 1 May, 2018;
originally announced May 2018.
-
Fully Asynchronous Push-Sum With Growing Intercommunication Intervals
Authors:
Alex Olshevsky,
Ioannis Ch. Paschalidis,
Artin Spiridonoff
Abstract:
We propose an algorithm for average consensus over a directed graph which is both fully asynchronous and robust to unreliable communications. We show its convergence to the average, while allowing for slowly growing but potentially unbounded communication failures.
We propose an algorithm for average consensus over a directed graph which is both fully asynchronous and robust to unreliable communications. We show its convergence to the average, while allowing for slowly growing but potentially unbounded communication failures.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
Minimal Reachability is Hard To Approximate
Authors:
Ali Jadbabaie,
Alexander Olshevsky,
George J. Pappas,
Vasileios Tzoumas
Abstract:
In this note, we consider the problem of choosing which nodes of a linear dynamical system should be actuated so that the state transfer from the system's initial condition to a given final state is possible. Assuming a standard complexity hypothesis, we show that this problem cannot be efficiently solved or approximated in polynomial, or even quasi-polynomial, time.
In this note, we consider the problem of choosing which nodes of a linear dynamical system should be actuated so that the state transfer from the system's initial condition to a given final state is possible. Assuming a standard complexity hypothesis, we show that this problem cannot be efficiently solved or approximated in polynomial, or even quasi-polynomial, time.
△ Less
Submitted 20 February, 2018; v1 submitted 27 October, 2017;
originally announced October 2017.
-
Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization
Authors:
Angelia Nedić,
Alex Olshevsky,
Michael G. Rabbat
Abstract:
In decentralized optimization, nodes cooperate to minimize an overall objective function that is the sum (or average) of per-node private objective functions. Algorithms interleave local computations with communication among all or a subset of the nodes. Motivated by a variety of applications---distributed estimation in sensor networks, fitting models to massive data sets, and distributed control…
▽ More
In decentralized optimization, nodes cooperate to minimize an overall objective function that is the sum (or average) of per-node private objective functions. Algorithms interleave local computations with communication among all or a subset of the nodes. Motivated by a variety of applications---distributed estimation in sensor networks, fitting models to massive data sets, and distributed control of multi-robot systems, to name a few---significant advances have been made towards the development of robust, practical algorithms with theoretical performance guarantees. This paper presents an overview of recent work in this area. In general, rates of convergence depend not only on the number of nodes involved and the desired level of accuracy, but also on the structure and nature of the network over which nodes communicate (e.g., whether links are directed or undirected, static or time-varying). We survey the state-of-the-art algorithms and their analyses tailored to these different scenarios, highlighting the role of the network topology.
△ Less
Submitted 15 January, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.
-
Improved Convergence Rates for Distributed Resource Allocation
Authors:
Angelia Nedić,
Alex Olshevsky,
Wei Shi
Abstract:
In this paper, we develop a class of decentralized algorithms for solving a convex resource allocation problem in a network of $n$ agents, where the agent objectives are decoupled while the resource constraints are coupled. The agents communicate over a connected undirected graph, and they want to collaboratively determine a solution to the overall network problem, while each agent only communicat…
▽ More
In this paper, we develop a class of decentralized algorithms for solving a convex resource allocation problem in a network of $n$ agents, where the agent objectives are decoupled while the resource constraints are coupled. The agents communicate over a connected undirected graph, and they want to collaboratively determine a solution to the overall network problem, while each agent only communicates with its neighbors. We first study the connection between the decentralized resource allocation problem and the decentralized consensus optimization problem. Then, using a class of algorithms for solving consensus optimization problems, we propose a novel class of decentralized schemes for solving resource allocation problems in a distributed manner. Specifically, we first propose an algorithm for solving the resource allocation problem with an $o(1/k)$ convergence rate guarantee when the agents' objective functions are generally convex (could be nondifferentiable) and per agent local convex constraints are allowed; We then propose a gradient-based algorithm for solving the resource allocation problem when per agent local constraints are absent and show that such scheme can achieve geometric rate when the objective functions are strongly convex and have Lipschitz continuous gradients. We have also provided scalability/network dependency analysis. Based on these two algorithms, we have further proposed a gradient projection-based algorithm which can handle smooth objective and simple constraints more efficiently. Numerical experiments demonstrates the viability and performance of all the proposed algorithms.
△ Less
Submitted 16 December, 2018; v1 submitted 16 June, 2017;
originally announced June 2017.
-
Distributed Learning for Cooperative Inference
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We study the problem of cooperative inference where a group of agents interact over a network and seek to estimate a joint parameter that best explains a set of observations. Agents do not know the network topology or the observations of other agents. We explore a variational interpretation of the Bayesian posterior density, and its relation to the stochastic mirror descent algorithm, to propose a…
▽ More
We study the problem of cooperative inference where a group of agents interact over a network and seek to estimate a joint parameter that best explains a set of observations. Agents do not know the network topology or the observations of other agents. We explore a variational interpretation of the Bayesian posterior density, and its relation to the stochastic mirror descent algorithm, to propose a new distributed learning algorithm. We show that, under appropriate assumptions, the beliefs generated by the proposed algorithm concentrate around the true parameter exponentially fast. We provide explicit non-asymptotic bounds for the convergence rate. Moreover, we develop explicit and computationally efficient algorithms for observation models belonging to exponential families.
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
Distributed Gaussian Learning over Time-varying Directed Graphs
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We present a distributed (non-Bayesian) learning algorithm for the problem of parameter estimation with Gaussian noise. The algorithm is expressed as explicit updates on the parameters of the Gaussian beliefs (i.e. means and precision). We show a convergence rate of $O(1/k)$ with the constant term depending on the number of agents and the topology of the network. Moreover, we show almost sure conv…
▽ More
We present a distributed (non-Bayesian) learning algorithm for the problem of parameter estimation with Gaussian noise. The algorithm is expressed as explicit updates on the parameters of the Gaussian beliefs (i.e. means and precision). We show a convergence rate of $O(1/k)$ with the constant term depending on the number of agents and the topology of the network. Moreover, we show almost sure convergence to the optimal solution of the estimation problem for the general case of time-varying directed graphs.
△ Less
Submitted 6 December, 2016; v1 submitted 5 December, 2016;
originally announced December 2016.
-
On (Non)Supermodularity of Average Control Energy
Authors:
Alex Olshevsky
Abstract:
Given a linear system, we consider the expected energy to move from the origin to a uniformly random point on the unit sphere as a function of the set of actuated variables. We show this function is not necessarily supermodular, correcting some claims in the existing literature.
Given a linear system, we consider the expected energy to move from the origin to a uniformly random point on the unit sphere as a function of the set of actuated variables. We show this function is not necessarily supermodular, correcting some claims in the existing literature.
△ Less
Submitted 31 January, 2017; v1 submitted 27 September, 2016;
originally announced September 2016.
-
A Tutorial on Distributed (Non-Bayesian) Learning: Problem, Algorithms and Results
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We overview some results on distributed learning with focus on a family of recently proposed algorithms known as non-Bayesian social learning. We consider different approaches to the distributed learning problem and its algorithmic solutions for the case of finitely many hypotheses. The original centralized problem is discussed at first, and then followed by a generalization to the distributed set…
▽ More
We overview some results on distributed learning with focus on a family of recently proposed algorithms known as non-Bayesian social learning. We consider different approaches to the distributed learning problem and its algorithmic solutions for the case of finitely many hypotheses. The original centralized problem is discussed at first, and then followed by a generalization to the distributed setting. The results on convergence and convergence rate are presented for both asymptotic and finite time regimes. Various extensions are discussed such as those dealing with directed time-varying networks, Nesterov's acceleration technique and a continuum sets of hypothesis.
△ Less
Submitted 23 September, 2016;
originally announced September 2016.
-
On the geometric convergence rate of distributed economic dispatch/demand response in power networks
Authors:
Thinh T. Doan,
Alex Olshevsky
Abstract:
Motivated by potential applications in power systems, we study a problem of optimizing a sum of $n$ convex functions on dynamic networks of $n$ nodes when each function is known to only a single node. The nodes' variables, while satisfy their local constraints, are coupled through a linear constraint. Our main contribution is to design a fully distributed primal-dual method for this problem. Under…
▽ More
Motivated by potential applications in power systems, we study a problem of optimizing a sum of $n$ convex functions on dynamic networks of $n$ nodes when each function is known to only a single node. The nodes' variables, while satisfy their local constraints, are coupled through a linear constraint. Our main contribution is to design a fully distributed primal-dual method for this problem. Under some fairly standard assumptions on objective functions, strong convexity and smoothness, we provide an explicit analysis for the convergence rate of our method on different networks. In particular, the nodes variables achieve a geometric convergence to the optimal with the associated convergence time scales quartically in the number of nodes on any sequence of time-varying undirected graphs satisfying a long-term connectivity condition. Moreover, this convergence time is constant independent on the number of nodes when the network is a b-regular simple graph with $b\geq 3$. Finally, to show the effectiveness of our method we also simulate a number of studies on economic dispatch problems and demand response problems in power systems.
△ Less
Submitted 30 September, 2016; v1 submitted 21 September, 2016;
originally announced September 2016.
-
Geometrically Convergent Distributed Optimization with Uncoordinated Step-Sizes
Authors:
Angelia Nedić,
Alex Olshevsky,
Wei Shi,
César A. Uribe
Abstract:
A recent algorithmic family for distributed optimization, DIGing's, have been shown to have geometric convergence over time-varying undirected/directed graphs. Nevertheless, an identical step-size for all agents is needed. In this paper, we study the convergence rates of the Adapt-Then-Combine (ATC) variation of the DIGing algorithm under uncoordinated step-sizes. We show that the ATC variation of…
▽ More
A recent algorithmic family for distributed optimization, DIGing's, have been shown to have geometric convergence over time-varying undirected/directed graphs. Nevertheless, an identical step-size for all agents is needed. In this paper, we study the convergence rates of the Adapt-Then-Combine (ATC) variation of the DIGing algorithm under uncoordinated step-sizes. We show that the ATC variation of DIGing algorithm converges geometrically fast even if the step-sizes are different among the agents. In addition, our analysis implies that the ATC structure can accelerate convergence compared to the distributed gradient descent (DGD) structure which has been used in the original DIGing algorithm.
△ Less
Submitted 19 September, 2016;
originally announced September 2016.
-
Fast Algorithms for Distributed Optimization and Hypothesis Testing: A Tutorial
Authors:
Alex Olshevsky
Abstract:
We consider several problems in the field of distributed optimization and hypothesis testing. We show how to obtain convergence times for these problems that scale linearly with the total number of nodes in the network by using a recent linear-time algorithm for the average consensus problem.
We consider several problems in the field of distributed optimization and hypothesis testing. We show how to obtain convergence times for these problems that scale linearly with the total number of nodes in the network by using a recent linear-time algorithm for the average consensus problem.
△ Less
Submitted 22 May, 2017; v1 submitted 13 September, 2016;
originally announced September 2016.
-
Achieving Geometric Convergence for Distributed Optimization over Time-Varying Graphs
Authors:
Angelia Nedich,
Alex Olshevsky,
Wei Shi
Abstract:
This paper considers the problem of distributed optimization over time-varying graphs. For the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient method and a gradient tracking technique. The DIGing algorithm uses doubly stochastic mixing matrices and employs fixed step-sizes and, yet, drives all the agen…
▽ More
This paper considers the problem of distributed optimization over time-varying graphs. For the case of undirected graphs, we introduce a distributed algorithm, referred to as DIGing, based on a combination of a distributed inexact gradient method and a gradient tracking technique. The DIGing algorithm uses doubly stochastic mixing matrices and employs fixed step-sizes and, yet, drives all the agents' iterates to a global and consensual minimizer. When the graphs are directed, in which case the implementation of doubly stochastic mixing matrices is unrealistic, we construct an algorithm that incorporates the push-sum protocol into the DIGing structure, thus obtaining Push-DIGing algorithm. The Push-DIGing uses column stochastic matrices and fixed step-sizes, but it still converges to a global and consensual minimizer. Under the strong convexity assumption, we prove that the algorithms converge at R-linear (geometric) rates as long as the step-sizes do not exceed some upper bounds. We establish explicit estimates for the convergence rates. When the graph is undirected it shows that DIGing scales polynomially in the number of agents. We also provide some numerical experiments to demonstrate the efficacy of the proposed algorithms and to validate our theoretical findings.
△ Less
Submitted 20 March, 2017; v1 submitted 11 July, 2016;
originally announced July 2016.
-
Distributed Learning with Infinitely Many Hypotheses
Authors:
Angelia Nedić,
Alex Olshevsky,
César Uribe
Abstract:
We consider a distributed learning setup where a network of agents sequentially access realizations of a set of random variables with unknown distributions. The network objective is to find a parametrized distribution that best describes their joint observations in the sense of the Kullback-Leibler divergence. Apart from recent efforts in the literature, we analyze the case of countably many hypot…
▽ More
We consider a distributed learning setup where a network of agents sequentially access realizations of a set of random variables with unknown distributions. The network objective is to find a parametrized distribution that best describes their joint observations in the sense of the Kullback-Leibler divergence. Apart from recent efforts in the literature, we analyze the case of countably many hypotheses and the case of a continuum of hypotheses. We provide non-asymptotic bounds for the concentration rate of the agents' beliefs around the correct hypothesis in terms of the number of agents, the network parameters, and the learning abilities of the agents. Additionally, we provide a novel motivation for a general set of distributed Non-Bayesian update rules as instances of the distributed stochastic mirror descent algorithm.
△ Less
Submitted 6 May, 2016;
originally announced May 2016.
-
Eigenvalue Clustering, Control Energy, and Logarithmic Capacity
Authors:
Alex Olshevsky
Abstract:
We prove two bounds showing that if the eigenvalues of a matrix are clustered in a region of the complex plane then the corresponding discrete-time linear system requires significant energy to control. A curious feature of one of our bounds is that the dependence on the region is via its logarithmic capacity, which is a measure of how well a unit of mass may be spread out over the region to minimi…
▽ More
We prove two bounds showing that if the eigenvalues of a matrix are clustered in a region of the complex plane then the corresponding discrete-time linear system requires significant energy to control. A curious feature of one of our bounds is that the dependence on the region is via its logarithmic capacity, which is a measure of how well a unit of mass may be spread out over the region to minimize a logarithmic potential.
△ Less
Submitted 23 April, 2016; v1 submitted 31 October, 2015;
originally announced November 2015.
-
Network Independent Rates in Distributed Learning
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We propose a new belief update rule for Distributed Non-Bayesian learning in time-varying directed graphs, where a group of agents tries to collectively identify a hypothesis that best describes a sequence of observed data. We show that the proposed update rule, inspired by the Push-Sum algorithm, is consistent, moreover we provide an explicit characterization of its convergence rate. Our main res…
▽ More
We propose a new belief update rule for Distributed Non-Bayesian learning in time-varying directed graphs, where a group of agents tries to collectively identify a hypothesis that best describes a sequence of observed data. We show that the proposed update rule, inspired by the Push-Sum algorithm, is consistent, moreover we provide an explicit characterization of its convergence rate. Our main result states that, after a transient time, all agents will concentrate their beliefs at a network independent rate. Network independent rates were not available for other consensus based distributed learning algorithms.
△ Less
Submitted 28 September, 2015;
originally announced September 2015.
-
Fast Convergence Rates for Distributed Non-Bayesian Learning
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set…
▽ More
We consider the problem of distributed learning, where a network of agents collectively aim to agree on a hypothesis that best explains a set of distributed observations of conditionally independent random processes. We propose a distributed algorithm and establish consistency, as well as a non-asymptotic, explicit and geometric convergence rate for the concentration of the beliefs around the set of optimal hypotheses. Additionally, if the agents interact over static networks, we provide an improved learning protocol with better scalability with respect to the number of nodes in the network.
△ Less
Submitted 10 April, 2017; v1 submitted 20 August, 2015;
originally announced August 2015.
-
Scaling laws for consensus protocols subject to noise
Authors:
Ali Jadbabaie,
Alex Olshevsky
Abstract:
We study the performance of discrete-time consensus protocols in the presence of additive noise. When the consensus dynamic corresponds to a reversible Markov chain, we give an exact expression for a weighted version of steady-state disagreement in terms of the stationary distribution and hitting times in an underlying graph. We then show how this result can be used to characterize the noise robus…
▽ More
We study the performance of discrete-time consensus protocols in the presence of additive noise. When the consensus dynamic corresponds to a reversible Markov chain, we give an exact expression for a weighted version of steady-state disagreement in terms of the stationary distribution and hitting times in an underlying graph. We then show how this result can be used to characterize the noise robustness of a class of protocols for formation control in terms of the Kemeny constant of an underlying graph.
△ Less
Submitted 7 March, 2017; v1 submitted 31 July, 2015;
originally announced August 2015.
-
Distributed Resource Allocation on Dynamic Networks in Quadratic Time
Authors:
Thinh T. Doan,
Alex Olshevsky
Abstract:
We consider the problem of allocating a fixed amount of resource among nodes in a network when each node suffers a cost which is a convex function of the amount of resource allocated to it. We propose a new deterministic and distributed protocol for this problem. Our main result is that the associated convergence time for the global objective scales quadratically in the number of nodes on any sequ…
▽ More
We consider the problem of allocating a fixed amount of resource among nodes in a network when each node suffers a cost which is a convex function of the amount of resource allocated to it. We propose a new deterministic and distributed protocol for this problem. Our main result is that the associated convergence time for the global objective scales quadratically in the number of nodes on any sequence of time-varying undirected graphs satisfying a long-term connectivity condition.
△ Less
Submitted 12 June, 2016; v1 submitted 28 July, 2015;
originally announced July 2015.
-
Convergence Time of Quantized Metropolis Consensus Over Time-Varying Networks
Authors:
Tamer Basar,
Seyed Rasoul Etesami,
Alex Olshevsky
Abstract:
We consider the quantized consensus problem on undirected time-varying connected graphs with n nodes, and devise a protocol with fast convergence time to the set of consensus points. Specifically, we show that when the edges of each network in a sequence of connected time-varying networks are activated based on Poisson processes with Metropolis rates, the expected convergence time to the set of co…
▽ More
We consider the quantized consensus problem on undirected time-varying connected graphs with n nodes, and devise a protocol with fast convergence time to the set of consensus points. Specifically, we show that when the edges of each network in a sequence of connected time-varying networks are activated based on Poisson processes with Metropolis rates, the expected convergence time to the set of consensus points is at most O(n^2 log^2 n), where each node performs a constant number of updates per unit time.
△ Less
Submitted 2 February, 2016; v1 submitted 6 April, 2015;
originally announced April 2015.
-
Linear Time Average Consensus on Fixed Graphs and Implications for Decentralized Optimization and Multi-Agent Control
Authors:
Alex Olshevsky
Abstract:
We describe a protocol for the average consensus problem on any fixed undirected graph whose convergence time scales linearly in the total number nodes $n$. The protocol is completely distributed, with the exception of requiring all nodes to know the same upper bound $U$ on the total number of nodes which is correct within a constant multiplicative factor.
We next discuss applications of this pr…
▽ More
We describe a protocol for the average consensus problem on any fixed undirected graph whose convergence time scales linearly in the total number nodes $n$. The protocol is completely distributed, with the exception of requiring all nodes to know the same upper bound $U$ on the total number of nodes which is correct within a constant multiplicative factor.
We next discuss applications of this protocol to problems in multi-agent control connected to the consensus problem. In particular, we describe protocols for formation maintenance and leader-following with convergence times which also scale linearly with the number of nodes.
Finally, we develop a distributed protocol for minimizing an average of (possibly nondifferentiable) convex functions $ (1/n) \sum_{i=1}^n f_i(θ)$, in the setting where only node $i$ in an undirected, connected graph knows the function $f_i(θ)$. Under the same assumption about all nodes knowing $U$, and additionally assuming that the subgradients of each $f_i(θ)$ have absolute values upper bounded by some constant $L$ known to the nodes, we show that after $T$ iterations our protocol has error which is $O(L \sqrt{n/T})$.
△ Less
Submitted 3 August, 2017; v1 submitted 15 November, 2014;
originally announced November 2014.
-
Nonasymptotic Convergence Rates for Cooperative Learning Over Time-Varying Directed Graphs
Authors:
Angelia Nedić,
Alex Olshevsky,
César A. Uribe
Abstract:
We study the problem of distributed hypothesis testing with a network of agents where some agents repeatedly gain access to information about the correct hypothesis. The group objective is to globally agree on a joint hypothesis that best describes the observed data at all the nodes. We assume that the agents can interact with their neighbors in an unknown sequence of time-varying directed graphs.…
▽ More
We study the problem of distributed hypothesis testing with a network of agents where some agents repeatedly gain access to information about the correct hypothesis. The group objective is to globally agree on a joint hypothesis that best describes the observed data at all the nodes. We assume that the agents can interact with their neighbors in an unknown sequence of time-varying directed graphs. Following the pioneering work of Jadbabaie, Molavi, Sandroni, and Tahbaz-Salehi, we propose local learning dynamics which combine Bayesian updates at each node with a local aggregation rule of private agent signals. We show that these learning dynamics drive all agents to the set of hypotheses which best explain the data collected at all nodes as long as the sequence of interconnection graphs is uniformly strongly connected. Our main result establishes a non-asymptotic, explicit, geometric convergence rate for the learning dynamic.
△ Less
Submitted 20 August, 2015; v1 submitted 7 October, 2014;
originally announced October 2014.
-
Minimum Input Selection for Structural Controllability
Authors:
Alex Olshevsky
Abstract:
Given a linear system $\dot{x} = Ax$, where $A$ is an $n \times n$ matrix with $m$ nonzero entries, we consider the problem of finding the smallest set of state variables to affect with an input so that the resulting system is structurally controllable. We further assume we are given a set of "forbidden state variables" $F$ which cannot be affected with an input and which we have to avoid in our s…
▽ More
Given a linear system $\dot{x} = Ax$, where $A$ is an $n \times n$ matrix with $m$ nonzero entries, we consider the problem of finding the smallest set of state variables to affect with an input so that the resulting system is structurally controllable. We further assume we are given a set of "forbidden state variables" $F$ which cannot be affected with an input and which we have to avoid in our selection. Our main result is that this problem can be solved deterministically in $O(n+m \sqrt{n})$ operations.
△ Less
Submitted 27 September, 2014; v1 submitted 10 July, 2014;
originally announced July 2014.
-
Stochastic Gradient-Push for Strongly Convex Functions on Time-Varying Directed Graphs
Authors:
Angelia Nedic,
Alex Olshevsky
Abstract:
We investigate the convergence rate of the recently proposed subgradient-push method for distributed optimization over time-varying directed graphs. The subgradient-push method can be implemented in a distributed way without requiring knowledge of either the number of agents or the graph sequence; each node is only required to know its out-degree at each time. Our main result is a convergence rate…
▽ More
We investigate the convergence rate of the recently proposed subgradient-push method for distributed optimization over time-varying directed graphs. The subgradient-push method can be implemented in a distributed way without requiring knowledge of either the number of agents or the graph sequence; each node is only required to know its out-degree at each time. Our main result is a convergence rate of $O \left((\ln t)/t \right)$ for strongly convex functions with Lipschitz gradients even if only stochastic gradient samples are available; this is asymptotically faster than the $O \left((\ln t)/\sqrt{t} \right)$ rate previously known for (general) convex functions.
△ Less
Submitted 15 February, 2015; v1 submitted 9 June, 2014;
originally announced June 2014.
-
On symmetric continuum opinion dynamics
Authors:
Julien M. Hendrickx,
Alex Olshevsky
Abstract:
This paper investigates the asymptotic behavior of some common opinion dynamic models in a continuum of agents. We show that as long as the interactions among the agents are symmetric, the distribution of the agents' opinion converges. We also investigate whether convergence occurs in a stronger sense than merely in distribution, namely, whether the opinion of almost every agent converges. We show…
▽ More
This paper investigates the asymptotic behavior of some common opinion dynamic models in a continuum of agents. We show that as long as the interactions among the agents are symmetric, the distribution of the agents' opinion converges. We also investigate whether convergence occurs in a stronger sense than merely in distribution, namely, whether the opinion of almost every agent converges. We show that while this is not the case in general, it becomes true under plausible assumptions on inter-agent interactions, namely that agents with similar opinions exert a non-negligible pull on each other, or that the interactions are entirely determined by their opinions via a smooth function.
△ Less
Submitted 10 August, 2016; v1 submitted 2 November, 2013;
originally announced November 2013.
-
Nonuniform Line Coverage from Noisy Scalar Measurements
Authors:
P. Davison,
N. E. Leonard,
A. Olshevsky,
M. Schwemmer
Abstract:
We study the problem of distributed coverage control in a network of mobile agents arranged on a line. The goal is to design distributed dynamics for the agents to achieve optimal coverage positions with respect to a scalar density field that measures the relative importance of each point on the line. Unlike previous work, which has implicitly assumed the agents know this density field, we only as…
▽ More
We study the problem of distributed coverage control in a network of mobile agents arranged on a line. The goal is to design distributed dynamics for the agents to achieve optimal coverage positions with respect to a scalar density field that measures the relative importance of each point on the line. Unlike previous work, which has implicitly assumed the agents know this density field, we only assume that each agent can access noisy samples of the field at points close to its current location. We provide a simple randomized protocol wherein every agent samples the scalar field at three nearby points at each step and which guarantees convergence to the optimal positions. We further analyze the convergence time of this protocol and show that, under suitable assumptions, the squared distance to the optimal coverage configuration decays as $O(1/t)$ with the number of iterations $t$, where the constant scales polynomially with the number of agents $n$. We illustrate these results with simulations.
△ Less
Submitted 21 November, 2014; v1 submitted 15 October, 2013;
originally announced October 2013.
-
On Primitivity of Sets of Matrices
Authors:
Vincent D. Blondel,
Raphael M. Jungers,
Alex Olshevsky
Abstract:
A nonnegative matrix $A$ is called primitive if $A^k$ is positive for some integer $k>0$. A generalization of this concept to finite sets of matrices is as follows: a set of matrices $\mathcal M = \{A_1, A_2, \ldots, A_m \}$ is primitive if $A_{i_1} A_{i_2} \ldots A_{i_k}$ is positive for some indices $i_1, i_2, ..., i_k$. The concept of primitive sets of matrices comes up in a number of problems…
▽ More
A nonnegative matrix $A$ is called primitive if $A^k$ is positive for some integer $k>0$. A generalization of this concept to finite sets of matrices is as follows: a set of matrices $\mathcal M = \{A_1, A_2, \ldots, A_m \}$ is primitive if $A_{i_1} A_{i_2} \ldots A_{i_k}$ is positive for some indices $i_1, i_2, ..., i_k$. The concept of primitive sets of matrices comes up in a number of problems within the study of discrete-time switched systems. In this paper, we analyze the computational complexity of deciding if a given set of matrices is primitive and we derive bounds on the length of the shortest positive product.
We show that while primitivity is algorithmically decidable, unless $P=NP$ it is not possible to decide primitivity of a matrix set in polynomial time. Moreover, we show that the length of the shortest positive sequence can be superpolynomial in the dimension of the matrices. On the other hand, defining ${\mathcal P}$ to be the set of matrices with no zero rows or columns, we give a simple combinatorial proof of a previously-known characterization of primitivity for matrices in ${\mathcal P}$ which can be tested in polynomial time. This latter observation is related to the well-known 1964 conjecture of Cerny on synchronizing automata; in fact, any bound on the minimal length of a synchronizing word for synchronizing automata immediately translates into a bound on the length of the shortest positive product of a primitive set of matrices in ${\mathcal P}$. In particular, any primitive set of $n \times n$ matrices in ${\mathcal P}$ has a positive product of length $O(n^3)$.
△ Less
Submitted 15 April, 2015; v1 submitted 4 June, 2013;
originally announced June 2013.
-
Minimal Controllability Problems
Authors:
Alex Olshevsky
Abstract:
Given a linear system, we consider the problem of finding a small set of variables to affect with an input so that the resulting system is controllable. We show that this problem is NP-hard; indeed, we show that even approximating the minimum number of variables that need to be affected within a multiplicative factor of $c \log n$ is NP-hard for some positive $c$. On the positive side, we show it…
▽ More
Given a linear system, we consider the problem of finding a small set of variables to affect with an input so that the resulting system is controllable. We show that this problem is NP-hard; indeed, we show that even approximating the minimum number of variables that need to be affected within a multiplicative factor of $c \log n$ is NP-hard for some positive $c$. On the positive side, we show it is possible to find sets of variables matching this inapproximability barrier in polynomial time. This can be done by a simple greedy heuristic which sequentially picks variables to maximize the rank increase of the controllability matrix. Experiments on Erdos-Renyi random graphs demonstrate this heuristic almost always succeeds at findings the minimum number of variables.
△ Less
Submitted 2 May, 2014; v1 submitted 10 April, 2013;
originally announced April 2013.
-
Distributed optimization over time-varying directed graphs
Authors:
Angelia Nedic,
Alex Olshevsky
Abstract:
We consider distributed optimization by a collection of nodes, each having access to its own convex function, whose collective goal is to minimize the sum of the functions. The communications between nodes are described by a time-varying sequence of directed graphs, which is uniformly strongly connected. For such communications, assuming that every node knows its out-degree, we develop a broadcast…
▽ More
We consider distributed optimization by a collection of nodes, each having access to its own convex function, whose collective goal is to minimize the sum of the functions. The communications between nodes are described by a time-varying sequence of directed graphs, which is uniformly strongly connected. For such communications, assuming that every node knows its out-degree, we develop a broadcast-based algorithm, termed the subgradient-push, which steers every node to an optimal value under a standard assumption of subgradient boundedness. The subgradient-push requires no knowledge of either the number of agents or the graph sequence to implement. Our analysis shows that the subgradient-push algorithm converges at a rate of $O(\ln(t)/\sqrt{t})$, where the constant depends on the initial values at the nodes, the subgradient norms, and, more interestingly, on both the consensus speed and the imbalances of influence among the nodes.
△ Less
Submitted 15 March, 2014; v1 submitted 10 March, 2013;
originally announced March 2013.
-
Consensus with Ternary Messages
Authors:
Alex Olshevsky
Abstract:
We provide a protocol for real-valued average consensus by networks of agents which exchange only a single message from the ternary alphabet {-1,0,1} between neighbors at each step. Our protocol works on time-varying undirected graphs subject to a connectivity condition, has a worst-case convergence time which is polynomial in the number of agents and the initial values, and requires no global kno…
▽ More
We provide a protocol for real-valued average consensus by networks of agents which exchange only a single message from the ternary alphabet {-1,0,1} between neighbors at each step. Our protocol works on time-varying undirected graphs subject to a connectivity condition, has a worst-case convergence time which is polynomial in the number of agents and the initial values, and requires no global knowledge about the graph topologies on the part of each node to implement except for knowing an upper bound on the degrees of its neighbors.
△ Less
Submitted 7 February, 2014; v1 submitted 23 December, 2012;
originally announced December 2012.
-
Graph diameter, eigenvalues, and minimum-time consensus
Authors:
Julien M. Hendrickx,
Raphaël M. Jungers,
Alexander Olshevsky,
Guillaume Vankeerberghen
Abstract:
We consider the problem of achieving average consensus in the minimum number of linear iterations on a fixed, undirected graph. We are motivated by the task of deriving lower bounds for consensus protocols and by the so-called "definitive consensus conjecture" which states that for an undirected connected graph G with diameter D there exist D matrices whose nonzero-pattern complies with the edges…
▽ More
We consider the problem of achieving average consensus in the minimum number of linear iterations on a fixed, undirected graph. We are motivated by the task of deriving lower bounds for consensus protocols and by the so-called "definitive consensus conjecture" which states that for an undirected connected graph G with diameter D there exist D matrices whose nonzero-pattern complies with the edges in G and whose product equals the all-ones matrix. Our first result is a counterexample to the definitive consensus conjecture, which is the first improvement of the diameter lower bound for linear consensus protocols. We then provide some algebraic conditions under which this conjecture holds, which we use to establish that all distance-regular graphs satisfy the definitive consensus conjecture.
△ Less
Submitted 29 August, 2013; v1 submitted 27 November, 2012;
originally announced November 2012.
-
Cooperative learning in multi-agent systems from intermittent measurements
Authors:
Naomi Ehrich Leonard,
Alex Olshevsky
Abstract:
Motivated by the problem of tracking a direction in a decentralized way, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector $μ$ from noisy measurements made independently by autonomous nodes. Our protocol is completely distribute…
▽ More
Motivated by the problem of tracking a direction in a decentralized way, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector $μ$ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the time-varying, unpredictable, and noisy nature of inter-agent communication, and intermittent noisy measurements of $μ$. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (time-varying) networks connecting the nodes.
△ Less
Submitted 15 December, 2014; v1 submitted 10 September, 2012;
originally announced September 2012.
-
How to decide consensus? A combinatorial necessary and sufficient condition and a proof that consensus is decidable but NP-hard
Authors:
Vincent Blondel,
Alex Olshevsky
Abstract:
A set of stochastic matrices ${\cal P}$ is a consensus set if for every sequence of matrices $P(1), P(2), \ldots$ whose elements belong to ${\cal P}$ and every initial state $x(0)$, the sequence of states defined by $x(t) = P(t) P(t-1) \cdots P(1) x(0)$ converges to a vector whose entries are all identical. In this paper, we introduce an "avoiding set condition" for compact sets of matrices and pr…
▽ More
A set of stochastic matrices ${\cal P}$ is a consensus set if for every sequence of matrices $P(1), P(2), \ldots$ whose elements belong to ${\cal P}$ and every initial state $x(0)$, the sequence of states defined by $x(t) = P(t) P(t-1) \cdots P(1) x(0)$ converges to a vector whose entries are all identical. In this paper, we introduce an "avoiding set condition" for compact sets of matrices and prove in our main theorem that this explicit combinatorial condition is both necessary and sufficient for consensus. We show that several of the conditions for consensus proposed in the literature can be directly derived from the avoiding set condition. The avoiding set condition is easy to check with an elementary algorithm, and so our result also establishes that consensus is algorithmically decidable. Direct verification of the avoiding set condition may require more than a polynomial time number of operations. This is however likely to be the case for any consensus checking algorithm since we also prove in this paper that unless $P=NP$, consensus cannot be decided in polynomial time.
△ Less
Submitted 31 May, 2014; v1 submitted 14 February, 2012;
originally announced February 2012.
-
Nonuniform Coverage Control on the Line
Authors:
Naomi Ehrich Leonard,
Alex Olshevsky
Abstract:
This paper investigates control laws allowing mobile, autonomous agents to optimally position themselves on the line for distributed sensing in a nonuniform field. We show that a simple static control law, based only on local measurements of the field by each agent, drives the agents close to the optimal positions after the agents execute in parallel a number of sensing/movement/computation rounds…
▽ More
This paper investigates control laws allowing mobile, autonomous agents to optimally position themselves on the line for distributed sensing in a nonuniform field. We show that a simple static control law, based only on local measurements of the field by each agent, drives the agents close to the optimal positions after the agents execute in parallel a number of sensing/movement/computation rounds that is essentially quadratic in the number of agents. Further, we exhibit a dynamic control law which, under slightly stronger assumptions on the capabilities and knowledge of each agent, drives the agents close to the optimal positions after the agents execute in parallel a number of sensing/communication/computation/movement rounds that is essentially linear in the number of agents. Crucially, both algorithms are fully distributed and robust to unpredictable loss and addition of agents.
△ Less
Submitted 7 November, 2012; v1 submitted 3 April, 2011;
originally announced April 2011.