-
Stability of Decentralized Gradient Descent in Open Multi-Agent Systems
Authors:
Julien M. Hendrickx,
Michael G. Rabbat
Abstract:
The aim of decentralized gradient descent (DGD) is to minimize a sum of $n$ functions held by interconnected agents. We study the stability of DGD in open contexts where agents can join or leave the system, resulting each time in the addition or the removal of their function from the global objective. Assuming all functions are smooth, strongly convex, and their minimizers all lie in a given ball,…
▽ More
The aim of decentralized gradient descent (DGD) is to minimize a sum of $n$ functions held by interconnected agents. We study the stability of DGD in open contexts where agents can join or leave the system, resulting each time in the addition or the removal of their function from the global objective. Assuming all functions are smooth, strongly convex, and their minimizers all lie in a given ball, we characterize the sensitivity of the global minimizer of the sum of these functions to the removal or addition of a new function and provide bounds in $ O\left(\min \left(κ^{0.5}, κ/n^{0.5},κ^{1.5}/n\right)\right)$ where $κ$ is the condition number. We also show that the states of all agents can be eventually bounded independently of the sequence of arrivals and departures. The magnitude of the bound scales with the importance of the interconnection, which also determines the accuracy of the final solution in the absence of arrival and departure, exposing thus a potential trade-off between accuracy and sensitivity. Our analysis relies on the formulation of DGD as gradient descent on an auxiliary function. The tightness of our results is analyzed using the PESTO Toolbox.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
Effectiveness of Alter Sampling in Social Networks
Authors:
Naghmeh Momeni,
Michael G. Rabbat
Abstract:
Social networks play a key role in studying various individual and social behaviors. To use social networks in a study, their structural properties must be measured. For offline social networks, the conventional procedure is surveying/interviewing a set of randomly-selected respondents. In many practical applications, inferring the network structure via sampling is too prohibitively costly. There…
▽ More
Social networks play a key role in studying various individual and social behaviors. To use social networks in a study, their structural properties must be measured. For offline social networks, the conventional procedure is surveying/interviewing a set of randomly-selected respondents. In many practical applications, inferring the network structure via sampling is too prohibitively costly. There are also applications in which it simply fails. For example, for optimal vaccination or employing influential spreaders for public health interventions, we need to efficiently and quickly target well-connected individuals, which random sampling does not accomplish. In a few studies, an alternative sampling scheme (which we dub `alter sampling') has proven useful. This method simply targets randomly-chosen neighbors of the randomly-selected respondents. A natural question that arises is: to what extent does this method generalize? Is the method suitable for every social network or only the very few ones considered so far? In this paper, we demonstrate the robustness of this method across a wide range of networks with diverse structural properties. The method outperforms random sampling by a large margin for a vast majority of cases. We then propose an estimator to assess the advantage of choosing alter sampling over random sampling in practical scenarios, and demonstrate its accuracy via Monte Carlo simulations on diverse synthetic networks.
△ Less
Submitted 14 December, 2018; v1 submitted 7 December, 2018;
originally announced December 2018.
-
Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization
Authors:
Angelia Nedić,
Alex Olshevsky,
Michael G. Rabbat
Abstract:
In decentralized optimization, nodes cooperate to minimize an overall objective function that is the sum (or average) of per-node private objective functions. Algorithms interleave local computations with communication among all or a subset of the nodes. Motivated by a variety of applications---distributed estimation in sensor networks, fitting models to massive data sets, and distributed control…
▽ More
In decentralized optimization, nodes cooperate to minimize an overall objective function that is the sum (or average) of per-node private objective functions. Algorithms interleave local computations with communication among all or a subset of the nodes. Motivated by a variety of applications---distributed estimation in sensor networks, fitting models to massive data sets, and distributed control of multi-robot systems, to name a few---significant advances have been made towards the development of robust, practical algorithms with theoretical performance guarantees. This paper presents an overview of recent work in this area. In general, rates of convergence depend not only on the number of nodes involved and the desired level of accuracy, but also on the structure and nature of the network over which nodes communicate (e.g., whether links are directed or undirected, static or time-varying). We survey the state-of-the-art algorithms and their analyses tailored to these different scenarios, highlighting the role of the network topology.
△ Less
Submitted 15 January, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.
-
Characterization and Inference of Graph Diffusion Processes from Observations of Stationary Signals
Authors:
Bastien Pasdeloup,
Vincent Gripon,
Grégoire Mercier,
Dominique Pastor,
Michael G. Rabbat
Abstract:
Many tools from the field of graph signal processing exploit knowledge of the underlying graph's structure (e.g., as encoded in the Laplacian matrix) to process signals on the graph. Therefore, in the case when no graph is available, graph signal processing tools cannot be used anymore. Researchers have proposed approaches to infer a graph topology from observations of signals on its nodes. Since…
▽ More
Many tools from the field of graph signal processing exploit knowledge of the underlying graph's structure (e.g., as encoded in the Laplacian matrix) to process signals on the graph. Therefore, in the case when no graph is available, graph signal processing tools cannot be used anymore. Researchers have proposed approaches to infer a graph topology from observations of signals on its nodes. Since the problem is ill-posed, these approaches make assumptions, such as smoothness of the signals on the graph, or sparsity priors. In this paper, we propose a characterization of the space of valid graphs, in the sense that they can explain stationary signals. To simplify the exposition in this paper, we focus here on the case where signals were i.i.d. at some point back in time and were observed after diffusion on a graph. We show that the set of graphs verifying this assumption has a strong connection with the eigenvectors of the covariance matrix, and forms a convex set. Along with a theoretical study in which these eigenvectors are assumed to be known, we consider the practical case when the observations are noisy, and experimentally observe how fast the set of valid graphs converges to the set obtained when the exact eigenvectors are known, as the number of observations grows. To illustrate how this characterization can be used for graph recovery, we present two methods for selecting a particular point in this set under chosen criteria, namely graph simplicity and sparsity. Additionally, we introduce a measure to evaluate how much a graph is adapted to signals under a stationarity assumption. Finally, we evaluate how state-of-the-art methods relate to this framework through experiments on a dataset of temperatures.
△ Less
Submitted 6 June, 2017; v1 submitted 9 May, 2016;
originally announced May 2016.
-
On Reconstructability of Quadratic Utility Functions from the Iterations in Gradient Methods
Authors:
Farhad Farokhi,
Iman Shames,
Michael G. Rabbat,
Mikael Johansson
Abstract:
In this paper, we consider a scenario where an eavesdropper can read the content of messages transmitted over a network. The nodes in the network are running a gradient algorithm to optimize a quadratic utility function where such a utility optimization is a part of a decision making process by an administrator. We are interested in understanding the conditions under which the eavesdropper can rec…
▽ More
In this paper, we consider a scenario where an eavesdropper can read the content of messages transmitted over a network. The nodes in the network are running a gradient algorithm to optimize a quadratic utility function where such a utility optimization is a part of a decision making process by an administrator. We are interested in understanding the conditions under which the eavesdropper can reconstruct the utility function or a scaled version of it and, as a result, gain insight into the decision-making process. We establish that if the parameter of the gradient algorithm, i.e.,~the step size, is chosen appropriately, the task of reconstruction becomes practically impossible for a class of Bayesian filters with uniform priors. We establish what step-size rules should be employed to ensure this.
△ Less
Submitted 17 September, 2015;
originally announced September 2015.
-
Measuring the Generalized Friendship Paradox in Networks with Quality-dependent Connectivity
Authors:
Naghmeh Momeni,
Michael G. Rabbat
Abstract:
The friendship paradox is a sociological phenomenon stating that most people have fewer friends than their friends do. The generalized friendship paradox refers to the same observation for attributes other than degree, and it has been observed in Twitter and scientific collaboration networks. This paper takes an analytical approach to model this phenomenon. We consider a preferential attachment-li…
▽ More
The friendship paradox is a sociological phenomenon stating that most people have fewer friends than their friends do. The generalized friendship paradox refers to the same observation for attributes other than degree, and it has been observed in Twitter and scientific collaboration networks. This paper takes an analytical approach to model this phenomenon. We consider a preferential attachment-like network growth mechanism governed by both node degrees and `qualities'. We introduce measures to quantify paradoxes, and contrast the results obtained in our model to those obtained for an uncorrelated network, where the degrees and qualities of adjacent nodes are uncorrelated. We shed light on the effect of the distribution of node qualities on the friendship paradox. We consider both the mean and the median to measure paradoxes, and compare the results obtained by using these two statistics.
△ Less
Submitted 3 November, 2014;
originally announced November 2014.
-
Generalized Friendship Paradox: An Analytical Approach
Authors:
Babak Fotouhi,
Naghmeh Momeni,
Michael G. Rabbat
Abstract:
The friendship paradox refers to the sociological observation that, while the people's assessment of their own popularity is typically self-aggrandizing, in reality they are less popular than their friends. The generalized friendship paradox is the average alter superiority observed empirically in social settings, scientific collaboration networks, as well as online social media. We posit a qualit…
▽ More
The friendship paradox refers to the sociological observation that, while the people's assessment of their own popularity is typically self-aggrandizing, in reality they are less popular than their friends. The generalized friendship paradox is the average alter superiority observed empirically in social settings, scientific collaboration networks, as well as online social media. We posit a quality-based network growth model in which the chance for a node to receive new links depends both on its degree and a quality parameter. Nodes are assigned qualities the first time they join the network, and these do not change over time. We analyse the model theoretically, finding expressions for the joint degree-quality distribution and nearest-neighbor distribution. We then demonstrate that this model exhibits both the friendship paradox and the generalized friendship paradox at the network level, regardless of the distribution of qualities. We also show that, in the proposed model, the degree and quality of each node are positively correlated regardless of how node qualities are distributed.
△ Less
Submitted 2 October, 2014;
originally announced October 2014.
-
Efficient Distributed Online Prediction and Stochastic Optimization with Approximate Distributed Averaging
Authors:
Konstantinos I. Tsianos,
Michael G. Rabbat
Abstract:
We study distributed methods for online prediction and stochastic optimization. Our approach is iterative: in each round nodes first perform local computations and then communicate in order to aggregate information and synchronize their decision variables. Synchronization is accomplished through the use of a distributed averaging protocol. When an exact distributed averaging protocol is used, it i…
▽ More
We study distributed methods for online prediction and stochastic optimization. Our approach is iterative: in each round nodes first perform local computations and then communicate in order to aggregate information and synchronize their decision variables. Synchronization is accomplished through the use of a distributed averaging protocol. When an exact distributed averaging protocol is used, it is known that the optimal regret bound of $\mathcal{O}(\sqrt{m})$ can be achieved using the distributed mini-batch algorithm of Dekel et al. (2012), where $m$ is the total number of samples processed across the network. We focus on methods using approximate distributed averaging protocols and show that the optimal regret bound can also be achieved in this setting. In particular, we propose a gossip-based optimization method which achieves the optimal regret bound. The amount of communication required depends on the network topology through the second largest eigenvalue of the transition matrix of a random walk on the network. In the setting of stochastic optimization, the proposed gossip-based approach achieves nearly-linear scaling: the optimization error is guaranteed to be no more than $ε$ after $\mathcal{O}(\frac{1}{n ε^2})$ rounds, each of which involves $\mathcal{O}(\log n)$ gossip iterations, when nodes communicate over a well-connected graph. This scaling law is also observed in numerical experiments on a cluster.
△ Less
Submitted 5 March, 2014; v1 submitted 3 March, 2014;
originally announced March 2014.
-
Degree Correlation in Scale-Free Graphs
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
We obtain closed form expressions for the expected conditional degree distribution and the joint degree distribution of the linear preferential attachment model for network growth in the steady state. We consider the multiple-destination preferential attachment growth model, where incoming nodes at each timestep attach to $β$ existing nodes, selected by degree-proportional probabilities. By the co…
▽ More
We obtain closed form expressions for the expected conditional degree distribution and the joint degree distribution of the linear preferential attachment model for network growth in the steady state. We consider the multiple-destination preferential attachment growth model, where incoming nodes at each timestep attach to $β$ existing nodes, selected by degree-proportional probabilities. By the conditional degree distribution $p(\ell| k)$, we mean the degree distribution of nodes that are connected to a node of degree $k$. By the joint degree distribution $p(k,\ell)$, we mean the proportion of links that connect nodes of degrees $k$ and $\ell$. In addition to this growth model, we consider the shifted-linear preferential growth model and solve for the same quantities, as well as a closed form expression for its steady-state degree distribution.
△ Less
Submitted 23 August, 2013;
originally announced August 2013.
-
Voter Model with Arbitrary Degree Dependence: Clout, Confidence and Irreversibility
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
In this paper, we consider the voter model with popularity bias. The influence of each node on its neighbors depends on its degree. We find the consensus probabilities and expected consensus times for each of the states. We also find the fixation probability, which is the probability that a single node whose state differs from every other node imposes its state on the entire system. In addition, w…
▽ More
In this paper, we consider the voter model with popularity bias. The influence of each node on its neighbors depends on its degree. We find the consensus probabilities and expected consensus times for each of the states. We also find the fixation probability, which is the probability that a single node whose state differs from every other node imposes its state on the entire system. In addition, we find the expected fixation time. Then two extensions to the model are proposed and the motivations behind them are discussed. The first one is confidence, where in addition to the states of neighbors, nodes take their own state into account at each update. We repeat the calculations for the augmented model and investigate the effects of adding confidence to the model. The second proposed extension is irreversibility, where one of the states is given the property that once nodes adopt it, they cannot switch back. The dynamics of densities, fixation times and consensus times are obtained.
△ Less
Submitted 23 August, 2013;
originally announced August 2013.
-
A Massively Parallel Associative Memory Based on Sparse Neural Networks
Authors:
Zhe Yao,
Vincent Gripon,
Michael G. Rabbat
Abstract:
Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression algorithms, and face recognition syste…
▽ More
Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression algorithms, and face recognition systems. A classical example of an associative memory is the Hopfield neural network. Recently, Gripon and Berrou have introduced an alternative construction which builds on ideas from the theory of error correcting codes and which greatly outperforms the Hopfield network in capacity, diversity, and efficiency. In this paper we implement a variation of the Gripon-Berrou associative memory on a general purpose graphical processing unit (GPU). The work of Gripon and Berrou proposes two retrieval rules, sum-of-sum and sum-of-max. The sum-of-sum rule uses only matrix-vector multiplication and is easily implemented on the GPU. The sum-of-max rule is much less straightforward to implement because it involves non-linear operations. However, the sum-of-max rule gives significantly better retrieval error rates. We propose a hybrid rule tailored for implementation on a GPU which achieves a 880-fold speedup without sacrificing any accuracy.
△ Less
Submitted 21 July, 2013; v1 submitted 27 March, 2013;
originally announced March 2013.
-
Network Growth with Arbitrary Initial Conditions: Analytical Results for Uniform and Preferential Attachment
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
This paper provides time-dependent expressions for the expected degree distribution of a given network that is subject to growth, as a function of time. We consider both uniform attachment, where incoming nodes form links to existing nodes selected uniformly at random, and preferential attachment, when probabilities are assigned proportional to the degrees of the existing nodes. We consider the ca…
▽ More
This paper provides time-dependent expressions for the expected degree distribution of a given network that is subject to growth, as a function of time. We consider both uniform attachment, where incoming nodes form links to existing nodes selected uniformly at random, and preferential attachment, when probabilities are assigned proportional to the degrees of the existing nodes. We consider the cases of single and multiple links being formed by each newly-introduced node. The initial conditions are arbitrary, that is, the solution depends on the degree distribution of the initial graph which is the substrate of the growth. Previous work in the literature focuses on the asymptotic state, that is, when the number of nodes added to the initial graph tends to infinity, rendering the effect of the initial graph negligible. Our contribution provides a solution for the expected degree distribution as a function of time, for arbitrary initial condition. Previous results match our results in the asymptotic limit.
△ Less
Submitted 12 November, 2013; v1 submitted 3 December, 2012;
originally announced December 2012.
-
Communication/Computation Tradeoffs in Consensus-Based Distributed Optimization
Authors:
Konstantinos I. Tsianos,
Sean Lawlor,
Michael G. Rabbat
Abstract:
We study the scalability of consensus-based distributed optimization algorithms by considering two questions: How many processors should we use for a given problem, and how often should they communicate when communication is not free? Central to our analysis is a problem-specific value $r$ which quantifies the communication/computation tradeoff. We show that organizing the communication among node…
▽ More
We study the scalability of consensus-based distributed optimization algorithms by considering two questions: How many processors should we use for a given problem, and how often should they communicate when communication is not free? Central to our analysis is a problem-specific value $r$ which quantifies the communication/computation tradeoff. We show that organizing the communication among nodes as a $k$-regular expander graph (Reingold, Vadhan, and Wigderson, 2002) yields speedups, while when all pairs of nodes communicate (as in a complete graph), there is an optimal number of processors that depends on $r$. Surprisingly, a speedup can be obtained, in terms of the time to reach a fixed level of accuracy, by communicating less and less frequently as the computation progresses. Experiments on a real cluster solving metric learning and non-smooth convex minimization tasks demonstrate strong agreement between theory and practice.
△ Less
Submitted 5 September, 2012;
originally announced September 2012.
-
Broadcast Gossip Algorithms for Consensus on Strongly Connected Digraphs
Authors:
Wu Shaochuan,
Michael G. Rabbat
Abstract:
We study a general framework for broadcast gossip algorithms which use companion variables to solve the average consensus problem. Each node maintains an initial state and a companion variable. Iterative updates are performed asynchronously whereby one random node broadcasts its current state and companion variable and all other nodes receiving the broadcast update their state and companion variab…
▽ More
We study a general framework for broadcast gossip algorithms which use companion variables to solve the average consensus problem. Each node maintains an initial state and a companion variable. Iterative updates are performed asynchronously whereby one random node broadcasts its current state and companion variable and all other nodes receiving the broadcast update their state and companion variable. We provide conditions under which this scheme is guaranteed to converge to a consensus solution, where all nodes have the same limiting values, on any strongly connected directed graph. Under stronger conditions, which are reasonable when the underlying communication graph is undirected, we guarantee that the consensus value is equal to the average, both in expectation and in the mean-squared sense. Our analysis uses tools from non-negative matrix theory and perturbation theory. The perturbation results rely on a parameter being sufficiently small. We characterize the allowable upper bound as well as the optimal setting for the perturbation parameter as a function of the network topology, and this allows us to characterize the worst-case rate of convergence. Simulations illustrate that, in comparison to existing broadcast gossip algorithms, the approaches proposed in this paper have the advantage that they simultaneously can be guaranteed to converge to the average consensus and they converge in a small number of broadcasts.
△ Less
Submitted 31 August, 2012; v1 submitted 23 August, 2012;
originally announced August 2012.
-
The Effect of Exogenous Inputs and Defiant Agents on Opinion Dynamics with Local and Global Interactions
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
Most of the conventional models for opinion dynamics mainly account for a fully local influence, where myopic agents decide their actions after they interact with other agents that are adjacent to them. For example, in the case of social interactions, this includes family, friends, and other strong social ties. The model proposed in this contribution, embodies a global influence as well where, by…
▽ More
Most of the conventional models for opinion dynamics mainly account for a fully local influence, where myopic agents decide their actions after they interact with other agents that are adjacent to them. For example, in the case of social interactions, this includes family, friends, and other strong social ties. The model proposed in this contribution, embodies a global influence as well where, by global, we mean that each node also observes a sample of the average behavior of the entire population (in the social example, people observe other people on the streets, subway, and other social venues). We consider a case where nodes have dichotomous states (examples include elections with two major parties, whether or not to adopt a new technology or product, and any yes/no opinion such as in voting on a referendum). The dynamics of states on a network with arbitrary degree distribution are studied. For a given initial condition, we find the probability to reach consensus on each state and the expected time reach to consensus. The effect of an exogenous bias on the average orientation of the system is investigated, to model mass media. To do so, we add an external field to the model that favors one of the states over the other. This field interferes with the regular decision process of each node and creates a constant probability to lean towards one of the states. We solve for the average state of the system as a function of time for given initial conditions. Then anti-conformists (stubborn nodes who never revise their states) are added to the network, in an effort to circumvent the external bias. We find necessary conditions on the number of these defiant nodes required to cancel the effect of the external bias. Our analysis is based on a mean field approximation of the agent opinions.
△ Less
Submitted 30 November, 2012; v1 submitted 15 August, 2012;
originally announced August 2012.
-
Dynamics of Influence on Hierarchical Structures
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
Dichotomous spin dynamics on a pyramidal hierarchical structure (the Bethe lattice) are studied. The system embodies a number of \emph{classes}, where a class comprises of nodes that are equidistant from the root (head node). Weighted links exist between nodes from the same and different classes. The spin (hereafter, \emph{state}) of the head node is fixed. We solve for the dynamics of the system…
▽ More
Dichotomous spin dynamics on a pyramidal hierarchical structure (the Bethe lattice) are studied. The system embodies a number of \emph{classes}, where a class comprises of nodes that are equidistant from the root (head node). Weighted links exist between nodes from the same and different classes. The spin (hereafter, \emph{state}) of the head node is fixed. We solve for the dynamics of the system for different boundary conditions. We find necessary conditions so that the classes eventually repudiate or acquiesce in the state imposed by the head node. The results indicate that to reach unanimity across the hierarchy, it suffices that the bottom-most class adopts the same state as the head node. Then the rest of the hierarchy will inevitably comply. This also sheds light on the importance of mass media as a means of synchronization between the top-most and bottom-most classes. Surprisingly, in the case of discord between the head node and the bottom-most classes, the average state over all nodes inclines towards that of the bottom-most class regardless of the link weights and intra-class configurations. Hence the role of the bottom-most class is signified.
△ Less
Submitted 10 June, 2013; v1 submitted 31 July, 2012;
originally announced July 2012.
-
Migration in a Small World: A Network Approach to Modeling Immigration Processes
Authors:
Babak Fotouhi,
Michael G. Rabbat
Abstract:
Existing theories of migration either focus on micro- or macroscopic behavior of populations; that is, either the average behavior of entire population is modeled directly, or decisions of individuals are modeled directly. In this work, we seek to bridge these two perspectives by modeling individual agents decisions to migrate while accounting for the social network structure that binds individual…
▽ More
Existing theories of migration either focus on micro- or macroscopic behavior of populations; that is, either the average behavior of entire population is modeled directly, or decisions of individuals are modeled directly. In this work, we seek to bridge these two perspectives by modeling individual agents decisions to migrate while accounting for the social network structure that binds individuals into a population. Pecuniary considerations combined with the decisions of peers are the primary elements of the model, being the main driving forces of migration. People of the home country are modeled as nodes on a small-world network. A dichotomous state is associated with each node, indicating whether it emigrates to the destination country or it stays in the home country. We characterize the emigration rate in terms of the relative welfare and population of the home and destination countries. The time evolution and the steady-state fraction of emigrants are also derived.
△ Less
Submitted 24 July, 2012;
originally announced July 2012.
-
The Impact of Communication Delays on Distributed Consensus Algorithms
Authors:
Konstantinos I. Tsianos,
Michael G. Rabbat
Abstract:
We study the effect of communication delays on distributed consensus algorithms. Two ways to model delays on a network are presented. The first model assumes that each link delivers messages with a fixed (constant) amount of delay, and the second model is more realistic, allowing for i.i.d. time-varying bounded delays. In contrast to previous work studying the effects of delays on consensus algori…
▽ More
We study the effect of communication delays on distributed consensus algorithms. Two ways to model delays on a network are presented. The first model assumes that each link delivers messages with a fixed (constant) amount of delay, and the second model is more realistic, allowing for i.i.d. time-varying bounded delays. In contrast to previous work studying the effects of delays on consensus algorithms, the models studied here allow for a node to receive multiple messages from the same neighbor in one iteration. The analysis of the fixed delay model shows that convergence to a consensus is guaranteed and the rate of convergence is reduced by no more than a factor O(B^2) where B is the maximum delay on any link. For the time-varying delay model we also give a convergence proof which, for row-stochastic consensus protocols, is not a trivial consequence of ergodic matrix products. In both delay models, the consensus value is no longer the average, even if the original protocol was an averaging protocol. For this reason, we propose the use of a different consensus algorithm called Push-Sum [Kempe et al. 2003]. We model delays in the Push-Sum framework and show that convergence to the average consensus is guaranteed. This suggests that Push-Sum might be a better choice from a practical standpoint.
△ Less
Submitted 24 July, 2012;
originally announced July 2012.
-
Distributed Strongly Convex Optimization
Authors:
Konstantinos I. Tsianos,
Michael G. Rabbat
Abstract:
A lot of effort has been invested into characterizing the convergence rates of gradient based algorithms for non-linear convex optimization. Recently, motivated by large datasets and problems in machine learning, the interest has shifted towards distributed optimization. In this work we present a distributed algorithm for strongly convex constrained optimization. Each node in a network of n comput…
▽ More
A lot of effort has been invested into characterizing the convergence rates of gradient based algorithms for non-linear convex optimization. Recently, motivated by large datasets and problems in machine learning, the interest has shifted towards distributed optimization. In this work we present a distributed algorithm for strongly convex constrained optimization. Each node in a network of n computers converges to the optimum of a strongly convex, L-Lipchitz continuous, separable objective at a rate O(log (sqrt(n) T) / T) where T is the number of iterations. This rate is achieved in the online setting where the data is revealed one at a time to the nodes, and in the batch setting where each node has access to its full local dataset from the start. The same convergence rate is achieved in expectation when the subgradients used at each node are corrupted with additive zero-mean noise.
△ Less
Submitted 19 July, 2012; v1 submitted 12 July, 2012;
originally announced July 2012.
-
Multiscale Gossip for Efficient Decentralized Averaging in Wireless Packet Networks
Authors:
Konstantinos I. Tsianos,
Michael G. Rabbat
Abstract:
This paper describes and analyzes a hierarchical gossip algorithm for solving the distributed average consensus problem in wireless sensor networks. The network is recursively partitioned into subnetworks. Initially, nodes at the finest scale gossip to compute local averages. Then, using geographic routing to enable gossip between nodes that are not directly connected, these local averages are pro…
▽ More
This paper describes and analyzes a hierarchical gossip algorithm for solving the distributed average consensus problem in wireless sensor networks. The network is recursively partitioned into subnetworks. Initially, nodes at the finest scale gossip to compute local averages. Then, using geographic routing to enable gossip between nodes that are not directly connected, these local averages are progressively fused up the hierarchy until the global average is computed. We show that the proposed hierarchical scheme with $k$ levels of hierarchy is competitive with state-of-the-art randomized gossip algorithms, in terms of message complexity, achieving $ε$-accuracy with high probability after $O\big(n \log \log n \log \frac{kn}ε \big)$ messages. Key to our analysis is the way in which the network is recursively partitioned. We find that the optimal scaling law is achieved when subnetworks at scale $j$ contain $O(n^{(2/3)^j})$ nodes; then the message complexity at any individual scale is $O(n \log \frac{kn}ε)$, and the total number of scales in the hierarchy grows slowly, as $Θ(\log \log n)$. Another important consequence of hierarchical construction is that the longest distance over which messages are exchanged is $O(n^{1/3})$ hops (at the highest scale), and most messages (at lower scales) travel shorter distances. In networks that use link-level acknowledgements, this results in less congestion and resource usage by reducing message retransmissions. Simulations illustrate that the proposed scheme is more message-efficient than existing state-of-the-art randomized gossip algorithms based on averaging along paths.
△ Less
Submitted 27 February, 2012; v1 submitted 9 November, 2010;
originally announced November 2010.
-
Gossip Algorithms for Distributed Signal Processing
Authors:
Alexandros G. Dimakis,
Soummya Kar,
Jose M. F. Moura,
Michael G. Rabbat,
Anna Scaglione
Abstract:
Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, develo** faster an…
▽ More
Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, develo** faster and more robust gossip algorithms and deriving theoretical performance guarantees. This article presents an overview of recent work in the area. We describe convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossi**. We discuss issues related to gossi** over wireless links, including the effects of quantization and noise, and we illustrate the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.
△ Less
Submitted 27 March, 2010;
originally announced March 2010.
-
Optimization and Analysis of Distributed Averaging with Short Node Memory
Authors:
Boris N. Oreshkin,
Mark J. Coates,
Michael G. Rabbat
Abstract:
In this paper, we demonstrate, both theoretically and by numerical examples, that adding a local prediction component to the update rule can significantly improve the convergence rate of distributed averaging algorithms. We focus on the case where the local predictor is a linear combination of the node's two previous values (i.e., two memory taps), and our update rule computes a combination of t…
▽ More
In this paper, we demonstrate, both theoretically and by numerical examples, that adding a local prediction component to the update rule can significantly improve the convergence rate of distributed averaging algorithms. We focus on the case where the local predictor is a linear combination of the node's two previous values (i.e., two memory taps), and our update rule computes a combination of the predictor and the usual weighted linear combination of values received from neighbouring nodes. We derive the optimal mixing parameter for combining the predictor with the neighbors' values, and carry out a theoretical analysis of the improvement in convergence rate that can be obtained using this acceleration methodology. For a chain topology on n nodes, this leads to a factor of n improvement over the one-step algorithm, and for a two-dimensional grid, our approach achieves a factor of n^1/2 improvement, in terms of the number of iterations required to reach a prescribed level of accuracy.
△ Less
Submitted 5 February, 2010; v1 submitted 20 March, 2009;
originally announced March 2009.