-
On the Effect of Bounded Rationality in Electricity Markets
Authors:
Lihui Yi,
Ermin Wei
Abstract:
Nash equilibrium is a common solution concept that captures strategic interaction in electricity market analysis. However, it requires a fundamental but impractical assumption that all market participants are fully rational, implying unlimited computational resources and cognitive abilities. To tackle the limitation, level-k reasoning is proposed and studied to model the bounded rational behaviors…
▽ More
Nash equilibrium is a common solution concept that captures strategic interaction in electricity market analysis. However, it requires a fundamental but impractical assumption that all market participants are fully rational, implying unlimited computational resources and cognitive abilities. To tackle the limitation, level-k reasoning is proposed and studied to model the bounded rational behaviors. In this paper, we consider a Cournot competition in electricity markets with two suppliers, both following level-k reasoning. One is a self-interested firm and the other serves as a benevolent social planner. First, we observe that the optimal strategy of the social planner corresponds to a particular rationality level, where being either less or more rational may both result in reduced social welfare. We then investigate the effect of bounded rationality on social welfare performance and find that it can largely deviate from that at the Nash equilibrium point. From the perspective of the social planner, we characterize optimal, expectation maximizing and robust maximin strategies, when having access to different information. Finally, by designing its utility function, we find that social welfare is better off if the social planner cooperates with or fights the self-interested firm. Numerical experiments further demonstrate and validate our findings.
△ Less
Submitted 11 July, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Efficient and Generalizable Certified Unlearning: A Hessian-free Recollection Approach
Authors:
Xinbao Qiao,
Meng Zhang,
Ming Tang,
Ermin Wei
Abstract:
Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data. Recent advances suggest precomputing and storing statistics extracted from second-order information and implementing unlearning through Newton-style updates. However, the theoretical analysis of these works often depends on restrictive assumptions of convexity and smo…
▽ More
Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data. Recent advances suggest precomputing and storing statistics extracted from second-order information and implementing unlearning through Newton-style updates. However, the theoretical analysis of these works often depends on restrictive assumptions of convexity and smoothness, and those mentioned operations on Hessian matrix are extremely costly. As a result, applying these works to high-dimensional models becomes challenging. In this paper, we propose an efficient Hessian-free certified unlearning. We propose to maintain a statistical vector for each data, computed through affine stochastic recursion approximation of the difference between retrained and learned models. Our analysis does not involve inverting Hessian and thus can be extended to non-convex non-smooth objectives. Under same assumptions, we demonstrate advancements of proposed method beyond the state-of-the-art theoretical studies, in terms of generalization, unlearning guarantee, deletion capacity, and computation/storage complexity, and we show that the unlearned model of our proposed approach is close to or same as the retrained model. Based on the strategy of recollecting statistics for forgetting data, we develop an algorithm that achieves near-instantaneous unlearning as it only requires a vector addition operation. Experiments demonstrate that the proposed scheme surpasses existing results by orders of magnitude in terms of time/storage costs, while also enhancing accuracy.
△ Less
Submitted 3 June, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
A Stochastic Quasi-Newton Method for Non-convex Optimization with Non-uniform Smoothness
Authors:
Zhenyu Sun,
Ermin Wei
Abstract:
Classical convergence analyses for optimization algorithms rely on the widely-adopted uniform smoothness assumption. However, recent experimental studies have demonstrated that many machine learning problems exhibit non-uniform smoothness, meaning the smoothness factor is a function of the model parameter instead of a universal constant. In particular, it has been observed that the smoothness grow…
▽ More
Classical convergence analyses for optimization algorithms rely on the widely-adopted uniform smoothness assumption. However, recent experimental studies have demonstrated that many machine learning problems exhibit non-uniform smoothness, meaning the smoothness factor is a function of the model parameter instead of a universal constant. In particular, it has been observed that the smoothness grows with respect to the gradient norm along the training trajectory. Motivated by this phenomenon, the recently introduced $(L_0, L_1)$-smoothness is a more general notion, compared to traditional $L$-smoothness, that captures such positive relationship between smoothness and gradient norm. Under this type of non-uniform smoothness, existing literature has designed stochastic first-order algorithms by utilizing gradient clip** techniques to obtain the optimal $\mathcal{O}(ε^{-3})$ sample complexity for finding an $ε$-approximate first-order stationary solution. Nevertheless, the studies of quasi-Newton methods are still lacking. Considering higher accuracy and more robustness for quasi-Newton methods, in this paper we propose a fast stochastic quasi-Newton method when there exists non-uniformity in smoothness. Leveraging gradient clip** and variance reduction, our algorithm can achieve the best-known $\mathcal{O}(ε^{-3})$ sample complexity and enjoys convergence speedup with simple hyperparameter tuning. Our numerical experiments show that our proposed algorithm outperforms the state-of-the-art approaches.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Strategic Data Revocation in Federated Unlearning
Authors:
Ningning Ding,
Ermin Wei,
Randall Berry
Abstract:
By allowing users to erase their data's impact on federated learning models, federated unlearning protects users' right to be forgotten and data privacy. Despite a burgeoning body of research on federated unlearning's technical feasibility, there is a paucity of literature investigating the considerations behind users' requests for data revocation. This paper proposes a non-cooperative game framew…
▽ More
By allowing users to erase their data's impact on federated learning models, federated unlearning protects users' right to be forgotten and data privacy. Despite a burgeoning body of research on federated unlearning's technical feasibility, there is a paucity of literature investigating the considerations behind users' requests for data revocation. This paper proposes a non-cooperative game framework to study users' data revocation strategies in federated unlearning. We prove the existence of a Nash equilibrium. However, users' best response strategies are coupled via model performance and unlearning costs, which makes the equilibrium computation challenging. We obtain the Nash equilibrium by establishing its equivalence with a much simpler auxiliary optimization problem. We also summarize users' multi-dimensional attributes into a single-dimensional metric and derive the closed-form characterization of an equilibrium, when users' unlearning costs are negligible. Moreover, we compare the cases of allowing and forbidding partial data revocation in federated unlearning. Interestingly, the results reveal that allowing partial revocation does not necessarily increase users' data contributions or payoffs due to the game structure. Additionally, we demonstrate that positive externalities may exist between users' data revocation decisions when users incur unlearning costs, while this is not the case when their unlearning costs are negligible.
△ Less
Submitted 6 December, 2023; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Effective properties of composites with open boundary conditions
Authors:
Guo-Qing Gu,
En-Bo Wei
Abstract:
A primitive problem of predicting the effective properties of composites is open boundary conditions. In this paper, Eshelby's transformation field method is developed to solve the open boundary problem of two-phase composites having arbitrary inclusion geometry. The transformation fields are introduced in the composite system to cope with the complicated interface boundary-value problem of the co…
▽ More
A primitive problem of predicting the effective properties of composites is open boundary conditions. In this paper, Eshelby's transformation field method is developed to solve the open boundary problem of two-phase composites having arbitrary inclusion geometry. The transformation fields are introduced in the composite system to cope with the complicated interface boundary-value problem of the composite. Furthermore, the open boundary problem is solved by Hermite polynomial, which is used to express the transformation fields and the perturbation fields. As an example, the formulas of Eshelby's method are derived for calculating the effective dielectric property of two-dimensional isotropic dielectric composites having open boundary conditions. The validity of the method is verified by comparing the effective responses estimated by the method with the exact solutions of dilute limit. The results show that the method is valid to solve the open boundary problem of composites having complex geometric inclusions.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Incentivized Federated Learning and Unlearning
Authors:
Ningning Ding,
Zhenyu Sun,
Ermin Wei,
Randall Berry
Abstract:
To protect users' right to be forgotten in federated learning, federated unlearning aims at eliminating the impact of leaving users' data on the global learned model. The current research in federated unlearning mainly concentrated on develo** effective and efficient unlearning techniques. However, the issue of incentivizing valuable users to remain engaged and preventing their data from being u…
▽ More
To protect users' right to be forgotten in federated learning, federated unlearning aims at eliminating the impact of leaving users' data on the global learned model. The current research in federated unlearning mainly concentrated on develo** effective and efficient unlearning techniques. However, the issue of incentivizing valuable users to remain engaged and preventing their data from being unlearned is still under-explored, yet important to the unlearned model performance. This paper focuses on the incentive issue and develops an incentive mechanism for federated learning and unlearning. We first characterize the leaving users' impact on the global model accuracy and the required communication rounds for unlearning. Building on these results, we propose a four-stage game to capture the interaction and information updates during the learning and unlearning process. A key contribution is to summarize users' multi-dimensional private information into one-dimensional metrics to guide the incentive design. We further investigate whether allowing federated unlearning is beneficial to the server and users, compared to a scenario without unlearning. Interestingly, users usually have a larger total payoff in the scenario with higher costs, due to the server's excess incentives under information asymmetry. The numerical results demonstrate the necessity of unlearning incentives for retaining valuable leaving users, and also show that our proposed mechanisms decrease the server's cost by up to 53.91\% compared to state-of-the-art benchmarks.
△ Less
Submitted 1 December, 2023; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Supply Function Equilibrium in Networked Electricity Markets
Authors:
YuanzhangXiao,
ChaithanyaBandi,
Ermin Wei
Abstract:
We study deregulated power markets with strategic power suppliers. In deregulated markets, each supplier submits its supply function (i.e., the amount of electricity it is willing to produce at various prices) to the independent system operator (ISO), who based on the submitted supply functions, dispatches the suppliers to clear the market with minimal total generation cost. If all suppliers repor…
▽ More
We study deregulated power markets with strategic power suppliers. In deregulated markets, each supplier submits its supply function (i.e., the amount of electricity it is willing to produce at various prices) to the independent system operator (ISO), who based on the submitted supply functions, dispatches the suppliers to clear the market with minimal total generation cost. If all suppliers reported their true marginal cost functions as supply functions, the market outcome would be efficient (i.e., the total generation cost is minimized). However, when suppliers are strategic and aim to maximize their own profits, the reported supply functions are not necessarily the true marginal cost functions, and the resulting market outcome may be inefficient. The efficiency loss depends crucially on the topology of the underlying transmission network. This paper provides an analytical upper bound of the efficiency loss due to strategic suppliers, and proves that the bound is tight under a large class of transmission networks (i.e., weakly cyclic networks). Our upper bound sheds light on how the efficiency loss depends on the transmission network topology (e.g., the degrees of nodes, the admittances and flow limits of transmission lines).
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Optimal EV Charging Decisions Considering Charging Rate Characteristics and Congestion Effects
Authors:
Lihui Yi,
Ermin Wei
Abstract:
With the rapid growth in the demand for plug-in electric vehicles (EVs), the corresponding charging infrastructures are expanding. These charging stations are located at various places and with different congestion levels. EV drivers face an important decision in choosing which charging station to go to in order to reduce their overall time costs. However, existing literature either assumes a flat…
▽ More
With the rapid growth in the demand for plug-in electric vehicles (EVs), the corresponding charging infrastructures are expanding. These charging stations are located at various places and with different congestion levels. EV drivers face an important decision in choosing which charging station to go to in order to reduce their overall time costs. However, existing literature either assumes a flat charging rate and hence overlooks the physical characteristics of an EV battery where charging rate is typically reduced as the battery charges, or ignores the effect of other drivers on an EV's decision making process. In this paper, we consider both the predetermined exogenous wait cost and the endogenous congestion induced by other drivers' strategic decisions, and propose a differential equation based approach to find the optimal strategies. We analytically characterize the equilibrium strategies and find that co-located EVs may make different decisions depending on the charging rate and/or remaining battery levels. Through numerical experiments, we investigate the impact of charging rate characteristics, modeling parameters and the consideration of endogenous congestion levels on the optimal charging decisions. Finally, we conduct numerical studies on real-world data and find that some EV users with slower charging rates may benefit from the participation of fast-charging EVs.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Exact Community Recovery in the Geometric SBM
Authors:
Julia Gaudio,
Xiaochun Niu,
Ermin Wei
Abstract:
We study the problem of exact community recovery in the Geometric Stochastic Block Model (GSBM), where each vertex has an unknown community label as well as a known position, generated according to a Poisson point process in $\mathbb{R}^d$. Edges are formed independently conditioned on the community labels and positions, where vertices may only be connected by an edge if they are within a prescrib…
▽ More
We study the problem of exact community recovery in the Geometric Stochastic Block Model (GSBM), where each vertex has an unknown community label as well as a known position, generated according to a Poisson point process in $\mathbb{R}^d$. Edges are formed independently conditioned on the community labels and positions, where vertices may only be connected by an edge if they are within a prescribed distance of each other. The GSBM thus favors the formation of dense local subgraphs, which commonly occur in real-world networks, a property that makes the GSBM qualitatively very different from the standard Stochastic Block Model (SBM). We propose a linear-time algorithm for exact community recovery, which succeeds down to the information-theoretic threshold, confirming a conjecture of Abbe, Baccelli, and Sankararaman. The algorithm involves two phases. The first phase exploits the density of local subgraphs to propagate estimated community labels among sufficiently occupied subregions, and produces an almost-exact vertex labeling. The second phase then refines the initial labels using a Poisson testing procedure. Thus, the GSBM enjoys local to global amplification just as the SBM, with the advantage of admitting an information-theoretically optimal, linear-time algorithm.
△ Less
Submitted 5 January, 2024; v1 submitted 20 July, 2023;
originally announced July 2023.
-
Eshelby's method for unidirectional periodic composites
Authors:
Guo-Qing Gu,
En-Bo Wei
Abstract:
Open boundary conditions are always used in investigating the effective properties of composites. In this paper, Eshelby's transformation field method is developed to deal with the effective response of unidirectional periodic composites having an open boundary. In the method, Hermite polynomials are used to cope with the open boundary conditions of the perturbation fields induced by the inclusion…
▽ More
Open boundary conditions are always used in investigating the effective properties of composites. In this paper, Eshelby's transformation field method is developed to deal with the effective response of unidirectional periodic composites having an open boundary. In the method, Hermite polynomials are used to cope with the open boundary conditions of the perturbation fields induced by the inclusions. The transformation fields are introduced in the composite system to meet the interface conditions between complex structure inclusions and matrix. As an example, Eshelby's method is used to estimate the effective responses of two-dimensional unidirectional periodic dielectric composites having an open boundary. The validity is verified by comparing the effective responses calculated by the method with the exact solutions of dilute limit. It is shown that the method is valid to solve the open boundary problem of unidirectional periodic composites having complex geometric inclusions.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Understanding Generalization of Federated Learning via Stability: Heterogeneity Matters
Authors:
Zhenyu Sun,
Xiaochun Niu,
Ermin Wei
Abstract:
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications. Good generalization indicates the model can predict unseen data correctly when trained under a limited number of data. Federated learning (FL), which has emerged as a popular distributed learning framework, allows multiple devices or clients to train a shared model without viol…
▽ More
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications. Good generalization indicates the model can predict unseen data correctly when trained under a limited number of data. Federated learning (FL), which has emerged as a popular distributed learning framework, allows multiple devices or clients to train a shared model without violating privacy requirements. While the existing literature has studied extensively the generalization performances of centralized machine learning algorithms, similar analysis in the federated settings is either absent or with very restrictive assumptions on the loss functions. In this paper, we aim to analyze the generalization performances of federated learning by means of algorithmic stability, which measures the change of the output model of an algorithm when perturbing one data point. Three widely-used algorithms are studied, including FedAvg, SCAFFOLD, and FedProx, under convex and non-convex loss functions. Our analysis shows that the generalization performances of models trained by these three algorithms are closely related to the heterogeneity of clients' datasets as well as the convergence behaviors of the algorithms. Particularly, in the i.i.d. setting, our results recover the classical results of stochastic gradient descent (SGD).
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Collaborative Multi-Agent Video Fast-Forwarding
Authors:
Shuyue Lan,
Zhilu Wang,
Ermin Wei,
Amit K. Roy-Chowdhury,
Qi Zhu
Abstract:
Multi-agent applications have recently gained significant popularity. In many computer vision tasks, a network of agents, such as a team of robots with cameras, could work collaboratively to perceive the environment for efficient and accurate situation awareness. However, these agents often have limited computation, communication, and storage resources. Thus, reducing resource consumption while st…
▽ More
Multi-agent applications have recently gained significant popularity. In many computer vision tasks, a network of agents, such as a team of robots with cameras, could work collaboratively to perceive the environment for efficient and accurate situation awareness. However, these agents often have limited computation, communication, and storage resources. Thus, reducing resource consumption while still providing an accurate perception of the environment becomes an important goal when deploying multi-agent systems. To achieve this goal, we identify and leverage the overlap among different camera views in multi-agent systems for reducing the processing, transmission and storage of redundant/unimportant video frames. Specifically, we have developed two collaborative multi-agent video fast-forwarding frameworks in distributed and centralized settings, respectively. In these frameworks, each individual agent can selectively process or skip video frames at adjustable paces based on multiple strategies via reinforcement learning. Multiple agents then collaboratively sense the environment via either 1) a consensus-based distributed framework called DMVF that periodically updates the fast-forwarding strategies of agents by establishing communication and consensus among connected neighbors, or 2) a centralized framework called MFFNet that utilizes a central controller to decide the fast-forwarding strategies for agents based on collected data. We demonstrate the efficacy and efficiency of our proposed frameworks on a real-world surveillance video dataset VideoWeb and a new simulated driving dataset CarlaSim, through extensive simulations and deployment on an embedded platform with TCP communication. We show that compared with other approaches in the literature, our frameworks achieve better coverage of important frames, while significantly reducing the number of frames processed at each agent.
△ Less
Submitted 27 May, 2023;
originally announced May 2023.
-
PrimeTime: A Finite-Time Consensus Protocol for Open Networks
Authors:
Henry W. Abrahamson,
Ermin Wei
Abstract:
In distributed problems where consensus between agents is required but average consensus is not desired, it can be necessary for each agent to know not only the data of each other agent in the network, but also the origin of each piece of data before consensus can be reached. However, transmitting large tables of data with IDs can cause the size of an agent's message to increase dramatically, whil…
▽ More
In distributed problems where consensus between agents is required but average consensus is not desired, it can be necessary for each agent to know not only the data of each other agent in the network, but also the origin of each piece of data before consensus can be reached. However, transmitting large tables of data with IDs can cause the size of an agent's message to increase dramatically, while truncating down to fewer pieces of data to keep the message size small can lead to problems with the speed of achieving consensus. Also, many existing consensus protocols are not robust against agents leaving and entering the network. We introduce PrimeTime, a novel communication protocol that exploits the properties of prime numbers to quickly and efficiently share small integer data across an open network. For sufficiently small networks or small integer data, we show that messages formed by PrimeTime require fewer bits than messages formed by simply tabularizing the data and IDs to be transmitted.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Attrition-Aware Adaptation for Multi-Agent Patrolling
Authors:
Anthony Goeckner,
Xinliang Li,
Ermin Wei,
Qi Zhu
Abstract:
Multi-agent patrolling is a key problem in a variety of domains such as intrusion detection, area surveillance, and policing which involves repeated visits by a group of agents to specified points in an environment. While the problem is well-studied, most works do not provide performance guarantees and either do not consider agent attrition or impose significant communication requirements to enabl…
▽ More
Multi-agent patrolling is a key problem in a variety of domains such as intrusion detection, area surveillance, and policing which involves repeated visits by a group of agents to specified points in an environment. While the problem is well-studied, most works do not provide performance guarantees and either do not consider agent attrition or impose significant communication requirements to enable adaptation. In this work, we present the Adaptive Heuristic-based Patrolling Algorithm, which is capable of adaptation to agent loss using minimal communication by taking advantage of Voronoi partitioning, and which meets guaranteed performance bounds. Additionally, we provide new centralized and distributed mathematical programming formulations of the patrolling problem, analyze the properties of Voronoi partitioning, and finally, show the value of our adaptive heuristic algorithm by comparison with various benchmark algorithms using a realistic simulation environment based on the Robot Operating System (ROS) 2.
△ Less
Submitted 12 January, 2024; v1 submitted 3 April, 2023;
originally announced April 2023.
-
DISH: A Distributed Hybrid Optimization Method Leveraging System Heterogeneity
Authors:
Xiaochun Niu,
Ermin Wei
Abstract:
We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents' computational capabilities, limiting their effectiveness. To address this, we propose DISH, a distributed hybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities…
▽ More
We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents' computational capabilities, limiting their effectiveness. To address this, we propose DISH, a distributed hybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND (gradient-related ascent and descent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings. Numerical experiments validate the effectiveness of our methods.
△ Less
Submitted 1 August, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Age-Dependent Differential Privacy
Authors:
Meng Zhang,
Ermin Wei,
Randall Berry,
Jianwei Huang
Abstract:
The proliferation of real-time applications has motivated extensive research on analyzing and optimizing data freshness in the context of \textit{age of information}. However, classical frameworks of privacy (e.g., differential privacy (DP)) have overlooked the impact of data freshness on privacy guarantees, which may lead to unnecessary accuracy loss when trying to achieve meaningful privacy guar…
▽ More
The proliferation of real-time applications has motivated extensive research on analyzing and optimizing data freshness in the context of \textit{age of information}. However, classical frameworks of privacy (e.g., differential privacy (DP)) have overlooked the impact of data freshness on privacy guarantees, which may lead to unnecessary accuracy loss when trying to achieve meaningful privacy guarantees in time-varying databases. In this work, we introduce \textit{age-dependent DP}, taking into account the underlying stochastic nature of a time-varying database. In this new framework, we establish a connection between classical DP and age-dependent DP, based on which we characterize the impact of data staleness and temporal correlation on privacy guarantees. Our characterization demonstrates that \textit{aging}, i.e., using stale data inputs and/or postponing the release of outputs, can be a new strategy to protect data privacy in addition to noise injection in the traditional DP framework. Furthermore, to generalize our results to a multi-query scenario, we present a sequential composition result for age-dependent DP under any publishing and aging policies. We then characterize the optimal tradeoffs between privacy risk and utility and show how this can be achieved. Finally, case studies show that to achieve a target of an arbitrarily small privacy risk in a single-query case, combing aging and noise injection only leads to a bounded accuracy loss, whereas using noise injection only (as in the benchmark case of DP) will lead to an unbounded accuracy loss.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
Recent Developments in Security-Constrained AC Optimal Power Flow: Overview of Challenge 1 in the ARPA-E Grid Optimization Competition
Authors:
Ignacio Aravena,
Daniel K. Molzahn,
Shixuan Zhang,
Cosmin G. Petra,
Frank E. Curtis,
Shenyinying Tu,
Andreas Wächter,
Ermin Wei,
Elizabeth Wong,
Amin Gholami,
Kaizhao Sun,
Xu Andy Sun,
Stephen T. Elbert,
Jesse T. Holzer,
Arun Veeramany
Abstract:
The optimal power flow problem is central to many tasks in the design and operation of electric power grids. This problem seeks the minimum cost operating point for an electric power grid while satisfying both engineering requirements and physical laws describing how power flows through the electric network. By additionally considering the possibility of component failures and using an accurate AC…
▽ More
The optimal power flow problem is central to many tasks in the design and operation of electric power grids. This problem seeks the minimum cost operating point for an electric power grid while satisfying both engineering requirements and physical laws describing how power flows through the electric network. By additionally considering the possibility of component failures and using an accurate AC power flow model of the electric network, the security-constrained AC optimal power flow (SC-AC-OPF) problem is of paramount practical relevance. To assess recent progress in solution algorithms for SC-AC-OPF problems and spur new innovations, the U.S. Department of Energy's Advanced Research Projects Agency--Energy (ARPA-E) organized Challenge 1 of the Grid Optimization (GO) competition. This paper describes the SC-AC-OPF problem formulation used in the competition, overviews historical developments and the state of the art in SC-AC-OPF algorithms, discusses the competition, and summarizes the algorithms used by the top three teams in Challenge 1 of the GO Competition (Teams gollnlp, GO-SNIP, and GMI-GO).
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
DISH: A Distributed Hybrid Primal-Dual Optimization Framework to Utilize System Heterogeneity
Authors:
Xiaochun Niu,
Ermin Wei
Abstract:
We consider solving distributed consensus optimization problems over multi-agent networks. Current distributed methods fail to capture the heterogeneity among agents' local computation capacities. We propose DISH as a distributed hybrid primal-dual algorithmic framework to handle and utilize system heterogeneity. Specifically, DISH allows those agents with higher computational capabilities or chea…
▽ More
We consider solving distributed consensus optimization problems over multi-agent networks. Current distributed methods fail to capture the heterogeneity among agents' local computation capacities. We propose DISH as a distributed hybrid primal-dual algorithmic framework to handle and utilize system heterogeneity. Specifically, DISH allows those agents with higher computational capabilities or cheaper computational costs to implement Newton-type updates locally, while other agents can adopt the much simpler gradient-type updates. We show that DISH is a general framework and includes EXTRA, DIGing, and ESOM-0 as special cases. Moreover, when all agents take both primal and dual Newton-type updates, DISH approximates Newton's method by estimating both primal and dual Hessians. Theoretically, we show that DISH achieves a linear (Q-linear) convergence rate to the exact optimal solution for strongly convex functions, regardless of agents' choices of gradient-type and Newton-type updates. Finally, we perform numerical studies to demonstrate the efficacy of DISH in practice. To the best of our knowledge, DISH is the first hybrid method allowing heterogeneous local updates for distributed consensus optimization under general network topology with provable convergence and rate guarantees.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
A Communication-efficient Algorithm with Linear Convergence for Federated Minimax Learning
Authors:
Zhenyu Sun,
Ermin Wei
Abstract:
In this paper, we study a large-scale multi-agent minimax optimization problem, which models many interesting applications in statistical learning and game theory, including Generative Adversarial Networks (GANs). The overall objective is a sum of agents' private local objective functions. We first analyze an important special case, empirical minimax problem, where the overall objective approximat…
▽ More
In this paper, we study a large-scale multi-agent minimax optimization problem, which models many interesting applications in statistical learning and game theory, including Generative Adversarial Networks (GANs). The overall objective is a sum of agents' private local objective functions. We first analyze an important special case, empirical minimax problem, where the overall objective approximates a true population minimax risk by statistical samples. We provide generalization bounds for learning with this objective through Rademacher complexity analysis. Then, we focus on the federated setting, where agents can perform local computation and communicate with a central server. Most existing federated minimax algorithms either require communication per iteration or lack performance guarantees with the exception of Local Stochastic Gradient Descent Ascent (SGDA), a multiple-local-update descent ascent algorithm which guarantees convergence under a diminishing stepsize. By analyzing Local SGDA under the ideal condition of no gradient noise, we show that generally it cannot guarantee exact convergence with constant stepsizes and thus suffers from slow rates of convergence. To tackle this issue, we propose FedGDA-GT, an improved Federated (Fed) Gradient Descent Ascent (GDA) method based on Gradient Tracking (GT). When local objectives are Lipschitz smooth and strongly-convex-strongly-concave, we prove that FedGDA-GT converges linearly with a constant stepsize to global $ε$-approximation solution with $\mathcal{O}(\log (1/ε))$ rounds of communication, which matches the time complexity of centralized GDA method. Finally, we numerically show that FedGDA-GT outperforms Local SGDA.
△ Less
Submitted 6 June, 2023; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Comprehensive Performance Analysis of Homomorphic Cryptosystems for Practical Data Processing
Authors:
Vasily Sidorov,
Ethan Yi Fan Wei,
Wee Keong Ng
Abstract:
Oblivious data processing has been an on and off topic for the last decade or so. It provides great opportunities for secure data management and processing, especially in the cloud. At the same time, modern computing resources seem to be affordable enough to allow for practical use of homomorphic cryptography. Yet, the availability of products that offer practical homomorphic data processing is ex…
▽ More
Oblivious data processing has been an on and off topic for the last decade or so. It provides great opportunities for secure data management and processing, especially in the cloud. At the same time, modern computing resources seem to be affordable enough to allow for practical use of homomorphic cryptography. Yet, the availability of products that offer practical homomorphic data processing is extremely scarce. As part of a project aimed at develo** a practical homomorphic data management platform, we have conducted an extensive study of homomorphic cryptosystems' performance, the results of which are presented in this work. For this work we chose the following five cryptosystems: fully homomorphic HElib and SEAL, somewhat fully homomorphic PyAono, and partially homomorphic Paillier and ElGamal. In the discussion of the aggregated results, we suggest that partially homomorphic cryptosystems could be used today in certain practical applications, whereas time has not yet come for the fully homomorphic ones.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
A Decomposition Algorithm for Large-Scale Security-Constrained AC Optimal Power Flow
Authors:
Frank E. Curtis,
Daniel K. Molzahn,
Shenyinying Tu,
Andreas Wächter,
Ermin Wei,
Elizabeth Wong
Abstract:
A decomposition algorithm for solving large-scale security-constrained AC optimal power flow problems is presented. The formulation considered is the one used in the ARPA-E Grid Optimization (GO) Competition, Challenge 1, held from November 2018 through October 2019. The techniques found to be most effective in terms of performance in the challenge are presented, including strategies for contingen…
▽ More
A decomposition algorithm for solving large-scale security-constrained AC optimal power flow problems is presented. The formulation considered is the one used in the ARPA-E Grid Optimization (GO) Competition, Challenge 1, held from November 2018 through October 2019. The techniques found to be most effective in terms of performance in the challenge are presented, including strategies for contingency selection, fast contingency evaluation, handling complementarity constraints, avoiding issues related to degeneracy, and exploiting parallelism. The results of numerical experiments are provided to demonstrate the effectiveness of the proposed techniques as compared to alternative strategies.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Faithful Edge Federated Learning: Scalability and Privacy
Authors:
Meng Zhang,
Ermin Wei,
Randall Berry
Abstract:
Federated learning enables machine learning algorithms to be trained over a network of multiple decentralized edge devices without requiring the exchange of local datasets. Successfully deploying federated learning requires ensuring that agents (e.g., mobile devices) faithfully execute the intended algorithm, which has been largely overlooked in the literature. In this study, we first use risk bou…
▽ More
Federated learning enables machine learning algorithms to be trained over a network of multiple decentralized edge devices without requiring the exchange of local datasets. Successfully deploying federated learning requires ensuring that agents (e.g., mobile devices) faithfully execute the intended algorithm, which has been largely overlooked in the literature. In this study, we first use risk bounds to analyze how the key feature of federated learning, unbalanced and non-i.i.d. data, affects agents' incentives to voluntarily participate and obediently follow traditional federated learning algorithms. To be more specific, our analysis reveals that agents with less typical data distributions and relatively more samples are more likely to opt out of or tamper with federated learning algorithms. To this end, we formulate the first faithful implementation problem of federated learning and design two faithful federated learning mechanisms which satisfy economic properties, scalability, and privacy. Further, the time complexity of computing all agents' payments in the number of agents is $\mathcal{O}(1)$. First, we design a Faithful Federated Learning (FFL) mechanism which approximates the Vickrey-Clarke-Groves (VCG) payments via an incremental computation. We show that it achieves (probably approximate) optimality, faithful implementation, voluntary participation, and some other economic properties (such as budget balance). Second, by partitioning agents into several subsets, we present a scalable VCG mechanism approximation. We further design a scalable and Differentially Private FFL (DP-FFL) mechanism, the first differentially private faithful mechanism, that maintains the economic properties. Our mechanism enables one to make three-way performance tradeoffs among privacy, the iterations needed, and payment accuracy loss.
△ Less
Submitted 26 October, 2021; v1 submitted 30 June, 2021;
originally announced June 2021.
-
FedHybrid: A Hybrid Primal-Dual Algorithm Framework for Federated Optimization
Authors:
Xiaochun Niu,
Ermin Wei
Abstract:
We consider a multi-agent consensus optimization problem over a server-client (federated) network, where all clients are connected to a central server. Current distributed algorithms fail to capture the heterogeneity in clients' local computation capacities. Motivated by the generalized Method of Multipliers in centralized optimization, we derive an approximate Newton-type primal-dual method with…
▽ More
We consider a multi-agent consensus optimization problem over a server-client (federated) network, where all clients are connected to a central server. Current distributed algorithms fail to capture the heterogeneity in clients' local computation capacities. Motivated by the generalized Method of Multipliers in centralized optimization, we derive an approximate Newton-type primal-dual method with a practical distributed implementation by utilizing the server-client topology. Then we propose a new primal-dual algorithm framework FedHybrid that allows different clients to perform various types of updates. Specifically, each client can choose to perform either gradient-type or Newton-type updates. We propose a novel analysis framework for primal-dual methods and obtain a linear convergence rate of FedHybrid for strongly convex functions, regardless of clients' choices of gradient-type or Newton-type updates. Numerical studies are provided to demonstrate the efficacy of our method in practice. To the best of our knowledge, this is the first hybrid algorithmic framework allowing heterogeneous local updates for distributed consensus optimization with a provable convergence and rate guarantee.
△ Less
Submitted 12 July, 2021; v1 submitted 2 June, 2021;
originally announced June 2021.
-
On the Convergence of NEAR-DGD for Nonconvex Optimization with Second Order Guarantees
Authors:
Charikleia Iakovidou,
Ermin Wei
Abstract:
We consider the setting where the nodes of an undirected, connected network collaborate to solve a shared objective modeled as the sum of smooth functions. We assume that each summand is privately known by a unique node. NEAR-DGD is a distributed first order method which permits adjusting the amount of communication between nodes relative to the amount of computation performed locally in order to…
▽ More
We consider the setting where the nodes of an undirected, connected network collaborate to solve a shared objective modeled as the sum of smooth functions. We assume that each summand is privately known by a unique node. NEAR-DGD is a distributed first order method which permits adjusting the amount of communication between nodes relative to the amount of computation performed locally in order to balance convergence accuracy and total application cost. In this work, we generalize the convergence properties of a variant of NEAR-DGD from the strongly convex to the nonconvex case. Under mild assumptions, we show convergence to minimizers of a custom Lyapunov function. Moreover, we demonstrate that the gap between those minimizers and the second order stationary solutions of the original problem can become arbitrarily small depending on the choice of algorithm parameters. Finally, we accompany our theoretical analysis with a numerical experiment to evaluate the empirical performance of NEAR-DGD in the nonconvex setting.
△ Less
Submitted 9 April, 2021; v1 submitted 25 March, 2021;
originally announced March 2021.
-
S-NEAR-DGD: A Flexible Distributed Stochastic Gradient Method for Inexact Communication
Authors:
Charikleia Iakovidou,
Ermin Wei
Abstract:
We present and analyze a stochastic distributed method (S-NEAR-DGD) that can tolerate inexact computation and inaccurate information exchange to alleviate the problems of costly gradient evaluations and bandwidth-limited communication in large-scale systems. Our method is based on a class of flexible, distributed first order algorithms that allow for the trade-off of computation and communication…
▽ More
We present and analyze a stochastic distributed method (S-NEAR-DGD) that can tolerate inexact computation and inaccurate information exchange to alleviate the problems of costly gradient evaluations and bandwidth-limited communication in large-scale systems. Our method is based on a class of flexible, distributed first order algorithms that allow for the trade-off of computation and communication to best accommodate the application setting. We assume that all the information exchange between nodes is subject to random distortion and that only stochastic approximations of the true gradients are available. Our theoretical results prove that the proposed algorithm converges linearly in expectation to a neighborhood of the optimal solution for strongly convex objective functions with Lipschitz gradients. We characterize the dependence of this neighborhood on algorithm and network parameters, the quality of the communication channel and the precision of the stochastic gradient approximations used. Finally, we provide numerical results to evaluate the empirical performance of our method.
△ Less
Submitted 29 January, 2021;
originally announced February 2021.
-
Distributed Multi-agent Video Fast-forwarding
Authors:
Shuyue Lan,
Zhilu Wang,
Amit K. Roy-Chowdhury,
Ermin Wei,
Qi Zhu
Abstract:
In many intelligent systems, a network of agents collaboratively perceives the environment for better and more efficient situation awareness. As these agents often have limited resources, it could be greatly beneficial to identify the content overlap** among camera views from different agents and leverage it for reducing the processing, transmission and storage of redundant/unimportant video fra…
▽ More
In many intelligent systems, a network of agents collaboratively perceives the environment for better and more efficient situation awareness. As these agents often have limited resources, it could be greatly beneficial to identify the content overlap** among camera views from different agents and leverage it for reducing the processing, transmission and storage of redundant/unimportant video frames. This paper presents a consensus-based distributed multi-agent video fast-forwarding framework, named DMVF, that fast-forwards multi-view video streams collaboratively and adaptively. In our framework, each camera view is addressed by a reinforcement learning based fast-forwarding agent, which periodically chooses from multiple strategies to selectively process video frames and transmits the selected frames at adjustable paces. During every adaptation period, each agent communicates with a number of neighboring agents, evaluates the importance of the selected frames from itself and those from its neighbors, refines such evaluation together with other agents via a system-wide consensus algorithm, and uses such evaluation to decide their strategy for the next period. Compared with approaches in the literature on a real-world surveillance video dataset VideoWeb, our method significantly improves the coverage of important frames and also reduces the number of frames processed in the system.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Learning to Price Vehicle Service with Unknown Demand
Authors:
Haoran Yu,
Ermin Wei,
Randall A. Berry
Abstract:
It can be profitable for vehicle service providers to set service prices based on users' travel demand on different origin-destination pairs. The prior studies on the spatial pricing of vehicle service rely on the assumption that providers know users' demand. In this paper, we study a monopolistic provider who initially does not know users' demand and needs to learn it over time by observing the u…
▽ More
It can be profitable for vehicle service providers to set service prices based on users' travel demand on different origin-destination pairs. The prior studies on the spatial pricing of vehicle service rely on the assumption that providers know users' demand. In this paper, we study a monopolistic provider who initially does not know users' demand and needs to learn it over time by observing the users' responses to the service prices. We design a pricing and vehicle supply policy, considering the tradeoff between exploration (i.e., learning the demand) and exploitation (i.e., maximizing the provider's short-term payoff). Considering that the provider needs to ensure the vehicle flow balance at each location, its pricing and supply decisions for different origin-destination pairs are tightly coupled. This makes it challenging to theoretically analyze the performance of our policy. We analyze the gap between the provider's expected time-average payoffs under our policy and a clairvoyant policy, which makes decisions based on complete information of the demand. We prove that after running our policy for D days, the loss in the expected time-average payoff can be at most O((ln D)^0.5 D^(-0.25)), which decays to zero as D approaches infinity.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Optimal and Quantized Mechanism Design for Fresh Data Acquisition
Authors:
Meng Zhang,
Ahmed Arafa,
Ermin Wei,
Randall A. Berry
Abstract:
The proliferation of real-time applications has spurred much interest in data freshness, captured by the {\it age-of-information} (AoI) metric. When strategic data sources have private market information, a fundamental economic challenge is how to incentivize them to acquire fresh data and optimize the age-related performance. In this work, we consider an information update system in which a desti…
▽ More
The proliferation of real-time applications has spurred much interest in data freshness, captured by the {\it age-of-information} (AoI) metric. When strategic data sources have private market information, a fundamental economic challenge is how to incentivize them to acquire fresh data and optimize the age-related performance. In this work, we consider an information update system in which a destination acquires, and pays for, fresh data updates from multiple sources. The destination incurs an age-related cost, modeled as a general increasing function of the AoI. Each source is strategic and incurs a sampling cost, which is its private information and may not be truthfully reported to the destination. The destination decides on the price of updates, when to get them, and who should generate them, based on the sources' reported sampling costs. We show that a benchmark that naively trusts the sources' reports can lead to an arbitrarily bad outcome compared to the case where sources truthfully report. To tackle this issue, we design an optimal (economic) mechanism for timely information acquisition following Myerson's seminal work. To this end, our proposed optimal mechanism minimizes the sum of the destination's age-related cost and its payment to the sources, while ensuring that the sources truthfully report their private information and will voluntarily participate in the mechanism. However, finding the optimal mechanisms may suffer from \textit{prohibitively expensive computational overheads} as it involves solving a nonlinear infinite-dimensional optimization problem. We further propose a quantized version of the optimal mechanism that achieves asymptotic optimality, maintains the other economic properties, and enables one to tradeoff between optimality and computational overheads.
△ Less
Submitted 15 December, 2020; v1 submitted 28 June, 2020;
originally announced June 2020.
-
On the Convergence of Nested Decentralized Gradient Methods with Multiple Consensus and Gradient Steps
Authors:
Albert S. Berahas,
Raghu Bollapragada,
Ermin Wei
Abstract:
In this paper, we consider minimizing a sum of local convex objective functions in a distributed setting, where the cost of communication and/or computation can be expensive. We extend and generalize the analysis for a class of nested gradient-based distributed algorithms (NEAR-DGD; Berahas, Bollapragada, Keskar and Wei, 2018) to account for multiple gradient steps at every iteration. We show the…
▽ More
In this paper, we consider minimizing a sum of local convex objective functions in a distributed setting, where the cost of communication and/or computation can be expensive. We extend and generalize the analysis for a class of nested gradient-based distributed algorithms (NEAR-DGD; Berahas, Bollapragada, Keskar and Wei, 2018) to account for multiple gradient steps at every iteration. We show the effect of performing multiple gradient steps on the rate of convergence and on the size of the neighborhood of convergence, and prove R-Linear convergence to the exact solution with a fixed number of gradient steps and increasing number of consensus steps. We test the performance of the generalized method on quadratic functions and show the effect of multiple consensus and gradient steps in terms of iterations, number of gradient evaluations, number of communications and cost.
△ Less
Submitted 7 July, 2021; v1 submitted 31 May, 2020;
originally announced June 2020.
-
A Two-Stage Decomposition Approach for AC Optimal Power Flow
Authors:
Shenyinying Tu,
Andreas Waechter,
Ermin Wei
Abstract:
The alternating current optimal power flow (AC-OPF) problem is critical to power system operations and planning, but it is generally hard to solve due to its nonconvex and large-scale nature. This paper proposes a scalable decomposition approach in which the power network is decomposed into a master network and a number of subnetworks, where each network has its own AC-OPF subproblem. This formula…
▽ More
The alternating current optimal power flow (AC-OPF) problem is critical to power system operations and planning, but it is generally hard to solve due to its nonconvex and large-scale nature. This paper proposes a scalable decomposition approach in which the power network is decomposed into a master network and a number of subnetworks, where each network has its own AC-OPF subproblem. This formulates a two-stage optimization problem and requires only a small amount of communication between the master and subnetworks. The key contribution is a smoothing technique that renders the response of a subnetwork differentiable with respect to the input from the master problem, utilizing properties of the barrier problem formulation that naturally arises when subproblems are solved by a primal-dual interior-point algorithm. Consequently, existing efficient nonlinear programming solvers can be used for both the master problem and the subproblems. The advantage of this framework is that speedup can be obtained by processing the subnetworks in parallel, and it has convergence guarantees under reasonable assumptions. The formulation is readily extended to instances with stochastic subnetwork loads. Numerical results show favorable performance and illustrate the scalability of the algorithm which is able to solve instances with more than 11 million buses.
△ Less
Submitted 10 June, 2020; v1 submitted 18 February, 2020;
originally announced February 2020.
-
FlexPD: A Flexible Framework Of First-Order Primal-Dual Algorithms for Distributed Optimization
Authors:
Fatemeh Mansoori,
Ermin Wei
Abstract:
In this paper, we study the problem of minimizing a sum of convex objective functions, which are locally available to agents in a network. Distributed optimization algorithms make it possible for the agents to cooperatively solve the problem through local computations and communications with neighbors. Lagrangian-based distributed optimization algorithms have received significant attention in rece…
▽ More
In this paper, we study the problem of minimizing a sum of convex objective functions, which are locally available to agents in a network. Distributed optimization algorithms make it possible for the agents to cooperatively solve the problem through local computations and communications with neighbors. Lagrangian-based distributed optimization algorithms have received significant attention in recent years, due to their exact convergence property. However, many of these algorithms have slow convergence or are expensive to execute. In this paper, we develop a flexible framework of first-order primal-dual algorithms (FlexPD), which allows for multiple primal steps per iteration. This framework includes three algorithms, FlexPD-F, FlexPD-G, and FlexPD-C that can be used for various applications with different computation and communication limitations. For strongly convex and Lipschitz gradient objective functions, we establish linear convergence of our proposed framework to the optimal solution. Simulation results confirm the superior performance of our framework compared to the existing methods.
△ Less
Submitted 30 March, 2020; v1 submitted 12 December, 2019;
originally announced December 2019.
-
Monetizing Mobile Data via Data Rewards
Authors:
Haoran Yu,
Ermin Wei,
Randall A. Berry
Abstract:
Most mobile network operators generate revenues by directly charging users for data plan subscriptions. Some operators now also offer users data rewards to incentivize them to watch mobile ads, which enables the operators to collect payments from advertisers and create new revenue streams. In this work, we analyze and compare two data rewarding schemes: a Subscription-Aware Rewarding (SAR) scheme…
▽ More
Most mobile network operators generate revenues by directly charging users for data plan subscriptions. Some operators now also offer users data rewards to incentivize them to watch mobile ads, which enables the operators to collect payments from advertisers and create new revenue streams. In this work, we analyze and compare two data rewarding schemes: a Subscription-Aware Rewarding (SAR) scheme and a Subscription-Unaware Rewarding (SUR) scheme. Under the SAR scheme, only the subscribers of the operators' data plans are eligible for the rewards; under the SUR scheme, all users are eligible for the rewards (e.g., the users who do not subscribe to the data plans can still get SIM cards and receive data rewards by watching ads). We model the interactions among an operator, users, and advertisers by a two-stage Stackelberg game, and characterize their equilibrium strategies under both the SAR and SUR schemes. We show that the SAR scheme can lead to more subscriptions and a higher operator revenue from the data market, while the SUR scheme can lead to better ad viewership and a higher operator revenue from the ad market. We further show that the operator's optimal choice between the two schemes is sensitive to the users' data consumption utility function and the operator's network capacity. We provide some counter-intuitive insights. For example, when each user has a logarithmic utility function, the operator should apply the SUR scheme (i.e., reward both subscribers and non-subscribers) if and only if it has a small network capacity.
△ Less
Submitted 27 November, 2019;
originally announced November 2019.
-
Nested Distributed Gradient Methods with Stochastic Computation Errors
Authors:
Charikleia Iakovidou,
Ermin Wei
Abstract:
In this work, we consider the problem of a network of agents collectively minimizing a sum of convex functions. The agents in our setting can only access their local objective functions and exchange information with their immediate neighbors. Motivated by applications where computation is imperfect, including, but not limited to, empirical risk minimization (ERM) and online learning, we assume tha…
▽ More
In this work, we consider the problem of a network of agents collectively minimizing a sum of convex functions. The agents in our setting can only access their local objective functions and exchange information with their immediate neighbors. Motivated by applications where computation is imperfect, including, but not limited to, empirical risk minimization (ERM) and online learning, we assume that only noisy estimates of the local gradients are available. To tackle this problem, we adapt a class of Nested Distributed Gradient methods (NEAR-DGD) to the stochastic gradient setting. These methods have minimal storage requirements, are communication aware and perform well in settings where gradient computation is costly, while communication is relatively inexpensive. We investigate the convergence properties of our method under standard assumptions for stochastic gradients, i.e. unbiasedness and bounded variance. Our analysis indicates that our method converges to a neighborhood of the optimal solution with a linear rate for local strongly convex functions and appropriate constant steplengths. We also show that distributed optimization with stochastic gradients achieves a noise reduction effect similar to mini-batching, which scales favorably with network size. Finally, we present numerical results to demonstrate the effectiveness of our method.
△ Less
Submitted 29 September, 2019;
originally announced September 2019.
-
Investment in EV charging spots for parking
Authors:
Brendan Badia,
Randall Berry,
Ermin Wei
Abstract:
As demand for electric vehicles (EVs) is expanding, meeting the need for charging infrastructure, especially in urban areas, has become a critical issue. One method of adding charging stations is to install them at parking spots. This increases the value of these spots to EV drivers needing to charge their vehicles. However, there is a cost to constructing these spots and such spots may preclude d…
▽ More
As demand for electric vehicles (EVs) is expanding, meeting the need for charging infrastructure, especially in urban areas, has become a critical issue. One method of adding charging stations is to install them at parking spots. This increases the value of these spots to EV drivers needing to charge their vehicles. However, there is a cost to constructing these spots and such spots may preclude drivers not needing to charge from using them, reducing the parking options for such drivers\color{black}. We look at two models for how decisions surrounding investment in charging stations on existing parking spots may be undertaken. First, we analyze two firms who compete over installing stations under government set mandates or subsidies. Given the cost of constructing spots and the competitiveness of the markets, we find it is ambiguous whether setting higher mandates or higher subsidies for spot construction leads to better aggregate outcomes. Second, we look at a system operator who faces uncertainty on the size of the EV market. If they are risk neutral, we find a relatively small change in the uncertainty of the EV market can lead to large changes in the optimal charging capacity.
△ Less
Submitted 22 April, 2019;
originally announced April 2019.
-
A General Framework of Exact Primal-Dual First Order Algorithms for Distributed Optimization
Authors:
Fatemeh Mansoori,
Ermin Wei
Abstract:
We study the problem of minimizing a sum of local objective convex functions over a network of processors/agents. This problem naturally calls for distributed optimization algorithms, in which the agents cooperatively solve the problem through local computations and communications with neighbors. While many of the existing distributed algorithms with constant stepsize can only converge to a neighb…
▽ More
We study the problem of minimizing a sum of local objective convex functions over a network of processors/agents. This problem naturally calls for distributed optimization algorithms, in which the agents cooperatively solve the problem through local computations and communications with neighbors. While many of the existing distributed algorithms with constant stepsize can only converge to a neighborhood of optimal solution, some recent methods based on augmented Lagrangian and method of multipliers can achieve exact convergence with a fixed stepsize. However, these methods either suffer from slow convergence speed or require minimization at each iteration. In this work, we develop a class of distributed first order primal-dual methods, which allows for multiple primal steps per iteration. This general framework makes it possible to control the trade off between the performance and the execution complexity in primal-dual algorithms. We show that for strongly convex and Lipschitz gradient objective functions, this class of algorithms converges linearly to the optimal solution under appropriate constant stepsize choices. Simulation results confirm the superior performance of our algorithm compared to existing methods.
△ Less
Submitted 29 March, 2019;
originally announced March 2019.
-
Nested Distributed Gradient Methods with Adaptive Quantized Communication
Authors:
Albert S. Berahas,
Charikleia Iakovidou,
Ermin Wei
Abstract:
In this paper, we consider minimizing a sum of local convex objective functions in a distributed setting, where communication can be costly. We propose and analyze a class of nested distributed gradient methods with adaptive quantized communication (NEAR-DGD+Q). We show the effect of performing multiple quantized communication steps on the rate of convergence and on the size of the neighborhood of…
▽ More
In this paper, we consider minimizing a sum of local convex objective functions in a distributed setting, where communication can be costly. We propose and analyze a class of nested distributed gradient methods with adaptive quantized communication (NEAR-DGD+Q). We show the effect of performing multiple quantized communication steps on the rate of convergence and on the size of the neighborhood of convergence, and prove R-Linear convergence to the exact solution with increasing number of consensus steps and adaptive quantization. We test the performance of the method, as well as some practical variants, on quadratic functions, and show the effects of multiple quantized communication steps in terms of iterations/gradient evaluations, communication and cost.
△ Less
Submitted 26 August, 2019; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Analyzing Location-Based Advertising for Vehicle Service Providers Using Effective Resistances
Authors:
Haoran Yu,
Ermin Wei,
Randall A. Berry
Abstract:
Vehicle service providers can display commercial ads in their vehicles based on passengers' origins and destinations to create a new revenue stream. In this work, we study a vehicle service provider who can generate different ad revenues when displaying ads on different arcs (i.e., origin-destination pairs). The provider needs to ensure the vehicle flow balance at each location, which makes it cha…
▽ More
Vehicle service providers can display commercial ads in their vehicles based on passengers' origins and destinations to create a new revenue stream. In this work, we study a vehicle service provider who can generate different ad revenues when displaying ads on different arcs (i.e., origin-destination pairs). The provider needs to ensure the vehicle flow balance at each location, which makes it challenging to analyze the provider's vehicle assignment and pricing decisions for different arcs. For example, the provider's price for its service on an arc depends on the ad revenues on other arcs as well as on the arc in question. To tackle the problem, we show that the traffic network corresponds to an electrical network. When the effective resistance between two locations is small, there are many paths between the two locations and the provider can easily route vehicles between them. We characterize the dependence of an arc's optimal price on any other arc's ad revenue using the effective resistances between these two arcs' origins and destinations. Furthermore, we study the provider's optimal selection of advertisers when it can only display ads for a limited number of advertisers. If each advertiser has one target arc for advertising, the provider should display ads for the advertiser whose target arc has a small effective resistance. We investigate the performance of our advertiser selection strategy based on a real-world dataset.
△ Less
Submitted 6 February, 2019;
originally announced February 2019.
-
A Fast Distributed Asynchronous Newton-Based Optimization Algorithm
Authors:
Fatemeh Mansoori,
Ermin Wei
Abstract:
One of the most important problems in the field of distributed optimization is the problem of minimizing a sum of local convex objective functions over a networked system. Most of the existing work in this area focus on develo** distributed algorithms in a synchronous setting under the presence of a central clock, where the agents need to wait for the slowest one to finish the update, before pro…
▽ More
One of the most important problems in the field of distributed optimization is the problem of minimizing a sum of local convex objective functions over a networked system. Most of the existing work in this area focus on develo** distributed algorithms in a synchronous setting under the presence of a central clock, where the agents need to wait for the slowest one to finish the update, before proceeding to the next iterate. Asynchronous distributed algorithms remove the need for a central coordinator, reduce the synchronization wait, and allow some agents to compute faster and execute more iterations. In the asynchronous setting, the only known algorithms for solving this problem could achieve either linear or sublinear rate of convergence. In this work, we built upon the existing literature to develop and analyze an asynchronous Newton-based method to solve a penalized version of the problem. We show that this algorithm guarantees almost sure convergence with global linear and local quadratic rate in expectation. Numerical studies confirm superior performance of our algorithm against other asynchronous methods.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.
-
Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space
Authors:
Ermo Wei,
Drew Wicke,
Sean Luke
Abstract:
We explore Deep Reinforcement Learning in a parameterized action space. Specifically, we investigate how to achieve sample-efficient end-to-end training in these tasks. We propose a new compact architecture for the tasks where the parameter policy is conditioned on the output of the discrete action policy. We also propose two new methods based on the state-of-the-art algorithms Trust Region Policy…
▽ More
We explore Deep Reinforcement Learning in a parameterized action space. Specifically, we investigate how to achieve sample-efficient end-to-end training in these tasks. We propose a new compact architecture for the tasks where the parameter policy is conditioned on the output of the discrete action policy. We also propose two new methods based on the state-of-the-art algorithms Trust Region Policy Optimization (TRPO) and Stochastic Value Gradient (SVG) to train such an architecture. We demonstrate that these methods outperform the state of the art method, Parameterized Action DDPG, on test domains.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.
-
A General Sensitivity Analysis Approach for Demand Response Optimizations
Authors:
Ding Xiang,
Ermin Wei
Abstract:
It is well-known that demand response can improve the system efficiency as well as lower consumers' (prosumers') electricity bills. However, it is not clear how we can either qualitatively identify the prosumer with the most impact potential or quantitatively estimate each prosumer's contribution to the total social welfare improvement when additional resource capacity/flexibility is introduced to…
▽ More
It is well-known that demand response can improve the system efficiency as well as lower consumers' (prosumers') electricity bills. However, it is not clear how we can either qualitatively identify the prosumer with the most impact potential or quantitatively estimate each prosumer's contribution to the total social welfare improvement when additional resource capacity/flexibility is introduced to the system with demand response, such as allowing net-selling behavior. In this work, we build upon existing literature on the electricity market, which consists of price-taking prosumers each with various appliances, an electric utility company and a social welfare optimizing distribution system operator, to design a general sensitivity analysis approach (GSAA) that can estimate the potential of each consumer's contribution to the social welfare when given more resource capacity. GSAA is based on existence of an efficient competitive equilibrium, which we establish in the paper. When prosumers' utility functions are quadratic, GSAA can give closed forms characterization on social welfare improvement based on duality analysis. Furthermore, we extend GSAA to a general convex settings, i.e., utility functions with strong convexity and Lipschitz continuous gradient. Even without knowing the specific forms the utility functions, we can derive upper and lower bounds of the social welfare improvement potential of each prosumer, when extra resource is introduced. For both settings, several applications and numerical examples are provided: including extending AC comfort zone, ability of EV to discharge and net selling. The estimation results show that GSAA can be used to decide how to allocate potentially limited market resources in the most impactful way.
△ Less
Submitted 7 October, 2018;
originally announced October 2018.
-
Large-Scale Spectrum Allocation for Cellular Networks via Sparse Optimization
Authors:
Binnan Zhuang,
Dongning Guo,
Ermin Wei,
Michael L. Honig
Abstract:
This paper studies joint spectrum allocation and user association in large heterogeneous cellular networks. The objective is to maximize some network utility function based on given traffic statistics collected over a slow timescale, conceived to be seconds to minutes. A key challenge is scalability: interference across cells creates dependencies across the entire network, making the optimization…
▽ More
This paper studies joint spectrum allocation and user association in large heterogeneous cellular networks. The objective is to maximize some network utility function based on given traffic statistics collected over a slow timescale, conceived to be seconds to minutes. A key challenge is scalability: interference across cells creates dependencies across the entire network, making the optimization problem computationally challenging as the size of the network becomes large. A suboptimal solution is presented, which performs well in networks consisting of one hundred access points (APs) serving several hundred user devices. This is achieved by optimizing over local overlap** neighborhoods, defined by interference conditions, and by exploiting the sparsity of a globally optimal solution. Specifically, with a total of $k$ user devices in the entire network, it suffices to divide the spectrum into $k$ segments, where each segment is mapped to a particular set, or pattern, of active APs within each local neighborhood. The problem is then to find a map** of segments to patterns, and to optimize the widths of the segments. A convex relaxation is proposed for this, which relies on a re-weighted $\ell_1$ approximation of an $\ell_0$ constraint, and is used to enforce the map** of a unique pattern to each spectrum segment. A distributed implementation based on alternating direction method of multipliers (ADMM) is also proposed. Numerical comparisons with benchmark schemes show that the proposed method achieves a substantial increase in achievable throughput and/or reduction in the average packet delay.
△ Less
Submitted 9 September, 2018;
originally announced September 2018.
-
Multiagent Soft Q-Learning
Authors:
Ermo Wei,
Drew Wicke,
David Freelan,
Sean Luke
Abstract:
Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous…
▽ More
Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space.
△ Less
Submitted 25 April, 2018;
originally announced April 2018.
-
Balancing Communication and Computation in Distributed Optimization
Authors:
Albert S. Berahas,
Raghu Bollapragada,
Nitish Shirish Keskar,
Ermin Wei
Abstract:
Methods for distributed optimization have received significant attention in recent years owing to their wide applicability in various domains. A distributed optimization method typically consists of two key components: communication and computation. More specifically, at every iteration (or every several iterations) of a distributed algorithm, each node in the network requires some form of informa…
▽ More
Methods for distributed optimization have received significant attention in recent years owing to their wide applicability in various domains. A distributed optimization method typically consists of two key components: communication and computation. More specifically, at every iteration (or every several iterations) of a distributed algorithm, each node in the network requires some form of information exchange with its neighboring nodes (communication) and the computation step related to a (sub)-gradient (computation). The standard way of judging an algorithm via only the number of iterations overlooks the complexity associated with each iteration. Moreover, various applications deploying distributed methods may prefer a different composition of communication and computation.
Motivated by this discrepancy, in this work we propose an adaptive cost framework which adjusts the cost measure depending on the features of various applications. We present a flexible algorithmic framework, where communication and computation steps are explicitly decomposed to enable algorithm customization for various applications. We apply this framework to the well-known distributed gradient descent (DGD) method, and show that the resulting customized algorithms, which we call DGD$^t$, NEAR-DGD$^t$ and NEAR-DGD$^+$, compare favorably to their base algorithms, both theoretically and empirically. The proposed NEAR-DGD$^+$ algorithm is an exact first-order method where the communication and computation steps are nested, and when the number of communication steps is adaptively increased, the method converges to the optimal solution. We test the performance and illustrate the flexibility of the methods, as well as practical variants, on quadratic functions and classification problems that arise in machine learning, in terms of iterations, gradient evaluations, communications and the proposed cost framework.
△ Less
Submitted 31 May, 2018; v1 submitted 9 September, 2017;
originally announced September 2017.
-
Superlinearly Convergent Asynchronous Distributed Network Newton Method
Authors:
Fatemeh Mansoori,
Ermin Wei
Abstract:
The problem of minimizing a sum of local convex objective functions over a networked system captures many important applications and has received much attention in the distributed optimization field. Most of existing work focuses on development of fast distributed algorithms under the presence of a central clock. The only known algorithms with convergence guarantees for this problem in asynchronou…
▽ More
The problem of minimizing a sum of local convex objective functions over a networked system captures many important applications and has received much attention in the distributed optimization field. Most of existing work focuses on development of fast distributed algorithms under the presence of a central clock. The only known algorithms with convergence guarantees for this problem in asynchronous setup could achieve either sublinear rate under totally asynchronous setting or linear rate under partially asynchronous setting (with bounded delay). In this work, we built upon existing literature to develop and analyze an asynchronous Newton based approach for solving a penalized version of the problem. We show that this algorithm converges almost surely with global linear rate and local superlinear rate in expectation. Numerical studies confirm superior performance against other existing asynchronous methods.
△ Less
Submitted 8 January, 2019; v1 submitted 10 May, 2017;
originally announced May 2017.
-
On Perfect Matchings in Matching Covered Graphs
Authors:
**ghua He,
Erling Wei,
Dong Ye,
Shaohui Zhai
Abstract:
Let $G$ be a matching-covered graph, i.e., every edge is contained in a perfect matching. An edge subset $X$ of $G$ is feasible if there exists two perfect matchings $M_1$ and $M_2$ such that $|M_1\cap X|\not\equiv |M_2\cap X| \pmod 2$. Lukot'ka and Rollová proved that an edge subset $X$ of a regular bipartite graph is not feasible if and only if $X$ is switching-equivalent to $\emptyset$, and the…
▽ More
Let $G$ be a matching-covered graph, i.e., every edge is contained in a perfect matching. An edge subset $X$ of $G$ is feasible if there exists two perfect matchings $M_1$ and $M_2$ such that $|M_1\cap X|\not\equiv |M_2\cap X| \pmod 2$. Lukot'ka and Rollová proved that an edge subset $X$ of a regular bipartite graph is not feasible if and only if $X$ is switching-equivalent to $\emptyset$, and they further ask whether a non-feasible set of a regular graph of class 1 is always switching-equivalent to either $\emptyset$ or $E(G)$? Two edges of $G$ are equivalent to each other if a perfect matching $M$ of $G$ either contains both of them or contains none of them. An equivalent class of $G$ is an edge subset $K$ with at least two edges such that the edges of $K$ are mutually equivalent. An equivalent class is not a feasible set. Lovász proved that an equivalent class of a brick has size 2. In this paper, we show that, for every integer $k\ge 3$, there exist infinitely many $k$-regular graphs of class 1 with an arbitrarily large equivalent class $K$ such that $K$ is not switching-equivalent to either $\emptyset$ or $E(G)$, which provides a negative answer to the problem proposed by Lukot'ka and Rollová. Further, we characterize bipartite graphs with equivalent class, and characterize matching-covered bipartite graphs of which every edge is removable.
△ Less
Submitted 17 March, 2017; v1 submitted 15 March, 2017;
originally announced March 2017.
-
Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization
Authors:
Binnan Zhuang,
Dongning Guo,
Ermin Wei,
Michael L. Honig
Abstract:
Joint allocation of spectrum and user association is considered for a large cellular network. The objective is to optimize a network utility function such as average delay given traffic statistics collected over a slow timescale. A key challenge is scalability: given $n$ Access Points (APs), there are $O(2^n)$ ways in which the APs can share the spectrum. The number of variables is reduced from…
▽ More
Joint allocation of spectrum and user association is considered for a large cellular network. The objective is to optimize a network utility function such as average delay given traffic statistics collected over a slow timescale. A key challenge is scalability: given $n$ Access Points (APs), there are $O(2^n)$ ways in which the APs can share the spectrum. The number of variables is reduced from $O(2^n)$ to $O(nk)$, where $k$ is the number of users, by optimizing over local overlap** neighborhoods, defined by interference conditions, and by exploiting the existence of sparse solutions in which the spectrum is divided into $k+1$ segments. We reformulate the problem by optimizing the assignment of subsets of active APs to those segments. An $\ell_0$ constraint enforces a one-to-one map** of subsets to spectrum, and an iterative (reweighted $\ell_1$) algorithm is used to find an approximate solution. Numerical results for a network with 100 APs serving several hundred users show the proposed method achieves a substantial increase in total throughput relative to benchmark schemes.
△ Less
Submitted 18 February, 2017;
originally announced February 2017.
-
Scalable Spectrum Allocation and User Association in Networks with Many Small Cells
Authors:
Binnan Zhuang,
Dongning Guo,
Ermin Wei,
Michael L. Honig
Abstract:
A scalable framework is developed to allocate radio resources across a large number of densely deployed small cells with given traffic statistics on a slow timescale. Joint user association and spectrum allocation is first formulated as a convex optimization problem by dividing the spectrum among all possible transmission patterns of active access points (APs). To improve scalability with the numb…
▽ More
A scalable framework is developed to allocate radio resources across a large number of densely deployed small cells with given traffic statistics on a slow timescale. Joint user association and spectrum allocation is first formulated as a convex optimization problem by dividing the spectrum among all possible transmission patterns of active access points (APs). To improve scalability with the number of APs, the problem is reformulated using local patterns of interfering APs. To maintain global consistency among local patterns, inter-cluster interaction is characterized as hyper-edges in a hyper-graph with nodes corresponding to neighborhoods of APs. A scalable solution is obtained by iteratively solving a convex optimization problem for bandwidth allocation with reduced complexity and constructing a global spectrum allocation using hyper-graph coloring. Numerical results demonstrate the proposed solution for a network with 100 APs and several hundred user equipments. For a given quality of service (QoS), the proposed scheme can increase the network capacity several fold compared to assigning each user to the strongest AP with full-spectrum reuse.
△ Less
Submitted 12 January, 2017;
originally announced January 2017.
-
On the O(1/k) Convergence of Asynchronous Distributed Alternating Direction Method of Multipliers
Authors:
Ermin Wei,
Asuman Ozdaglar
Abstract:
We consider a network of agents that are cooperatively solving a global optimization problem, where the objective function is the sum of privately known local objective functions of the agents and the decision variables are coupled via linear constraints. Recent literature focused on special cases of this formulation and studied their distributed solution through either subgradient based methods w…
▽ More
We consider a network of agents that are cooperatively solving a global optimization problem, where the objective function is the sum of privately known local objective functions of the agents and the decision variables are coupled via linear constraints. Recent literature focused on special cases of this formulation and studied their distributed solution through either subgradient based methods with O(1/sqrt(k)) rate of convergence (where k is the iteration number) or Alternating Direction Method of Multipliers (ADMM) based methods, which require a synchronous implementation and a globally known order on the agents. In this paper, we present a novel asynchronous ADMM based distributed method for the general formulation and show that it converges at the rate O(1/k).
△ Less
Submitted 31 July, 2013;
originally announced July 2013.
-
A Distributed Newton Method for Network Utility Maximization
Authors:
Ermin Wei,
Asuman Ozdaglar,
Ali Jadbabaie
Abstract:
Most existing work uses dual decomposition and subgradient methods to solve Network Utility Maximization (NUM) problems in a distributed manner, which suffer from slow rate of convergence properties. This work develops an alternative distributed Newton-type fast converging algorithm for solving network utility maximization problems with self-concordant utility functions. By using novel matrix spli…
▽ More
Most existing work uses dual decomposition and subgradient methods to solve Network Utility Maximization (NUM) problems in a distributed manner, which suffer from slow rate of convergence properties. This work develops an alternative distributed Newton-type fast converging algorithm for solving network utility maximization problems with self-concordant utility functions. By using novel matrix splitting techniques, both primal and dual updates for the Newton step can be computed using iterative schemes in a decentralized manner with limited information exchange. Similarly, the stepsize can be obtained via an iterative consensus-based averaging scheme. We show that even when the Newton direction and the stepsize in our method are computed within some error (due to finite truncation of the iterative schemes), the resulting objective function value still converges superlinearly to an explicitly characterized error neighborhood. Simulation results demonstrate significant convergence rate improvement of our algorithm relative to the existing subgradient methods based on dual decomposition.
△ Less
Submitted 22 April, 2011; v1 submitted 14 May, 2010;
originally announced May 2010.
-
Localization of electric field distribution in graded core-shell metamaterials
Authors:
En-Bo Wei,
K. W. Yu
Abstract:
The local electric field distribution has been investigated in a core-shell cylindrical metamaterial structure under the illumination of a uniform incident optical field. The structure consists of a homogeneous dielectric core, a shell of graded metal-dielectric metamaterial, embedded in a uniform matrix. In the quasi-static limit, the permittivity of the metamaterial is given by the graded Drud…
▽ More
The local electric field distribution has been investigated in a core-shell cylindrical metamaterial structure under the illumination of a uniform incident optical field. The structure consists of a homogeneous dielectric core, a shell of graded metal-dielectric metamaterial, embedded in a uniform matrix. In the quasi-static limit, the permittivity of the metamaterial is given by the graded Drude model. The local electric potentials and hence the electric fields have been derived exactly and analytically in terms of hyper-geometric functions. Our results showed that the peak of the electric field inside the cylindrical shell can be confined in a desired position by varying the frequency of the optical field and the parameters of the graded profiles. Thus, by fabricating graded metamaterials, it is possible to control electric field distribution spatially. We offer an intuitive explanation for the gradation-controlled electric field distribution.
△ Less
Submitted 7 November, 2009;
originally announced November 2009.