-
Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence
Authors:
Feiran Zhao,
Keyou You
Abstract:
In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradien…
▽ More
In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradient primal-dual method to find an optimal state feedback gain. Despite the non-convexity of the cost-constrained LQR problem, we provide a constructive proof for strong duality and a geometric interpretation of an optimal multiplier set. By proving that the concave dual function is Lipschitz smooth, we further provide convergence guarantees for the PG primal-dual method. Finally, we perform simulations to validate our theoretical findings.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes
Authors:
Kang You,
Kai Liu,
Li Yu,
Pan Gao,
Dandan Ding
Abstract:
Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance an…
▽ More
Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance and extremely low-decoding-latency simultaneously. Inspired by conventional Trisoup codec, a point model-based strategy is devised to characterize local surfaces. Specifically, skin features are embedded from local windows via an attention-based encoder, and dilated windows are introduced as cross-scale priors to infer the distribution of quantized features in parallel. During decoding, features undergo fast refinement, followed by a folding-based point generator that reconstructs point coordinates with fairly fast speed. Experiments show that Pointsoup achieves state-of-the-art performance on multiple benchmarks with significantly lower decoding complexity, i.e., up to 90$\sim$160$\times$ faster than the G-PCCv23 Trisoup decoder on a comparatively low-end platform (e.g., one RTX 2080Ti). Furthermore, it offers variable-rate control with a single neural model (2.9MB), which is attractive for industrial practitioners.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Harnessing Data for Accelerating Model Predictive Control by Constraint Removal
Authors:
Zhinan Hou,
Feiran Zhao,
Keyou You
Abstract:
Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC poli…
▽ More
Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC policy. This removal rule can use the information of historical data according to the Lipschitz constant and the distance between the current state and historical states. In particular, we provide the explicit expression for calculating the Lipschitz constant by the model parameters. Finally, simulations are performed to validate the effectiveness of the proposed method.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR
Authors:
Feiran Zhao,
Florian Dörfler,
Alessandro Chiuso,
Keyou You
Abstract:
Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven…
▽ More
Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven LQR problem, which is shown to be equivalent to the certainty-equivalence LQR with optimal non-asymptotic guarantees. Second, we design a novel data-enabled policy optimization (DeePO) method to directly update the policy, where the gradient is explicitly computed using only a batch of persistently exciting (PE) data. Third, we establish its global convergence via a projected gradient dominance property. Importantly, we efficiently use DeePO to adaptively learn the LQR by performing only one-step projected gradient descent per sample of the closed-loop system, which also leads to an explicit recursive update of the policy. Under PE inputs and for bounded noise, we show that the average regret of the LQR cost is upper-bounded by two terms signifying a sublinear decrease in time $\mathcal{O}(1/\sqrt{T})$ plus a bias scaling inversely with signal-to-noise ratio (SNR), which are independent of the noise statistics. Finally, we perform simulations to validate the theoretical results and demonstrate the computational and sample efficiency of our method.
△ Less
Submitted 19 April, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Data-enabled Policy Optimization for the Linear Quadratic Regulator
Authors:
Feiran Zhao,
Florian Dörfler,
Keyou You
Abstract:
Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabl…
▽ More
Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabled policy optimization (DeePO) method, which requires only a finite number of sufficiently exciting data to iteratively solve the LQR problem via PO. Based on a data-driven closed-loop parameterization, we are able to directly compute the policy gradient from a batch of persistently exciting data. Next, we show that the nonconvex PO problem satisfies a projected gradient dominance property by relating it to an equivalent convex program, leading to the global convergence of DeePO. Moreover, we apply regularization methods to enhance certainty-equivalence and robustness of the resulting controller and show an implicit regularization property. Finally, we perform simulations to validate our results.
△ Less
Submitted 15 September, 2023; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Standoff Tracking Using DNN-Based MPC with Implementation on FPGA
Authors:
Fei Dong,
Xingchen Li,
Keyou You,
Shiji Song
Abstract:
This work studies the standoff tracking problem to drive an unmanned aerial vehicle (UAV) to slide on a desired circle over a moving target at a constant height. We propose a novel Lyapunov guidance vector (LGV) field with tunable convergence rates for the UAV's trajectory planning and a deep neural network (DNN)-based model predictive control (MPC) scheme to track the reference trajectory. Then,…
▽ More
This work studies the standoff tracking problem to drive an unmanned aerial vehicle (UAV) to slide on a desired circle over a moving target at a constant height. We propose a novel Lyapunov guidance vector (LGV) field with tunable convergence rates for the UAV's trajectory planning and a deep neural network (DNN)-based model predictive control (MPC) scheme to track the reference trajectory. Then, we show how to collect samples for training the DNN offline and design an integral module (IM) to refine the tracking performance of our DNN-based MPC. Moreover, the hardware-in-the-loop (HIL) simulation with an FPGA@200MHz demonstrates that our method is a valid alternative to embedded implementations of MPC for addressing complex systems and applications which is impossible for directly solving the MPC optimization problems.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
Globally Convergent Policy Gradient Methods for Linear Quadratic Control of Partially Observed Systems
Authors:
Feiran Zhao,
Xingyun Fu,
Keyou You
Abstract:
While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite…
▽ More
While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite length as feedback. We show that the solution set to the parameterized optimization problem is a matrix space, which is invariant to similarity transformation. By proving a gradient dominance property, we show the global convergence of policy gradient methods. Moreover, we observe that the gradient is orthogonal to the solution set, revealing an explicit relation between the resulting solution and the initial policy. Finally, we perform simulations to validate our theoretical results.
△ Less
Submitted 22 April, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Optimal $(0,1)$-Matrix Completion with Majorization Ordered Objectives (To the memory of Pravin Varaiya)
Authors:
Yanfang Mo,
Wei Chen,
Keyou You,
Li Qiu
Abstract:
We propose and examine two optimal $(0,1)$-matrix completion problems with majorization ordered objectives. They elevate the seminal study by Gale and Ryser from feasibility to optimality in partial order programming (POP), referring to optimization with partially ordered objectives. We showcase their applications in electric vehicle charging, portfolio optimization, and secure data storage. Solvi…
▽ More
We propose and examine two optimal $(0,1)$-matrix completion problems with majorization ordered objectives. They elevate the seminal study by Gale and Ryser from feasibility to optimality in partial order programming (POP), referring to optimization with partially ordered objectives. We showcase their applications in electric vehicle charging, portfolio optimization, and secure data storage. Solving such integer POP (iPOP) problems is challenging because of the possible non-comparability among objective values and the integer requirements. Nevertheless, we prove the essential uniqueness of all optimal objective values and identify two particular ones for each of the two inherently symmetric iPOP problems. Furthermore, for every optimal objective value, we decompose the construction of an associated optimal~$(0,1)$-matrix into a series of sorting processes, respectively agreeing with the rule of thumb "peak shaving" or "valley filling." We show that the resulting algorithms have linear time complexities and verify their empirical efficiency via numerical simulations compared to the standard order-preserving method for POP.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Minimum Input Design for Direct Data-driven Property Identification of Unknown Linear Systems
Authors:
Shubo Kang,
Keyou You
Abstract:
In a direct data-driven approach, this paper studies the {\em property identification(ID)} problem to analyze whether an unknown linear system has a property of interest, e.g., stabilizability and structural properties. In sharp contrast to the model-based analysis, we approach it by directly using the input and state feedback data of the unknown system. Via a new concept of sufficient richness of…
▽ More
In a direct data-driven approach, this paper studies the {\em property identification(ID)} problem to analyze whether an unknown linear system has a property of interest, e.g., stabilizability and structural properties. In sharp contrast to the model-based analysis, we approach it by directly using the input and state feedback data of the unknown system. Via a new concept of sufficient richness of input sectional data, we first establish the necessary and sufficient condition for the minimum input design to excite the system for property ID. Specifically, the input sectional data is sufficiently rich for property ID {\em if and only if} it spans a linear subspace that contains a property dependent minimum linear subspace, any basis of which can also be easily used to form the minimum excitation input. Interestingly, we show that many structural properties can be identified with the minimum input that is however unable to identify the explicit system model. Overall, our results rigorously quantify the advantages of the direct data-driven analysis over the model-based analysis for linear systems in terms of data efficiency.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
IPDAE: Improved Patch-Based Deep Autoencoder for Lossy Point Cloud Geometry Compression
Authors:
Kang You,
Pan Gao,
Qing Li
Abstract:
Point cloud is a crucial representation of 3D contents, which has been widely used in many areas such as virtual reality, mixed reality, autonomous driving, etc. With the boost of the number of points in the data, how to efficiently compress point cloud becomes a challenging problem. In this paper, we propose a set of significant improvements to patch-based point cloud compression, i.e., a learnab…
▽ More
Point cloud is a crucial representation of 3D contents, which has been widely used in many areas such as virtual reality, mixed reality, autonomous driving, etc. With the boost of the number of points in the data, how to efficiently compress point cloud becomes a challenging problem. In this paper, we propose a set of significant improvements to patch-based point cloud compression, i.e., a learnable context model for entropy coding, octree coding for sampling centroid points, and an integrated compression and training process. In addition, we propose an adversarial network to improve the uniformity of points during reconstruction. Our experiments show that the improved patch-based autoencoder outperforms the state-of-the-art in terms of rate-distortion performance, on both sparse and large-scale point clouds. More importantly, our method can maintain a short compression time while ensuring the reconstruction quality.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Convergence and Sample Complexity of Policy Gradient Methods for Stabilizing Linear Systems
Authors:
Feiran Zhao,
Xingyun Fu,
Keyou You
Abstract:
System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the…
▽ More
System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the policy and the discount factor of the LQR problem. Firstly, we propose an explicit rule to adaptively adjust the discount factor by exploring the stability margin of a linear control policy. Then, we establish the sample complexity of PG methods for stabilization, which only adds a coefficient logarithmic in the spectral radius of the state matrix to that for solving the LQR problem with a prior stabilizing policy. Finally, we perform simulations to validate our theoretical findings and demonstrate the effectiveness of our method on a class of nonlinear systems.
△ Less
Submitted 14 September, 2023; v1 submitted 28 May, 2022;
originally announced May 2022.
-
Data-driven Control of Unknown Linear Systems via Quantized Feedback
Authors:
Feiran Zhao,
Xingchen Li,
Keyou You
Abstract:
Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a com…
▽ More
Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a common stabilizing controller for all possible dynamics consistent with data, in the form of a linear matrix inequality. Moreover, we formulate semi-definite programming to solve the coarsest quantization density. By establishing its connections to unstable eigenvalues of the state matrix, we further prove a necessary rank condition on the data for quantized feedback stabilization. Finally, we validate our theoretical results by numerical examples.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
An Exact Method for the Daily Package Shipment Problem with Outsourcing
Authors:
Zhuolin Wang,
Rong** Zhu,
Jian-Ya Ding,
Yu Yang,
Keyou You
Abstract:
The package shipment problem requires to optimally co-design paths for both packages and a heterogeneous fleet in a transit center network (TCN). Instances arising from the package delivery industry in China usually involve more than ten thousand origin-destination (OD) pairs and have to be solved daily within an hour. Motivated by the fact that there is no interaction among different origin cente…
▽ More
The package shipment problem requires to optimally co-design paths for both packages and a heterogeneous fleet in a transit center network (TCN). Instances arising from the package delivery industry in China usually involve more than ten thousand origin-destination (OD) pairs and have to be solved daily within an hour. Motivated by the fact that there is no interaction among different origin centers due to their competitive relationship, we propose a novel two-layer localized package shipment on a TCN (LPS-TCN) model that exploits outsourcing for cost saving. Consequently, the original problem breaks into a set of much smaller shipment problems, each of which has hundreds of OD pairs and is subsequently modelled as a mixed integer program (MIP). Since the LPS-TCN model is proved to be Strongly NP-hard and contains tens of thousands of feasible paths, an off-the-shelf MIP solver cannot produce a reliable solution in a practically acceptable amount of time. We develop a column generation based algorithm that iteratively adds "profitable" paths and further enhance it by problem-specific cutting planes and variable bound tightening techniques. Computational experiments on realistic instances from a major Chinese package express company demonstrate that the LPS-TCN model can yield solutions that bring daily economic cost reduction up to 1 million CNY for the whole TCN. In addition, our proposed algorithm solves the LPS-TCN model substantially faster than CPLEX, one of the state-of-the-art commercial MIP solvers.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Learning Stabilizing Controllers of Linear Systems via Discount Policy Gradient
Authors:
Feiran Zhao,
Xingyun Fu,
Keyou You
Abstract:
Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy w…
▽ More
Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy with finite cost. To this end, we discount the LQR cost with a factor, by adaptively increasing which gradient leads the policy to the stabilizing set while maintaining a finite cost. Based on the Lyapunov theory, we design an update rule for the discount factor which can be directly computed from data, rendering our method purely model-free. Compared to recent work \citep{perdomo2021stabilizing}, our algorithm allows the policy to be updated only once for each discount factor. Moreover, the number of sampled trajectories and simulation time for gradient descent is significantly reduced to $\mathcal{O}(\log(1/ε))$ for the desired accuracy $ε$. Finally, we conduct simulations on both small-scale and large-scale examples to show the efficiency of our discount PG method.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Multi-period facility location and capacity planning under $\infty$-Wasserstein joint chance constraints in humanitarian logistics
Authors:
Zhuolin Wang,
Keyou You,
Zhengli Wang,
Kanglin Liu
Abstract:
The key of the post-disaster humanitarian logistics (PD-HL) is to build a good facility location and capacity planning (FLCP) model for delivering relief supplies to affected areas in time. To fully exploit the historical PD data, this paper adopts the data-driven distributionally robust (DR) approach and proposes a novel multi-period FLCP model under the $\infty$-Wasserstein joint chance constrai…
▽ More
The key of the post-disaster humanitarian logistics (PD-HL) is to build a good facility location and capacity planning (FLCP) model for delivering relief supplies to affected areas in time. To fully exploit the historical PD data, this paper adopts the data-driven distributionally robust (DR) approach and proposes a novel multi-period FLCP model under the $\infty$-Wasserstein joint chance constraints (MFLCP-W). Specifically, we sequentially decide locations from a candidate set to build facilities with supply capacities, which are expanded if more economical, and use a finite number of historical demand samples in chance constraints to ensure a high probability of on-time delivery. To solve the MFLCP-W model, we equivalently reformulate it as a mixed integer second-order cone program and then solve it by designing an effective outer approximation algorithm with two tailored valid cuts. Finally, a case study under hurricane threats shows that MFLCP-W outperforms its counterparts in the terms of the cost and service quality, and that our algorithm converges significantly faster than the commercial solver CPLEX 12.8 with a better optimality gap.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
RSS-based Multiple Sources Localization with Unknown Log-normal Shadow Fading
Authors:
Yueyan Chu,
Wenbin Guo,
Kangyong You,
Lei Zhao,
Tao Peng,
Wenbo Wang
Abstract:
Multi-source localization based on received signal strength (RSS) has drawn great interest in wireless sensor networks. However, the shadow fading term caused by obstacles cannot be separated from the received signal, which leads to severe error in location estimate. In this paper, we approximate the log-normal sum distribution through Fenton-Wilkinson method to formulate a non-convex maximum like…
▽ More
Multi-source localization based on received signal strength (RSS) has drawn great interest in wireless sensor networks. However, the shadow fading term caused by obstacles cannot be separated from the received signal, which leads to severe error in location estimate. In this paper, we approximate the log-normal sum distribution through Fenton-Wilkinson method to formulate a non-convex maximum likelihood (ML) estimator with unknown shadow fading factor. In order to overcome the difficulty in solving the non-convex problem, we propose a novel algorithm to estimate the locations of sources. Specifically, the region is divided into $N$ grids firstly, and the multi-source localization is converted into a sparse recovery problem so that we can obtain the sparse solution. Then we utilize the K-means clustering method to obtain the rough locations of the off-grid sources as the initial feasible point of the ML estimator. Finally, an iterative refinement of the estimated locations is proposed by dynamic updating of the localization dictionary. The proposed algorithm can efficiently approach a superior local optimal solution of the ML estimator. It is shown from the simulation results that the proposed method has a promising localization performance and improves the robustness for multi-source localization in unknown shadow fading environments. Moreover, the proposed method provides a better computational complexity from $O(K^3N^3)$ to $O(N^3)$.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression
Authors:
Kang You,
Pan Gao
Abstract:
The ever-increasing 3D application makes the point cloud compression unprecedentedly important and needed. In this paper, we propose a patch-based compression process using deep learning, focusing on the lossy point cloud geometry compression. Unlike existing point cloud compression networks, which apply feature extraction and reconstruction on the entire point cloud, we divide the point cloud int…
▽ More
The ever-increasing 3D application makes the point cloud compression unprecedentedly important and needed. In this paper, we propose a patch-based compression process using deep learning, focusing on the lossy point cloud geometry compression. Unlike existing point cloud compression networks, which apply feature extraction and reconstruction on the entire point cloud, we divide the point cloud into patches and compress each patch independently. In the decoding process, we finally assemble the decompressed patches into a complete point cloud. In addition, we train our network by a patch-to-patch criterion, i.e., use the local reconstruction loss for optimization, to approximate the global reconstruction optimality. Our method outperforms the state-of-the-art in terms of rate-distortion performance, especially at low bitrates. Moreover, the compression process we proposed can guarantee to generate the same number of points as the input. The network model of this method can be easily applied to other point cloud reconstruction problems, such as upsampling.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Multiple Sources Localization with Sparse Recovery under Log-normal Shadow Fading
Authors:
Yueyan Chu,
Kangyong You,
Wenbin Guo
Abstract:
Localization based on received signal strength (RSS) has drawn great interest in the wireless sensor network (WSN). In this paper, we investigate the RSS-based multi-sources localization problem with unknown transmitted power under shadow fading. The log-normal shadowing effect is approximated through Fenton-Wilkinson (F-W) method and maximum likelihood estimation is adopted to optimize the RSS-ba…
▽ More
Localization based on received signal strength (RSS) has drawn great interest in the wireless sensor network (WSN). In this paper, we investigate the RSS-based multi-sources localization problem with unknown transmitted power under shadow fading. The log-normal shadowing effect is approximated through Fenton-Wilkinson (F-W) method and maximum likelihood estimation is adopted to optimize the RSS-based multiple sources localization problem. Moreover, we exploit a sparse recovery and weighted average of candidates (SR-WAC) based method to set up an initiation, which can efficiently approach a superior local optimal solution. It is shown from the simulation results that the proposed method has a much higher localization accuracy and outperforms the other
△ Less
Submitted 31 March, 2021;
originally announced May 2021.
-
Innovation Compression for Communication-efficient Distributed Optimization with Linear Convergence
Authors:
Jiaqi Zhang,
Keyou You,
Lihua Xie
Abstract:
Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve line…
▽ More
Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $δ$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors.
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
Global Convergence of Policy Gradient Primal-dual Methods for Risk-constrained LQRs
Authors:
Feiran Zhao,
Keyou You,
Tamer Başar
Abstract:
While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem…
▽ More
While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem via the PO approach, which requires addressing a challenging non-convex constrained optimization problem. To solve it, we first build on our earlier result that an optimal policy has a time-invariant affine structure to show that the associated Lagrangian function is coercive, locally gradient dominated and has local Lipschitz continuous gradient, based on which we establish strong duality. Then, we design policy gradient primal-dual methods with global convergence guarantees in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our methods.
△ Less
Submitted 21 November, 2022; v1 submitted 10 April, 2021;
originally announced April 2021.
-
Infinite-horizon Risk-constrained Linear Quadratic Regulator with Average Cost
Authors:
Feiran Zhao,
Keyou You,
Tamer Basar
Abstract:
The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie…
▽ More
The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie within a user-specified level. By leveraging the duality, its optimal solution is first shown to be stationary and affine in the state, i.e., $u(x,λ^*) = -K(λ^*)x + l(λ^*)$, where $λ^*$ is an optimal multiplier, used to address the risk constraint. Then, we establish the stability of the resulting closed-loop system. Furthermore, we propose a primal-dual method with sublinear convergence rate to find an optimal policy $u(x,λ^*)$. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed framework and the primal-dual method.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
A Distributed Implementation of Steady-State Kalman Filter
Authors:
Jiaqi Yan,
Xu Yang,
Yilin Mo,
Keyou You
Abstract:
This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system. By a lossless decomposition of optimal steady-state Kalman filter, we show that the problem of distributed estimation can be reformulated as synchronization of homogeneous linear systems. Based on such decompositi…
▽ More
This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system. By a lossless decomposition of optimal steady-state Kalman filter, we show that the problem of distributed estimation can be reformulated as synchronization of homogeneous linear systems. Based on such decomposition, a distributed estimator is proposed, where each sensor node runs a local filter using only its own measurement and fuses the local estimate of each node with a consensus algorithm. We show that the average of the estimate from all sensors coincides with the optimal Kalman estimate. Numerical examples are provided in the end to illustrate the performance of the proposed scheme.
△ Less
Submitted 21 April, 2022; v1 submitted 26 January, 2021;
originally announced January 2021.
-
Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator
Authors:
Feiran Zhao,
Keyou You
Abstract:
Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair…
▽ More
Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair and leverage primal-dual methods to optimize it by solely using data. We first study the optimization landscape of the Lagrangian function and establish the strong duality in spite of its non-convex nature. Alongside, we find that the Lagrangian function enjoys an important local gradient dominance property, which is then exploited to develop a convergent random search algorithm to learn the dual function. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.
△ Less
Submitted 30 May, 2021; v1 submitted 21 November, 2020;
originally announced November 2020.
-
Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric
Authors:
Feiran Zhao,
Keyou You
Abstract:
Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic gam…
▽ More
Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic game of the penalized LQR problem, we propose a Q-learning method with convergence guarantees to learn an optimal minimax controller.
△ Less
Submitted 16 January, 2023; v1 submitted 13 October, 2020;
originally announced October 2020.
-
Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection
Authors:
Takuya Higuchi,
Mohammad Ghasemzadeh,
Kisun You,
Chandra Dhir
Abstract:
We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device.…
▽ More
We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device. Therefore, having small memory and compute cost is crucial for a voice trigger detection system. Recently, singular value decomposition filters (SVDFs) has been used for end-to-end voice trigger detection. The SVDFs approximate a fully-connected layer with a low rank approximation, which reduces the number of model parameters. In this work, we propose S1DCNN as an alternative approach for end-to-end small-footprint voice trigger detection. An S1DCNN layer consists of a 1D convolution layer followed by a depth-wise 1D convolution layer. We show that the SVDF can be expressed as a special case of the S1DCNN layer. Experimental results show that the S1DCNN achieve 19.0% relative false reject ratio (FRR) reduction with a similar model size and a similar time delay compared to the SVDF. By using longer time delays, the S1DCNN further improve the FRR up to 12.2% relative.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
The Isoline Tracking in Unknown Scalar Fields with Concentration Feedback
Authors:
Fei Dong,
Keyou You
Abstract:
The isoline tracking of this work is concerned with the control design for a sensing vehicle to track a desired isoline of an unknown scalar field. To this end, we propose a simple PI-like controller for a Dubins vehicle in the GPS-denied environments. Our key idea lies in the design of a novel sliding surface based error in the standard PI controller. For the circular field, we show that the P-li…
▽ More
The isoline tracking of this work is concerned with the control design for a sensing vehicle to track a desired isoline of an unknown scalar field. To this end, we propose a simple PI-like controller for a Dubins vehicle in the GPS-denied environments. Our key idea lies in the design of a novel sliding surface based error in the standard PI controller. For the circular field, we show that the P-like controller can globally regulate the vehicle to the desired isoline with the steady-state error that can be arbitrarily reduced by increasing the P gain, and is eliminated by the PI-like controller. For any smoothing field, the P-like controller is able to achieve the local regulation. Then, it is extended to the cases of a single-integrator vehicle and a doubleintegrator vehicle, respectively. Finally, the effectiveness and advantages of our approaches are validated via simulations on the fixed-wing UAV and quadrotor simulators.
△ Less
Submitted 15 July, 2020;
originally announced July 2020.
-
Coordinate-free Isoline Tracking in Unknown 2-D Scalar Fields
Authors:
Fei Dong,
Keyou You
Abstract:
The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concentration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surfac…
▽ More
The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concentration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surface based error term in the standard PI controller. Interestingly, we also prove that the tracking error can be reduced by increasing the proportion gain, and is eliminated for circular fields with a non-zero integral gain. The effectiveness of our controller is validated via simulations by using a fixed-wing UAV on the real dataset of the concentration distribution of PM 2.5 in Handan, China.
△ Less
Submitted 27 March, 2020;
originally announced March 2020.
-
Distributed Adaptive Newton Methods with Global Superlinear Convergence
Authors:
Jiaqi Zhang,
Keyou You,
Tamer Başar
Abstract:
This paper considers the distributed optimization problem where each node of a peer-to-peer network minimizes a finite sum of objective functions by communicating with its neighboring nodes. In sharp contrast to the existing literature where the fastest distributed algorithms converge either with a global linear or a local superlinear rate, we propose a distributed adaptive Newton (DAN) algorithm…
▽ More
This paper considers the distributed optimization problem where each node of a peer-to-peer network minimizes a finite sum of objective functions by communicating with its neighboring nodes. In sharp contrast to the existing literature where the fastest distributed algorithms converge either with a global linear or a local superlinear rate, we propose a distributed adaptive Newton (DAN) algorithm with a global quadratic convergence rate. Our key idea lies in the design of a finite-time set-consensus method with Polyak's adaptive stepsize. Moreover, we introduce a low-rank matrix approximation (LA) technique to compress the innovation of Hessian matrix so that each node only needs to transmit message of dimension $\mathcal{O}(p)$ (where $p$ is the dimension of decision vectors) per iteration, which is essentially the same as that of first-order methods. Nevertheless, the resulting DAN-LA converges to an optimal solution with a global superlinear rate. Numerical experiments on logistic regression problems are conducted to validate their advantages over existing methods.
△ Less
Submitted 14 January, 2022; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Second-order Conic Programming Approach for Wasserstein Distributionally Robust Two-stage Linear Programs
Authors:
Zhuolin Wang,
Keyou You,
Shiji Song,
Yuli Zhang
Abstract:
This paper proposes a second-order conic programming (SOCP) approach to solve distributionally robust two-stage stochastic linear programs over 1-Wasserstein balls. We start from the case with distribution uncertainty only in the objective function and exactly reformulate it as an SOCP problem. Then, we study the case with distribution uncertainty only in constraints, and show that such a robust p…
▽ More
This paper proposes a second-order conic programming (SOCP) approach to solve distributionally robust two-stage stochastic linear programs over 1-Wasserstein balls. We start from the case with distribution uncertainty only in the objective function and exactly reformulate it as an SOCP problem. Then, we study the case with distribution uncertainty only in constraints, and show that such a robust program is generally NP-hard as it involves a norm maximization problem over a polyhedron. However, it is reduced to an SOCP problem if the extreme points of the polyhedron are given as a prior. This motivates to design a constraint generation algorithm with provable convergence to approximately solve the NP-hard problem. In sharp contrast to the exiting literature, the distribution achieving the worst-case cost is given as an "empirical" distribution by simply perturbing each sample for both cases. Finally, experiments illustrate the advantages of the proposed model in terms of the out-of-sample performance and the computational complexity.
△ Less
Submitted 28 May, 2020; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Coordinate-free Circumnavigation of a Moving Target via a PD-like Controller
Authors:
Fei Dong,
Keyou You,
Lihua Xie,
Qinglei Hu
Abstract:
This paper proposes a coordinate-free controller for a nonholonomic vehicle to circumnavigate a fully-actuated moving target by using range-only measurements. If the range rate is available, our Proportional Derivative (PD)-like controller has a simple structure as the standard PD controller, except the design of an additive constant bias and a saturation function in the error feedback. We show th…
▽ More
This paper proposes a coordinate-free controller for a nonholonomic vehicle to circumnavigate a fully-actuated moving target by using range-only measurements. If the range rate is available, our Proportional Derivative (PD)-like controller has a simple structure as the standard PD controller, except the design of an additive constant bias and a saturation function in the error feedback. We show that if the target is stationary, the vehicle asymptotically encloses the target with a predefined radius at an exponential convergence rate, i.e., an exact circumnavigation pattern can be completed. For a moving target, the circumnavigation error converges to a small region whose size is shown proportional to the maneuverability of the target, e.g., the maximum linear speed and acceleration. Moreover, we design a second-order sliding mode (SOSM) filter to estimate the range rate and show that the SOSM filter can recover the range rate in a finite time. Finally, the effectiveness and advantages of our controller are validated via both numerical simulations and real experiments.
△ Less
Submitted 13 November, 2021; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Parametric Sparse Bayesian Dictionary Learning for Multiple Sources Localization with Propagation Parameters Uncertainty and Nonuniform Noise
Authors:
Kangyong You,
Wenbin Guo,
Tao Peng,
Yueliang Liu,
Peiliang Zuo,
Wenbo Wang
Abstract:
Received signal strength (RSS) based source localization method is popular due to its simplicity and low cost. However, this method is highly dependent on the propagation model which is not easy to be captured in practice. Moreover, most existing works only consider the single source and the identical measurement noise scenario, while in practice multiple co-channel sources may transmit simultaneo…
▽ More
Received signal strength (RSS) based source localization method is popular due to its simplicity and low cost. However, this method is highly dependent on the propagation model which is not easy to be captured in practice. Moreover, most existing works only consider the single source and the identical measurement noise scenario, while in practice multiple co-channel sources may transmit simultaneously, and the measurement noise tends to be nonuniform. In this paper, we study the multiple co-channel sources localization (MSL) problem under unknown nonuniform noise, while jointly estimating the parametric propagation model. Specifically, we model the MSL problem as being parameterized by the unknown source locations and propagation parameters, and then reformulate it as a joint parametric sparsifying dictionary learning (PSDL) and sparse signal recovery (SSR) problem which is solved under the framework of sparse Bayesian learning with iterative parametric dictionary approximation. Furthermore, multiple snapshot measurements are utilized to improve the localization accuracy, and the Cramer-Rao lower bound (CRLB) is derived to analyze the theoretical estimation error bound. Comparing with the state-of-the-art sparsity-based MSL algorithms as well as CRLB, extensive simulations show the importance of jointly inferring the propagation parameters,and highlight the effectiveness and superiority of the proposed method.
△ Less
Submitted 22 December, 2019; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Graph Learning for Spatiotemporal Signals with Long- and Short-Term Characterization
Authors:
Yueliang Liu,
Wenbin Guo,
Kangyong You,
Lei Zhao,
Tao Peng,
Wenbo Wang
Abstract:
Mining natural associations from high-dimensional spatiotemporal signals plays an important role in various fields including biology, climatology, and financial analysis. However, most existing works have mainly studied time-independent signals without considering the correlations of spatiotemporal signals that achieve high learning accuracy. This paper aims to learn graphs that better reflect und…
▽ More
Mining natural associations from high-dimensional spatiotemporal signals plays an important role in various fields including biology, climatology, and financial analysis. However, most existing works have mainly studied time-independent signals without considering the correlations of spatiotemporal signals that achieve high learning accuracy. This paper aims to learn graphs that better reflect underlying data relations by leveraging the long- and short-term characteristics of spatiotemporal signals. First, a spatiotemporal signal model is presented that considers both spatial and temporal relations. In particular, we integrate a low-rank representation and a Gaussian Markov process to describe the temporal correlations. Then, the graph learning problem is formulated as a joint low-rank component estimation and graph Laplacian inference. Accordingly, we propose a low rank and spatiotemporal smoothness-based graph learning method (GL-LRSS), which introduces a spatiotemporal smoothness prior into time-vertex signal analysis. By jointly exploiting the low rank of long-time observations and the smoothness of short-time observations, the overall learning performance can be effectively improved. Experiments on both synthetic and real-world datasets demonstrate substantial improvements in the learning accuracy of the proposed method over the state-of-the-art low-rank component estimation and graph learning methods.
△ Less
Submitted 6 December, 2020; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Suspension Regulation of Medium-low-speed Maglev Trains via Deep Reinforcement Learning
Authors:
Feiran Zhao,
Keyou You,
Shiji Song,
Wenyue Zhang,
Laisheng Tong
Abstract:
The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision pro…
▽ More
The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision process (MDP) with unknown transition probabilities. With the deterministic policy gradient and neural network approximation, we design reinforcement learning (RL) algorithms to solve the MDP and obtain a state-feedback controller by using sampled data from the suspension system. To further improve its performance, we adopt a double Q-learning scheme for learning the regulation controller. We illustrate that the proposed controllers outperform the existing PID controller with a real dataset from the mlsMT in Changsha, China and is even comparable to model-based controllers, which assume that the complete information of the model is known, via simulations.
△ Less
Submitted 8 May, 2020; v1 submitted 28 October, 2019;
originally announced October 2019.
-
Distributed Dual Gradient Tracking for Resource Allocation in Unbalanced Networks
Authors:
Jiaqi Zhang,
Keyou You,
Kai Cai
Abstract:
This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the…
▽ More
This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the resource allocation problem. To study the convergence of the DDGT, we first establish the sublinear convergence rate of PPG for non-convex objective functions, which advances the existing results on PPG as they require the strong-convexity of objective functions. Then we show that the DDGT converges linearly for strongly convex and Lipschitz smooth cost functions, and sublinearly without the Lipschitz smoothness. Finally, experimental results suggest that DDGT outperforms existing algorithms.
△ Less
Submitted 23 August, 2020; v1 submitted 22 September, 2019;
originally announced September 2019.
-
Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization
Authors:
Jiaqi Zhang,
Keyou You
Abstract:
This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mi…
▽ More
This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mini-batch of samples, where the batch size is designed to be proportional to the size of the local dataset. We explicitly evaluate the convergence rate of DSGT with respect to the number of iterations in terms of algebraic connectivity of the network, mini-batch size, gradient variance, etc. Under certain conditions, we further show that DSGT has a network independence property in the sense that the network topology only affects the convergence rate up to a constant factor. Hence, the convergence rate of DSGT can be comparable to the centralized SGD method. Moreover, a linear speedup of DSGT with respect to the number of nodes is achievable for some scenarios. Numerical experiments for neural networks and logistic regression problems on CIFAR-10 finally illustrate the advantages of DSGT.
△ Less
Submitted 28 August, 2020; v1 submitted 6 September, 2019;
originally announced September 2019.
-
Optimization-based Control for Bearing-only Target Search with a Mobile Vehicle
Authors:
Zhuo Li,
Keyou You,
Shiji Song,
Anke Xue
Abstract:
This work aims to design an optimization-based controller for a discrete-time Dubins vehicle to approach a target with unknown position as fast as possible by only using bearing measurements. To this end, we propose a bi-objective optimization problem, which jointly considers the performance of estimating the unknown target position and controlling the mobile vehicle to a known position, and then…
▽ More
This work aims to design an optimization-based controller for a discrete-time Dubins vehicle to approach a target with unknown position as fast as possible by only using bearing measurements. To this end, we propose a bi-objective optimization problem, which jointly considers the performance of estimating the unknown target position and controlling the mobile vehicle to a known position, and then adopt a weighted sum method with normalization to solve it. The controller is given based on the solution of the optimization problem in ties with a least-square estimate of the target position. Moreover, the controller does not need the vehicle's global position information. Finally, simulation results are included to validate the effectiveness of the proposed controller.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
Target Encirclement with any Smooth Pattern Using Range-only Measurements
Authors:
Fei Dong,
Keyou You,
Shiji Song
Abstract:
This paper proposes a coordinate-free controller to drive a mobile robot to encircle a target at unknown position by only using range measurements. Different from the existing works, a backstep** based controller is proposed to encircle the target with zero steady-state error for any desired smooth pattern. Moreover, we show its asymptotic exponential convergence under a fixed set of control par…
▽ More
This paper proposes a coordinate-free controller to drive a mobile robot to encircle a target at unknown position by only using range measurements. Different from the existing works, a backstep** based controller is proposed to encircle the target with zero steady-state error for any desired smooth pattern. Moreover, we show its asymptotic exponential convergence under a fixed set of control parameters, which are independent of the initial distance to the target. The effectiveness and advantages of the proposed controller are validated via simulations.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
Flight Control for UAV Loitering Over a Ground Target with Unknown Maneuver
Authors:
Fei Dong,
Keyou You,
Jiaqi Zhang
Abstract:
This paper proposes a flight controller for an unmanned aerial vehicle (UAV) to loiter over a ground moving target (GMT). We are concerned with the scenario that the stochastically time-varying maneuver of the GMT is unknown to the UAV, which renders it challenging to estimate the GMT's motion state. Assuming that the state of the GMT is available, we first design a discrete-time Lyapunov vector f…
▽ More
This paper proposes a flight controller for an unmanned aerial vehicle (UAV) to loiter over a ground moving target (GMT). We are concerned with the scenario that the stochastically time-varying maneuver of the GMT is unknown to the UAV, which renders it challenging to estimate the GMT's motion state. Assuming that the state of the GMT is available, we first design a discrete-time Lyapunov vector field for the loitering guidance and then design a discrete-time integral sliding mode control (ISMC) to track the guidance commands. By modeling the maneuver process as a finite-state Markov chain, we propose a Rao-Blackwellised particle filter (RBPF), which only requires a few number of particles, to simultaneously estimate the motion state and the maneuver of the GMT with a camera or radar sensor. Then, we apply the principle of certainty equivalence to the ISMC and obtain the flight controller for completing the loitering task. Finally, the effectiveness and advantages of our controller are validated via simulations.
△ Less
Submitted 19 October, 2019; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Cooperative Source Seeking via Networked Multi-vehicle Systems
Authors:
Zhuo Li,
Keyou You,
Shiji Song
Abstract:
This paper studies the cooperative source seeking problem via a networked multi-vehicle system. In contrast to existing literature, the multi-vehicle system is controlled to the source position that maximizes aggregated multiple unknown scalar fields and each sensor-enabled vehicle only samples measurements of one scalar field. Thus, a single vehicle is unable to localize the source and has to coo…
▽ More
This paper studies the cooperative source seeking problem via a networked multi-vehicle system. In contrast to existing literature, the multi-vehicle system is controlled to the source position that maximizes aggregated multiple unknown scalar fields and each sensor-enabled vehicle only samples measurements of one scalar field. Thus, a single vehicle is unable to localize the source and has to cooperate with its neighboring vehicles. By jointly exploiting the ideas of the consensus algorithm and the stochastic extremum seeking (ES), this paper proposes novel distributed stochastic ES controllers, which are gradient-free and do not need any absolute information, such that the multi-vehicle system simultaneously approaches the source position. The effectiveness of the proposed controllers is proved for quadratic scalar fields. Finally, illustrative examples are included to validate the theoretical results.
△ Less
Submitted 9 January, 2020; v1 submitted 22 January, 2019;
originally announced January 2019.
-
Range-based Coordinate Alignment for Cooperative Mobile Sensor Network Localization
Authors:
Keyou You,
Qizhu Chen,
Pei Xie,
Shiji Song
Abstract:
This paper studies a coordinate alignment problem for cooperative mobile sensor network localization with range-based measurements. The network consists of target nodes, each of which has only access position information in a local fixed coordinate frame, and anchor nodes with GPS position information. To localize target nodes, we aim to align their coordinate frames, which leads to a non-convex o…
▽ More
This paper studies a coordinate alignment problem for cooperative mobile sensor network localization with range-based measurements. The network consists of target nodes, each of which has only access position information in a local fixed coordinate frame, and anchor nodes with GPS position information. To localize target nodes, we aim to align their coordinate frames, which leads to a non-convex optimization problem over a rotation group $\text{SO}(3)$. Then, we reformulate it as an optimization problem with a convex objective function over spherical surfaces. We explicitly design both iterative and recursive algorithms for localizing a target node with an anchor node, and extend to the case with multiple target nodes. Finally, the advantages of our algorithms against the literature are validated via simulations.
△ Less
Submitted 22 February, 2020; v1 submitted 10 December, 2018;
originally announced December 2018.
-
Parallel Optimal Control for Cooperative Automation of Large-scale Connected Vehicles via ADMM
Authors:
Zhitao Wang,
Yang Zheng,
Shengbo Eben Li,
Keyou You,
Keqiang Li
Abstract:
This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating di…
▽ More
This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating direction method of multipliers (ADMM) to solve the centralized optimization in a parallel way, which scales more favorably to large-scale instances. Also, Taylor series is used to linearize nonconvex constraints caused by coupling collision avoidance constraints among interactive vehicles. Simulations with two typical traffic scenes for multiple vehicles demonstrate the effectiveness and efficiency of our method.
△ Less
Submitted 31 July, 2018;
originally announced July 2018.
-
Bayesian Filtering with Unknown Sensor Measurement Losses
Authors:
Jiaqi Zhang,
Keyou You,
Lihua Xie
Abstract:
This work studies the state estimation problem of a stochastic nonlinear system with unknown sensor measurement losses. If the estimator knows the sensor measurement losses of a linear Gaussian system, the minimum variance estimate is easily computed by the celebrated intermittent Kalman filter (IKF). However, this will no longer be the case when the measurement losses are unknown and/or the syste…
▽ More
This work studies the state estimation problem of a stochastic nonlinear system with unknown sensor measurement losses. If the estimator knows the sensor measurement losses of a linear Gaussian system, the minimum variance estimate is easily computed by the celebrated intermittent Kalman filter (IKF). However, this will no longer be the case when the measurement losses are unknown and/or the system is nonlinear or non-Gaussian. By exploiting the binary property of the measurement loss process and the IKF, we design three suboptimal filters for the state estimation, i.e., BKF-I, BKF-II and RBPF. The BKF-I is based on the MAP estimator of the measurement loss process and the BKF-II is derived by estimating the conditional loss probability. The RBPF is a particle filter based algorithm which marginalizes out the loss process to increase the efficiency of particles. All the proposed filters can be easily implemented in recursive forms. Finally, a linear system, a target tracking system and a quadrotor's path control problem are included to illustrate their effectiveness, and show the tradeoff between computational complexity and estimation accuracy of the proposed filters.
△ Less
Submitted 8 May, 2020; v1 submitted 24 January, 2018;
originally announced January 2018.
-
Distributed Discrete-time Optimization in Multi-agent Networks Using only Sign of Relative State
Authors:
Jiaqi Zhang,
Keyou You,
Tamer Başar
Abstract:
This paper proposes distributed discrete-time algorithms to cooperatively solve an additive cost optimization problem in multi-agent networks. The striking feature lies in the use of only the sign of relative state information between neighbors, which substantially differentiates our algorithms from others in the existing literature. We first interpret the proposed algorithms in terms of the penal…
▽ More
This paper proposes distributed discrete-time algorithms to cooperatively solve an additive cost optimization problem in multi-agent networks. The striking feature lies in the use of only the sign of relative state information between neighbors, which substantially differentiates our algorithms from others in the existing literature. We first interpret the proposed algorithms in terms of the penalty method in optimization theory and then perform non-asymptotic analysis to study convergence for static network graphs. Compared with the celebrated distributed subgradient algorithms, which however use the exact relative state information, the convergence speed is essentially not affected by the loss of information. We also study how introducing noise into the relative state information and randomly activated graphs affect the performance of our algorithms. Finally, we validate the theoretical results on a class of distributed quantile regression problems.
△ Less
Submitted 10 December, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.
-
Distributed Algorithms for Robust Convex Optimization via the Scenario Approach
Authors:
Keyou You,
Roberto Tempo,
Pei Xie
Abstract:
This paper proposes distributed algorithms to solve robust convex optimization (RCO) when the constraints are affected by nonlinear uncertainty. We adopt a scenario approach by randomly sampling the uncertainty set. To facilitate the computational task, instead of using a single centralized processor to obtain a "global solution" of the scenario problem (SP), we resort to {\it multiple interconnec…
▽ More
This paper proposes distributed algorithms to solve robust convex optimization (RCO) when the constraints are affected by nonlinear uncertainty. We adopt a scenario approach by randomly sampling the uncertainty set. To facilitate the computational task, instead of using a single centralized processor to obtain a "global solution" of the scenario problem (SP), we resort to {\it multiple interconnected processors} that are distributed among different nodes of a network to simultaneously solve the SP. Then, we propose a primal-dual sub-gradient algorithm and a random projection algorithm to distributedly solve the SP over undirected and directed graphs, respectively. Both algorithms are given in an explicit recursive form with simple iterations, which are especially suited for processors with limited computational capability. We show that, if the underlying graph is strongly connected, each node asymptotically computes a common optimal solution to the SP with a convergence rate $O(1/(\sum_{t=1}^kζ^t))$ where $\{ζ^t\}$ is a sequence of appropriately decreasing stepsizes. That is, the RCO is effectively solved in a distributed way. The relations with the existing literature on robust convex programs are thoroughly discussed and an example of robust system identification is included to validate the effectiveness of our distributed algorithms.
△ Less
Submitted 14 January, 2018; v1 submitted 19 July, 2016;
originally announced July 2016.
-
Likelihood Ratio Based Scheduler for Secure Detection in Cyber Physical Systems
Authors:
Jian-Ya Ding,
Keyou You,
Shiji Song,
Cheng Wu
Abstract:
This paper is concerned with a binary detection problem over a non-secure network. To satisfy the communication rate constraint and against possible cyber attacks, which are modeled as deceptive signals injected to the network, a likelihood ratio based (LRB) scheduler is designed in the sensor side to smartly select sensor measurements for transmission. By exploring the scheduler, some sensor meas…
▽ More
This paper is concerned with a binary detection problem over a non-secure network. To satisfy the communication rate constraint and against possible cyber attacks, which are modeled as deceptive signals injected to the network, a likelihood ratio based (LRB) scheduler is designed in the sensor side to smartly select sensor measurements for transmission. By exploring the scheduler, some sensor measurements are successfully retrieved from the attacked data at the decision center. We show that even under a moderate communication rate constraint of secure networks, an optimal LRB scheduler can achieve a comparable asymptotic detection performance to the standard N-P test using the full set of measurements, and is strictly better than the random scheduler. For non-secure networks, the LRB scheduler can also maintain the detection functionality but suffers graceful performance degradation under different attack intensities. Finally, we perform simulations to validate our theoretical results.
△ Less
Submitted 26 July, 2015;
originally announced July 2015.
-
Distributed Algorithms for Computation of Centrality Measures in Complex Networks
Authors:
Keyou You,
Roberto Tempo,
Li Qiu
Abstract:
This paper is concerned with distributed computation of several commonly used centrality measures in complex networks. In particular, we propose deterministic algorithms, which converge in finite time, for the distributed computation of the degree, closeness and betweenness centrality measures in directed graphs. Regarding eigenvector centrality, we consider the PageRank problem as its typical var…
▽ More
This paper is concerned with distributed computation of several commonly used centrality measures in complex networks. In particular, we propose deterministic algorithms, which converge in finite time, for the distributed computation of the degree, closeness and betweenness centrality measures in directed graphs. Regarding eigenvector centrality, we consider the PageRank problem as its typical variant, and design distributed randomized algorithms to compute PageRank for both fixed and time-varying graphs. A key feature of the proposed algorithms is that they do not require to know the network size, which can be simultaneously estimated at every node, and that they are clock-free. To address the PageRank problem of time-varying graphs, we introduce the novel concept of persistent graph, which eliminates the effect of spamming nodes. Moreover, we prove that these algorithms converge almost surely and in the sense of $L^p$. Finally, the effectiveness of the proposed algorithms is illustrated via extensive simulations using a classical benchmark.
△ Less
Submitted 29 May, 2016; v1 submitted 7 July, 2015;
originally announced July 2015.