Search | arXiv e-print repository

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Abstract: In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradien… ▽ More In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradient primal-dual method to find an optimal state feedback gain. Despite the non-convexity of the cost-constrained LQR problem, we provide a constructive proof for strong duality and a geometric interpretation of an optimal multiplier set. By proving that the concave dual function is Lipschitz smooth, we further provide convergence guarantees for the PG primal-dual method. Finally, we perform simulations to validate our theoretical findings. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2404.13550 [pdf, other]

Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes

Authors: Kang You, Kai Liu, Li Yu, Pan Gao, Dandan Ding

Abstract: Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance an… ▽ More Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance and extremely low-decoding-latency simultaneously. Inspired by conventional Trisoup codec, a point model-based strategy is devised to characterize local surfaces. Specifically, skin features are embedded from local windows via an attention-based encoder, and dilated windows are introduced as cross-scale priors to infer the distribution of quantized features in parallel. During decoding, features undergo fast refinement, followed by a folding-based point generator that reconstructs point coordinates with fairly fast speed. Experiments show that Pointsoup achieves state-of-the-art performance on multiple benchmarks with significantly lower decoding complexity, i.e., up to 90$\sim$160$\times$ faster than the G-PCCv23 Trisoup decoder on a comparatively low-end platform (e.g., one RTX 2080Ti). Furthermore, it offers variable-rate control with a single neural model (2.9MB), which is attractive for industrial practitioners. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2403.19126 [pdf, other]

Harnessing Data for Accelerating Model Predictive Control by Constraint Removal

Authors: Zhinan Hou, Feiran Zhao, Keyou You

Abstract: Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC poli… ▽ More Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC policy. This removal rule can use the information of historical data according to the Lipschitz constant and the distance between the current state and historical states. In particular, we provide the explicit expression for calculating the Lipschitz constant by the model parameters. Finally, simulations are performed to validate the effectiveness of the proposed method. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2401.14871 [pdf, other]

Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR

Authors: Feiran Zhao, Florian Dörfler, Alessandro Chiuso, Keyou You

Abstract: Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven… ▽ More Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven LQR problem, which is shown to be equivalent to the certainty-equivalence LQR with optimal non-asymptotic guarantees. Second, we design a novel data-enabled policy optimization (DeePO) method to directly update the policy, where the gradient is explicitly computed using only a batch of persistently exciting (PE) data. Third, we establish its global convergence via a projected gradient dominance property. Importantly, we efficiently use DeePO to adaptively learn the LQR by performing only one-step projected gradient descent per sample of the closed-loop system, which also leads to an explicit recursive update of the policy. Under PE inputs and for bounded noise, we show that the average regret of the LQR cost is upper-bounded by two terms signifying a sublinear decrease in time $\mathcal{O}(1/\sqrt{T})$ plus a bias scaling inversely with signal-to-noise ratio (SNR), which are independent of the noise statistics. Finally, we perform simulations to validate the theoretical results and demonstrate the computational and sample efficiency of our method. △ Less

Submitted 19 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2303.17958 [pdf, other]

Data-enabled Policy Optimization for the Linear Quadratic Regulator

Authors: Feiran Zhao, Florian Dörfler, Keyou You

Abstract: Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabl… ▽ More Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabled policy optimization (DeePO) method, which requires only a finite number of sufficiently exciting data to iteratively solve the LQR problem via PO. Based on a data-driven closed-loop parameterization, we are able to directly compute the policy gradient from a batch of persistently exciting data. Next, we show that the nonconvex PO problem satisfies a projected gradient dominance property by relating it to an equivalent convex program, leading to the global convergence of DeePO. Moreover, we apply regularization methods to enhance certainty-equivalence and robustness of the resulting controller and show an implicit regularization property. Finally, we perform simulations to validate our results. △ Less

Submitted 15 September, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted in IEEE CDC 2023

arXiv:2212.10945 [pdf, other]

Standoff Tracking Using DNN-Based MPC with Implementation on FPGA

Authors: Fei Dong, Xingchen Li, Keyou You, Shiji Song

Abstract: This work studies the standoff tracking problem to drive an unmanned aerial vehicle (UAV) to slide on a desired circle over a moving target at a constant height. We propose a novel Lyapunov guidance vector (LGV) field with tunable convergence rates for the UAV's trajectory planning and a deep neural network (DNN)-based model predictive control (MPC) scheme to track the reference trajectory. Then,… ▽ More This work studies the standoff tracking problem to drive an unmanned aerial vehicle (UAV) to slide on a desired circle over a moving target at a constant height. We propose a novel Lyapunov guidance vector (LGV) field with tunable convergence rates for the UAV's trajectory planning and a deep neural network (DNN)-based model predictive control (MPC) scheme to track the reference trajectory. Then, we show how to collect samples for training the DNN offline and design an integral module (IM) to refine the tracking performance of our DNN-based MPC. Moreover, the hardware-in-the-loop (HIL) simulation with an FPGA@200MHz demonstrates that our method is a valid alternative to embedded implementations of MPC for addressing complex systems and applications which is impossible for directly solving the MPC optimization problems. △ Less

Submitted 21 December, 2022; originally announced December 2022.

arXiv:2211.04051 [pdf, other]

Globally Convergent Policy Gradient Methods for Linear Quadratic Control of Partially Observed Systems

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite… ▽ More While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite length as feedback. We show that the solution set to the parameterized optimization problem is a matrix space, which is invariant to similarity transformation. By proving a gradient dominance property, we show the global convergence of policy gradient methods. Moreover, we observe that the gradient is orthogonal to the solution set, revealing an explicit relation between the resulting solution and the initial policy. Finally, we perform simulations to validate our theoretical results. △ Less

Submitted 22 April, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: To appear at IFAC World Congress 2023

arXiv:2209.04373 [pdf, other]

Optimal $(0,1)$-Matrix Completion with Majorization Ordered Objectives (To the memory of Pravin Varaiya)

Authors: Yanfang Mo, Wei Chen, Keyou You, Li Qiu

Abstract: We propose and examine two optimal $(0,1)$-matrix completion problems with majorization ordered objectives. They elevate the seminal study by Gale and Ryser from feasibility to optimality in partial order programming (POP), referring to optimization with partially ordered objectives. We showcase their applications in electric vehicle charging, portfolio optimization, and secure data storage. Solvi… ▽ More We propose and examine two optimal $(0,1)$-matrix completion problems with majorization ordered objectives. They elevate the seminal study by Gale and Ryser from feasibility to optimality in partial order programming (POP), referring to optimization with partially ordered objectives. We showcase their applications in electric vehicle charging, portfolio optimization, and secure data storage. Solving such integer POP (iPOP) problems is challenging because of the possible non-comparability among objective values and the integer requirements. Nevertheless, we prove the essential uniqueness of all optimal objective values and identify two particular ones for each of the two inherently symmetric iPOP problems. Furthermore, for every optimal objective value, we decompose the construction of an associated optimal~$(0,1)$-matrix into a series of sorting processes, respectively agreeing with the rule of thumb "peak shaving" or "valley filling." We show that the resulting algorithms have linear time complexities and verify their empirical efficiency via numerical simulations compared to the standard order-preserving method for POP. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: 16pages, 6 figures

arXiv:2208.13454 [pdf, other]

Minimum Input Design for Direct Data-driven Property Identification of Unknown Linear Systems

Authors: Shubo Kang, Keyou You

Abstract: In a direct data-driven approach, this paper studies the {\em property identification(ID)} problem to analyze whether an unknown linear system has a property of interest, e.g., stabilizability and structural properties. In sharp contrast to the model-based analysis, we approach it by directly using the input and state feedback data of the unknown system. Via a new concept of sufficient richness of… ▽ More In a direct data-driven approach, this paper studies the {\em property identification(ID)} problem to analyze whether an unknown linear system has a property of interest, e.g., stabilizability and structural properties. In sharp contrast to the model-based analysis, we approach it by directly using the input and state feedback data of the unknown system. Via a new concept of sufficient richness of input sectional data, we first establish the necessary and sufficient condition for the minimum input design to excite the system for property ID. Specifically, the input sectional data is sufficiently rich for property ID {\em if and only if} it spans a linear subspace that contains a property dependent minimum linear subspace, any basis of which can also be easily used to form the minimum excitation input. Interestingly, we show that many structural properties can be identified with the minimum input that is however unable to identify the explicit system model. Overall, our results rigorously quantify the advantages of the direct data-driven analysis over the model-based analysis for linear systems in terms of data efficiency. △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2208.02519 [pdf]

IPDAE: Improved Patch-Based Deep Autoencoder for Lossy Point Cloud Geometry Compression

Authors: Kang You, Pan Gao, Qing Li

Abstract: Point cloud is a crucial representation of 3D contents, which has been widely used in many areas such as virtual reality, mixed reality, autonomous driving, etc. With the boost of the number of points in the data, how to efficiently compress point cloud becomes a challenging problem. In this paper, we propose a set of significant improvements to patch-based point cloud compression, i.e., a learnab… ▽ More Point cloud is a crucial representation of 3D contents, which has been widely used in many areas such as virtual reality, mixed reality, autonomous driving, etc. With the boost of the number of points in the data, how to efficiently compress point cloud becomes a challenging problem. In this paper, we propose a set of significant improvements to patch-based point cloud compression, i.e., a learnable context model for entropy coding, octree coding for sampling centroid points, and an integrated compression and training process. In addition, we propose an adversarial network to improve the uniformity of points during reconstruction. Our experiments show that the improved patch-based autoencoder outperforms the state-of-the-art in terms of rate-distortion performance, on both sparse and large-scale point clouds. More importantly, our method can maintain a short compression time while ensuring the reconstruction quality. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: 12 pages

arXiv:2205.14335 [pdf, other]

Convergence and Sample Complexity of Policy Gradient Methods for Stabilizing Linear Systems

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the… ▽ More System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the policy and the discount factor of the LQR problem. Firstly, we propose an explicit rule to adaptively adjust the discount factor by exploring the stability margin of a linear control policy. Then, we establish the sample complexity of PG methods for stabilization, which only adds a coefficient logarithmic in the spectral radius of the state matrix to that for solving the LQR problem with a prior stabilizing policy. Finally, we perform simulations to validate our theoretical findings and demonstrate the effectiveness of our method on a class of nonlinear systems. △ Less

Submitted 14 September, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

arXiv:2203.05245 [pdf, other]

Data-driven Control of Unknown Linear Systems via Quantized Feedback

Authors: Feiran Zhao, Xingchen Li, Keyou You

Abstract: Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a com… ▽ More Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a common stabilizing controller for all possible dynamics consistent with data, in the form of a linear matrix inequality. Moreover, we formulate semi-definite programming to solve the coarsest quantization density. By establishing its connections to unstable eigenvalues of the state matrix, we further prove a necessary rank condition on the data for quantized feedback stabilization. Finally, we validate our theoretical results by numerical examples. △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: To appear at the 4th Annual Conference on Learning for Dynamics and Control

arXiv:2202.03614 [pdf, other]

An Exact Method for the Daily Package Shipment Problem with Outsourcing

Authors: Zhuolin Wang, Rong** Zhu, Jian-Ya Ding, Yu Yang, Keyou You

Abstract: The package shipment problem requires to optimally co-design paths for both packages and a heterogeneous fleet in a transit center network (TCN). Instances arising from the package delivery industry in China usually involve more than ten thousand origin-destination (OD) pairs and have to be solved daily within an hour. Motivated by the fact that there is no interaction among different origin cente… ▽ More The package shipment problem requires to optimally co-design paths for both packages and a heterogeneous fleet in a transit center network (TCN). Instances arising from the package delivery industry in China usually involve more than ten thousand origin-destination (OD) pairs and have to be solved daily within an hour. Motivated by the fact that there is no interaction among different origin centers due to their competitive relationship, we propose a novel two-layer localized package shipment on a TCN (LPS-TCN) model that exploits outsourcing for cost saving. Consequently, the original problem breaks into a set of much smaller shipment problems, each of which has hundreds of OD pairs and is subsequently modelled as a mixed integer program (MIP). Since the LPS-TCN model is proved to be Strongly NP-hard and contains tens of thousands of feasible paths, an off-the-shelf MIP solver cannot produce a reliable solution in a practically acceptable amount of time. We develop a column generation based algorithm that iteratively adds "profitable" paths and further enhance it by problem-specific cutting planes and variable bound tightening techniques. Computational experiments on realistic instances from a major Chinese package express company demonstrate that the LPS-TCN model can yield solutions that bring daily economic cost reduction up to 1 million CNY for the whole TCN. In addition, our proposed algorithm solves the LPS-TCN model substantially faster than CPLEX, one of the state-of-the-art commercial MIP solvers. △ Less

Submitted 7 February, 2022; originally announced February 2022.

arXiv:2112.09294 [pdf, other]

Learning Stabilizing Controllers of Linear Systems via Discount Policy Gradient

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy w… ▽ More Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy with finite cost. To this end, we discount the LQR cost with a factor, by adaptively increasing which gradient leads the policy to the stabilizing set while maintaining a finite cost. Based on the Lyapunov theory, we design an update rule for the discount factor which can be directly computed from data, rendering our method purely model-free. Compared to recent work \citep{perdomo2021stabilizing}, our algorithm allows the policy to be updated only once for each discount factor. Moreover, the number of sampled trajectories and simulation time for gradient descent is significantly reduced to $\mathcal{O}(\log(1/ε))$ for the desired accuracy $ε$. Finally, we conduct simulations on both small-scale and large-scale examples to show the efficiency of our discount PG method. △ Less

Submitted 16 December, 2021; originally announced December 2021.

Comments: Submitted to L4DC 2022

arXiv:2111.15057 [pdf, other]

Multi-period facility location and capacity planning under $\infty$-Wasserstein joint chance constraints in humanitarian logistics

Authors: Zhuolin Wang, Keyou You, Zhengli Wang, Kanglin Liu

Abstract: The key of the post-disaster humanitarian logistics (PD-HL) is to build a good facility location and capacity planning (FLCP) model for delivering relief supplies to affected areas in time. To fully exploit the historical PD data, this paper adopts the data-driven distributionally robust (DR) approach and proposes a novel multi-period FLCP model under the $\infty$-Wasserstein joint chance constrai… ▽ More The key of the post-disaster humanitarian logistics (PD-HL) is to build a good facility location and capacity planning (FLCP) model for delivering relief supplies to affected areas in time. To fully exploit the historical PD data, this paper adopts the data-driven distributionally robust (DR) approach and proposes a novel multi-period FLCP model under the $\infty$-Wasserstein joint chance constraints (MFLCP-W). Specifically, we sequentially decide locations from a candidate set to build facilities with supply capacities, which are expanded if more economical, and use a finite number of historical demand samples in chance constraints to ensure a high probability of on-time delivery. To solve the MFLCP-W model, we equivalently reformulate it as a mixed integer second-order cone program and then solve it by designing an effective outer approximation algorithm with two tailored valid cuts. Finally, a case study under hurricane threats shows that MFLCP-W outperforms its counterparts in the terms of the cost and service quality, and that our algorithm converges significantly faster than the commercial solver CPLEX 12.8 with a better optimality gap. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2110.10435 [pdf, other]

RSS-based Multiple Sources Localization with Unknown Log-normal Shadow Fading

Authors: Yueyan Chu, Wenbin Guo, Kangyong You, Lei Zhao, Tao Peng, Wenbo Wang

Abstract: Multi-source localization based on received signal strength (RSS) has drawn great interest in wireless sensor networks. However, the shadow fading term caused by obstacles cannot be separated from the received signal, which leads to severe error in location estimate. In this paper, we approximate the log-normal sum distribution through Fenton-Wilkinson method to formulate a non-convex maximum like… ▽ More Multi-source localization based on received signal strength (RSS) has drawn great interest in wireless sensor networks. However, the shadow fading term caused by obstacles cannot be separated from the received signal, which leads to severe error in location estimate. In this paper, we approximate the log-normal sum distribution through Fenton-Wilkinson method to formulate a non-convex maximum likelihood (ML) estimator with unknown shadow fading factor. In order to overcome the difficulty in solving the non-convex problem, we propose a novel algorithm to estimate the locations of sources. Specifically, the region is divided into $N$ grids firstly, and the multi-source localization is converted into a sparse recovery problem so that we can obtain the sparse solution. Then we utilize the K-means clustering method to obtain the rough locations of the off-grid sources as the initial feasible point of the ML estimator. Finally, an iterative refinement of the estimated locations is proposed by dynamic updating of the localization dictionary. The proposed algorithm can efficiently approach a superior local optimal solution of the ML estimator. It is shown from the simulation results that the proposed method has a promising localization performance and improves the robustness for multi-source localization in unknown shadow fading environments. Moreover, the proposed method provides a better computational complexity from $O(K^3N^3)$ to $O(N^3)$. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: 11 pages, 10 figures. arXiv admin note: substantial text overlap with arXiv:2105.15097

arXiv:2110.09109 [pdf, other]

doi 10.1145/3469877.3490611

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Authors: Kang You, Pan Gao

Abstract: The ever-increasing 3D application makes the point cloud compression unprecedentedly important and needed. In this paper, we propose a patch-based compression process using deep learning, focusing on the lossy point cloud geometry compression. Unlike existing point cloud compression networks, which apply feature extraction and reconstruction on the entire point cloud, we divide the point cloud int… ▽ More The ever-increasing 3D application makes the point cloud compression unprecedentedly important and needed. In this paper, we propose a patch-based compression process using deep learning, focusing on the lossy point cloud geometry compression. Unlike existing point cloud compression networks, which apply feature extraction and reconstruction on the entire point cloud, we divide the point cloud into patches and compress each patch independently. In the decoding process, we finally assemble the decompressed patches into a complete point cloud. In addition, we train our network by a patch-to-patch criterion, i.e., use the local reconstruction loss for optimization, to approximate the global reconstruction optimality. Our method outperforms the state-of-the-art in terms of rate-distortion performance, especially at low bitrates. Moreover, the compression process we proposed can guarantee to generate the same number of points as the input. The network model of this method can be easily applied to other point cloud reconstruction problems, such as upsampling. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: Accepted to ACM Multimedia Asia (MMAsia '21)

arXiv:2105.15097 [pdf, other]

Multiple Sources Localization with Sparse Recovery under Log-normal Shadow Fading

Authors: Yueyan Chu, Kangyong You, Wenbin Guo

Abstract: Localization based on received signal strength (RSS) has drawn great interest in the wireless sensor network (WSN). In this paper, we investigate the RSS-based multi-sources localization problem with unknown transmitted power under shadow fading. The log-normal shadowing effect is approximated through Fenton-Wilkinson (F-W) method and maximum likelihood estimation is adopted to optimize the RSS-ba… ▽ More Localization based on received signal strength (RSS) has drawn great interest in the wireless sensor network (WSN). In this paper, we investigate the RSS-based multi-sources localization problem with unknown transmitted power under shadow fading. The log-normal shadowing effect is approximated through Fenton-Wilkinson (F-W) method and maximum likelihood estimation is adopted to optimize the RSS-based multiple sources localization problem. Moreover, we exploit a sparse recovery and weighted average of candidates (SR-WAC) based method to set up an initiation, which can efficiently approach a superior local optimal solution. It is shown from the simulation results that the proposed method has a much higher localization accuracy and outperforms the other △ Less

Submitted 31 March, 2021; originally announced May 2021.

arXiv:2105.06697 [pdf, other]

Innovation Compression for Communication-efficient Distributed Optimization with Linear Convergence

Authors: Jiaqi Zhang, Keyou You, Lihua Xie

Abstract: Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve line… ▽ More Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $δ$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 14 pages

arXiv:2104.04901 [pdf, other]

Global Convergence of Policy Gradient Primal-dual Methods for Risk-constrained LQRs

Authors: Feiran Zhao, Keyou You, Tamer Başar

Abstract: While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem… ▽ More While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem via the PO approach, which requires addressing a challenging non-convex constrained optimization problem. To solve it, we first build on our earlier result that an optimal policy has a time-invariant affine structure to show that the associated Lagrangian function is coercive, locally gradient dominated and has local Lipschitz continuous gradient, based on which we establish strong duality. Then, we design policy gradient primal-dual methods with global convergence guarantees in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our methods. △ Less

Submitted 21 November, 2022; v1 submitted 10 April, 2021; originally announced April 2021.

arXiv:2103.15363 [pdf, other]

Infinite-horizon Risk-constrained Linear Quadratic Regulator with Average Cost

Authors: Feiran Zhao, Keyou You, Tamer Basar

Abstract: The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie… ▽ More The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie within a user-specified level. By leveraging the duality, its optimal solution is first shown to be stationary and affine in the state, i.e., $u(x,λ^*) = -K(λ^*)x + l(λ^*)$, where $λ^*$ is an optimal multiplier, used to address the risk constraint. Then, we establish the stability of the resulting closed-loop system. Furthermore, we propose a primal-dual method with sublinear convergence rate to find an optimal policy $u(x,λ^*)$. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed framework and the primal-dual method. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: Submitted to IEEE CDC 2021

arXiv:2101.10689 [pdf, other]

A Distributed Implementation of Steady-State Kalman Filter

Authors: Jiaqi Yan, Xu Yang, Yilin Mo, Keyou You

Abstract: This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system. By a lossless decomposition of optimal steady-state Kalman filter, we show that the problem of distributed estimation can be reformulated as synchronization of homogeneous linear systems. Based on such decompositi… ▽ More This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system. By a lossless decomposition of optimal steady-state Kalman filter, we show that the problem of distributed estimation can be reformulated as synchronization of homogeneous linear systems. Based on such decomposition, a distributed estimator is proposed, where each sensor node runs a local filter using only its own measurement and fuses the local estimate of each node with a consensus algorithm. We show that the average of the estimate from all sensors coincides with the optimal Kalman estimate. Numerical examples are provided in the end to illustrate the performance of the proposed scheme. △ Less

Submitted 21 April, 2022; v1 submitted 26 January, 2021; originally announced January 2021.

arXiv:2011.10931 [pdf, other]

Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator

Authors: Feiran Zhao, Keyou You

Abstract: Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair… ▽ More Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair and leverage primal-dual methods to optimize it by solely using data. We first study the optimization landscape of the Lagrangian function and establish the strong duality in spite of its non-convex nature. Alongside, we find that the Lagrangian function enjoys an important local gradient dominance property, which is then exploited to develop a convergent random search algorithm to learn the dual function. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations. △ Less

Submitted 30 May, 2021; v1 submitted 21 November, 2020; originally announced November 2020.

Comments: To appear in the Annual Conference on Learning for Dynamics and Control (L4DC) 2021

arXiv:2010.06794 [pdf, other]

Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric

Authors: Feiran Zhao, Keyou You

Abstract: Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic gam… ▽ More Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic game of the penalized LQR problem, we propose a Q-learning method with convergence guarantees to learn an optimal minimax controller. △ Less

Submitted 16 January, 2023; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: Accepted by Automatica, to appear in 2023

arXiv:2008.03405 [pdf, other]

Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection

Authors: Takuya Higuchi, Mohammad Ghasemzadeh, Kisun You, Chandra Dhir

Abstract: We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device.… ▽ More We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device. Therefore, having small memory and compute cost is crucial for a voice trigger detection system. Recently, singular value decomposition filters (SVDFs) has been used for end-to-end voice trigger detection. The SVDFs approximate a fully-connected layer with a low rank approximation, which reduces the number of model parameters. In this work, we propose S1DCNN as an alternative approach for end-to-end small-footprint voice trigger detection. An S1DCNN layer consists of a 1D convolution layer followed by a depth-wise 1D convolution layer. We show that the SVDF can be expressed as a special case of the S1DCNN layer. Experimental results show that the S1DCNN achieve 19.0% relative false reject ratio (FRR) reduction with a similar model size and a similar time delay compared to the SVDF. By using longer time delays, the S1DCNN further improve the FRR up to 12.2% relative. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: Accepted to INTERSPEECH 2020

arXiv:2007.07733 [pdf, other]

doi 10.1016/j.automatica.2021.109779

The Isoline Tracking in Unknown Scalar Fields with Concentration Feedback

Authors: Fei Dong, Keyou You

Abstract: The isoline tracking of this work is concerned with the control design for a sensing vehicle to track a desired isoline of an unknown scalar field. To this end, we propose a simple PI-like controller for a Dubins vehicle in the GPS-denied environments. Our key idea lies in the design of a novel sliding surface based error in the standard PI controller. For the circular field, we show that the P-li… ▽ More The isoline tracking of this work is concerned with the control design for a sensing vehicle to track a desired isoline of an unknown scalar field. To this end, we propose a simple PI-like controller for a Dubins vehicle in the GPS-denied environments. Our key idea lies in the design of a novel sliding surface based error in the standard PI controller. For the circular field, we show that the P-like controller can globally regulate the vehicle to the desired isoline with the steady-state error that can be arbitrarily reduced by increasing the P gain, and is eliminated by the PI-like controller. For any smoothing field, the P-like controller is able to achieve the local regulation. Then, it is extended to the cases of a single-integrator vehicle and a doubleintegrator vehicle, respectively. Finally, the effectiveness and advantages of our approaches are validated via simulations on the fixed-wing UAV and quadrotor simulators. △ Less

Submitted 15 July, 2020; originally announced July 2020.

arXiv:2003.12684 [pdf, other]

Coordinate-free Isoline Tracking in Unknown 2-D Scalar Fields

Authors: Fei Dong, Keyou You

Abstract: The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concentration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surfac… ▽ More The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concentration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surface based error term in the standard PI controller. Interestingly, we also prove that the tracking error can be reduced by increasing the proportion gain, and is eliminated for circular fields with a non-zero integral gain. The effectiveness of our controller is validated via simulations by using a fixed-wing UAV on the real dataset of the concentration distribution of PM 2.5 in Handan, China. △ Less

Submitted 27 March, 2020; originally announced March 2020.

Comments: 6 pages, 3 figures

arXiv:2002.07378 [pdf, other]

Distributed Adaptive Newton Methods with Global Superlinear Convergence

Authors: Jiaqi Zhang, Keyou You, Tamer Başar

Abstract: This paper considers the distributed optimization problem where each node of a peer-to-peer network minimizes a finite sum of objective functions by communicating with its neighboring nodes. In sharp contrast to the existing literature where the fastest distributed algorithms converge either with a global linear or a local superlinear rate, we propose a distributed adaptive Newton (DAN) algorithm… ▽ More This paper considers the distributed optimization problem where each node of a peer-to-peer network minimizes a finite sum of objective functions by communicating with its neighboring nodes. In sharp contrast to the existing literature where the fastest distributed algorithms converge either with a global linear or a local superlinear rate, we propose a distributed adaptive Newton (DAN) algorithm with a global quadratic convergence rate. Our key idea lies in the design of a finite-time set-consensus method with Polyak's adaptive stepsize. Moreover, we introduce a low-rank matrix approximation (LA) technique to compress the innovation of Hessian matrix so that each node only needs to transmit message of dimension $\mathcal{O}(p)$ (where $p$ is the dimension of decision vectors) per iteration, which is essentially the same as that of first-order methods. Nevertheless, the resulting DAN-LA converges to an optimal solution with a global superlinear rate. Numerical experiments on logistic regression problems are conducted to validate their advantages over existing methods. △ Less

Submitted 14 January, 2022; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: Accepted to Automatica as regular paper. 13 pages

arXiv:2002.06751 [pdf, other]

Second-order Conic Programming Approach for Wasserstein Distributionally Robust Two-stage Linear Programs

Authors: Zhuolin Wang, Keyou You, Shiji Song, Yuli Zhang

Abstract: This paper proposes a second-order conic programming (SOCP) approach to solve distributionally robust two-stage stochastic linear programs over 1-Wasserstein balls. We start from the case with distribution uncertainty only in the objective function and exactly reformulate it as an SOCP problem. Then, we study the case with distribution uncertainty only in constraints, and show that such a robust p… ▽ More This paper proposes a second-order conic programming (SOCP) approach to solve distributionally robust two-stage stochastic linear programs over 1-Wasserstein balls. We start from the case with distribution uncertainty only in the objective function and exactly reformulate it as an SOCP problem. Then, we study the case with distribution uncertainty only in constraints, and show that such a robust program is generally NP-hard as it involves a norm maximization problem over a polyhedron. However, it is reduced to an SOCP problem if the extreme points of the polyhedron are given as a prior. This motivates to design a constraint generation algorithm with provable convergence to approximately solve the NP-hard problem. In sharp contrast to the exiting literature, the distribution achieving the worst-case cost is given as an "empirical" distribution by simply perturbing each sample for both cases. Finally, experiments illustrate the advantages of the proposed model in terms of the out-of-sample performance and the computational complexity. △ Less

Submitted 28 May, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

arXiv:2002.06507 [pdf, other]

doi 10.1109/TAES.2021.3127858

Coordinate-free Circumnavigation of a Moving Target via a PD-like Controller

Authors: Fei Dong, Keyou You, Lihua Xie, Qinglei Hu

Abstract: This paper proposes a coordinate-free controller for a nonholonomic vehicle to circumnavigate a fully-actuated moving target by using range-only measurements. If the range rate is available, our Proportional Derivative (PD)-like controller has a simple structure as the standard PD controller, except the design of an additive constant bias and a saturation function in the error feedback. We show th… ▽ More This paper proposes a coordinate-free controller for a nonholonomic vehicle to circumnavigate a fully-actuated moving target by using range-only measurements. If the range rate is available, our Proportional Derivative (PD)-like controller has a simple structure as the standard PD controller, except the design of an additive constant bias and a saturation function in the error feedback. We show that if the target is stationary, the vehicle asymptotically encloses the target with a predefined radius at an exponential convergence rate, i.e., an exact circumnavigation pattern can be completed. For a moving target, the circumnavigation error converges to a small region whose size is shown proportional to the maneuverability of the target, e.g., the maximum linear speed and acceleration. Moreover, we design a second-order sliding mode (SOSM) filter to estimate the range rate and show that the SOSM filter can recover the range rate in a finite time. Finally, the effectiveness and advantages of our controller are validated via both numerical simulations and real experiments. △ Less

Submitted 13 November, 2021; v1 submitted 16 February, 2020; originally announced February 2020.

Comments: 13 pages,17 figures

arXiv:1911.08021 [pdf, other]

doi 10.1109/TSP.2020.3009875

Parametric Sparse Bayesian Dictionary Learning for Multiple Sources Localization with Propagation Parameters Uncertainty and Nonuniform Noise

Authors: Kangyong You, Wenbin Guo, Tao Peng, Yueliang Liu, Peiliang Zuo, Wenbo Wang

Abstract: Received signal strength (RSS) based source localization method is popular due to its simplicity and low cost. However, this method is highly dependent on the propagation model which is not easy to be captured in practice. Moreover, most existing works only consider the single source and the identical measurement noise scenario, while in practice multiple co-channel sources may transmit simultaneo… ▽ More Received signal strength (RSS) based source localization method is popular due to its simplicity and low cost. However, this method is highly dependent on the propagation model which is not easy to be captured in practice. Moreover, most existing works only consider the single source and the identical measurement noise scenario, while in practice multiple co-channel sources may transmit simultaneously, and the measurement noise tends to be nonuniform. In this paper, we study the multiple co-channel sources localization (MSL) problem under unknown nonuniform noise, while jointly estimating the parametric propagation model. Specifically, we model the MSL problem as being parameterized by the unknown source locations and propagation parameters, and then reformulate it as a joint parametric sparsifying dictionary learning (PSDL) and sparse signal recovery (SSR) problem which is solved under the framework of sparse Bayesian learning with iterative parametric dictionary approximation. Furthermore, multiple snapshot measurements are utilized to improve the localization accuracy, and the Cramer-Rao lower bound (CRLB) is derived to analyze the theoretical estimation error bound. Comparing with the state-of-the-art sparsity-based MSL algorithms as well as CRLB, extensive simulations show the importance of jointly inferring the propagation parameters,and highlight the effectiveness and superiority of the proposed method. △ Less

Submitted 22 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

Comments: 12 pages, 9 figures

arXiv:1911.08018 [pdf, other]

doi 10.1109/TSIPN.2020.3038475

Graph Learning for Spatiotemporal Signals with Long- and Short-Term Characterization

Authors: Yueliang Liu, Wenbin Guo, Kangyong You, Lei Zhao, Tao Peng, Wenbo Wang

Abstract: Mining natural associations from high-dimensional spatiotemporal signals plays an important role in various fields including biology, climatology, and financial analysis. However, most existing works have mainly studied time-independent signals without considering the correlations of spatiotemporal signals that achieve high learning accuracy. This paper aims to learn graphs that better reflect und… ▽ More Mining natural associations from high-dimensional spatiotemporal signals plays an important role in various fields including biology, climatology, and financial analysis. However, most existing works have mainly studied time-independent signals without considering the correlations of spatiotemporal signals that achieve high learning accuracy. This paper aims to learn graphs that better reflect underlying data relations by leveraging the long- and short-term characteristics of spatiotemporal signals. First, a spatiotemporal signal model is presented that considers both spatial and temporal relations. In particular, we integrate a low-rank representation and a Gaussian Markov process to describe the temporal correlations. Then, the graph learning problem is formulated as a joint low-rank component estimation and graph Laplacian inference. Accordingly, we propose a low rank and spatiotemporal smoothness-based graph learning method (GL-LRSS), which introduces a spatiotemporal smoothness prior into time-vertex signal analysis. By jointly exploiting the low rank of long-time observations and the smoothness of short-time observations, the overall learning performance can be effectively improved. Experiments on both synthetic and real-world datasets demonstrate substantial improvements in the learning accuracy of the proposed method over the state-of-the-art low-rank component estimation and graph learning methods. △ Less

Submitted 6 December, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

Comments: 13 pages, 6 figures

Journal ref: IEEE Transactions on Signal and Information Processing over Networks, vol 6, pp. 699-713, 2020

arXiv:1910.12491 [pdf, other]

Suspension Regulation of Medium-low-speed Maglev Trains via Deep Reinforcement Learning

Authors: Feiran Zhao, Keyou You, Shiji Song, Wenyue Zhang, Laisheng Tong

Abstract: The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision pro… ▽ More The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision process (MDP) with unknown transition probabilities. With the deterministic policy gradient and neural network approximation, we design reinforcement learning (RL) algorithms to solve the MDP and obtain a state-feedback controller by using sampled data from the suspension system. To further improve its performance, we adopt a double Q-learning scheme for learning the regulation controller. We illustrate that the proposed controllers outperform the existing PID controller with a real dataset from the mlsMT in Changsha, China and is even comparable to model-based controllers, which assume that the complete information of the model is known, via simulations. △ Less

Submitted 8 May, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

Comments: 12 pages, 15 figures

arXiv:1909.09937 [pdf, other]

Distributed Dual Gradient Tracking for Resource Allocation in Unbalanced Networks

Authors: Jiaqi Zhang, Keyou You, Kai Cai

Abstract: This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the… ▽ More This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the resource allocation problem. To study the convergence of the DDGT, we first establish the sublinear convergence rate of PPG for non-convex objective functions, which advances the existing results on PPG as they require the strong-convexity of objective functions. Then we show that the DDGT converges linearly for strongly convex and Lipschitz smooth cost functions, and sublinearly without the Lipschitz smoothness. Finally, experimental results suggest that DDGT outperforms existing algorithms. △ Less

Submitted 23 August, 2020; v1 submitted 22 September, 2019; originally announced September 2019.

Comments: Accepted by IEEE Transactions on Signal Processing. This version fixed some typos in the accepted version

arXiv:1909.02712 [pdf, other]

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

Authors: Jiaqi Zhang, Keyou You

Abstract: This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mi… ▽ More This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mini-batch of samples, where the batch size is designed to be proportional to the size of the local dataset. We explicitly evaluate the convergence rate of DSGT with respect to the number of iterations in terms of algebraic connectivity of the network, mini-batch size, gradient variance, etc. Under certain conditions, we further show that DSGT has a network independence property in the sense that the network topology only affects the convergence rate up to a constant factor. Hence, the convergence rate of DSGT can be comparable to the centralized SGD method. Moreover, a linear speedup of DSGT with respect to the number of nodes is achievable for some scenarios. Numerical experiments for neural networks and logistic regression problems on CIFAR-10 finally illustrate the advantages of DSGT. △ Less

Submitted 28 August, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

Comments: This paper has been revised and theoretical results are improved

arXiv:1908.00380 [pdf, other]

Optimization-based Control for Bearing-only Target Search with a Mobile Vehicle

Authors: Zhuo Li, Keyou You, Shiji Song, Anke Xue

Abstract: This work aims to design an optimization-based controller for a discrete-time Dubins vehicle to approach a target with unknown position as fast as possible by only using bearing measurements. To this end, we propose a bi-objective optimization problem, which jointly considers the performance of estimating the unknown target position and controlling the mobile vehicle to a known position, and then… ▽ More This work aims to design an optimization-based controller for a discrete-time Dubins vehicle to approach a target with unknown position as fast as possible by only using bearing measurements. To this end, we propose a bi-objective optimization problem, which jointly considers the performance of estimating the unknown target position and controlling the mobile vehicle to a known position, and then adopt a weighted sum method with normalization to solve it. The controller is given based on the solution of the optimization problem in ties with a least-square estimate of the target position. Moreover, the controller does not need the vehicle's global position information. Finally, simulation results are included to validate the effectiveness of the proposed controller. △ Less

Submitted 1 August, 2019; originally announced August 2019.

arXiv:1906.07416 [pdf, other]

doi 10.1016/j.automatica.2020.108932

Target Encirclement with any Smooth Pattern Using Range-only Measurements

Authors: Fei Dong, Keyou You, Shiji Song

Abstract: This paper proposes a coordinate-free controller to drive a mobile robot to encircle a target at unknown position by only using range measurements. Different from the existing works, a backstep** based controller is proposed to encircle the target with zero steady-state error for any desired smooth pattern. Moreover, we show its asymptotic exponential convergence under a fixed set of control par… ▽ More This paper proposes a coordinate-free controller to drive a mobile robot to encircle a target at unknown position by only using range measurements. Different from the existing works, a backstep** based controller is proposed to encircle the target with zero steady-state error for any desired smooth pattern. Moreover, we show its asymptotic exponential convergence under a fixed set of control parameters, which are independent of the initial distance to the target. The effectiveness and advantages of the proposed controller are validated via simulations. △ Less

Submitted 18 June, 2019; originally announced June 2019.

arXiv:1906.07000 [pdf, other]

doi 10.1109/TCST.2019.2948915

Flight Control for UAV Loitering Over a Ground Target with Unknown Maneuver

Authors: Fei Dong, Keyou You, Jiaqi Zhang

Abstract: This paper proposes a flight controller for an unmanned aerial vehicle (UAV) to loiter over a ground moving target (GMT). We are concerned with the scenario that the stochastically time-varying maneuver of the GMT is unknown to the UAV, which renders it challenging to estimate the GMT's motion state. Assuming that the state of the GMT is available, we first design a discrete-time Lyapunov vector f… ▽ More This paper proposes a flight controller for an unmanned aerial vehicle (UAV) to loiter over a ground moving target (GMT). We are concerned with the scenario that the stochastically time-varying maneuver of the GMT is unknown to the UAV, which renders it challenging to estimate the GMT's motion state. Assuming that the state of the GMT is available, we first design a discrete-time Lyapunov vector field for the loitering guidance and then design a discrete-time integral sliding mode control (ISMC) to track the guidance commands. By modeling the maneuver process as a finite-state Markov chain, we propose a Rao-Blackwellised particle filter (RBPF), which only requires a few number of particles, to simultaneously estimate the motion state and the maneuver of the GMT with a camera or radar sensor. Then, we apply the principle of certainty equivalence to the ISMC and obtain the flight controller for completing the loitering task. Finally, the effectiveness and advantages of our controller are validated via simulations. △ Less

Submitted 19 October, 2019; v1 submitted 17 June, 2019; originally announced June 2019.

Comments: 12 pages, 14 figures. Accepted by IEEE Transactions on Control Systems Technology

arXiv:1901.07716 [pdf, other]

Cooperative Source Seeking via Networked Multi-vehicle Systems

Authors: Zhuo Li, Keyou You, Shiji Song

Abstract: This paper studies the cooperative source seeking problem via a networked multi-vehicle system. In contrast to existing literature, the multi-vehicle system is controlled to the source position that maximizes aggregated multiple unknown scalar fields and each sensor-enabled vehicle only samples measurements of one scalar field. Thus, a single vehicle is unable to localize the source and has to coo… ▽ More This paper studies the cooperative source seeking problem via a networked multi-vehicle system. In contrast to existing literature, the multi-vehicle system is controlled to the source position that maximizes aggregated multiple unknown scalar fields and each sensor-enabled vehicle only samples measurements of one scalar field. Thus, a single vehicle is unable to localize the source and has to cooperate with its neighboring vehicles. By jointly exploiting the ideas of the consensus algorithm and the stochastic extremum seeking (ES), this paper proposes novel distributed stochastic ES controllers, which are gradient-free and do not need any absolute information, such that the multi-vehicle system simultaneously approaches the source position. The effectiveness of the proposed controllers is proved for quadratic scalar fields. Finally, illustrative examples are included to validate the theoretical results. △ Less

Submitted 9 January, 2020; v1 submitted 22 January, 2019; originally announced January 2019.

arXiv:1812.04201 [pdf, other]

Range-based Coordinate Alignment for Cooperative Mobile Sensor Network Localization

Authors: Keyou You, Qizhu Chen, Pei Xie, Shiji Song

Abstract: This paper studies a coordinate alignment problem for cooperative mobile sensor network localization with range-based measurements. The network consists of target nodes, each of which has only access position information in a local fixed coordinate frame, and anchor nodes with GPS position information. To localize target nodes, we aim to align their coordinate frames, which leads to a non-convex o… ▽ More This paper studies a coordinate alignment problem for cooperative mobile sensor network localization with range-based measurements. The network consists of target nodes, each of which has only access position information in a local fixed coordinate frame, and anchor nodes with GPS position information. To localize target nodes, we aim to align their coordinate frames, which leads to a non-convex optimization problem over a rotation group $\text{SO}(3)$. Then, we reformulate it as an optimization problem with a convex objective function over spherical surfaces. We explicitly design both iterative and recursive algorithms for localizing a target node with an anchor node, and extend to the case with multiple target nodes. Finally, the advantages of our algorithms against the literature are validated via simulations. △ Less

Submitted 22 February, 2020; v1 submitted 10 December, 2018; originally announced December 2018.

arXiv:1807.11874 [pdf, ps, other]

Parallel Optimal Control for Cooperative Automation of Large-scale Connected Vehicles via ADMM

Authors: Zhitao Wang, Yang Zheng, Shengbo Eben Li, Keyou You, Keqiang Li

Abstract: This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating di… ▽ More This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating direction method of multipliers (ADMM) to solve the centralized optimization in a parallel way, which scales more favorably to large-scale instances. Also, Taylor series is used to linearize nonconvex constraints caused by coupling collision avoidance constraints among interactive vehicles. Simulations with two typical traffic scenes for multiple vehicles demonstrate the effectiveness and efficiency of our method. △ Less

Submitted 31 July, 2018; originally announced July 2018.

arXiv:1801.07945 [pdf, other]

Bayesian Filtering with Unknown Sensor Measurement Losses

Authors: Jiaqi Zhang, Keyou You, Lihua Xie

Abstract: This work studies the state estimation problem of a stochastic nonlinear system with unknown sensor measurement losses. If the estimator knows the sensor measurement losses of a linear Gaussian system, the minimum variance estimate is easily computed by the celebrated intermittent Kalman filter (IKF). However, this will no longer be the case when the measurement losses are unknown and/or the syste… ▽ More This work studies the state estimation problem of a stochastic nonlinear system with unknown sensor measurement losses. If the estimator knows the sensor measurement losses of a linear Gaussian system, the minimum variance estimate is easily computed by the celebrated intermittent Kalman filter (IKF). However, this will no longer be the case when the measurement losses are unknown and/or the system is nonlinear or non-Gaussian. By exploiting the binary property of the measurement loss process and the IKF, we design three suboptimal filters for the state estimation, i.e., BKF-I, BKF-II and RBPF. The BKF-I is based on the MAP estimator of the measurement loss process and the BKF-II is derived by estimating the conditional loss probability. The RBPF is a particle filter based algorithm which marginalizes out the loss process to increase the efficiency of particles. All the proposed filters can be easily implemented in recursive forms. Finally, a linear system, a target tracking system and a quadrotor's path control problem are included to illustrate their effectiveness, and show the tradeoff between computational complexity and estimation accuracy of the proposed filters. △ Less

Submitted 8 May, 2020; v1 submitted 24 January, 2018; originally announced January 2018.

Comments: Accepted in IEEE Transactions on Control of Network Systems. Parts of the results appear in the 6th IFAC Workshop on Distributed Estimation and Control in Networked Systems NECSYS 2016

arXiv:1709.08360 [pdf, other]

doi 10.1109/TAC.2018.2884998

Distributed Discrete-time Optimization in Multi-agent Networks Using only Sign of Relative State

Authors: Jiaqi Zhang, Keyou You, Tamer Başar

Abstract: This paper proposes distributed discrete-time algorithms to cooperatively solve an additive cost optimization problem in multi-agent networks. The striking feature lies in the use of only the sign of relative state information between neighbors, which substantially differentiates our algorithms from others in the existing literature. We first interpret the proposed algorithms in terms of the penal… ▽ More This paper proposes distributed discrete-time algorithms to cooperatively solve an additive cost optimization problem in multi-agent networks. The striking feature lies in the use of only the sign of relative state information between neighbors, which substantially differentiates our algorithms from others in the existing literature. We first interpret the proposed algorithms in terms of the penalty method in optimization theory and then perform non-asymptotic analysis to study convergence for static network graphs. Compared with the celebrated distributed subgradient algorithms, which however use the exact relative state information, the convergence speed is essentially not affected by the loss of information. We also study how introducing noise into the relative state information and randomly activated graphs affect the performance of our algorithms. Finally, we validate the theoretical results on a class of distributed quantile regression problems. △ Less

Submitted 10 December, 2018; v1 submitted 25 September, 2017; originally announced September 2017.

Comments: Part of this work has been presented in American Control Conference (ACC) 2018, first version posted on arxiv on Sep. 2017, IEEE Transactions on Automatic Control, 2018

arXiv:1607.05507 [pdf, other]

Distributed Algorithms for Robust Convex Optimization via the Scenario Approach

Authors: Keyou You, Roberto Tempo, Pei Xie

Abstract: This paper proposes distributed algorithms to solve robust convex optimization (RCO) when the constraints are affected by nonlinear uncertainty. We adopt a scenario approach by randomly sampling the uncertainty set. To facilitate the computational task, instead of using a single centralized processor to obtain a "global solution" of the scenario problem (SP), we resort to {\it multiple interconnec… ▽ More This paper proposes distributed algorithms to solve robust convex optimization (RCO) when the constraints are affected by nonlinear uncertainty. We adopt a scenario approach by randomly sampling the uncertainty set. To facilitate the computational task, instead of using a single centralized processor to obtain a "global solution" of the scenario problem (SP), we resort to {\it multiple interconnected processors} that are distributed among different nodes of a network to simultaneously solve the SP. Then, we propose a primal-dual sub-gradient algorithm and a random projection algorithm to distributedly solve the SP over undirected and directed graphs, respectively. Both algorithms are given in an explicit recursive form with simple iterations, which are especially suited for processors with limited computational capability. We show that, if the underlying graph is strongly connected, each node asymptotically computes a common optimal solution to the SP with a convergence rate $O(1/(\sum_{t=1}^kζ^t))$ where $\{ζ^t\}$ is a sequence of appropriately decreasing stepsizes. That is, the RCO is effectively solved in a distributed way. The relations with the existing literature on robust convex programs are thoroughly discussed and an example of robust system identification is included to validate the effectiveness of our distributed algorithms. △ Less

Submitted 14 January, 2018; v1 submitted 19 July, 2016; originally announced July 2016.

Comments: 15 pages, 4 figures

arXiv:1507.07277 [pdf, other]

Likelihood Ratio Based Scheduler for Secure Detection in Cyber Physical Systems

Authors: Jian-Ya Ding, Keyou You, Shiji Song, Cheng Wu

Abstract: This paper is concerned with a binary detection problem over a non-secure network. To satisfy the communication rate constraint and against possible cyber attacks, which are modeled as deceptive signals injected to the network, a likelihood ratio based (LRB) scheduler is designed in the sensor side to smartly select sensor measurements for transmission. By exploring the scheduler, some sensor meas… ▽ More This paper is concerned with a binary detection problem over a non-secure network. To satisfy the communication rate constraint and against possible cyber attacks, which are modeled as deceptive signals injected to the network, a likelihood ratio based (LRB) scheduler is designed in the sensor side to smartly select sensor measurements for transmission. By exploring the scheduler, some sensor measurements are successfully retrieved from the attacked data at the decision center. We show that even under a moderate communication rate constraint of secure networks, an optimal LRB scheduler can achieve a comparable asymptotic detection performance to the standard N-P test using the full set of measurements, and is strictly better than the random scheduler. For non-secure networks, the LRB scheduler can also maintain the detection functionality but suffers graceful performance degradation under different attack intensities. Finally, we perform simulations to validate our theoretical results. △ Less

Submitted 26 July, 2015; originally announced July 2015.

arXiv:1507.01694 [pdf, other]

doi 10.1109/TAC.2016.2604373

Distributed Algorithms for Computation of Centrality Measures in Complex Networks

Authors: Keyou You, Roberto Tempo, Li Qiu

Abstract: This paper is concerned with distributed computation of several commonly used centrality measures in complex networks. In particular, we propose deterministic algorithms, which converge in finite time, for the distributed computation of the degree, closeness and betweenness centrality measures in directed graphs. Regarding eigenvector centrality, we consider the PageRank problem as its typical var… ▽ More This paper is concerned with distributed computation of several commonly used centrality measures in complex networks. In particular, we propose deterministic algorithms, which converge in finite time, for the distributed computation of the degree, closeness and betweenness centrality measures in directed graphs. Regarding eigenvector centrality, we consider the PageRank problem as its typical variant, and design distributed randomized algorithms to compute PageRank for both fixed and time-varying graphs. A key feature of the proposed algorithms is that they do not require to know the network size, which can be simultaneously estimated at every node, and that they are clock-free. To address the PageRank problem of time-varying graphs, we introduce the novel concept of persistent graph, which eliminates the effect of spamming nodes. Moreover, we prove that these algorithms converge almost surely and in the sense of $L^p$. Finally, the effectiveness of the proposed algorithms is illustrated via extensive simulations using a classical benchmark. △ Less

Submitted 29 May, 2016; v1 submitted 7 July, 2015; originally announced July 2015.

Comments: 15 pages, 8 figures,(conditionally accepted), IEEE Transactions on Automatic Control, 2016

Showing 1–46 of 46 results for author: You, K