Search | arXiv e-print repository

Design and Control of a Low-cost Non-backdrivable End-effector Upper Limb Rehabilitation Device

Authors: Fulan Li, Yunfei Guo, Wenda Xu, Weide Zhang, Fangyun Zhao, Baiyu Wang, Huaguang Du, Chengkun Zhang

Abstract: This paper presents the development of an upper limb end-effector based rehabilitation device for stroke patients, offering assistance or resistance along any 2-dimensional trajectory during physical therapy. It employs a non-backdrivable ball-screw-driven mechanism for enhanced control accuracy. The control system features three novel algorithms: First, the Implicit Euler velocity control algorit… ▽ More This paper presents the development of an upper limb end-effector based rehabilitation device for stroke patients, offering assistance or resistance along any 2-dimensional trajectory during physical therapy. It employs a non-backdrivable ball-screw-driven mechanism for enhanced control accuracy. The control system features three novel algorithms: First, the Implicit Euler velocity control algorithm (IEVC) highlighted for its state-of-the-art accuracy, stability, efficiency and generalizability in motion restriction control. Second, an Admittance Virtual Dynamics simulation algorithm that achieves a smooth and natural human interaction with the non-backdrivable end-effector. Third, a generalized impedance force calculation algorithm allowing efficient impedance control on any trajectory or area boundary. Experimental validation demonstrated the system's effectiveness in accurate end-effector position control across various trajectories and configurations. The proposed upper limb end-effector-based rehabilitation device, with its high performance and adaptability, holds significant promise for extensive clinical application, potentially improving rehabilitation outcomes for stroke patients. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 12 pages, 15 figures

arXiv:2406.03734 [pdf, other]

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Authors: Feiran Zhao, Keyou You

Abstract: In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradien… ▽ More In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradient primal-dual method to find an optimal state feedback gain. Despite the non-convexity of the cost-constrained LQR problem, we provide a constructive proof for strong duality and a geometric interpretation of an optimal multiplier set. By proving that the concave dual function is Lipschitz smooth, we further provide convergence guarantees for the PG primal-dual method. Finally, we perform simulations to validate our theoretical findings. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2404.14557 [pdf]

Efficiency and Cost Optimization of Dual Active Bridge Converter for 350kW DC Fast Chargers

Authors: Sadik Cinik, Fangzhou Zhao, Giuseppe De Falco, Xiongfei Wang

Abstract: This study focuses on optimizing the design parameters of a Dual Active Bridge (DAB) converter for use in 350 kW DC fast chargers, emphasizing the balance between efficiency and cost. Addressing the observed gaps in existing high-power application research, it introduces an optimization framework to evaluate critical design parameters,number of converter modules, switching frequency, and transform… ▽ More This study focuses on optimizing the design parameters of a Dual Active Bridge (DAB) converter for use in 350 kW DC fast chargers, emphasizing the balance between efficiency and cost. Addressing the observed gaps in existing high-power application research, it introduces an optimization framework to evaluate critical design parameters,number of converter modules, switching frequency, and transformer turns ratio,within a broad operational voltage range. The analysis identifies an optimal configuration that achieves over 95% efficiency at rated power across a wide output voltage range, comprising seven 50 kW DAB converters with a switching frequency of 30 kHz, and a transformer turns ratio of 0.9. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2403.19126 [pdf, other]

Harnessing Data for Accelerating Model Predictive Control by Constraint Removal

Authors: Zhinan Hou, Feiran Zhao, Keyou You

Abstract: Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC poli… ▽ More Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC policy. This removal rule can use the information of historical data according to the Lipschitz constant and the distance between the current state and historical states. In particular, we provide the explicit expression for calculating the Lipschitz constant by the model parameters. Finally, simulations are performed to validate the effectiveness of the proposed method. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2402.14543 [pdf]

Low-frequency Resonances in Grid-Forming Converters: Causes and Dam** Control

Authors: Fangzhou Zhao, Tianhua Zhu, Zejie Li, Xiongfei Wang

Abstract: Grid-forming voltage-source converter (GFM-VSC) may experience low-frequency resonances, such as synchronous resonance (SR) and sub-synchronous resonance (SSR), in the output power. This paper offers a comprehensive study on the root causes of low-frequency resonances with GFM-VSC systems and the dam** control methods. The typical GFM control structures are introduced first, along with a map**… ▽ More Grid-forming voltage-source converter (GFM-VSC) may experience low-frequency resonances, such as synchronous resonance (SR) and sub-synchronous resonance (SSR), in the output power. This paper offers a comprehensive study on the root causes of low-frequency resonances with GFM-VSC systems and the dam** control methods. The typical GFM control structures are introduced first, along with a map** between the resonances and control loops. Then, the causes of SR and SSR are discussed, highlighting the impacts of control interactions on the resonances. Further, the recent advancements in stabilizing control methods for SR and SSR are critically reviewed with experimental tests of a GFM-VSC under different grid conditions. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2401.14871 [pdf, other]

Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR

Authors: Feiran Zhao, Florian Dörfler, Alessandro Chiuso, Keyou You

Abstract: Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven… ▽ More Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven LQR problem, which is shown to be equivalent to the certainty-equivalence LQR with optimal non-asymptotic guarantees. Second, we design a novel data-enabled policy optimization (DeePO) method to directly update the policy, where the gradient is explicitly computed using only a batch of persistently exciting (PE) data. Third, we establish its global convergence via a projected gradient dominance property. Importantly, we efficiently use DeePO to adaptively learn the LQR by performing only one-step projected gradient descent per sample of the closed-loop system, which also leads to an explicit recursive update of the policy. Under PE inputs and for bounded noise, we show that the average regret of the LQR cost is upper-bounded by two terms signifying a sublinear decrease in time $\mathcal{O}(1/\sqrt{T})$ plus a bias scaling inversely with signal-to-noise ratio (SNR), which are independent of the noise statistics. Finally, we perform simulations to validate the theoretical results and demonstrate the computational and sample efficiency of our method. △ Less

Submitted 19 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.01785 [pdf]

Closed-Form Solutions for Grid-Forming Converters: A Design-Oriented Study

Authors: Fangzhou Zhao, Tianhua Zhu, Lennart Harnefors, Bo Fan, Heng Wu, Zichao Zhou, Yin Sun, Xiongfei Wang

Abstract: This paper derives closed-form solutions for grid-forming converters with power synchronization control (PSC) by subtly simplifying and factorizing the complex closed-loop models. The solutions can offer clear analytical insights into control-loop interactions, enabling guidelines for robust controller design. It is proved that 1) the proportional gains of PSC and alternating voltage control (AVC)… ▽ More This paper derives closed-form solutions for grid-forming converters with power synchronization control (PSC) by subtly simplifying and factorizing the complex closed-loop models. The solutions can offer clear analytical insights into control-loop interactions, enabling guidelines for robust controller design. It is proved that 1) the proportional gains of PSC and alternating voltage control (AVC) can introduce negative resistance, which aggravates synchronous resonance (SR) of power control, 2) the integral gain of AVC is the cause of sub-synchronous resonance (SSR) in stiff-grid interconnections, albeit the proportional gain of AVC can help dampen the SSR, and 3) surprisingly, the current controller that dampens SR actually exacerbates SSR. Controller design guidelines are given based on analytical insights. The findings are verified by simulations and experimental results. △ Less

Submitted 4 December, 2023; originally announced December 2023.

arXiv:2310.03402 [pdf, other]

A Complementary Global and Local Knowledge Network for Ultrasound denoising with Fine-grained Refinement

Authors: Zhenyu Bu, Kai-Ni Wang, Fuxing Zhao, Shengxiao Li, Guang-Quan Zhou

Abstract: Ultrasound imaging serves as an effective and non-invasive diagnostic tool commonly employed in clinical examinations. However, the presence of speckle noise in ultrasound images invariably degrades image quality, impeding the performance of subsequent tasks, such as segmentation and classification. Existing methods for speckle noise reduction frequently induce excessive image smoothing or fail to… ▽ More Ultrasound imaging serves as an effective and non-invasive diagnostic tool commonly employed in clinical examinations. However, the presence of speckle noise in ultrasound images invariably degrades image quality, impeding the performance of subsequent tasks, such as segmentation and classification. Existing methods for speckle noise reduction frequently induce excessive image smoothing or fail to preserve detailed information adequately. In this paper, we propose a complementary global and local knowledge network for ultrasound denoising with fine-grained refinement. Initially, the proposed architecture employs the L-CSwinTransformer as encoder to capture global information, incorporating CNN as decoder to fuse local features. We expand the resolution of the feature at different stages to extract more global information compared to the original CSwinTransformer. Subsequently, we integrate Fine-grained Refinement Block (FRB) within the skip-connection stage to further augment features. We validate our model on two public datasets, HC18 and BUSI. Experimental results demonstrate that our model can achieve competitive performance in both quantitative metrics and visual performance. Our code will be available at https://github.com/AAlkaid/USDenoising. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: Submitted to ICASSP 2024

arXiv:2310.00630 [pdf, other]

Sequential Monte Carlo Graph Convolutional Network for Dynamic Brain Connectivity

Authors: Fengfan Zhao, Ercan Engin Kuruoglu

Abstract: An increasingly important brain function analysis modality is functional connectivity analysis which regards connections as statistical codependency between the signals of different brain regions. Graph-based analysis of brain connectivity provides a new way of exploring the association between brain functional deficits and the structural disruption related to brain disorders, but the current impl… ▽ More An increasingly important brain function analysis modality is functional connectivity analysis which regards connections as statistical codependency between the signals of different brain regions. Graph-based analysis of brain connectivity provides a new way of exploring the association between brain functional deficits and the structural disruption related to brain disorders, but the current implementations have limited capability due to the assumptions of noise-free data and stationary graph topology. We propose a new methodology based on the particle filtering algorithm, with proven success in tracking problems, which estimates the hidden states of a dynamic graph with only partial and noisy observations, without the assumptions of stationarity on connectivity. We enrich the particle filtering state equation with a graph Neural Network called Sequential Monte Carlo Graph Convolutional Network (SMC-GCN), which due to the nonlinear regression capability, can limit spurious connections in the graph. Experiment studies demonstrate that SMC-GCN achieves the superior performance of several methods in brain disorder classification. △ Less

Submitted 4 January, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

arXiv:2309.01958 [pdf, other]

Empowering Low-Light Image Enhancer through Customized Learnable Priors

Authors: Naishan Zheng, Man Zhou, Yanmeng Dong, Xiangyu Rui, Jie Huang, Chongyi Li, Feng Zhao

Abstract: Deep neural networks have achieved remarkable progress in enhancing low-light images by improving their brightness and eliminating noise. However, most existing methods construct end-to-end map** networks heuristically, neglecting the intrinsic prior of image enhancement task and lacking transparency and interpretability. Although some unfolding solutions have been proposed to relieve these issu… ▽ More Deep neural networks have achieved remarkable progress in enhancing low-light images by improving their brightness and eliminating noise. However, most existing methods construct end-to-end map** networks heuristically, neglecting the intrinsic prior of image enhancement task and lacking transparency and interpretability. Although some unfolding solutions have been proposed to relieve these issues, they rely on proximal operator networks that deliver ambiguous and implicit priors. In this work, we propose a paradigm for low-light image enhancement that explores the potential of customized learnable priors to improve the transparency of the deep unfolding paradigm. Motivated by the powerful feature representation capability of Masked Autoencoder (MAE), we customize MAE-based illumination and noise priors and redevelop them from two perspectives: 1) \textbf{structure flow}: we train the MAE from a normal-light image to its illumination properties and then embed it into the proximal operator design of the unfolding architecture; and m2) \textbf{optimization flow}: we train MAE from a normal-light image to its gradient representation and then employ it as a regularization term to constrain noise in the model output. These designs improve the interpretability and representation capability of the model.Extensive experiments on multiple low-light image enhancement datasets demonstrate the superiority of our proposed paradigm over state-of-the-art methods. Code is available at https://github.com/zheng980629/CUE. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: Accepted by ICCV 2023

arXiv:2308.14924 [pdf, other]

Optimal Economic Gas Turbine Dispatch with Deep Reinforcement Learning

Authors: Manuel Sage, Martin Staniszewski, Yaoyao Fiona Zhao

Abstract: Dispatching strategies for gas turbines (GTs) are changing in modern electricity grids. A growing incorporation of intermittent renewable energy requires GTs to operate more but shorter cycles and more frequently on partial loads. Deep reinforcement learning (DRL) has recently emerged as a tool that can cope with this development and dispatch GTs economically. The key advantages of DRL are a model… ▽ More Dispatching strategies for gas turbines (GTs) are changing in modern electricity grids. A growing incorporation of intermittent renewable energy requires GTs to operate more but shorter cycles and more frequently on partial loads. Deep reinforcement learning (DRL) has recently emerged as a tool that can cope with this development and dispatch GTs economically. The key advantages of DRL are a model-free optimization and the ability to handle uncertainties, such as those introduced by varying loads or renewable energy production. In this study, three popular DRL algorithms are implemented for an economic GT dispatch problem on a case study in Alberta, Canada. We highlight the benefits of DRL by incorporating an existing thermodynamic software provided by Siemens Energy into the environment model and by simulating uncertainty via varying electricity prices, loads, and ambient conditions. Among the tested algorithms and baseline methods, Deep Q-Networks (DQN) obtained the highest rewards while Proximal Policy Optimization (PPO) was the most sample efficient. We further propose and implement a method to assign GT operation and maintenance cost dynamically based on operating hours and cycles. Compared to existing methods, our approach better approximates the true cost of modern GT dispatch and hence leads to more realistic policies. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND

arXiv:2308.00759 [pdf, other]

Decomposition Ascribed Synergistic Learning for Unified Image Restoration

Authors: **ghao Zhang, Feng Zhao

Abstract: Learning to restore multiple image degradations within a single model is quite beneficial for real-world applications. Nevertheless, existing works typically concentrate on regarding each degradation independently, while their relationship has been less exploited to ensure the synergistic learning. To this end, we revisit the diverse degradations through the lens of singular value decomposition, w… ▽ More Learning to restore multiple image degradations within a single model is quite beneficial for real-world applications. Nevertheless, existing works typically concentrate on regarding each degradation independently, while their relationship has been less exploited to ensure the synergistic learning. To this end, we revisit the diverse degradations through the lens of singular value decomposition, with the observation that the decomposed singular vectors and singular values naturally undertake the different types of degradation information, dividing various restoration tasks into two groups, \ie, singular vector dominated and singular value dominated. The above analysis renders a more unified perspective to ascribe the diverse degradations, compared to previous task-level independent learning. The dedicated optimization of degraded singular vectors and singular values inherently utilizes the potential relationship among diverse restoration tasks, attributing to the Decomposition Ascribed Synergistic Learning (DASL). Specifically, DASL comprises two effective operators, namely, Singular VEctor Operator (SVEO) and Singular VAlue Operator (SVAO), to favor the decomposed optimization, which can be lightly integrated into existing image restoration backbone. Moreover, the congruous decomposition loss has been devised for auxiliary. Extensive experiments on blended five image restoration tasks demonstrate the effectiveness of our method. △ Less

Submitted 12 March, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

Comments: 16 pages, 17 figures

arXiv:2307.15393 [pdf, other]

Deep Reinforcement Learning Based Intelligent Reflecting Surface Optimization for TDD MultiUser MIMO Systems

Authors: Fengyu Zhao, Wen Chen, Ziwei Liu, Jun Li, Qingqing Wu

Abstract: In this letter, we investigate the discrete phase shift design of the intelligent reflecting surface (IRS) in a time division duplexing (TDD) multi-user multiple input multiple output (MIMO) system.We modify the design of deep reinforcement learning (DRL) scheme so that we can maximizing the average downlink data transmission rate free from the sub-channel channel state information (CSI). Based on… ▽ More In this letter, we investigate the discrete phase shift design of the intelligent reflecting surface (IRS) in a time division duplexing (TDD) multi-user multiple input multiple output (MIMO) system.We modify the design of deep reinforcement learning (DRL) scheme so that we can maximizing the average downlink data transmission rate free from the sub-channel channel state information (CSI). Based on the characteristics of the model, we modify the proximal policy optimization (PPO) algorithm and integrate gated recurrent unit (GRU) to tackle the non-convex optimization problem. Simulation results show that the performance of the proposed PPO-GRU surpasses the benchmarks in terms of performance, convergence speed, and training stability. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.00700 [pdf]

doi 10.1109/JSEN.2022.3203147

Coverage Enhancement Strategy in WMSNs Based on a Novel Swarm Intelligence Algorithm: Army Ant Search Optimizer

Authors: Yindi Yao, Qin Wen, Yanpeng Cui, Feng Zhao, Bozhan Zhao, Yao** Zeng

Abstract: As one of the most crucial scenarios of the Internet of Things (IoT), wireless multimedia sensor networks (WMSNs) pay more attention to the information-intensive data (e.g., audio, video, image) for remote environments. The area coverage reflects the perception of WMSNs to the surrounding environment, where a good coverage effect can ensure effective data collection. Given the harsh and complex ph… ▽ More As one of the most crucial scenarios of the Internet of Things (IoT), wireless multimedia sensor networks (WMSNs) pay more attention to the information-intensive data (e.g., audio, video, image) for remote environments. The area coverage reflects the perception of WMSNs to the surrounding environment, where a good coverage effect can ensure effective data collection. Given the harsh and complex physical environment of WMSNs, which easily form the sensing overlap** regions and coverage holes by random deployment. The intention of our research is to deal with the optimization problem of maximizing the coverage rate in WMSNs. By proving the NP-hard of the coverage enhancement of WMSNs, inspired by the predation behavior of army ants, this article proposes a novel swarm intelligence (SI) technology army ant search optimizer (AASO) to solve the above problem, which is implemented by five operators: army ant and prey initialization, recruited by prey, attack prey, update prey, and build ant bridge. The simulation results demonstrate that the optimizer shows good performance in terms of exploration and exploitation on benchmark suites when compared to other representative SI algorithms. More importantly, coverage enhancement AASO-based in WMSNs has better merits in terms of coverage effect when compared to existing approaches. △ Less

Submitted 2 July, 2023; originally announced July 2023.

Comments: 13 page, 12 figure, 8 tables

Journal ref: in IEEE Sensors Journal, vol. 22, no. 21, pp. 21299-21311, Nov., 2022

arXiv:2305.20024 [pdf]

Cooperative IoT Data Sharing with Heterogeneity of Participants Based on Electricity Retail

Authors: Bohong Wang, Qinglai Guo, Tian Xia, Qiang Li, Di Liu, Feng Zhao

Abstract: With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions. However, the isolation of IoT data among several entities makes it difficult to achieve optimal allocation of data resources and convert data resources into real economic value, thus it is necessary to introduce the I… ▽ More With the development of Internet of Things (IoT) and big data technology, the data value is increasingly explored in multiple practical scenarios, including electricity transactions. However, the isolation of IoT data among several entities makes it difficult to achieve optimal allocation of data resources and convert data resources into real economic value, thus it is necessary to introduce the IoT data sharing mode to drive data circulation. To enhance the accuracy and fairness of IoT data sharing, the heterogeneity of participants is sufficiently considered, and data valuation and profit allocation in IoT data sharing are improved based on the background of electricity retail. Data valuation is supposed to be relevant to attributes of IoT data buyers, thus risk preferences of electricity retailers are applied as characteristic attributes and data premium rates are proposed to modify data value rates. Profit allocation should measure the marginal contribution shares of electricity retailers and data brokers fairly, thus asymmetric Nash bargaining model is used to guarantee that they could receive reasonable profits based on their specific contribution to the coalition of IoT data sharing. Considering the heterogeneity of participants comprehensively, the proposed IoT data sharing fits for a large coalition of IoT data sharing with multiple electricity retailers and data brokers. Finally, to demonstrate the applications of IoT data sharing in smart grids, case studies are utilized to validate the results of data value for electricity retailers with different risk preferences and the efficiency of profit allocation using asymmetric Nash bargaining model. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: 18 pages, 14 figures

arXiv:2304.07990 [pdf, other]

Novel Quality Measure and Efficient Resolution of Convex Hull Pricing for Unit Commitment

Authors: Mikhail A. Bragin, Farhan Hyder, Bing Yan, Peter B. Luh, **ye Zhao, Feng Zhao, Dane A. Schiro, Tongxin Zheng

Abstract: Electricity prices determined by economic dispatch that do not consider fixed costs may lead to significant uplift payments. However, when fixed costs are included, prices become non-monotonic with respect to demand, which can adversely impact market transparency. To overcome this issue, convex hull (CH) pricing has been introduced for unit commitment with fixed costs. Several CH pricing methods h… ▽ More Electricity prices determined by economic dispatch that do not consider fixed costs may lead to significant uplift payments. However, when fixed costs are included, prices become non-monotonic with respect to demand, which can adversely impact market transparency. To overcome this issue, convex hull (CH) pricing has been introduced for unit commitment with fixed costs. Several CH pricing methods have been presented, and a feasible cost has been used as a quality measure for the CH price. However, obtaining a feasible cost requires a computationally intensive optimization procedure, and the associated duality gap may not provide an accurate quality measure. This paper presents a new approach for quantifying the quality of the CH price by establishing an upper bound on the optimal dual value. The proposed approach uses Surrogate Lagrangian Relaxation (SLR) to efficiently obtain near-optimal CH prices, while the upper bound decreases rapidly due to the convergence of SLR. Testing results on the IEEE 118-bus system demonstrate that the novel quality measure is more accurate than the measure provided by a feasible cost, indicating the high quality of the upper bound and the efficiency of SLR. △ Less

Submitted 17 April, 2023; originally announced April 2023.

arXiv:2303.17958 [pdf, other]

Data-enabled Policy Optimization for the Linear Quadratic Regulator

Authors: Feiran Zhao, Florian Dörfler, Keyou You

Abstract: Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabl… ▽ More Policy optimization (PO), an essential approach of reinforcement learning for a broad range of system classes, requires significantly more system data than indirect (identification-followed-by-control) methods or behavioral-based direct methods even in the simplest linear quadratic regulator (LQR) problem. In this paper, we take an initial step towards bridging this gap by proposing the data-enabled policy optimization (DeePO) method, which requires only a finite number of sufficiently exciting data to iteratively solve the LQR problem via PO. Based on a data-driven closed-loop parameterization, we are able to directly compute the policy gradient from a batch of persistently exciting data. Next, we show that the nonconvex PO problem satisfies a projected gradient dominance property by relating it to an equivalent convex program, leading to the global convergence of DeePO. Moreover, we apply regularization methods to enhance certainty-equivalence and robustness of the resulting controller and show an implicit regularization property. Finally, we perform simulations to validate our results. △ Less

Submitted 15 September, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted in IEEE CDC 2023

arXiv:2302.08053 [pdf]

Selective Noise Suppression Methods Using Random SVPWM to Shape the Noise Spectrum of PMSMs

Authors: Jian Wen, Xiaobin Cheng, Peifeng Ji, Jun Yang, Feng Zhao

Abstract: Random pulse width modulation techniques are used in AC motors powered by two-level three-phase inverters, which cause a broadband spectrum of voltage, current, and electromagnetic force. The voltage distribution across a wide range of frequencies may increase the vibration and acoustic noise of motors. This study proposes two selective noise suppression (SNS) methods to eliminate voltage harmonic… ▽ More Random pulse width modulation techniques are used in AC motors powered by two-level three-phase inverters, which cause a broadband spectrum of voltage, current, and electromagnetic force. The voltage distribution across a wide range of frequencies may increase the vibration and acoustic noise of motors. This study proposes two selective noise suppression (SNS) methods to eliminate voltage harmonics for specified frequencies. In the first method, the switching frequency is constant. The pulse position is calculated by the duty cycle of the current switching cycle. Both the pulse position and switching frequency are randomized in the second method. This involves creating a unique relationship among the switching frequency, pulse position, and duty cycle to shape the noise spectrum. Computer simulation and experimental results show that both methods effectively perform selective noise suppression at a specific frequency. The power spectrum density (PSD) using the second SNS method is more uniform near integer multiples of the switching frequency than that using random pulse width modulation techniques or the first SNS method. These methods provide a valuable reference for eliminating electromagnetic and acoustic noises at resonant frequencies in motors. △ Less

Submitted 6 June, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

Comments: 8 pages, 15 figures

arXiv:2211.04051 [pdf, other]

Globally Convergent Policy Gradient Methods for Linear Quadratic Control of Partially Observed Systems

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite… ▽ More While the optimization landscape of policy gradient methods has been recently investigated for partially observed linear systems in terms of both static output feedback and dynamical controllers, they only provide convergence guarantees to stationary points. In this paper, we propose a new policy parameterization for partially observed linear systems, using a past input-output trajectory of finite length as feedback. We show that the solution set to the parameterized optimization problem is a matrix space, which is invariant to similarity transformation. By proving a gradient dominance property, we show the global convergence of policy gradient methods. Moreover, we observe that the gradient is orthogonal to the solution set, revealing an explicit relation between the resulting solution and the initial policy. Finally, we perform simulations to validate our theoretical results. △ Less

Submitted 22 April, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: To appear at IFAC World Congress 2023

arXiv:2210.08181 [pdf, other]

Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network

Authors: Keyu Yan, Man Zhou, Jie Huang, Feng Zhao, Chengjun Xie, Chongyi Li, Danfeng Hong

Abstract: Panchromatic (PAN) and multi-spectral (MS) image fusion, named Pan-sharpening, refers to super-resolve the low-resolution (LR) multi-spectral (MS) images in the spatial domain to generate the expected high-resolution (HR) MS images, conditioning on the corresponding high-resolution PAN images. In this paper, we present a simple yet effective \textit{alternating reverse filtering network} for pan-s… ▽ More Panchromatic (PAN) and multi-spectral (MS) image fusion, named Pan-sharpening, refers to super-resolve the low-resolution (LR) multi-spectral (MS) images in the spatial domain to generate the expected high-resolution (HR) MS images, conditioning on the corresponding high-resolution PAN images. In this paper, we present a simple yet effective \textit{alternating reverse filtering network} for pan-sharpening. Inspired by the classical reverse filtering that reverses images to the status before filtering, we formulate pan-sharpening as an alternately iterative reverse filtering process, which fuses LR MS and HR MS in an interpretable manner. Different from existing model-driven methods that require well-designed priors and degradation assumptions, the reverse filtering process avoids the dependency on pre-defined exact priors. To guarantee the stability and convergence of the iterative process via contraction map** on a metric space, we develop the learnable multi-scale Gaussian kernel module, instead of using specific filters. We demonstrate the theoretical feasibility of such formulations. Extensive experiments on diverse scenes to thoroughly verify the performance of our method, significantly outperforming the state of the arts. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Journal ref: NeurIPS2022

arXiv:2206.02311 [pdf, other]

Underdetermined 2D-DOD and 2D-DOA Estimation for Bistatic Coprime EMVS-MIMO Radar: From the Difference Coarray Perspective

Authors: Qianpeng Xie, Yihang Du, He Wang, Xiaoyi Pan, Feng Zhao

Abstract: In this paper, the underdetermined 2D-DOD and 2D-DOA estimation for bistatic coprime EMVS-MIMO radar is considered. Firstly, a 5-D tensor model was constructed by using the multi-dimensional space-time characteristics of the received data. Then, an 8-D tensor has been obtained by using the auto-correlation calculation. To obtain the difference coarrays of transmit and receive EMVS, the de-coupling… ▽ More In this paper, the underdetermined 2D-DOD and 2D-DOA estimation for bistatic coprime EMVS-MIMO radar is considered. Firstly, a 5-D tensor model was constructed by using the multi-dimensional space-time characteristics of the received data. Then, an 8-D tensor has been obtained by using the auto-correlation calculation. To obtain the difference coarrays of transmit and receive EMVS, the de-coupling process between the spatial response of EMVS and the steering vector is inevitable. Thus, a new 6-D tensor can be constructed via the tensor permutation and the generalized tensorization of the canonical polyadic decomposition. {According} to the theory of the Tensor-Matrix Product operation, the duplicated elements in the difference coarrays can be removed by the utilization of two designed selection matrices. Due to the centrosymmetric geometry of the difference coarrays, two DFT beamspace matrices were subsequently designed to convert the complex steering matrices into the real-valued ones, whose advantage is to improve the estimation accuracy of the 2D-DODs and 2D-DOAs. Afterwards, a third-order tensor with the third-way fixed at 36 was constructed and the Parallel Factor algorithm was deployed, which can yield the closed-form automatically paired 2D-DOD and 2D-DOA estimation. The simulation results show that the proposed algorithm can exhibit superior estimation performance for the underdetermined 2D-DOD and 2D-DOA estimation. △ Less

Submitted 5 June, 2022; originally announced June 2022.

Comments: 25pages,7 figures

arXiv:2206.01891 [pdf, other]

8D Parameters Estimation for Bistatic EMVS-MIMO Radar via the nested PARAFAC

Authors: Qianpeng Xie, He Wang, Yihang Du, Xiaoyi Pan, Feng Zhao

Abstract: In this letter, a novel nested PARAFAC algorithm was proposed to improve the 8D parameters estimation performance for the bistatic EMVS-MIMO radar. Firstly, the outer part PARAFAC algorithm was carried out to estimate the receive spatial response matrix and its first way factor matrix. For the estimated first way factor matrix, a theory is given to rearrange its data into an new matrix, which is t… ▽ More In this letter, a novel nested PARAFAC algorithm was proposed to improve the 8D parameters estimation performance for the bistatic EMVS-MIMO radar. Firstly, the outer part PARAFAC algorithm was carried out to estimate the receive spatial response matrix and its first way factor matrix. For the estimated first way factor matrix, a theory is given to rearrange its data into an new matrix, which is the mode-1 unfolding matrix of a three-way tensor. Then, the inner part PARAFAC algorithm was used to estimate the transmit steering vector matrix, the transmit spatial response matrix and the receive steering vector matrix. Thus, the transmit 4D parameters and receive 4D parameters can be accurately located via the abovementioned process. Compared with the original PARAFAC algorithm, the proposed nested PARAFAC algorithm can avoid additional reconstruction process when estimating the transmit/receive spatial response matrix. Moreover, the proposed algorithm can offer a highly-accurate 8D parameters estimaiton than that of the original PARAFAC algorithm. Simulated results verify the effectiveness of the proposed algorithm. △ Less

Submitted 3 June, 2022; originally announced June 2022.

arXiv:2205.14335 [pdf, other]

Convergence and Sample Complexity of Policy Gradient Methods for Stabilizing Linear Systems

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the… ▽ More System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this paper, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the policy and the discount factor of the LQR problem. Firstly, we propose an explicit rule to adaptively adjust the discount factor by exploring the stability margin of a linear control policy. Then, we establish the sample complexity of PG methods for stabilization, which only adds a coefficient logarithmic in the spectral radius of the state matrix to that for solving the LQR problem with a prior stabilizing policy. Finally, we perform simulations to validate our theoretical findings and demonstrate the effectiveness of our method on a class of nonlinear systems. △ Less

Submitted 14 September, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

arXiv:2203.05245 [pdf, other]

Data-driven Control of Unknown Linear Systems via Quantized Feedback

Authors: Feiran Zhao, Xingchen Li, Keyou You

Abstract: Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a com… ▽ More Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a common stabilizing controller for all possible dynamics consistent with data, in the form of a linear matrix inequality. Moreover, we formulate semi-definite programming to solve the coarsest quantization density. By establishing its connections to unstable eigenvalues of the state matrix, we further prove a necessary rank condition on the data for quantized feedback stabilization. Finally, we validate our theoretical results by numerical examples. △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: To appear at the 4th Annual Conference on Learning for Dynamics and Control

arXiv:2112.09294 [pdf, other]

Learning Stabilizing Controllers of Linear Systems via Discount Policy Gradient

Authors: Feiran Zhao, Xingyun Fu, Keyou You

Abstract: Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy w… ▽ More Stability is one of the most fundamental requirements for systems synthesis. In this paper, we address the stabilization problem for unknown linear systems via policy gradient (PG) methods. We leverage a key feature of PG for Linear Quadratic Regulator (LQR), i.e., it drives the policy away from the boundary of the unstabilizing region along the descent direction, provided with an initial policy with finite cost. To this end, we discount the LQR cost with a factor, by adaptively increasing which gradient leads the policy to the stabilizing set while maintaining a finite cost. Based on the Lyapunov theory, we design an update rule for the discount factor which can be directly computed from data, rendering our method purely model-free. Compared to recent work \citep{perdomo2021stabilizing}, our algorithm allows the policy to be updated only once for each discount factor. Moreover, the number of sampled trajectories and simulation time for gradient descent is significantly reduced to $\mathcal{O}(\log(1/ε))$ for the desired accuracy $ε$. Finally, we conduct simulations on both small-scale and large-scale examples to show the efficiency of our discount PG method. △ Less

Submitted 16 December, 2021; originally announced December 2021.

Comments: Submitted to L4DC 2022

arXiv:2111.03259 [pdf]

Single-shot wide-field optical section imaging

Authors: Yuyao Hu, Dong Liang, **g Wang, Ya** Xuan, Fu Zhao, Jun Liu, Ruxin Li

Abstract: Optical sectioning technology has been widely used in various fluorescence microscopes owing to its background removing capability. Here, a virtual HiLo based on edge detection (V-HiLo-ED) is proposed to achieve wide-field optical sectioning, which requires only single wide-field image. Compared with conventional optical sectioning technologies, its imaging speed can be increased by at least twice… ▽ More Optical sectioning technology has been widely used in various fluorescence microscopes owing to its background removing capability. Here, a virtual HiLo based on edge detection (V-HiLo-ED) is proposed to achieve wide-field optical sectioning, which requires only single wide-field image. Compared with conventional optical sectioning technologies, its imaging speed can be increased by at least twice, meanwhile maintaining nice optical sectioning performance, low cost, and excellent artifact suppression capabilities. Furthermore, the new V-HiLo-ED can also be extended to other non-fluorescence imaging fields. This simple, cost-effective and easy-to-extend method will benefit many research and application fields that needs to remove out-of-focus blurred images. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Comments: 22 pages 9 figures

arXiv:2108.02998 [pdf, other]

AI-based Aortic Vessel Tree Segmentation for Cardiovascular Diseases Treatment: Status Quo

Authors: Yuan **, Antonio Pepe, Jianning Li, Christina Gsaxner, Fen-hua Zhao, Kelsey L. Pomykala, Jens Kleesiek, Alejandro F. Frangi, Jan Egger

Abstract: The aortic vessel tree is composed of the aorta and its branching arteries, and plays a key role in supplying the whole body with blood. Aortic diseases, like aneurysms or dissections, can lead to an aortic rupture, whose treatment with open surgery is highly risky. Therefore, patients commonly undergo drug treatment under constant monitoring, which requires regular inspections of the vessels thro… ▽ More The aortic vessel tree is composed of the aorta and its branching arteries, and plays a key role in supplying the whole body with blood. Aortic diseases, like aneurysms or dissections, can lead to an aortic rupture, whose treatment with open surgery is highly risky. Therefore, patients commonly undergo drug treatment under constant monitoring, which requires regular inspections of the vessels through imaging. The standard imaging modality for diagnosis and monitoring is computed tomography (CT), which can provide a detailed picture of the aorta and its branching vessels if completed with a contrast agent, called CT angiography (CTA). Optimally, the whole aortic vessel tree geometry from consecutive CTAs is overlaid and compared. This allows not only detection of changes in the aorta, but also of its branches, caused by the primary pathology or newly developed. When performed manually, this reconstruction requires slice by slice contouring, which could easily take a whole day for a single aortic vessel tree, and is therefore not feasible in clinical practice. Automatic or semi-automatic vessel tree segmentation algorithms, however, can complete this task in a fraction of the manual execution time and run in parallel to the clinical routine of the clinicians. In this paper, we systematically review computing techniques for the automatic and semi-automatic segmentation of the aortic vessel tree. The review concludes with an in-depth discussion on how close these state-of-the-art approaches are to an application in clinical practice and how active this research field is, taking into account the number of publications, datasets and challenges. △ Less

Submitted 3 April, 2023; v1 submitted 6 August, 2021; originally announced August 2021.

arXiv:2104.04901 [pdf, other]

Global Convergence of Policy Gradient Primal-dual Methods for Risk-constrained LQRs

Authors: Feiran Zhao, Keyou You, Tamer Başar

Abstract: While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem… ▽ More While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained linear quadratic regulator (RC-LQR) problem via the PO approach, which requires addressing a challenging non-convex constrained optimization problem. To solve it, we first build on our earlier result that an optimal policy has a time-invariant affine structure to show that the associated Lagrangian function is coercive, locally gradient dominated and has local Lipschitz continuous gradient, based on which we establish strong duality. Then, we design policy gradient primal-dual methods with global convergence guarantees in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our methods. △ Less

Submitted 21 November, 2022; v1 submitted 10 April, 2021; originally announced April 2021.

arXiv:2103.15363 [pdf, other]

Infinite-horizon Risk-constrained Linear Quadratic Regulator with Average Cost

Authors: Feiran Zhao, Keyou You, Tamer Basar

Abstract: The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie… ▽ More The behaviour of a stochastic dynamical system may be largely influenced by those low-probability, yet extreme events. To address such occurrences, this paper proposes an infinite-horizon risk-constrained Linear Quadratic Regulator (LQR) framework with time-average cost. In addition to the standard LQR objective, the average one-stage predictive variance of the state penalty is constrained to lie within a user-specified level. By leveraging the duality, its optimal solution is first shown to be stationary and affine in the state, i.e., $u(x,λ^*) = -K(λ^*)x + l(λ^*)$, where $λ^*$ is an optimal multiplier, used to address the risk constraint. Then, we establish the stability of the resulting closed-loop system. Furthermore, we propose a primal-dual method with sublinear convergence rate to find an optimal policy $u(x,λ^*)$. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed framework and the primal-dual method. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: Submitted to IEEE CDC 2021

arXiv:2102.09817 [pdf, ps, other]

Unit selection synthesis based data augmentation for fixed phrase speaker verification

Authors: Houjun Huang, Xu Xiang, Fei Zhao, Shuai Wang, Yanmin Qian

Abstract: Data augmentation is commonly used to help build a robust speaker verification system, especially in limited-resource case. However, conventional data augmentation methods usually focus on the diversity of acoustic environment, leaving the lexicon variation neglected. For text dependent speaker verification tasks, it's well-known that preparing training data with the target transcript is the most… ▽ More Data augmentation is commonly used to help build a robust speaker verification system, especially in limited-resource case. However, conventional data augmentation methods usually focus on the diversity of acoustic environment, leaving the lexicon variation neglected. For text dependent speaker verification tasks, it's well-known that preparing training data with the target transcript is the most effectual approach to build a well-performing system, however collecting such data is time-consuming and expensive. In this work, we propose a unit selection synthesis based data augmentation method to leverage the abundant text-independent data resources. In this approach text-independent speeches of each speaker are firstly broke up to speech segments each contains one phone unit. Then segments that contain phonetics in the target transcript are selected to produce a speech with the target transcript by concatenating them in turn. Experiments are carried out on the AISHELL Speaker Verification Challenge 2019 database, the results and analysis shows that our proposed method can boost the system performance significantly. △ Less

Submitted 19 February, 2021; originally announced February 2021.

Comments: Accepted to ICASSP 2021

arXiv:2011.12564 [pdf]

Soft-Median Choice: An Automatic Feature Smoothing Method for Sound Event Detection

Authors: Fengnian Zhao, Ruwei Li, Xin Liu, Liwen Xu

Abstract: In Sound Event Detection (SED) systems, the lengths of median filters for post-processing have never been optimized during training due to several problems. No gradient is received by the lengths so they cannot be learned during back-propagation. The median-filtering inserted in the models also causes block in gradient flowing and the smoothing process misleads the model by ignoring errors. To res… ▽ More In Sound Event Detection (SED) systems, the lengths of median filters for post-processing have never been optimized during training due to several problems. No gradient is received by the lengths so they cannot be learned during back-propagation. The median-filtering inserted in the models also causes block in gradient flowing and the smoothing process misleads the model by ignoring errors. To resolve these problems, we provide different channels of features smoothed to different extents along with the original feature, so the model can optimize the weights while cognizing all the errors. We then use a linear layer to integrate the results and produce a linear combination. We further design the soft-median function to dredge the gradient flow. The proposed framework is called Soft-Median Choice (SMC). Experiments show that the SMC block not only automatically smooths the features based on the training set, but also forces the model to extract common features shared by all the frames of a sound event. The performance of the proposed method outperforms the baseline by over 10% of Event-Based F1 Score (EBFS) in both the validation and the evaluation set, and also slightly outperforms the single model of the state-of-the-art SED system. △ Less

Submitted 22 March, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

Comments: 5 pages, 3 figures, 6 tables

arXiv:2011.10931 [pdf, other]

Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator

Authors: Feiran Zhao, Keyou You

Abstract: Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair… ▽ More Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller with a focus on the linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. To solve it, we parameterize the policy with a feedback gain pair and leverage primal-dual methods to optimize it by solely using data. We first study the optimization landscape of the Lagrangian function and establish the strong duality in spite of its non-convex nature. Alongside, we find that the Lagrangian function enjoys an important local gradient dominance property, which is then exploited to develop a convergent random search algorithm to learn the dual function. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations. △ Less

Submitted 30 May, 2021; v1 submitted 21 November, 2020; originally announced November 2020.

Comments: To appear in the Annual Conference on Learning for Dynamics and Control (L4DC) 2021

arXiv:2010.06794 [pdf, other]

Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric

Authors: Feiran Zhao, Keyou You

Abstract: Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic gam… ▽ More Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic game of the penalized LQR problem, we propose a Q-learning method with convergence guarantees to learn an optimal minimax controller. △ Less

Submitted 16 January, 2023; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: Accepted by Automatica, to appear in 2023

arXiv:1910.12491 [pdf, other]

Suspension Regulation of Medium-low-speed Maglev Trains via Deep Reinforcement Learning

Authors: Feiran Zhao, Keyou You, Shiji Song, Wenyue Zhang, Laisheng Tong

Abstract: The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision pro… ▽ More The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision process (MDP) with unknown transition probabilities. With the deterministic policy gradient and neural network approximation, we design reinforcement learning (RL) algorithms to solve the MDP and obtain a state-feedback controller by using sampled data from the suspension system. To further improve its performance, we adopt a double Q-learning scheme for learning the regulation controller. We illustrate that the proposed controllers outperform the existing PID controller with a real dataset from the mlsMT in Changsha, China and is even comparable to model-based controllers, which assume that the complete information of the model is known, via simulations. △ Less

Submitted 8 May, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

Comments: 12 pages, 15 figures

arXiv:1910.09266 [pdf, ps, other]

Multi-Band Multi-Resolution Fully Convolutional Neural Networks for Singing Voice Separation

Authors: Emad M. Grais, Fei Zhao, Mark D. Plumbley

Abstract: Deep neural networks with convolutional layers usually process the entire spectrogram of an audio signal with the same time-frequency resolutions, number of filters, and dimensionality reduction scale. According to the constant-Q transform, good features can be extracted from audio signals if the low frequency bands are processed with high frequency resolution filters and the high frequency bands… ▽ More Deep neural networks with convolutional layers usually process the entire spectrogram of an audio signal with the same time-frequency resolutions, number of filters, and dimensionality reduction scale. According to the constant-Q transform, good features can be extracted from audio signals if the low frequency bands are processed with high frequency resolution filters and the high frequency bands with high time resolution filters. In the spectrogram of a mixture of singing voices and music signals, there is usually more information about the voice in the low frequency bands than the high frequency bands. These raise the need for processing each part of the spectrogram differently. In this paper, we propose a multi-band multi-resolution fully convolutional neural network (MBR-FCN) for singing voice separation. The MBR-FCN processes the frequency bands that have more information about the target signals with more filters and smaller dimentionality reduction scale than the bands with less information. Furthermore, the MBR-FCN processes the low frequency bands with high frequency resolution filters and the high frequency bands with high time resolution filters. Our experimental results show that the proposed MBR-FCN with very few parameters achieves better singing voice separation performance than other deep neural networks. △ Less

Submitted 21 October, 2019; originally announced October 2019.

MSC Class: 68T01; 68T10; 68T45; 62H25 ACM Class: H.5.5; I.5; I.2.6; I.4.3; I.4; I.2

arXiv:1804.03961 [pdf, other]

Discriminative Learning-based Smartphone Indoor Localization

Authors: Jose Luis V. Carrera, Zhongliang Zhao, Torsten Braun, Haiyong Luo, Fang Zhao

Abstract: Due to the growing area of ubiquitous mobile applications, indoor localization of smartphones has become an interesting research topic. Most of the current indoor localization systems rely on intensive site survey to achieve high accuracy. In this work, we propose an efficient smartphones indoor localization system that is able to reduce the site survey effort while still achieving high localizati… ▽ More Due to the growing area of ubiquitous mobile applications, indoor localization of smartphones has become an interesting research topic. Most of the current indoor localization systems rely on intensive site survey to achieve high accuracy. In this work, we propose an efficient smartphones indoor localization system that is able to reduce the site survey effort while still achieving high localization accuracy. Our system is built by fusing a variety of signals, such as Wi-Fi received signal strength indicator, magnetic field and floor plan information in an enhanced particle filter. To achieve high and stable performance, we first apply discriminative learning models to integrate Wi-Fi and magnetic field readings to achieve room level landmark detection. Further, we integrate landmark detection, range-based localization models, with a graph-based discretized system state representation. Because our approach requires only discriminative learning-based room level landmark detections, the time spent in the learning phase is significantly reduced compared to traditional Wi-Fi fingerprinting or landmark-based approaches. We conduct experimental studies to evaluate our system in an office-like indoor environment. Experiment results show that our system can significantly reduce the learning efforts, and the localization method can achieve performance with an average localization error of 1.55 meters. △ Less

Submitted 11 April, 2018; originally announced April 2018.

Comments: 14 pages

Showing 1–36 of 36 results for author: Zhao, F