Search | arXiv e-print repository

Robustly Optimized Deep Feature Decoupling Network for Fatty Liver Diseases Detection

Authors: Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Xi Wu, Xin Wang

Abstract: Current medical image classification efforts mainly aim for higher average performance, often neglecting the balance between different classes. This can lead to significant differences in recognition accuracy between classes and obvious recognition weaknesses. Without the support of massive data, deep learning faces challenges in fine-grained classification of fatty liver. In this paper, we propos… ▽ More Current medical image classification efforts mainly aim for higher average performance, often neglecting the balance between different classes. This can lead to significant differences in recognition accuracy between classes and obvious recognition weaknesses. Without the support of massive data, deep learning faces challenges in fine-grained classification of fatty liver. In this paper, we propose an innovative deep learning framework that combines feature decoupling and adaptive adversarial training. Firstly, we employ two iteratively compressed decouplers to supervised decouple common features and specific features related to fatty liver in abdominal ultrasound images. Subsequently, the decoupled features are concatenated with the original image after transforming the color space and are fed into the classifier. During adversarial training, we adaptively adjust the perturbation and balance the adversarial strength by the accuracy of each class. The model will eliminate recognition weaknesses by correctly classifying adversarial samples, thus improving recognition robustness. Finally, the accuracy of our method improved by 4.16%, achieving 82.95%. As demonstrated by extensive experiments, our method is a generalized learning framework that can be directly used to eliminate the recognition weaknesses of any classifier while improving its average performance. Code is available at https://github.com/HP-ML/MICCAI2024. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: MICCAI 2024

arXiv:2406.05652 [pdf, other]

Distributed Combinatorial Optimization of Downlink User Assignment in mmWave Cell-free Massive MIMO Using Graph Neural Networks

Authors: Bile Peng, Bihan Guo, Karl-Ludwig Besser, Luca Kunz, Ramprasad Raghunath, Anke Schmeink, Eduard A Jorswieck, Giuseppe Caire, H. Vincent Poor

Abstract: Millimeter wave (mmWave) cell-free massive MIMO (CF mMIMO) is a promising solution for future wireless communications. However, its optimization is non-trivial due to the challenging channel characteristics. We show that mmWave CF mMIMO optimization is largely an assignment problem between access points (APs) and users due to the high path loss of mmWave channels, the limited output power of the a… ▽ More Millimeter wave (mmWave) cell-free massive MIMO (CF mMIMO) is a promising solution for future wireless communications. However, its optimization is non-trivial due to the challenging channel characteristics. We show that mmWave CF mMIMO optimization is largely an assignment problem between access points (APs) and users due to the high path loss of mmWave channels, the limited output power of the amplifier, and the almost orthogonal channels between users given a large number of AP antennas. The combinatorial nature of the assignment problem, the requirement for scalability, and the distributed implementation of CF mMIMO make this problem difficult. In this work, we propose an unsupervised machine learning (ML) enabled solution. In particular, a graph neural network (GNN) customized for scalability and distributed implementation is introduced. Moreover, the customized GNN architecture is hierarchically permutation-equivariant (HPE), i.e., if the APs or users of an AP are permuted, the output assignment is automatically permuted in the same way. To address the combinatorial problem, we relax it to a continuous problem, and introduce an information entropy-inspired penalty term. The training objective is then formulated using the augmented Lagrangian method (ALM). The test results show that the realized sum-rate outperforms that of the generalized serial dictatorship (GSD) algorithm and is very close to the upper bound in a small network scenario, while the upper bound is impossible to obtain in a large network scenario. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2404.16522 [pdf, other]

A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Authors: Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

Abstract: Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classif… ▽ More Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classifying 2D echocardiography data into five distinct echocardiographic views: apical 4-chamber, parasternal long axis of left ventricle, parasternal short axis at levels of the mitral valve, papillary muscle, and apex. It then extracts features of each view separately and combines five features for disease classification. A total of 212 patients diagnosed with HCM, and 30 patients diagnosed with CA, along with 200 individuals with normal cardiac function(Normal), were enrolled in this study from 2018 to 2022. This approach achieved a precision, recall of 0.905, and micro-F1 score of 0.904, demonstrating its effectiveness in accurately identifying HCM and CA using a multi-view analysis. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.08549 [pdf]

Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations

Authors: Boyuan Peng, Jiaju Chen, Qihui Ye, Minjiang Chen, Peiwu Qin, Chenggang Yan, Dongmei Yu, Zhenglin Chen

Abstract: Cell segmentation is essential in biomedical research for analyzing cellular morphology and behavior. Deep learning methods, particularly convolutional neural networks (CNNs), have revolutionized cell segmentation by extracting intricate features from images. However, the robustness of these methods under microscope optical aberrations remains a critical challenge. This study comprehensively evalu… ▽ More Cell segmentation is essential in biomedical research for analyzing cellular morphology and behavior. Deep learning methods, particularly convolutional neural networks (CNNs), have revolutionized cell segmentation by extracting intricate features from images. However, the robustness of these methods under microscope optical aberrations remains a critical challenge. This study comprehensively evaluates the performance of cell instance segmentation models under simulated aberration conditions using the DynamicNuclearNet (DNN) and LIVECell datasets. Aberrations, including Astigmatism, Coma, Spherical, and Trefoil, were simulated using Zernike polynomial equations. Various segmentation models, such as Mask R-CNN with different network heads (FPN, C3) and backbones (ResNet, VGG19, SwinS), were trained and tested under aberrated conditions. Results indicate that FPN combined with SwinS demonstrates superior robustness in handling simple cell images affected by minor aberrations. Conversely, Cellpose2.0 proves effective for complex cell images under similar conditions. Our findings provide insights into selecting appropriate segmentation models based on cell morphology and aberration severity, enhancing the reliability of cell segmentation in biomedical applications. Further research is warranted to validate these methods with diverse aberration types and emerging segmentation models. Overall, this research aims to guide researchers in effectively utilizing cell segmentation models in the presence of minor optical aberrations. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.14172 [pdf]

Lane level joint control of off-ramp and main line speed guidance on expressway in rainy weather

Authors: Boyao Peng, Lexing Zhang, Enkai Li

Abstract: In the upstream of the exit ramp of the expressway, the speed limit difference leads to a significant deceleration of the vehicle in the area adjacent to the off-ramp. The friction coefficient of the road surface decreases under rainy weather, and the above deceleration process can easily lead to sideslip and rollover of the vehicle. Dynamic speed guidance is an effective way to improve the status… ▽ More In the upstream of the exit ramp of the expressway, the speed limit difference leads to a significant deceleration of the vehicle in the area adjacent to the off-ramp. The friction coefficient of the road surface decreases under rainy weather, and the above deceleration process can easily lead to sideslip and rollover of the vehicle. Dynamic speed guidance is an effective way to improve the status quo. Currently, there is an emerging trend to utilize I2V technology and high-precision map technology for lane level speed guidance control. This paper presents an optimized joint control strategy for main line-off-ramp speed guidance, which can adjust the guidance speed in real time according to the rainfall intensity. At the same time, this paper designs a progressive deceleration strategy, which works together with the speed guidance control to ensure the safe deceleration of vehicles. The simulation results show that the proposed control strategy outperforms the fixed speed limit control in terms of improving the total traveled time (TTT), total traveled distance (TTD) and standard deviation of speed (SD). Sensitivity analysis shows that the proposed control strategy can improve performance with the increase of the compliance rate of drivers. The speed guidance control method established in this paper can improve the vehicle operation efficiency in the off-ramp area of the expressway and reduce the speed difference of each vehicle in rainy weather, which guarantee the safety of expressway driving in the rainy day. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 103rd TRB Conference

Report number: TRBAM-24-01802 MSC Class: 90-10 ACM Class: A.0

arXiv:2403.04028 [pdf, other]

RISnet: A Domain-Knowledge Driven Neural Network Architecture for RIS Optimization with Mutual Coupling and Partial CSI

Authors: Bile Peng, Karl-Ludwig Besser, Shanpu Shen, Finn Siegismund-Poschmann, Ramprasad Raghunath, Daniel Mittleman, Vahid Jamali, Eduard A. Jorswieck

Abstract: Multiple access techniques are cornerstones of wireless communications. Their performance depends on the channel properties, which can be improved by reconfigurable intelligent surfaces (RISs). In this work, we jointly optimize MA precoding at the base station (BS) and RIS configuration. We tackle difficulties of mutual coupling between RIS elements, scalability to more than 1000 RIS elements, and… ▽ More Multiple access techniques are cornerstones of wireless communications. Their performance depends on the channel properties, which can be improved by reconfigurable intelligent surfaces (RISs). In this work, we jointly optimize MA precoding at the base station (BS) and RIS configuration. We tackle difficulties of mutual coupling between RIS elements, scalability to more than 1000 RIS elements, and channel estimation. We first derive an RIS-assisted channel model considering mutual coupling, then propose an unsupervised machine learning (ML) approach to optimize the RIS. In particular, we design a dedicated neural network (NN) architecture RISnet with good scalability and desired symmetry. Moreover, we combine ML-enabled RIS configuration and analytical precoding at BS since there exist analytical precoding schemes. Furthermore, we propose another variant of RISnet, which requires the channel state information (CSI) of a small portion of RIS elements (in this work, 16 out of 1296 elements) if the channel comprises a few specular propagation paths. More generally, this work is an early contribution to combine ML technique and domain knowledge in communication for NN architecture design. Compared to generic ML, the problem-specific ML can achieve higher performance, lower complexity and symmetry. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 13 pages, 16 figures

arXiv:2401.16889 [pdf, other]

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

Authors: Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Abstract: This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jum** and standing. Our RL-based controller incorporates a n… ▽ More This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jum** and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world.The study also delves into the adaptivity and robustness introduced by the proposed RL system in develo** locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot's I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jum** skills, such as standing long jumps and high jumps. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.14281 [pdf, other]

Energy-Efficient Power Allocation in Cell-Free Massive MIMO via Graph Neural Networks

Authors: Ramprasad Raghunath, Bile Peng, Eduard A. Jorswieck

Abstract: CF-mMIMO systems are a promising solution to enhance the performance in 6G wireless networks. Its distributed nature of the architecture makes it highly reliable, provides sufficient coverage and allows higher performance than cellular networks. EE is an important metric that reduces the operating costs and also better for the environment. In this work, we optimize the downlink EE performance with… ▽ More CF-mMIMO systems are a promising solution to enhance the performance in 6G wireless networks. Its distributed nature of the architecture makes it highly reliable, provides sufficient coverage and allows higher performance than cellular networks. EE is an important metric that reduces the operating costs and also better for the environment. In this work, we optimize the downlink EE performance with MRT precoding and power allocation. Our aim is to achieve a less complex, distributed and scalable solution. To achieve this, we apply unsupervised ML with permutation equivariant architecture and use a non-convex objective function with multiple local optima. We compare the performance with the centralized and computationally expensive SCA. The results indicate that the proposed approach can outperform the baseline with significantly less computation time. △ Less

Submitted 9 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

arXiv:2312.10921 [pdf, other]

AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis

Authors: Dongze Li, Kang Zhao, Wei Wang, Bo Peng, Yingya Zhang, **g Dong, Tieniu Tan

Abstract: Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality. Recent NeRF-based approaches have shown superiority in quality and fidelity compared to previous studies. However, when it comes to few-shot talking head generation, a practical scenario where only few seconds of talking video is available for one identity, two limitat… ▽ More Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality. Recent NeRF-based approaches have shown superiority in quality and fidelity compared to previous studies. However, when it comes to few-shot talking head generation, a practical scenario where only few seconds of talking video is available for one identity, two limitations emerge: 1) they either have no base model, which serves as a facial prior for fast convergence, or ignore the importance of audio when building the prior; 2) most of them overlook the degree of correlation between different face regions and audio, e.g., mouth is audio related, while ear is audio independent. In this paper, we present Audio Enhanced Neural Radiance Field (AE-NeRF) to tackle the above issues, which can generate realistic portraits of a new speaker with fewshot dataset. Specifically, we introduce an Audio Aware Aggregation module into the feature fusion stage of the reference scheme, where the weight is determined by the similarity of audio between reference and target image. Then, an Audio-Aligned Face Generation strategy is proposed to model the audio related and audio independent regions respectively, with a dual-NeRF framework. Extensive experiments have shown AE-NeRF surpasses the state-of-the-art on image fidelity, audio-lip synchronization, and generalization ability, even in limited training set or training iterations. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI 2024

arXiv:2309.09565 [pdf, other]

A Covariance Adaptive Student's t Based Kalman Filter

Authors: Benyang Gong, Jiacheng He, Gang Wang, Bei Peng

Abstract: In the classical Kalman filter(KF), the estimated state is a linear combination of the one-step predicted state and measurement state, their confidence level change when the prediction mean square error matrix and covariance matrix of measurement noise vary. The existing student's t based Kalman filter(TKF) works similarly to the way KF works, they both work well with impulse noise, but when it co… ▽ More In the classical Kalman filter(KF), the estimated state is a linear combination of the one-step predicted state and measurement state, their confidence level change when the prediction mean square error matrix and covariance matrix of measurement noise vary. The existing student's t based Kalman filter(TKF) works similarly to the way KF works, they both work well with impulse noise, but when it comes to Gaussian noise, TKF encounters an adjustment limit of the confidence level, this can lead to inaccuracies in such situations. This brief optimizes TKF by using the Gaussian mixture model(GMM), which generates a reasonable covariance matrix from the measurement noise to replace the one used in the existing algorithm and breaks the adjustment limit of the confidence level. At the end of the brief, the performance of the covariance adaptive student's t based Kalman filter(TGKF) is verified. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.08088 [pdf, ps, other]

Interactive Model Fusion-Based GM-PHD Filter

Authors: Jiacheng He, Shan Zhong, Bei Peng, Gang Wang, Qizhen Wang

Abstract: In multi-target tracking (MTT), non-Gaussian measurement noise from sensors can diminish the performance of the Gaussian-assumed Gaussian mixture probability hypothesis density (GM-PHD) filter. In this paper, an approach that transforms the MTT problem under non-Gaussian conditions into an MTT problem under Gaussian conditions is developed. Specifically, measurement noise with a non-Gaussian distr… ▽ More In multi-target tracking (MTT), non-Gaussian measurement noise from sensors can diminish the performance of the Gaussian-assumed Gaussian mixture probability hypothesis density (GM-PHD) filter. In this paper, an approach that transforms the MTT problem under non-Gaussian conditions into an MTT problem under Gaussian conditions is developed. Specifically, measurement noise with a non-Gaussian distribution is modeled as a weighted sum of different Gaussian distributions. Subsequently, the GM-PHD filter is applied to compute the multi-target states under these distinct Gaussian distributions. Finally, an interactive multi-model framework is employed to fuse the diverse multi-target state information into a unified synthesis. The effectiveness of the proposed approach is validated through the simulation results. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: conference

arXiv:2307.01445 [pdf, ps, other]

Distributed fusion filter over lossy wireless sensor networks with the presence of non-Gaussian noise

Authors: Jiacheng He, Bei Peng, Zhenyu Feng, Xuemei Mao, Song Gao, Gang Wang

Abstract: The information transmission between nodes in a wireless sensor networks (WSNs) often causes packet loss due to denial-of-service (DoS) attack, energy limitations, and environmental factors, and the information that is successfully transmitted can also be contaminated by non-Gaussian noise. The presence of these two factors poses a challenge for distributed state estimation (DSE) over WSNs. In thi… ▽ More The information transmission between nodes in a wireless sensor networks (WSNs) often causes packet loss due to denial-of-service (DoS) attack, energy limitations, and environmental factors, and the information that is successfully transmitted can also be contaminated by non-Gaussian noise. The presence of these two factors poses a challenge for distributed state estimation (DSE) over WSNs. In this paper, a generalized packet drop model is proposed to describe the packet loss phenomenon caused by DoS attacks and other factors. Moreover, a modified maximum correntropy Kalman filter is given, and it is extended to distributed form (DM-MCKF). In addition, a distributed modified maximum correntropy Kalman filter incorporating the generalized data packet drop (DM-MCKF-DPD) algorithm is provided to implement DSE with the presence of both non-Gaussian noise pollution and packet drop. A sufficient condition to ensure the convergence of the fixed-point iterative process of the DM-MCKF-DPD algorithm is presented and the computational complexity of the DM-MCKF-DPD algorithm is analyzed. Finally, the effectiveness and feasibility of the proposed algorithms are verified by simulations. △ Less

Submitted 6 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

arXiv:2307.01442 [pdf, ps, other]

Quantized criterion-based kernel recursive least squares adaptive filtering for time series prediction

Authors: Jiacheng He, Gang Wang, Kun Zhang, Shan Zhong, Bei Peng

Abstract: The robustness of the kernel recursive least square (KRLS) algorithm has recently been improved by combining them with more robust information-theoretic learning criteria, such as minimum error entropy (MEE) and generalized MEE (GMEE), which also improves the computational complexity of the KRLS-type algorithms to a certain extent. To reduce the computational load of the KRLS-type algorithms, the… ▽ More The robustness of the kernel recursive least square (KRLS) algorithm has recently been improved by combining them with more robust information-theoretic learning criteria, such as minimum error entropy (MEE) and generalized MEE (GMEE), which also improves the computational complexity of the KRLS-type algorithms to a certain extent. To reduce the computational load of the KRLS-type algorithms, the quantized GMEE (QGMEE) criterion, in this paper, is combined with the KRLS algorithm, and as a result two kinds of KRLS-type algorithms, called quantized kernel recursive MEE (QKRMEE) and quantized kernel recursive GMEE (QKRGMEE), are designed. As well, the mean error behavior, mean square error behavior, and computational complexity of the proposed algorithms are investigated. In addition, simulation and real experimental data are utilized to verify the feasibility of the proposed algorithms. △ Less

Submitted 6 September, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

arXiv:2306.13564 [pdf, other]

Estimating Residential Solar Potential Using Aerial Data

Authors: Ross Goroshin, Alex Wilson, Andrew Lamb, Betty Peng, Brandon Ewonus, Cornelius Ratsch, Jordan Raisher, Marisa Leung, Max Burq, Thomas Colthurst, William Rucklidge, Carl Elkin

Abstract: Project Sunroof estimates the solar potential of residential buildings using high quality aerial data. That is, it estimates the potential solar energy (and associated financial savings) that can be captured by buildings if solar panels were to be installed on their roofs. Unfortunately its coverage is limited by the lack of high resolution digital surface map (DSM) data. We present a deep learnin… ▽ More Project Sunroof estimates the solar potential of residential buildings using high quality aerial data. That is, it estimates the potential solar energy (and associated financial savings) that can be captured by buildings if solar panels were to be installed on their roofs. Unfortunately its coverage is limited by the lack of high resolution digital surface map (DSM) data. We present a deep learning approach that bridges this gap by enhancing widely available low-resolution data, thereby dramatically increasing the coverage of Sunroof. We also present some ongoing efforts to potentially improve accuracy even further by replacing certain algorithmic components of the Sunroof processing pipeline with deep learning. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Journal ref: ICLR 2023 - Tackling Climate Change with Machine Learning Workshop

arXiv:2306.11476 [pdf, other]

A Model Fusion Distributed Kalman Filter For Non-Gaussian Observation Noise

Authors: Xuemei Mao, Gang Wang, Bei Peng, Jiacheng He, Kun Zhang, Song Gao

Abstract: The distributed Kalman filter (DKF) has attracted extensive research as an information fusion method for wireless sensor systems(WSNs). And the DKF in non-Gaussian environments is still a pressing problem. In this paper, we approximate the non-Gaussian noise as a Gaussian mixture model and estimate the parameters through the expectation-maximization algorithm. A DKF, called model fusion DKF (MFDKF… ▽ More The distributed Kalman filter (DKF) has attracted extensive research as an information fusion method for wireless sensor systems(WSNs). And the DKF in non-Gaussian environments is still a pressing problem. In this paper, we approximate the non-Gaussian noise as a Gaussian mixture model and estimate the parameters through the expectation-maximization algorithm. A DKF, called model fusion DKF (MFDKF) is proposed against the non-Gaussain noise. Specifically, the proposed MFDKF is obtained by fusing the sub-models that are built based on the noise approximation with the help of interacting multiple model (IMM). Considering that some WSNs demand high consensus or have restricted communication, consensus MFDKF (C-MFDKF) and simplified MFDKF (S-MFDKF) are proposed based on consensus theory, respectively. The convergence of MFDKF and its derivative algorithms are analyzed. A series of simulations indicate the effectiveness of the MFDKF and its derivative algorithms. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2305.18875 [pdf, other]

Centralised rehearsal of decentralised cooperation: Multi-agent reinforcement learning for the scalable coordination of residential energy flexibility

Authors: Flora Charbonnier, Bei Peng, Thomas Morstyn, Malcolm McCulloch

Abstract: This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility. The coordination of distributed resources such as electric vehicles and heating will be critical to the successful integration of large shares of renewable energy in our electricity grid and, thus, to help mitigate climate change. The pr… ▽ More This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility. The coordination of distributed resources such as electric vehicles and heating will be critical to the successful integration of large shares of renewable energy in our electricity grid and, thus, to help mitigate climate change. The pre-learning of individual reinforcement learning policies can enable distributed control with no sharing of personal data required during execution. However, previous approaches for multi-agent reinforcement learning-based distributed energy resources coordination impose an ever greater training computational burden as the size of the system increases. We therefore adopt a deep multi-agent actor-critic method which uses a \emph{centralised but factored critic} to rehearse coordination ahead of execution. Results show that coordination is achieved at scale, with minimal information and communication infrastructure requirements, no interference with daily activities, and privacy protection. Significant savings are obtained for energy users, the distribution network and greenhouse gas emissions. Moreover, training times are nearly 40 times shorter than with a previous state-of-the-art reinforcement learning approach without the factored critic for 30 homes. △ Less

Submitted 5 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.00692 [pdf, other]

Non-Orthogonal Multiple Access Assisted by Reconfigurable Intelligent Surface Using Unsupervised Machine Learning

Authors: Finn Siegismund-Poschmann, Bile Peng, Eduard A. Jorswieck

Abstract: Nonorthogonal multiple access (NOMA) with multi-antenna base station (BS) is a promising technology for next-generation wireless communication, which has high potential in performance and user fairness. Since the performance of NOMA depends on the channel conditions, we can combine NOMA and reconfigurable intelligent surface (RIS), which is a large and passive antenna array and can optimize the wi… ▽ More Nonorthogonal multiple access (NOMA) with multi-antenna base station (BS) is a promising technology for next-generation wireless communication, which has high potential in performance and user fairness. Since the performance of NOMA depends on the channel conditions, we can combine NOMA and reconfigurable intelligent surface (RIS), which is a large and passive antenna array and can optimize the wireless channel. However, the high dimensionality makes the RIS optimization a complicated problem. In this work, we propose a machine learning approach to solve the problem of joint optimization of precoding and RIS configuration. We apply the RIS to realize the quasi-degradation of the channel, which allows for optimal precoding in closed form. The neural network architecture RISnet is used, which is designed dedicatedly for RIS optimization. The proposed solution is superior to the works in the literature in terms of performance and computation time. △ Less

Submitted 31 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2305.00667 [pdf, ps, other]

RISnet: A Scalable Approach for Reconfigurable Intelligent Surface Optimization with Partial CSI

Authors: Bile Peng, Karl-Ludwig Besser, Ramprasad Raghunath, Vahid Jamali, Eduard A. Jorswieck

Abstract: The reconfigurable intelligent surface (RIS) is a promising technology that enables wireless communication systems to achieve improved performance by intelligently manipulating wireless channels. In this paper, we consider the sum-rate maximization problem in a downlink multi-user multi-input-single-output (MISO) channel via space-division multiple access (SDMA). Two major challenges of this probl… ▽ More The reconfigurable intelligent surface (RIS) is a promising technology that enables wireless communication systems to achieve improved performance by intelligently manipulating wireless channels. In this paper, we consider the sum-rate maximization problem in a downlink multi-user multi-input-single-output (MISO) channel via space-division multiple access (SDMA). Two major challenges of this problem are the high dimensionality due to the large number of RIS elements and the difficulty to obtain the full channel state information (CSI), which is assumed known in many algorithms proposed in the literature. Instead, we propose a hybrid machine learning approach using the weighted minimum mean squared error (WMMSE) precoder at the base station (BS) and a dedicated neural network (NN) architecture, RISnet, for RIS configuration. The RISnet has a good scalability to optimize 1296 RIS elements and requires partial CSI of only 16 RIS elements as input. We show it achieves a high performance with low requirement for channel estimation for geometric channel models obtained with ray-tracing simulation. The unsupervised learning lets the RISnet find an optimized RIS configuration by itself. Numerical results show that a trained model configures the RIS with low computational effort, considerably outperforms the baselines, and can work with discrete phase shifts. △ Less

Submitted 18 August, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2302.09450 [pdf, other]

Robust and Versatile Bipedal Jum** Control through Reinforcement Learning

Authors: Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Abstract: This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a reinforcement learning framework for training a robot to accomplish a large variety of jum** tasks, such as jum** to different locations and directions. To improve performance on these challenging tasks, we d… ▽ More This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a reinforcement learning framework for training a robot to accomplish a large variety of jum** tasks, such as jum** to different locations and directions. To improve performance on these challenging tasks, we develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history. In order to train a versatile jum** policy, we utilize a multi-stage training scheme that includes different training stages for different objectives. After multi-stage training, the policy can be directly transferred to a real bipedal Cassie robot. Training on different tasks and exploring more diverse scenarios lead to highly robust policies that can exploit the diverse set of learned maneuvers to recover from perturbations or poor landings during real-world deployment. Such robustness in the proposed policy enables Cassie to succeed in completing a variety of challenging jump tasks in the real world, such as standing long jumps, jum** onto elevated platforms, and multi-axes jumps. △ Less

Submitted 31 May, 2023; v1 submitted 18 February, 2023; originally announced February 2023.

Comments: Accepted in Robotics: Science and Systems 2023 (RSS 2023). The accompanying video is at https://youtu.be/aAPSZ2QFB-E

arXiv:2301.05867 [pdf, other]

State Estimation of Wireless Sensor Networks in the Presence of Data Packet Drops and Non-Gaussian Noise

Authors: Jiacheng He, Gang Wang, Xuemei Mao, Song Gao, Bei Peng

Abstract: Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise. However, these algorithms currently fail to take account of data packet drops. The present work addresses this issue by propos… ▽ More Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise. However, these algorithms currently fail to take account of data packet drops. The present work addresses this issue by proposing a distributed maximum correntropy Kalman filter that accounts for data packet drops (i.e., the DMCKF-DPD algorithm). The effectiveness and feasibility of the algorithm are verified by simulations conducted in a wireless sensor network with intermittent observations due to data packet drops under a non-Gaussian noise environment. Moreover, the computational complexity of the DMCKF-DPD algorithm is demonstrated to be moderate compared with that of a conventional distributed Kalman filter, and we provide a sufficient condition to ensure the convergence of the proposed algorithm. △ Less

Submitted 3 September, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

arXiv:2301.05813 [pdf, ps, other]

Minimum Error Entropy Rauch-Tung-Striebel Smoother

Authors: Jiacheng He, Hongwei Wang, Gang Wang, Shan Zhong, Bei Peng

Abstract: Outliers and impulsive disturbances often cause heavy-tailed distributions in practical applications, and these will degrade the performance of Gaussian approximation smoothing algorithms. To improve the robustness of the Rauch-Tung-Striebel (RTS) smother against complicated non-Gaussian noises, a new RTS-smoother integrated with the minimum error entropy (MEE) criterion (MEE-RTS) is proposed for… ▽ More Outliers and impulsive disturbances often cause heavy-tailed distributions in practical applications, and these will degrade the performance of Gaussian approximation smoothing algorithms. To improve the robustness of the Rauch-Tung-Striebel (RTS) smother against complicated non-Gaussian noises, a new RTS-smoother integrated with the minimum error entropy (MEE) criterion (MEE-RTS) is proposed for linear systems, which is also extended to the state estimation of nonlinear systems by utilizing the Taylor series linearization approach. The mean error behavior, the mean square error behavior, as well as the computational complexity of the MEE-RTS smoother are analyzed. According to simulation results, the proposed smoothers perform better than several robust solutions in terms of steady-state error. △ Less

Submitted 2 February, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

arXiv:2212.12329 [pdf, other]

Approaching Globally Optimal Energy Efficiency in Interference Networks via Machine Learning

Authors: Bile Peng, Karl-Ludwig Besser, Ramprasad Raghunath, Eduard A. Jorswieck

Abstract: This work presents a machine learning approach to optimize the energy efficiency (EE) in a multi-cell wireless network. This optimization problem is non-convex and its global optimum is difficult to find. In the literature, either simple but suboptimal approaches or optimal methods with high complexity and poor scalability are proposed. In contrast, we propose a machine learning framework to appro… ▽ More This work presents a machine learning approach to optimize the energy efficiency (EE) in a multi-cell wireless network. This optimization problem is non-convex and its global optimum is difficult to find. In the literature, either simple but suboptimal approaches or optimal methods with high complexity and poor scalability are proposed. In contrast, we propose a machine learning framework to approach the global optimum. While the neural network (NN) training takes moderate time, application with the trained model requires very low computational complexity. In particular, we introduce a novel objective function based on stochastic actions to solve the non-convex optimization problem. Besides, we design a dedicated NN architecture for the multi-cell network optimization problems that is permutation-equivariant. It classifies channels according to their roles in the EE computation. In this way, we encode our domain knowledge into the NN design and shed light into the black box of machine learning. Training and testing results show that the proposed method without supervision and with reasonable computational effort achieves an EE close to the global optimum found by the branch-and-bound algorithm. Hence, the proposed approach balances between computational complexity and performance. △ Less

Submitted 14 December, 2023; v1 submitted 25 November, 2022; originally announced December 2022.

arXiv:2212.02967 [pdf, other]

RISnet: a Dedicated Scalable Neural Network Architecture for Optimization of Reconfigurable Intelligent Surfaces

Authors: Bile Peng, Finn Siegismund-Poschmann, Eduard A. Jorswieck

Abstract: The reconfigurable intelligent surface (RIS) is a promising technology for next-generation wireless communication. It comprises many passive antennas, which reflect signals from the transmitter to the receiver with adjusted phases without changing the amplitude. The large number of the antennas enables a huge potential of signal processing despite the simple functionality of a single antenna. Howe… ▽ More The reconfigurable intelligent surface (RIS) is a promising technology for next-generation wireless communication. It comprises many passive antennas, which reflect signals from the transmitter to the receiver with adjusted phases without changing the amplitude. The large number of the antennas enables a huge potential of signal processing despite the simple functionality of a single antenna. However, it also makes the RIS configuration a high dimensional problem, which might not have a closed-form solution and has a high complexity and, as a result, severe difficulty in online real-time application if we apply iterative numerical solutions. In this paper, we introduce a machine learning approach to maximize the weighted sum-rate (WSR). We propose a dedicated neural network architecture called RISNet. The RIS optimization is designed according to the RIS property of product and direct channel and homogeneous RIS antennas. The architecture is scalable due to the fact that the number of trainable parameters is independent from the number of RIS antennas (because all antennas share the same parameters). The weighted minimum mean squared error (WMMSE) precoding is applied and an alternating optimization (AO) training procedure is designed. Testing results show that the proposed approach outperforms the state-of-the-art block coordinate descent (BCD) algorithm. Moreover, although the training takes several hours, online testing with trained model (application) is almost instant, which makes it feasible for real-time application. Compared to it, the BCD algorithm requires much more convergence time. Therefore, the proposed method outperforms the state-of-the-art algorithm in both performance and complexity. △ Less

Submitted 15 January, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2210.04435 [pdf, other]

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

Authors: Xiaoyu Huang, Zhongyu Li, Yanzhen Xiang, Yiming Ni, Yufeng Chi, Yunhao Li, Lizhi Yang, Xue Bin Peng, Koushil Sreenath

Abstract: We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkee** tasks in the real world. Soccer goalkee** using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion ma… ▽ More We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkee** tasks in the real world. Soccer goalkee** using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion maneuvers in a very short amount of time, usually less than one second. In this paper, we propose to address this problem using a hierarchical model-free RL framework. The first component of the framework contains multiple control policies for distinct locomotion skills, which can be used to cover different regions of the goal. Each control policy enables the robot to track random parametric end-effector trajectories while performing one specific locomotion skill, such as jump, dive, and sidestep. These skills are then utilized by the second part of the framework which is a high-level planner to determine a desired skill and end-effector trajectory in order to intercept a ball flying to different regions of the goal. We deploy the proposed framework on a Mini Cheetah quadrupedal robot and demonstrate the effectiveness of our framework for various agile interceptions of a fast-moving ball in the real world. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: First two authors contributed equally. Accompanying video is at https://youtu.be/iX6OgG67-ZQ

arXiv:2208.01160 [pdf, other]

Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot

Authors: Yandong Ji, Zhongyu Li, Yinan Sun, Xue Bin Peng, Sergey Levine, Glen Berseth, Koushil Sreenath

Abstract: We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Develo** algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability… ▽ More We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Develo** algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability during the control of a dynamic legged robot. Moreover, we need to consider motion planning to shoot the hard-to-model deformable ball rolling on the ground with uncertain friction to a desired location. In this paper, we propose a hierarchical framework that leverages deep reinforcement learning to train (a) a robust motion control policy that can track arbitrary motions and (b) a planning policy to decide the desired kicking motion to shoot a soccer ball to a target. We deploy the proposed framework on an A1 quadrupedal robot and enable it to accurately shoot the ball to random targets in the real world. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted to 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

arXiv:2207.00001 [pdf]

MultiEarth 2022 -- The Champion Solution for Image-to-Image Translation Challenge via Generation Models

Authors: Yuchuan Gou, Bo Peng, Hongchen Liu, Hang Zhou, Jui-Hsin Lai

Abstract: The MultiEarth 2022 Image-to-Image Translation challenge provides a well-constrained test bed for generating the corresponding RGB Sentinel-2 imagery with the given Sentinel-1 VV & VH imagery. In this challenge, we designed various generation models and found the SPADE [1] and pix2pixHD [2] models could perform our best results. In our self-evaluation, the SPADE-2 model with L1-loss can achieve 0.… ▽ More The MultiEarth 2022 Image-to-Image Translation challenge provides a well-constrained test bed for generating the corresponding RGB Sentinel-2 imagery with the given Sentinel-1 VV & VH imagery. In this challenge, we designed various generation models and found the SPADE [1] and pix2pixHD [2] models could perform our best results. In our self-evaluation, the SPADE-2 model with L1-loss can achieve 0.02194 MAE score and 31.092 PSNR dB. In our final submission, the best model can achieve 0.02795 MAE score ranked No.1 on the leader board. △ Less

Submitted 17 June, 2022; originally announced July 2022.

Comments: CVPR 2022, MultiEarth 2022, Image-to-Image translation, competition

arXiv:2206.09488 [pdf, other]

Two-Hop Age of Information Scheduling for Multi-UAV Assisted Mobile Edge Computing: FRL vs MADDPG

Authors: Marjan Tajik, Mohammadreza Maleki, Nader Mokari, Mohammad Reza Javan, Hamid Saeedi, Bile Peng, Eduard A. Jorswieck

Abstract: In this work, we adopt the emerging technology of mobile edge computing (MEC) in the Unmanned aerial vehicles (UAVs) for communication-computing systems, to optimize the age of information (AoI) in the network. We assume that tasks are processed jointly on UAVs and BS to enhance edge performance with limited connectivity and computing. Using UAVs and BS jointly with MEC can reduce AoI on the netwo… ▽ More In this work, we adopt the emerging technology of mobile edge computing (MEC) in the Unmanned aerial vehicles (UAVs) for communication-computing systems, to optimize the age of information (AoI) in the network. We assume that tasks are processed jointly on UAVs and BS to enhance edge performance with limited connectivity and computing. Using UAVs and BS jointly with MEC can reduce AoI on the network. To maintain the freshness of the tasks, we formulate the AoI minimization in two-hop communication framework, the first hop at the UAVs and the second hop at the BS. To approach the challenge, we optimize the problem using a deep reinforcement learning (DRL) framework, called federated reinforcement learning (FRL). In our network we have two types of agents with different states and actions but with the same policy. Our FRL enables us to handle the two-step AoI minimization and UAV trajectory problems. In addition, we compare our proposed algorithm, which has a centralized processing unit to update the weights, with fully decentralized multi-agent deep deterministic policy gradient (MADDPG), which enhances the agent's performance. As a result, the suggested algorithm outperforms the MADDPG by about 38\% △ Less

Submitted 19 June, 2022; originally announced June 2022.

arXiv:2202.06668 [pdf, other]

Resource allocation for reconfigurable intelligent surface aided broadcast channels

Authors: Cong Sun, Xian Liu, Bile Peng, Eduard Jorswieck

Abstract: A two-user downlink network aided by a reconfigurable intelligent surface is considered. The weighted sum signal to interference plus noise ratio maximization and the sum rate maximization models are presented, where the precoding vectors and the RIS matrix are jointly optimized. Since the optimization problem is non-convex and difficult, new approximation models are proposed. The upper bounds of… ▽ More A two-user downlink network aided by a reconfigurable intelligent surface is considered. The weighted sum signal to interference plus noise ratio maximization and the sum rate maximization models are presented, where the precoding vectors and the RIS matrix are jointly optimized. Since the optimization problem is non-convex and difficult, new approximation models are proposed. The upper bounds of the corresponding objective functions are derived and maximized. Two new algorithms based on the alternating direction method of multiplier are proposed. It is proved that the proposed algorithms converge to the KKT points of the approximation models as long as the iteration points converge. Simulation results show the good performances of the proposed models compared to state of the art algorithms. △ Less

Submitted 14 February, 2022; originally announced February 2022.

arXiv:2201.02834 [pdf, other]

Reconfigurable Intelligent Surface Enabled Spatial Multiplexing with Fully Convolutional Network

Authors: Bile Peng, Jan-Aike Termöhlen, Cong Sun, Dan** He, Ke Guan, Tim Fingscheidt, Eduard A. Jorswieck

Abstract: Reconfigurable intelligent surface (RIS) is an emerging technology for future wireless communication systems. In this work, we consider downlink spatial multiplexing enabled by the RIS for weighted sum-rate (WSR) maximization. In the literature, most solutions use alternating gradient-based optimization, which has moderate performance, high complexity, and limited scalability. We propose to apply… ▽ More Reconfigurable intelligent surface (RIS) is an emerging technology for future wireless communication systems. In this work, we consider downlink spatial multiplexing enabled by the RIS for weighted sum-rate (WSR) maximization. In the literature, most solutions use alternating gradient-based optimization, which has moderate performance, high complexity, and limited scalability. We propose to apply a fully convolutional network (FCN) to solve this problem, which was originally designed for semantic segmentation of images. The rectangular shape of the RIS and the spatial correlation of channels with adjacent RIS antennas due to the short distance between them encourage us to apply it for the RIS configuration. We design a set of channel features that includes both cascaded channels via the RIS and the direct channel. In the base station (BS), the differentiable minimum mean squared error (MMSE) precoder is used for pretraining and the weighted minimum mean squared error (WMMSE) precoder is then applied for fine-tuning, which is nondifferentiable, more complex, but achieves a better performance. Evaluation results show that the proposed solution has higher performance and allows for a faster evaluation than the baselines. Hence it scales better to a large number of antennas, advancing the RIS one step closer to practical deployment. △ Less

Submitted 21 September, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

arXiv:2109.13322 [pdf, other]

doi 10.1073/pnas.2012982118

Induced transparency: interference or polarization?

Authors: Changqing Wang, Xuefeng Jiang, William R. Sweeney, Chia Wei Hsu, Yiming Liu, Guangming Zhao, Bo Peng, Mengzhen Zhang, Liang Jiang, A. Douglas Stone, Lan Yang

Abstract: The polarization of optical fields is a crucial degree of freedom in the all-optical analogue of electromagnetically induced transparency (EIT). However, the physical origins of EIT and polarization induced phenomena have not been well distinguished, which can lead to confusion in associated applications such as slow light and optical/quantum storage. Here we study the polarization effects in vari… ▽ More The polarization of optical fields is a crucial degree of freedom in the all-optical analogue of electromagnetically induced transparency (EIT). However, the physical origins of EIT and polarization induced phenomena have not been well distinguished, which can lead to confusion in associated applications such as slow light and optical/quantum storage. Here we study the polarization effects in various optical EIT systems. We find that a polarization mismatch between whispering gallery modes in two indirectly coupled resonators can induce a narrow transparency window in the transmission spectrum resembling the EIT lineshape. However, such polarization induced transparency (PIT) is distinct from EIT: it originates from strong polarization rotation effects and shows unidirectional feature. The coexistence of PIT and EIT provides new routes for the manipulation of light flow in optical resonator systems. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Comments: 8 pages, 4 figures, 57 references. The published version can be found via ULR: https://www.pnas.org/content/118/3/e2012982118

Journal ref: Proceedings of the National Academy of Sciences Vol. 118 No. 3 e2012982118 (19 Jan 2021)

arXiv:2109.03463 [pdf, ps, other]

Generalized Minimum Error Entropy for Adaptive Filtering

Authors: Jiacheng He, Gang Wang, Bei Peng, Zhenyu Feng, Kun Zhang

Abstract: Error entropy is a important nonlinear similarity measure, and it has received increasing attention in many practical applications. The default kernel function of error entropy criterion is Gaussian kernel function, however, which is not always the best choice. In our study, a novel concept, called generalized error entropy, utilizing the generalized Gaussian density (GGD) function as the kernel f… ▽ More Error entropy is a important nonlinear similarity measure, and it has received increasing attention in many practical applications. The default kernel function of error entropy criterion is Gaussian kernel function, however, which is not always the best choice. In our study, a novel concept, called generalized error entropy, utilizing the generalized Gaussian density (GGD) function as the kernel function is proposed. We further derivate the generalized minimum error entropy (GMEE) criterion, and a novel adaptive filtering called GMEE algorithm is derived by utilizing GMEE criterion. The stability, steady-state performance, and computational complexity of the proposed algorithm are investigated. Some simulation indicate that the GMEE algorithm performs well in Gaussian, sub-Gaussian, and super-Gaussian noises environment, respectively. Finally, the GMEE algorithm is applied to acoustic echo cancelation and performs well. △ Less

Submitted 1 September, 2023; v1 submitted 8 September, 2021; originally announced September 2021.

Comments: 9 pages, 8 figures

arXiv:2108.11623 [pdf, other]

Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian

Authors: Baiyu Peng, **gliang Duan, Jianyu Chen, Shengbo Eben Li, Gen** Xie, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun

Abstract: Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address thes… ▽ More Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. We first review the constrained policy optimization process from a feedback control perspective, which regards the penalty weight as the control input and the safe probability as the control output. Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller. We then unify them and present a proportional-integral Lagrangian method to get both their merits, with an integral separation technique to limit the integral value in a reasonable range. To accelerate training, the gradient of safe probability is computed in a model-based manner. We demonstrate our method can reduce the oscillations and conservatism of RL policy in a car-following simulation. To prove its practicality, we also apply our method to a real-world mobile robot navigation task, where our robot successfully avoids a moving obstacle with highly uncertain or even aggressive behaviors. △ Less

Submitted 26 August, 2021; originally announced August 2021.

arXiv:2103.14295 [pdf, other]

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

Authors: Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Abstract: Develo** robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, w… ▽ More Develo** robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot. To facilitate sim-to-real transfer, domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. The learned policies enable Cassie to perform a set of diverse and dynamic behaviors, while also being more robust than traditional controllers and prior learning-based methods that use residual control. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: To appear on 2021 International Conference on Robotics and Automation (ICRA 2021)

arXiv:2102.08539 [pdf, other]

Separated Proportional-Integral Lagrangian for Chance Constrained Reinforcement Learning

Authors: Baiyu Peng, Yao Mu, **gliang Duan, Yang Guan, Shengbo Eben Li, Jianyu Chen

Abstract: Safety is essential for reinforcement learning (RL) applied in real-world tasks like autonomous driving. Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements in real-world environment with uncertainty. Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic… ▽ More Safety is essential for reinforcement learning (RL) applied in real-world tasks like autonomous driving. Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements in real-world environment with uncertainty. Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic oscillations or cannot satisfy the constraints. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. Taking a control perspective, we first interpret the penalty method and the Lagrangian method as proportional feedback and integral feedback control, respectively. Then, a proportional-integral Lagrangian method is proposed to steady learning process while improving safety. To prevent integral overshooting and reduce conservatism, we introduce the integral separation technique inspired by PID control. Finally, an analytical gradient of the chance constraint is utilized for model-based policy optimization. The effectiveness of SPIL is demonstrated by a narrow car-following task. Experiments indicate that compared with previous methods, SPIL improves the performance while guaranteeing safety, with a steady learning process. △ Less

Submitted 16 February, 2021; originally announced February 2021.

arXiv:2012.10716 [pdf, other]

Model-Based Actor-Critic with Chance Constraint for Stochastic System

Authors: Baiyu Peng, Yao Mu, Yang Guan, Shengbo Eben Li, Yuming Yin, Jianyu Chen

Abstract: Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low convergence rate, or only learn a conservative policy. In this paper, we propose a model-based chance constrained actor-critic (CCAC) algorithm which can efficientl… ▽ More Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low convergence rate, or only learn a conservative policy. In this paper, we propose a model-based chance constrained actor-critic (CCAC) algorithm which can efficiently learn a safe and non-conservative policy. Different from existing methods that optimize a conservative lower bound, CCAC directly solves the original chance constrained problems, where the objective function and safe probability is simultaneously optimized with adaptive weights. In order to improve the convergence rate, CCAC utilizes the gradient of dynamic model to accelerate policy optimization. The effectiveness of CCAC is demonstrated by a stochastic car-following task. Experiments indicate that compared with previous RL methods, CCAC improves the performance while guaranteeing safety, with a five times faster convergence rate. It also has 100 times higher online computation efficiency than traditional safety techniques such as stochastic model predictive control. △ Less

Submitted 16 March, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

arXiv:2003.00848 [pdf, other]

Mixed Reinforcement Learning with Additive Stochastic Uncertainty

Authors: Yao Mu, Shengbo Eben Li, Chang Liu, Qi Sun, Bingbing Nie, Bo Cheng, Baiyu Peng

Abstract: Reinforcement learning (RL) methods often rely on massive exploration data to search optimal policies, and suffer from poor sampling efficiency. This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed. The dual r… ▽ More Reinforcement learning (RL) methods often rely on massive exploration data to search optimal policies, and suffer from poor sampling efficiency. This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed. The dual representations indicate the environmental model and the state-action data: the former can accelerate the learning process of RL, while its inherent model uncertainty generally leads to worse policy accuracy than the latter, which comes from direct measurements of states and actions. In the framework design of the mixed RL, the compensation of the additive stochastic model uncertainty is embedded inside the policy iteration RL framework by using explored state-action data via iterative Bayesian estimator (IBE). The optimal policy is then computed in an iterative way by alternating between policy evaluation (PEV) and policy improvement (PIM). The convergence of the mixed RL is proved using the Bellman's principle of optimality, and the recursive stability of the generated policy is proved via the Lyapunov's direct method. The effectiveness of the mixed RL is demonstrated by a typical optimal control problem of stochastic non-affine nonlinear systems (i.e., double lane change task with an automated vehicle). △ Less

Submitted 28 February, 2020; originally announced March 2020.

arXiv:2002.07699 [pdf, other]

Cognitive Biomarker Prioritization in Alzheimer's Disease using Brain Morphometric Data

Authors: Bo Peng, Xiaohui Yao, Shannon L. Risacher, Andrew J. Saykin, Li Shen, Xia Ning

Abstract: Background:Cognitive assessments represent the most common clinical routine for the diagnosis of Alzheimer's Disease (AD). Given a large number of cognitive assessment tools and time-limited office visits, it is important to determine a proper set of cognitive tests for different subjects. Most current studies create guidelines of cognitive test selection for a targeted population, but they are no… ▽ More Background:Cognitive assessments represent the most common clinical routine for the diagnosis of Alzheimer's Disease (AD). Given a large number of cognitive assessment tools and time-limited office visits, it is important to determine a proper set of cognitive tests for different subjects. Most current studies create guidelines of cognitive test selection for a targeted population, but they are not customized for each individual subject. In this manuscript, we develop a machine learning paradigm enabling personalized cognitive assessments prioritization. Method: We adapt a newly developed learning-to-rank approach PLTR to implement our paradigm. This method learns the latent scoring function that pushes the most effective cognitive assessments onto the top of the prioritization list. We also extend PLTR to better separate the most effective cognitive assessments and the less effective ones. Results: Our empirical study on the ADNI data shows that the proposed paradigm outperforms the state-of-the-art baselines on identifying and prioritizing individual-specific cognitive biomarkers. We conduct experiments in cross validation and level-out validation settings. In the two settings, our paradigm significantly outperforms the best baselines with improvement as much as 22.1% and 19.7%, respectively, on prioritizing cognitive features. Conclusions: The proposed paradigm achieves superior performance on prioritizing cognitive biomarkers. The cognitive biomarkers prioritized on top have great potentials to facilitate personalized diagnosis, disease subty**, and ultimately precision medicine in AD. △ Less

Submitted 12 November, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: This paper has been accepted by BMC MIDM

arXiv:1911.04470 [pdf, other]

doi 10.1109/TCSVT.2019.2936710

Semi-Heterogeneous Three-Way Joint Embedding Network for Sketch-Based Image Retrieval

Authors: Jianjun Lei, Yuxin Song, Bo Peng, Zhanyu Ma, Ling Shao, Yi-Zhe Song

Abstract: Sketch-based image retrieval (SBIR) is a challenging task due to the large cross-domain gap between sketches and natural images. How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR. In this paper, we propose a novel semi-heterogeneous three-way joint embedding network (Semi3-Net), which integrates three branches (a sketch branch,… ▽ More Sketch-based image retrieval (SBIR) is a challenging task due to the large cross-domain gap between sketches and natural images. How to align abstract sketches and natural images into a common high-level semantic space remains a key problem in SBIR. In this paper, we propose a novel semi-heterogeneous three-way joint embedding network (Semi3-Net), which integrates three branches (a sketch branch, a natural image branch, and an edgemap branch) to learn more discriminative cross-domain feature representations for the SBIR task. The key insight lies with how we cultivate the mutual and subtle relationships amongst the sketches, natural images, and edgemaps. A semi-heterogeneous feature map** is designed to extract bottom features from each domain, where the sketch and edgemap branches are shared while the natural image branch is heterogeneous to the other branches. In addition, a joint semantic embedding is introduced to embed the features from different domains into a common high-level semantic space, where all of the three branches are shared. To further capture informative features common to both natural images and the corresponding edgemaps, a co-attention model is introduced to conduct common channel-wise feature recalibration between different domains. A hybrid-loss mechanism is designed to align the three branches, where an alignment loss and a sketch-edgemap contrastive loss are presented to encourage the network to learn invariant cross-domain representations. Experimental results on two widely used category-level datasets (Sketchy and TU-Berlin Extension) demonstrate that the proposed method outperforms state-of-the-art methods. △ Less

Submitted 9 November, 2019; originally announced November 2019.

Comments: Accepted by IEEE Transactions on Circuits and Systems for Video Technology

arXiv:1911.03552 [pdf]

doi 10.1038/s41567-019-0746-7

Electromagnetically induced transparency at a chiral exceptional point

Authors: Changqing Wang, Xuefeng Jiang, Guangming Zhao, Mengzhen Zhang, Chia Wei Hsu, Bo Peng, A. Douglas Stone, Liang Jiang, Lan Yang

Abstract: Electromagnetically induced transparency, as a quantum interference effect to eliminate optical absorption in an opaque medium, has found extensive applications in slow light generation, optical storage, frequency conversion, optical quantum memory as well as enhanced nonlinear interactions at the few-photon level in all kinds of systems. Recently, there have been great interests in exceptional po… ▽ More Electromagnetically induced transparency, as a quantum interference effect to eliminate optical absorption in an opaque medium, has found extensive applications in slow light generation, optical storage, frequency conversion, optical quantum memory as well as enhanced nonlinear interactions at the few-photon level in all kinds of systems. Recently, there have been great interests in exceptional points, a spectral singularity that could be reached by tuning various parameters in open systems, to render unusual features to the physical systems, such as optical states with chirality. Here we theoretically and experimentally study transparency and absorption modulated by chiral optical states at exceptional points in an indirectly-coupled resonator system. By tuning one resonator to an exceptional point, transparency or absorption occurs depending on the chirality of the eigenstate. Our results demonstrate a new strategy to manipulate the light flow and the spectra of a photonic resonator system by exploiting a discrete optical state associated with specific chirality at an exceptional point as a unique control bit, which opens up a new horizon of controlling slow light using optical states. Compatible with the idea of state control in quantum gate operation, this strategy hence bridges optical computing and storage. △ Less

Submitted 8 November, 2019; originally announced November 2019.

Comments: 22 pages, 4 figures, 44 references

Journal ref: Nature Physics 16, 334-340 (2020)

arXiv:1907.04700 [pdf, other]

Cooperative Localization with Angular Measurements and Posterior Linearization

Authors: Yibo Wu, Bile Peng, Henk Wymeersch, Gonzalo Seco-Granados, Anastasios Kakkavas, Mario H. Castañeda Garcia, Richard A. Stirling-Gallacher

Abstract: The application of cooperative localization in vehicular networks is attractive to improve accuracy and coverage. Conventional distance measurements between vehicles are limited by the need for synchronization and provide no heading information of the vehicle. To address this, we present a cooperative localization algorithm using posterior linearization belief propagation (PLBP) utilizing angle-of… ▽ More The application of cooperative localization in vehicular networks is attractive to improve accuracy and coverage. Conventional distance measurements between vehicles are limited by the need for synchronization and provide no heading information of the vehicle. To address this, we present a cooperative localization algorithm using posterior linearization belief propagation (PLBP) utilizing angle-of-arrival (AoA)-only measurements. Simulation results show that both directional and positional root mean squared error (RMSE) of vehicles can be decreased significantly and converge to a low value in a few iterations. Furthermore, the influence of parameters for the vehicular network, such as vehicle density, communication radius, prior uncertainty and AoA measurements noise, is analyzed. △ Less

Submitted 10 July, 2019; originally announced July 2019.

Comments: Submitted for possible publication to an IEEE conference

arXiv:1904.09252 [pdf, ps, other]

Learning Physical-Layer Communication with Quantized Feedback

Authors: **xiang Song, Bile Peng, Christian Häger, Henk Wymeersch, Anant Sahai

Abstract: Data-driven optimization of transmitters and receivers can reveal new modulation and detection schemes and enable physical-layer communication over unknown channels. Previous work has shown that practical implementations of this approach require a feedback signal from the receiver to the transmitter. In this paper, we study the impact of quantized feedback in data-driven learning of physical-layer… ▽ More Data-driven optimization of transmitters and receivers can reveal new modulation and detection schemes and enable physical-layer communication over unknown channels. Previous work has shown that practical implementations of this approach require a feedback signal from the receiver to the transmitter. In this paper, we study the impact of quantized feedback in data-driven learning of physical-layer communication. A novel quantization method is proposed, which exploits the specific properties of the feedback signal and is suitable for non-stationary signal distributions. The method is evaluated for linear and nonlinear channels. Simulation results show that feedback quantization does not appreciably affect the learning process and can lead to excellent performance, even with $1$-bit quantization. In addition, it is shown that learning is surprisingly robust to noisy feedback where random bit flips are applied to the quantization bits. △ Less

Submitted 4 November, 2019; v1 submitted 19 April, 2019; originally announced April 2019.

arXiv:1710.06537 [pdf, other]

doi 10.1109/ICRA.2018.8460528

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Authors: Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

Abstract: Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts… ▽ More Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts. In this paper, we demonstrate a simple method to bridge this "reality gap". By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. This adaptivity enables the policies to generalize to the dynamics of the real world without any training on the physical system. Our approach is demonstrated on an object pushing task using a robotic arm. Despite being trained exclusively in simulation, our policies are able to maintain a similar level of performance when deployed on a real robot, reliably moving an object to a desired location from random initial configurations. We explore the impact of various design decisions and show that the resulting policies are robust to significant calibration error. △ Less

Submitted 2 March, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

Showing 1–42 of 42 results for author: Peng, B