-
A Supervised Information Enhanced Multi-Granularity Contrastive Learning Framework for EEG Based Emotion Recognition
Authors:
Xiang Li,
Jian Song,
Zhigang Zhao,
Chunxiao Wang,
Dawei Song,
Bin Hu
Abstract:
This study introduces a novel Supervised Info-enhanced Contrastive Learning framework for EEG based Emotion Recognition (SICLEER). SI-CLEER employs multi-granularity contrastive learning to create robust EEG contextual representations, potentiallyn improving emotion recognition effectiveness. Unlike existing methods solely guided by classification loss, we propose a joint learning model combining…
▽ More
This study introduces a novel Supervised Info-enhanced Contrastive Learning framework for EEG based Emotion Recognition (SICLEER). SI-CLEER employs multi-granularity contrastive learning to create robust EEG contextual representations, potentiallyn improving emotion recognition effectiveness. Unlike existing methods solely guided by classification loss, we propose a joint learning model combining self-supervised contrastive learning loss and supervised classification loss. This model optimizes both loss functions, capturing subtle EEG signal differences specific to emotion detection. Extensive experiments demonstrate SI-CLEER's robustness and superior accuracy on the SEED dataset compared to state-of-the-art methods. Furthermore, we analyze electrode performance, highlighting the significance of central frontal and temporal brain region EEGs in emotion detection. This study offers an universally applicable approach with potential benefits for diverse EEG classification tasks.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Stability and Performance Analysis of Discrete-Time ReLU Recurrent Neural Networks
Authors:
Sahel Vahedi Noori,
Bin Hu,
Geir Dullerud,
Peter Seiler
Abstract:
This paper presents sufficient conditions for the stability and $\ell_2$-gain performance of recurrent neural networks (RNNs) with ReLU activation functions. These conditions are derived by combining Lyapunov/dissipativity theory with Quadratic Constraints (QCs) satisfied by repeated ReLUs. We write a general class of QCs for repeated RELUs using known properties for the scalar ReLU. Our stability…
▽ More
This paper presents sufficient conditions for the stability and $\ell_2$-gain performance of recurrent neural networks (RNNs) with ReLU activation functions. These conditions are derived by combining Lyapunov/dissipativity theory with Quadratic Constraints (QCs) satisfied by repeated ReLUs. We write a general class of QCs for repeated RELUs using known properties for the scalar ReLU. Our stability and performance condition uses these QCs along with a "lifted" representation for the ReLU RNN. We show that the positive homogeneity property satisfied by a scalar ReLU does not expand the class of QCs for the repeated ReLU. We present examples to demonstrate the stability / performance condition and study the effect of the lifting horizon.
△ Less
Submitted 14 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Diffusion Model Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment
Authors:
Zhaoyang Wang,
Bo Hu,
Mingyang Zhang,
Jie Li,
Leida Li,
Maoguo Gong,
Xinbo Gao
Abstract:
Existing free-energy guided No-Reference Image Quality Assessment (NR-IQA) methods still suffer from finding a balance between learning feature information at the pixel level of the image and capturing high-level feature information and the efficient utilization of the obtained high-level feature information remains a challenge. As a novel class of state-of-the-art (SOTA) generative model, the dif…
▽ More
Existing free-energy guided No-Reference Image Quality Assessment (NR-IQA) methods still suffer from finding a balance between learning feature information at the pixel level of the image and capturing high-level feature information and the efficient utilization of the obtained high-level feature information remains a challenge. As a novel class of state-of-the-art (SOTA) generative model, the diffusion model exhibits the capability to model intricate relationships, enabling a comprehensive understanding of images and possessing a better learning of both high-level and low-level visual features. In view of these, we pioneer the exploration of the diffusion model into the domain of NR-IQA. Firstly, we devise a new diffusion restoration network that leverages the produced enhanced image and noise-containing images, incorporating nonlinear features obtained during the denoising process of the diffusion model, as high-level visual information. Secondly, two visual evaluation branches are designed to comprehensively analyze the obtained high-level feature information. These include the visual compensation guidance branch, grounded in the transformer architecture and noise embedding strategy, and the visual difference analysis branch, built on the ResNet architecture and the residual transposed attention block. Extensive experiments are conducted on seven public NR-IQA datasets, and the results demonstrate that the proposed model outperforms SOTA methods for NR-IQA.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Online Signed Sampling of Bandlimited Graph Signals
Authors:
Wenwei Liu,
Hui Feng,
Feng Ji,
Bo Hu
Abstract:
The theory of sampling and recovery of bandlimited graph signals has been extensively studied. However, in many cases, the observation of a signal is quite coarse. For example, users only provide simple comments such as "like" or "dislike" for a product on an e-commerce platform. This is a particular scenario where only the sign information of a graph signal can be measured. In this paper, we are…
▽ More
The theory of sampling and recovery of bandlimited graph signals has been extensively studied. However, in many cases, the observation of a signal is quite coarse. For example, users only provide simple comments such as "like" or "dislike" for a product on an e-commerce platform. This is a particular scenario where only the sign information of a graph signal can be measured. In this paper, we are interested in how to sample based on sign information in an online manner, by which the direction of the original graph signal can be estimated. The online signed sampling problem of a graph signal can be formulated as a Markov decision process in a finite horizon. Unfortunately, it is intractable for large size graphs. We propose a low-complexity greedy signed sampling algorithm (GSS) as well as a stop** criterion. Meanwhile, we prove that the objective function is adaptive monotonic and adaptive submodular, so that the performance is close enough to the global optimum with a lower bound. Finally, we demonstrate the effectiveness of the GSS algorithm by both synthesis and realworld data.
△ Less
Submitted 18 February, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition
Authors:
Yi Chang,
Zhao Ren,
Zixing Zhang,
Xin **g,
Kun Qian,
Xi Shao,
Bin Hu,
Tanja Schultz,
Björn W. Schuller
Abstract:
Speech contains rich information on the emotions of humans, and Speech Emotion Recognition (SER) has been an important topic in the area of human-computer interaction. The robustness of SER models is crucial, particularly in privacy-sensitive and reliability-demanding domains like private healthcare. Recently, the vulnerability of deep neural networks in the audio domain to adversarial attacks has…
▽ More
Speech contains rich information on the emotions of humans, and Speech Emotion Recognition (SER) has been an important topic in the area of human-computer interaction. The robustness of SER models is crucial, particularly in privacy-sensitive and reliability-demanding domains like private healthcare. Recently, the vulnerability of deep neural networks in the audio domain to adversarial attacks has become a popular area of research. However, prior works on adversarial attacks in the audio domain primarily rely on iterative gradient-based techniques, which are time-consuming and prone to overfitting the specific threat model. Furthermore, the exploration of sparse perturbations, which have the potential for better stealthiness, remains limited in the audio domain. To address these challenges, we propose a generator-based attack method to generate sparse and transferable adversarial examples to deceive SER models in an end-to-end and efficient manner. We evaluate our method on two widely-used SER datasets, Database of Elicited Mood in Speech (DEMoS) and Interactive Emotional dyadic MOtion CAPture (IEMOCAP), and demonstrate its ability to generate successful sparse adversarial examples in an efficient manner. Moreover, our generated adversarial examples exhibit model-agnostic transferability, enabling effective adversarial attacks on advanced victim models.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Segment Anything in Defect Detection
Authors:
Bozhen Hu,
Bin Gao,
Cheng Tan,
Tongle Wu,
Stan Z. Li
Abstract:
Defect detection plays a crucial role in infrared non-destructive testing systems, offering non-contact, safe, and efficient inspection capabilities. However, challenges such as low resolution, high noise, and uneven heating in infrared thermal images hinder comprehensive and accurate defect detection. In this study, we propose DefectSAM, a novel approach for segmenting defects on highly noisy the…
▽ More
Defect detection plays a crucial role in infrared non-destructive testing systems, offering non-contact, safe, and efficient inspection capabilities. However, challenges such as low resolution, high noise, and uneven heating in infrared thermal images hinder comprehensive and accurate defect detection. In this study, we propose DefectSAM, a novel approach for segmenting defects on highly noisy thermal images based on the widely adopted model, Segment Anything (SAM)\cite{kirillov2023segany}. Harnessing the power of a meticulously curated dataset generated through labor-intensive lab experiments and valuable prompts from experienced experts, DefectSAM surpasses existing state-of-the-art segmentation algorithms and achieves significant improvements in defect detection rates. Notably, DefectSAM excels in detecting weaker and smaller defects on complex and irregular surfaces, reducing the occurrence of missed detections and providing more accurate defect size estimations. Experimental studies conducted on various materials have validated the effectiveness of our solutions in defect detection, which hold significant potential to expedite the evolution of defect detection tools, enabling enhanced inspection capabilities and accuracy in defect identification.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
SG-GAN: Fine Stereoscopic-Aware Generation for 3D Brain Point Cloud Up-sampling from a Single Image
Authors:
Bowen Hu,
Baiying Lei,
Shuqiang Wang
Abstract:
In minimally-invasive brain surgeries with indirect and narrow operating environments, 3D brain reconstruction is crucial. However, as requirements of accuracy for some new minimally-invasive surgeries (such as brain-computer interface surgery) are higher and higher, the outputs of conventional 3D reconstruction, such as point cloud (PC), are facing the challenges that sample points are too sparse…
▽ More
In minimally-invasive brain surgeries with indirect and narrow operating environments, 3D brain reconstruction is crucial. However, as requirements of accuracy for some new minimally-invasive surgeries (such as brain-computer interface surgery) are higher and higher, the outputs of conventional 3D reconstruction, such as point cloud (PC), are facing the challenges that sample points are too sparse and the precision is insufficient. On the other hand, there is a scarcity of high-density point cloud datasets, which makes it challenging to train models for direct reconstruction of high-density brain point clouds. In this work, a novel model named stereoscopic-aware graph generative adversarial network (SG-GAN) with two stages is proposed to generate fine high-density PC conditioned on a single image. The Stage-I GAN sketches the primitive shape and basic structure of the organ based on the given image, yielding Stage-I point clouds. The Stage-II GAN takes the results from Stage-I and generates high-density point clouds with detailed features. The Stage-II GAN is capable of correcting defects and restoring the detailed features of the region of interest (ROI) through the up-sampling process. Furthermore, a parameter-free-attention-based free-transforming module is developed to learn the efficient features of input, while upholding a promising performance. Comparing with the existing methods, the SG-GAN model shows superior performance in terms of visual quality, objective measurements, and performance in classification, as demonstrated by comprehensive results measured by several evaluation metrics including PC-to-PC error and Chamfer distance.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
Learning the Kalman Filter with Fine-Grained Sample Complexity
Authors:
Xiangyuan Zhang,
Bin Hu,
Tamer Başar
Abstract:
We develop the first end-to-end sample complexity of model-free policy gradient (PG) methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the receding-horizon policy gradient (RHPG-KF) framework and demonstrate $\tilde{\mathcal{O}}(ε^{-2})$ sample complexity for RHPG-KF in learning a stabilizing filter that is $ε$-close to the optimal Kalman filter. Notably, the p…
▽ More
We develop the first end-to-end sample complexity of model-free policy gradient (PG) methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the receding-horizon policy gradient (RHPG-KF) framework and demonstrate $\tilde{\mathcal{O}}(ε^{-2})$ sample complexity for RHPG-KF in learning a stabilizing filter that is $ε$-close to the optimal Kalman filter. Notably, the proposed RHPG-KF framework does not require the system to be open-loop stable nor assume any prior knowledge of a stabilizing filter. Our results shed light on applying model-free PG methods to control a linear dynamical system where the state measurements could be corrupted by statistical noises and other (possibly adversarial) disturbances.
△ Less
Submitted 27 February, 2023; v1 submitted 29 January, 2023;
originally announced January 2023.
-
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data
Authors:
Yuhao Zhang,
Chen Xu,
Bojie Hu,
Chunliang Zhang,
Tong Xiao,
**gbo Zhu
Abstract:
We present a method for introducing a text encoder into pre-trained end-to-end speech translation systems. It enhances the ability of adapting one modality (i.e., source-language speech) to another (i.e., source-language text). Thus, the speech translation model can learn from both unlabeled and labeled data, especially when the source-language text data is abundant. Beyond this, we present a deno…
▽ More
We present a method for introducing a text encoder into pre-trained end-to-end speech translation systems. It enhances the ability of adapting one modality (i.e., source-language speech) to another (i.e., source-language text). Thus, the speech translation model can learn from both unlabeled and labeled data, especially when the source-language text data is abundant. Beyond this, we present a denoising method to build a robust text encoder that can deal with both normal and noisy text data. Our system sets new state-of-the-arts on the MuST-C En-De, En-Fr, and LibriSpeech En-Fr tasks.
△ Less
Submitted 4 December, 2022;
originally announced December 2022.
-
Linear RNNs Provably Learn Linear Dynamic Systems
Authors:
Lifu Wang,
Tianyu Wang,
Shengwei Yi,
Bo Shen,
Bo Hu,
Xing Cao
Abstract:
We study the learning ability of linear recurrent neural networks with Gradient Descent. We prove the first theoretical guarantee on linear RNNs to learn any stable linear dynamic system using any a large type of loss functions. For an arbitrary stable linear system with a parameter $ρ_C$ related to the transition matrix $C$, we show that despite the non-convexity of the parameter optimization los…
▽ More
We study the learning ability of linear recurrent neural networks with Gradient Descent. We prove the first theoretical guarantee on linear RNNs to learn any stable linear dynamic system using any a large type of loss functions. For an arbitrary stable linear system with a parameter $ρ_C$ related to the transition matrix $C$, we show that despite the non-convexity of the parameter optimization loss if the width of the RNN is large enough (and the required width in hidden layers does not rely on the length of the input sequence), a linear RNN can provably learn any stable linear dynamic system with the sample and time complexity polynomial in $\frac{1}{1-ρ_C}$. Our results provide the first theoretical guarantee to learn a linear RNN and demonstrate how can the recurrent structure help to learn a dynamic system.
△ Less
Submitted 22 October, 2023; v1 submitted 18 November, 2022;
originally announced November 2022.
-
Knowing the Past to Predict the Future: Reinforcement Virtual Learning
Authors:
Peng Zhang,
Yawen Huang,
Bingzhang Hu,
Shizheng Wang,
Haoran Duan,
Noura Al Moubayed,
Yefeng Zheng,
Yang Long
Abstract:
Reinforcement Learning (RL)-based control system has received considerable attention in recent decades. However, in many real-world problems, such as Batch Process Control, the environment is uncertain, which requires expensive interaction to acquire the state and reward values. In this paper, we present a cost-efficient framework, such that the RL model can evolve for itself in a Virtual Space us…
▽ More
Reinforcement Learning (RL)-based control system has received considerable attention in recent decades. However, in many real-world problems, such as Batch Process Control, the environment is uncertain, which requires expensive interaction to acquire the state and reward values. In this paper, we present a cost-efficient framework, such that the RL model can evolve for itself in a Virtual Space using the predictive models with only historical data. The proposed framework enables a step-by-step RL model to predict the future state and select optimal actions for long-sight decisions. The main focuses are summarized as: 1) how to balance the long-sight and short-sight rewards with an optimal strategy; 2) how to make the virtual model interacting with real environment to converge to a final learning policy. Under the experimental settings of Fed-Batch Process, our method consistently outperforms the existing state-of-the-art methods.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
Authors:
Xingang Guo,
Bin Hu
Abstract:
Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood. The optimal $\mathcal{H}_\infty$ control framework aims at designing a policy to minimize the closed-loop $\mathcal{H}_\infty$ norm, and is arguably the most fundamenta…
▽ More
Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood. The optimal $\mathcal{H}_\infty$ control framework aims at designing a policy to minimize the closed-loop $\mathcal{H}_\infty$ norm, and is arguably the most fundamental robust control paradigm. In this work, we show that direct policy search is guaranteed to find the global solution of the robust $\mathcal{H}_\infty$ state-feedback control design problem. Notice that policy search for optimal $\mathcal{H}_\infty$ control leads to a constrained nonconvex nonsmooth optimization problem, where the nonconvex feasible set consists of all the policies stabilizing the closed-loop dynamics. We show that for this nonsmooth optimization problem, all Clarke stationary points are global minimum. Next, we identify the coerciveness of the closed-loop $\mathcal{H}_\infty$ objective function, and prove that all the sublevel sets of the resultant policy search problem are compact. Based on these properties, we show that Goldstein's subgradient method and its implementable variants can be guaranteed to stay in the nonconvex feasible set and eventually find the global optimal solution of the $\mathcal{H}_\infty$ state-feedback synthesis problem. Our work builds a new connection between nonconvex nonsmooth optimization theory and robust control, leading to an interesting global convergence result for direct policy search on optimal $\mathcal{H}_\infty$ synthesis.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Sampling of Correlated Bandlimited Continuous Signals by Joint Time-vertex Graph Fourier Transform
Authors:
Zhongyi Ni,
Feng Ji,
Hang Sheng,
Hui Feng,
Bo Hu
Abstract:
When sampling multiple signals, the correlation between the signals can be exploited to reduce the overall number of samples. In this paper, we study the sampling theory of multiple correlated signals, using correlation to sample them at the lowest sampling rate. Based on the correlation between signal sources, we model multiple continuous-time signals as continuous time-vertex graph signals. The…
▽ More
When sampling multiple signals, the correlation between the signals can be exploited to reduce the overall number of samples. In this paper, we study the sampling theory of multiple correlated signals, using correlation to sample them at the lowest sampling rate. Based on the correlation between signal sources, we model multiple continuous-time signals as continuous time-vertex graph signals. The graph signals are projected onto orthogonal bases to remove spatial correlation and reduce dimensions by graph Fourier transform. When the bandwidths of the original signals and the reduced dimension signals are given, we prove the minimum sampling rate required for recovery of the original signals, and propose a feasible sampling scheme.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Weight-based Channel-model Matrix Framework provides a reasonable solution for EEG-based cross-dataset emotion recognition
Authors:
Huayu Chen,
Huanhuan He,
**g Zhu,
Shuting Sun,
Jianxiu Li,
Xuexiao Shao,
Junxiang Li,
Xiaowei Li,
Bin Hu
Abstract:
Cross-dataset emotion recognition as an extremely challenging task in the field of EEG-based affective computing is influenced by many factors, which makes the universal models yield unsatisfactory results. Facing the situation that lacks EEG information decoding research, we first analyzed the impact of different EEG information(individual, session, emotion and trial) for emotion recognition by s…
▽ More
Cross-dataset emotion recognition as an extremely challenging task in the field of EEG-based affective computing is influenced by many factors, which makes the universal models yield unsatisfactory results. Facing the situation that lacks EEG information decoding research, we first analyzed the impact of different EEG information(individual, session, emotion and trial) for emotion recognition by sample space visualization, sample aggregation phenomena quantification, and energy pattern analysis on five public datasets. Based on these phenomena and patterns, we provided the processing methods and interpretable work of various EEG differences. Through the analysis of emotional feature distribution patterns, the Individual Emotional Feature Distribution Difference(IEFDD) was found, which was also considered as the main factor of the stability for emotion recognition. After analyzing the limitations of traditional modeling approach suffering from IEFDD, the Weight-based Channel-model Matrix Framework(WCMF) was proposed. To reasonably characterize emotional feature distribution patterns, four weight extraction methods were designed, and the optimal was the correction T-test(CT) weight extraction method. Finally, the performance of WCMF was validated on cross-dataset tasks in two kinds of experiments that simulated different practical scenarios, and the results showed that WCMF had more stable and better emotion recognition ability.
△ Less
Submitted 3 November, 2022; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Constraining Pseudo-label in Self-training Unsupervised Domain Adaptation with Energy-based Model
Authors:
Lingsheng Kong,
Bo Hu,
Xiongchang Liu,
Jun Lu,
Jane You,
Xiaofeng Liu
Abstract:
Deep learning is usually data starved, and the unsupervised domain adaptation (UDA) is developed to introduce the knowledge in the labeled source domain to the unlabeled target domain. Recently, deep self-training presents a powerful means for UDA, involving an iterative process of predicting the target domain and then taking the confident predictions as hard pseudo-labels for retraining. However,…
▽ More
Deep learning is usually data starved, and the unsupervised domain adaptation (UDA) is developed to introduce the knowledge in the labeled source domain to the unlabeled target domain. Recently, deep self-training presents a powerful means for UDA, involving an iterative process of predicting the target domain and then taking the confident predictions as hard pseudo-labels for retraining. However, the pseudo-labels are usually unreliable, thus easily leading to deviated solutions with propagated errors. In this paper, we resort to the energy-based model and constrain the training of the unlabeled target sample with an energy function minimization objective. It can be achieved via a simple additional regularization or an energy-based loss. This framework allows us to gain the benefits of the energy-based model, while retaining strong discriminative performance following a plug-and-play fashion. The convergence property and its connection with classification expectation minimization are investigated. We deliver extensive experiments on the most popular and large-scale UDA benchmarks of image classification as well as semantic segmentation to demonstrate its generality and effectiveness.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
Authors:
Xingang Guo,
Bin Hu
Abstract:
In this paper, we consider the policy evaluation problem in multi-agent reinforcement learning (MARL) and derive exact closed-form formulas for the finite-time mean-squared estimation errors of decentralized temporal difference (TD) learning with linear function approximation. Our analysis hinges upon the fact that the decentralized TD learning method can be viewed as a Markov jump linear system (…
▽ More
In this paper, we consider the policy evaluation problem in multi-agent reinforcement learning (MARL) and derive exact closed-form formulas for the finite-time mean-squared estimation errors of decentralized temporal difference (TD) learning with linear function approximation. Our analysis hinges upon the fact that the decentralized TD learning method can be viewed as a Markov jump linear system (MJLS). Then standard MJLS theory can be applied to quantify the mean and covariance matrix of the estimation error of the decentralized TD method at every time step. Various implications of our exact formulas on the algorithm performance are also discussed. An interesting finding is that under a necessary and sufficient stability condition, the mean-squared TD estimation error will converge to an exact limit at a specific exponential rate.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
EEG based Emotion Recognition: A Tutorial and Review
Authors:
Xiang Li,
Yazhou Zhang,
Prayag Tiwari,
Dawei Song,
Bin Hu,
Meihong Yang,
Zhigang Zhao,
Neeraj Kumar,
Pekka Marttinen
Abstract:
Emotion recognition technology through analyzing the EEG signal is currently an essential concept in Artificial Intelligence and holds great potential in emotional health care, human-computer interaction, multimedia content recommendation, etc. Though there have been several works devoted to reviewing EEG-based emotion recognition, the content of these reviews needs to be updated. In addition, tho…
▽ More
Emotion recognition technology through analyzing the EEG signal is currently an essential concept in Artificial Intelligence and holds great potential in emotional health care, human-computer interaction, multimedia content recommendation, etc. Though there have been several works devoted to reviewing EEG-based emotion recognition, the content of these reviews needs to be updated. In addition, those works are either fragmented in content or only focus on specific techniques adopted in this area but neglect the holistic perspective of the entire technical routes. Hence, in this paper, we review from the perspective of researchers who try to take the first step on this topic. We review the recent representative works in the EEG-based emotion recognition research and provide a tutorial to guide the researchers to start from the beginning. The scientific basis of EEG-based emotion recognition in the psychological and physiological levels is introduced. Further, we categorize these reviewed works into different technical routes and illustrate the theoretical basis and the research motivation, which will help the readers better understand why those techniques are studied and employed. At last, existing challenges and future investigations are also discussed in this paper, which guides the researchers to decide potential future research directions.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Audio Self-supervised Learning: A Survey
Authors:
Shuo Liu,
Adria Mallol-Ragolta,
Emilia Parada-Cabeleiro,
Kun Qian,
Xin **g,
Alexander Kathan,
Bin Hu,
Bjoern W. Schuller
Abstract:
Inspired by the humans' cognitive ability to generalise knowledge and skills, Self-Supervised Learning (SSL) targets at discovering general representations from large-scale data without requiring human annotations, which is an expensive and time consuming task. Its success in the fields of computer vision and natural language processing have prompted its recent adoption into the field of audio and…
▽ More
Inspired by the humans' cognitive ability to generalise knowledge and skills, Self-Supervised Learning (SSL) targets at discovering general representations from large-scale data without requiring human annotations, which is an expensive and time consuming task. Its success in the fields of computer vision and natural language processing have prompted its recent adoption into the field of audio and speech processing. Comprehensive reviews summarising the knowledge in audio SSL are currently missing. To fill this gap, in the present work, we provide an overview of the SSL methods used for audio and speech processing applications. Herein, we also summarise the empirical works that exploit the audio modality in multi-modal SSL frameworks, and the existing suitable benchmarks to evaluate the power of SSL in the computer audition domain. Finally, we discuss some open problems and point out the future directions on the development of audio SSL.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods
Authors:
Xingang Guo,
Bin Hu
Abstract:
Value-based methods play a fundamental role in Markov decision processes (MDPs) and reinforcement learning (RL). In this paper, we present a unified control-theoretic framework for analyzing valued-based methods such as value computation (VC), value iteration (VI), and temporal difference (TD) learning (with linear function approximation). Built upon an intrinsic connection between value-based met…
▽ More
Value-based methods play a fundamental role in Markov decision processes (MDPs) and reinforcement learning (RL). In this paper, we present a unified control-theoretic framework for analyzing valued-based methods such as value computation (VC), value iteration (VI), and temporal difference (TD) learning (with linear function approximation). Built upon an intrinsic connection between value-based methods and dynamic systems, we can directly use existing convex testing conditions in control theory to derive various convergence results for the aforementioned value-based methods. These testing conditions are convex programs in form of either linear programming (LP) or semidefinite programming (SDP), and can be solved to construct Lyapunov functions in a straightforward manner. Our analysis reveals some intriguing connections between feedback control systems and RL algorithms. It is our hope that such connections can inspire more work at the intersection of system/control theory and RL.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Delving into Rectifiers in Style-Based Image Translation
Authors:
Yipeng Zhang,
Bingliang Hu,
Hailong Ning,
Quang Wang
Abstract:
While modern image translation techniques can create photorealistic synthetic images, they have limited style controllability, thus could suffer from translation errors. In this work, we show that the activation function is one of the crucial components in controlling the direction of image synthesis. Specifically, we explicitly demonstrated that the slope parameters of the rectifier could change…
▽ More
While modern image translation techniques can create photorealistic synthetic images, they have limited style controllability, thus could suffer from translation errors. In this work, we show that the activation function is one of the crucial components in controlling the direction of image synthesis. Specifically, we explicitly demonstrated that the slope parameters of the rectifier could change the data distribution and be used independently to control the direction of translation. To improve the style controllability, two simple but effective techniques are proposed, including Adaptive ReLU (AdaReLU) and structural adaptive function. The AdaReLU can dynamically adjust the slope parameters according to the target style and can be utilized to increase the controllability by combining with Adaptive Instance Normalization (AdaIN). Meanwhile, the structural adaptative function enables rectifiers to manipulate the structure of feature maps more effectively. It is composed of the proposed structural convolution (StruConv), an efficient convolutional module that can choose the area to be activated based on the mean and variance specified by AdaIN. Extensive experiments show that the proposed techniques can greatly increase the network controllability and output diversity in style-based image translation tasks.
△ Less
Submitted 23 November, 2021; v1 submitted 20 November, 2021;
originally announced November 2021.
-
Recovery of Graph Signals from Sign Measurements
Authors:
Wenwei Liu,
Hui Feng,
Kaixuan Wang,
Feng Ji,
Bo Hu
Abstract:
Sampling and interpolation have been extensively studied, in order to reconstruct or estimate the entire graph signal from the signal values on a subset of vertexes, of which most achievements are about continuous signals. While in a lot of signal processing tasks, signals are not fully observed, and only the signs of signals are available, for example a rating system may only provide several simp…
▽ More
Sampling and interpolation have been extensively studied, in order to reconstruct or estimate the entire graph signal from the signal values on a subset of vertexes, of which most achievements are about continuous signals. While in a lot of signal processing tasks, signals are not fully observed, and only the signs of signals are available, for example a rating system may only provide several simple options. In this paper, the reconstruction of band-limited graph signals based on sign sampling is discussed and a greedy sampling strategy is proposed. The simulation experiments are presented, and the greedy sampling algorithm is compared with random sampling algorithm, which verify the validity of the proposed approach.
△ Less
Submitted 26 September, 2021;
originally announced September 2021.
-
3D Brain Reconstruction by Hierarchical Shape-Perception Network from a Single Incomplete Image
Authors:
Bowen Hu,
Baiying Lei,
Shuqiang Wang,
Yong Liu,
Bingchuan Wang,
Min Gan,
Yanyan Shen
Abstract:
3D shape reconstruction is essential in the navigation of minimally-invasive and auto robot-guided surgeries whose operating environments are indirect and narrow, and there have been some works that focused on reconstructing the 3D shape of the surgical organ through limited 2D information available. However, the lack and incompleteness of such information caused by intraoperative emergencies (suc…
▽ More
3D shape reconstruction is essential in the navigation of minimally-invasive and auto robot-guided surgeries whose operating environments are indirect and narrow, and there have been some works that focused on reconstructing the 3D shape of the surgical organ through limited 2D information available. However, the lack and incompleteness of such information caused by intraoperative emergencies (such as bleeding) and risk control conditions have not been considered. In this paper, a novel hierarchical shape-perception network (HSPN) is proposed to reconstruct the 3D point clouds (PCs) of specific brains from one single incomplete image with low latency. A branching predictor and several hierarchical attention pipelines are constructed to generate point clouds that accurately describe the incomplete images and then complete these point clouds with high quality. Meanwhile, attention gate blocks (AGBs) are designed to efficiently aggregate geometric local features of incomplete PCs transmitted by hierarchical attention pipelines and internal features of reconstructing point clouds. With the proposed HSPN, 3D shape perception and completion can be achieved spontaneously. Comprehensive results measured by Chamfer distance and PC-to-PC error demonstrate that the performance of the proposed HSPN outperforms other competitive methods in terms of qualitative displays, quantitative experiment, and classification evaluation.
△ Less
Submitted 11 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
A Point Cloud Generative Model via Tree-Structured Graph Convolutions for 3D Brain Shape Reconstruction
Authors:
Bowen Hu,
Baiying Lei,
Yanyan Shen,
Yong Liu,
Shuqiang Wang
Abstract:
Fusing medical images and the corresponding 3D shape representation can provide complementary information and microstructure details to improve the operational performance and accuracy in brain surgery. However, compared to the substantial image data, it is almost impossible to obtain the intraoperative 3D shape information by using physical methods such as sensor scanning, especially in minimally…
▽ More
Fusing medical images and the corresponding 3D shape representation can provide complementary information and microstructure details to improve the operational performance and accuracy in brain surgery. However, compared to the substantial image data, it is almost impossible to obtain the intraoperative 3D shape information by using physical methods such as sensor scanning, especially in minimally invasive surgery and robot-guided surgery. In this paper, a general generative adversarial network (GAN) architecture based on graph convolutional networks is proposed to reconstruct the 3D point clouds (PCs) of brains by using one single 2D image, thus relieving the limitation of acquiring 3D shape data during surgery. Specifically, a tree-structured generative mechanism is constructed to use the latent vector effectively and transfer features between hidden layers accurately. With the proposed generative model, a spontaneous image-to-PC conversion is finished in real-time. Competitive qualitative and quantitative experimental results have been achieved on our model. In multiple evaluation methods, the proposed model outperforms another common point cloud generative model PointOutNet.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation
Authors:
Kai Ye,
Yinru Ye,
Minqiang Yang,
Bin Hu
Abstract:
The main challenges of image-to-image (I2I) translation are to make the translated image realistic and retain as much information from the source domain as possible. To address this issue, we propose a novel architecture, termed as IEGAN, which removes the encoder of each network and introduces an encoder that is independent of other networks. Compared with previous models, it embodies three advan…
▽ More
The main challenges of image-to-image (I2I) translation are to make the translated image realistic and retain as much information from the source domain as possible. To address this issue, we propose a novel architecture, termed as IEGAN, which removes the encoder of each network and introduces an encoder that is independent of other networks. Compared with previous models, it embodies three advantages of our model: Firstly, it is more directly and comprehensively to grasp image information since the encoder no longer receives loss from generator and discriminator. Secondly, the independent encoder allows each network to focus more on its own goal which makes the translated image more realistic. Thirdly, the reduction in the number of encoders performs more unified image representation. However, when the independent encoder applies two down-sampling blocks, it's hard to extract semantic information. To tackle this problem, we propose deep and shallow information space containing characteristic and semantic information, which can guide the model to translate high-quality images under the task with significant shape or texture change. We compare IEGAN with other previous models, and conduct researches on semantic information consistency and component ablation at the same time. These experiments show the superiority and effectiveness of our architecture. Our code is published on: https://github.com/Elvinky/IEGAN.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
A Hybrid Wired/Wireless Deterministic Network for Smart Grid
Authors:
Bin Hu,
Hamid Gharavi
Abstract:
With the rapid growth of time-critical applications in smart grid, robotics, autonomous vehicles, and industrial automation, demand for high reliability, low latency and strictly bounded jitter is sharply increasing. High-precision time synchronization communications, such as Time Triggered Ethernet (TTE), have been successfully developed for wired networks. However, the high cost of deploying add…
▽ More
With the rapid growth of time-critical applications in smart grid, robotics, autonomous vehicles, and industrial automation, demand for high reliability, low latency and strictly bounded jitter is sharply increasing. High-precision time synchronization communications, such as Time Triggered Ethernet (TTE), have been successfully developed for wired networks. However, the high cost of deploying additional equipment and extra wiring limits the scalability of these networks. Therefore, in this paper, a hybrid wired/wireless high-precision time synchronization network based on a combination of high-speed TTE and 5G Ultra-Reliable and Low-Latency Communications (URLLC) is proposed. The main motivation is to comply with the low latency, low jitter, and high reliability requirements of time critical applications, such as smart grid synchrophasor communications. Therefore, in the proposed hybrid network architecture, a high-speed TTE is considered as the main bus (i.e., backbone network), whereas a Precision Time Protocol (PTP) aided 5G-URLLC-based wireless access is used as a sub-network. The main challenge is to achieve interoperability between the PTP aided URLLC and the TTE, while ensuring high precision timing and synchronization. The simulation results demonstrate the impact of the PTP-aided URLLC in maintaining network reliability, latency, and jitter in full coordination with the TTE-network.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Authors:
Chen Xu,
Bojie Hu,
Yanyang Li,
Yuhao Zhang,
shen huang,
Qi Ju,
Tong Xiao,
**gbo Zhu
Abstract:
Encoder pre-training is promising in end-to-end Speech Translation (ST), given the fact that speech-to-translation data is scarce. But ST encoders are not simple instances of Automatic Speech Recognition (ASR) or Machine Translation (MT) encoders. For example, we find that ASR encoders lack the global context representation, which is necessary for translation, whereas MT encoders are not designed…
▽ More
Encoder pre-training is promising in end-to-end Speech Translation (ST), given the fact that speech-to-translation data is scarce. But ST encoders are not simple instances of Automatic Speech Recognition (ASR) or Machine Translation (MT) encoders. For example, we find that ASR encoders lack the global context representation, which is necessary for translation, whereas MT encoders are not designed to deal with long but locally attentive acoustic sequences. In this work, we propose a Stacked Acoustic-and-Textual Encoding (SATE) method for speech translation. Our encoder begins with processing the acoustic sequence as usual, but later behaves more like an MT encoder for a global representation of the input sequence. In this way, it is straightforward to incorporate the pre-trained models into the system. Also, we develop an adaptor module to alleviate the representation inconsistency between the pre-trained ASR encoder and MT encoder, and develop a multi-teacher knowledge distillation method to preserve the pre-training knowledge. Experimental results on the LibriSpeech En-Fr and MuST-C En-De ST tasks show that our method achieves state-of-the-art BLEU scores of 18.3 and 25.2. To our knowledge, we are the first to develop an end-to-end ST system that achieves comparable or even better BLEU performance than the cascaded ST counterpart when large-scale ASR and MT data is available.
△ Less
Submitted 15 June, 2021; v1 submitted 12 May, 2021;
originally announced May 2021.
-
On Imitation Learning of Linear Control Policies: Enforcing Stability and Robustness Constraints via LMI Conditions
Authors:
Aaron Havens,
Bin Hu
Abstract:
When applying imitation learning techniques to fit a policy from expert demonstrations, one can take advantage of prior stability/robustness assumptions on the expert's policy and incorporate such control-theoretic prior knowledge explicitly into the learning process. In this paper, we formulate the imitation learning of linear policies as a constrained optimization problem, and present efficient…
▽ More
When applying imitation learning techniques to fit a policy from expert demonstrations, one can take advantage of prior stability/robustness assumptions on the expert's policy and incorporate such control-theoretic prior knowledge explicitly into the learning process. In this paper, we formulate the imitation learning of linear policies as a constrained optimization problem, and present efficient methods which can be used to enforce stability and robustness constraints during the learning processes. Specifically, we show that one can guarantee the closed-loop stability and robustness by posing linear matrix inequality (LMI) constraints on the fitted policy. Then both the projected gradient descent method and the alternating direction method of multipliers (ADMM) method can be applied to solve the resulting constrained policy fitting problem. Finally, we provide numerical results to demonstrate the effectiveness of our methods in producing linear polices with various stability and robustness guarantees.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
The impact of data volume on performance of deep learning based building rooftop extraction using very high spatial resolution aerial images
Authors:
Hongjie He,
Ke Yang,
Yuwei Cai,
Zijian Jiang,
Qiutong Yu,
Kun Zhao,
Junbo Wang,
Sarah Narges Fatholahi,
Yan Liu,
Hasti Andon Petrosians,
Bingxu Hu,
Liyuan Qing,
Zhehan Zhang,
Hongzhang Xu,
Siyu Li,
Kyle Gao,
Linlin Xu,
Jonathan Li
Abstract:
Building rooftop data are of importance in several urban applications and in natural disaster management. In contrast to traditional surveying and map**, by using high spatial resolution aerial images, deep learning-based building rooftops extraction methods are efficient and accurate. Although more training data is preferred in deep learning-based tasks, the effect of data volume on building ex…
▽ More
Building rooftop data are of importance in several urban applications and in natural disaster management. In contrast to traditional surveying and map**, by using high spatial resolution aerial images, deep learning-based building rooftops extraction methods are efficient and accurate. Although more training data is preferred in deep learning-based tasks, the effect of data volume on building extraction models is underexplored. Therefore, the paper explores the impact of data volume on the performance of building rooftop extraction from very-high-spatial-resolution (VHSR) images using deep learning-based methods. To do so, we manually labelled 0.12m spatial resolution aerial images and perform a comparative analysis of models trained on datasets of different sizes using popular deep learning architectures for segmentation tasks, including Fully Convolutional Networks (FCN)-8s, U-Net and DeepLabv3+. The experiments showed that with more training data, algorithms converged faster and achieved higher accuracy, while better algorithms were able to better mitigate the lack of training data.
△ Less
Submitted 4 October, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Regularized Recovery by Multi-order Partial Hypergraph Total Variation
Authors:
Ruyuan Qu,
Jiaqi He,
Hui Feng,
Chongbin Xu,
Bo Hu
Abstract:
Capturing complex high-order interactions among data is an important task in many scenarios. A common way to model high-order interactions is to use hypergraphs whose topology can be mathematically represented by tensors. Existing methods use a fixed-order tensor to describe the topology of the whole hypergraph, which ignores the divergence of different-order interactions. In this work, we take th…
▽ More
Capturing complex high-order interactions among data is an important task in many scenarios. A common way to model high-order interactions is to use hypergraphs whose topology can be mathematically represented by tensors. Existing methods use a fixed-order tensor to describe the topology of the whole hypergraph, which ignores the divergence of different-order interactions. In this work, we take this divergence into consideration, and propose a multi-order hypergraph Laplacian and the corresponding total variation. Taking this total variation as a regularization term, we can utilize the topology information contained by it to smooth the hypergraph signal. This can help distinguish different-order interactions and represent high-order interactions accurately.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.
-
Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity
Authors:
Kaiqing Zhang,
Xiangyuan Zhang,
Bin Hu,
Tamer Başar
Abstract:
Direct policy search serves as one of the workhorses in modern reinforcement learning (RL), and its applications in continuous control tasks have recently attracted increasing attention. In this work, we investigate the convergence theory of policy gradient (PG) methods for learning the linear risk-sensitive and robust controller. In particular, we develop PG methods that can be implemented in a d…
▽ More
Direct policy search serves as one of the workhorses in modern reinforcement learning (RL), and its applications in continuous control tasks have recently attracted increasing attention. In this work, we investigate the convergence theory of policy gradient (PG) methods for learning the linear risk-sensitive and robust controller. In particular, we develop PG methods that can be implemented in a derivative-free fashion by sampling system trajectories, and establish both global convergence and sample complexity results in the solutions of two fundamental settings in risk-sensitive and robust control: the finite-horizon linear exponential quadratic Gaussian, and the finite-horizon linear-quadratic disturbance attenuation problems. As a by-product, our results also provide the first sample complexity for the global convergence of PG methods on solving zero-sum linear-quadratic dynamic games, a nonconvex-nonconcave minimax optimization problem that serves as a baseline setting in multi-agent reinforcement learning (MARL) with continuous spaces. One feature of our algorithms is that during the learning phase, a certain level of robustness/risk-sensitivity of the controller is preserved, which we termed as the implicit regularization property, and is an essential requirement in safety-critical control systems.
△ Less
Submitted 30 December, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Co-design of Optimal Transmission Power and Controller for Networked Control Systems Under State-dependent Markovian Channels
Authors:
Bin Hu,
Tua A. Tamba
Abstract:
This paper considers a co-design problem for industrial networked control systems to ensure both the stability and efficiency properties of such systems. The assurance of such properties is particularly challenging due to the fact that wireless communications in industrial environments are not only subject to shadow fading but also stochastically correlated with their surrounding environments. To…
▽ More
This paper considers a co-design problem for industrial networked control systems to ensure both the stability and efficiency properties of such systems. The assurance of such properties is particularly challenging due to the fact that wireless communications in industrial environments are not only subject to shadow fading but also stochastically correlated with their surrounding environments. To address such challenges, this paper first introduces a novel state-dependent Markov channel (SD-MC) model that explicitly captures the state-dependent features of industrial wireless communication systems by defining the proposed model's transition probabilities as a function of both its environment's states and transmission power. Under the proposed channel model, sufficient conditions on Maximum Allowable Transmission Interval (MATI) are presented to ensure both asymptotic stability in expectation and almost sure asymptotic stability properties of a continuous nonlinear control system with state-dependent fading channels. Based on such conditions, the co-design problem is then formulated as a constrained polynomial optimization problem (CPOP), which can be efficiently solved using semidefinite programming methods for the case of a two-state state dependent Markovian channel. The solutions to such a CPOP represent optimal control and power strategies that optimize the average expected joint costs in an infinite time horizon while still respect the stability constraints. For a general SD-MC model, this paper further shows that sub-optimal solutions can be obtained from linear programming formulations of the considered CPOP. Simulation results are given to illustrate the efficacy of the proposed co-design scheme.
△ Less
Submitted 27 November, 2020;
originally announced November 2020.
-
Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global Convergence
Authors:
Joao Paulo Jansch-Porto,
Bin Hu,
Geir Dullerud
Abstract:
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the global convergence of gradient-based policy optimization methods for quadratic optimal control of discrete-time Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for…
▽ More
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the global convergence of gradient-based policy optimization methods for quadratic optimal control of discrete-time Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, with static state feedback controllers and quadratic performance costs. Despite the non-convexity of the resultant problem, we are still able to identify several useful properties such as coercivity, gradient dominance, and almost smoothness. Based on these properties, we show global convergence of three types of policy optimization methods: the gradient descent method; the Gauss-Newton method; and the natural policy gradient method. We prove that all three methods converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which is mean-square stabilizing. Some numerical examples are presented to support the theory. This work brings new insights for understanding the performance of policy gradient methods on the Markovian jump linear quadratic control problem.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results
Authors:
Yuqian Zhou,
Michael Kwan,
Kyle Tolentino,
Neil Emerton,
Sehoon Lim,
Tim Large,
Lijiang Fu,
Zhihong Pan,
Baopu Li,
Qirui Yang,
Yihao Liu,
Jigang Tang,
Tao Ku,
Shibin Ma,
Bingnan Hu,
Jiarong Wang,
Densen Puthussery,
Hrishikesh P S,
Melvin Kuriakose,
Jiji C V,
Varun Sundar,
Sumanth Hegde,
Divya Kothandaraman,
Kaushik Mitra,
Akashdeep Jassal
, et al. (20 additional authors not shown)
Abstract:
This paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, ei…
▽ More
This paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, eight and nine teams submitted the results during the testing phase for each track. The results in the paper are state-of-the-art restoration performance of Under-Display Camera Restoration. Datasets and paper are available at https://yzhouas.github.io/projects/UDC/udc.html.
△ Less
Submitted 18 August, 2020;
originally announced August 2020.
-
Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump Systems
Authors:
Joao Paulo Jansch-Porto,
Bin Hu,
Geir Dullerud
Abstract:
Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regu…
▽ More
Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regulator (LQR), our proposed problem leads to a special hybrid MDP (with mixed continuous and discrete variables) and poses significant new challenges due to the appearance of an underlying Markov jump parameter governing the mode of the system dynamics. Specifically, the state of a MJLS does not form a Markov chain and hence one cannot study the MJLS control problem as a MDP with solely continuous state variable. However, one can augment the state and the jump parameter to obtain a MDP with a mixed continuous/discrete state space. We discuss how control theory sheds light on the policy parameterization of such hybrid MDPs. Then we modify the widely used natural policy gradient method to directly learn the optimal state feedback control policy for MJLS without identifying either the system dynamics or the transition probability of the switching parameter. We implement the (data-driven) natural policy gradient method on different MJLS examples. Our simulation results suggest that the natural gradient method can efficiently learn the optimal controller for MJLS with unknown dynamics.
△ Less
Submitted 14 July, 2020; v1 submitted 4 June, 2020;
originally announced June 2020.
-
Convolutional Neural Network for Behavioral Modeling and Predistortion of Wideband Power Amplifiers
Authors:
Xin Hu,
Zhijun Liu,
Xiaofei Yu,
Yulong Zhao,
Wenhua Chen,
Biao Hu,
Xuekun Du,
Xiang Li,
Mohamed Helaoui,
Weidong Wang,
Fadhel M. Ghannouchi
Abstract:
In this paper, we propose a novel behavior model for wideband PAs using a real-valued time-delay convolutional neural network (RVTDCNN). The input data of the model are sorted and arranged as the graph composed of the in-phase and quadrature (I/Q) components and envelope-dependent terms of current and past signals. We design a pre-designed filter using the convolutional layer to extract the basis…
▽ More
In this paper, we propose a novel behavior model for wideband PAs using a real-valued time-delay convolutional neural network (RVTDCNN). The input data of the model are sorted and arranged as the graph composed of the in-phase and quadrature (I/Q) components and envelope-dependent terms of current and past signals. We design a pre-designed filter using the convolutional layer to extract the basis functions required for the PA forward or reverse modeling. The generated rich basis functions are modeled using a simple fully connected layer. Because of the weight sharing characteristics of the convolutional structure, the strong memory effect does not lead to a obvious increase in the complexity of the model. Meanwhile, the extraction effect of the pre-designed filter also reduces the training complexity of the model. The experimental results show that the performance of the RVTDCNN model is almost the same as the NN models and the multilayer NN models.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
On a notion of stochastic zeroing barrier function
Authors:
Tua A. Tamba,
Bin Hu,
Yul Y. Nazaruddin
Abstract:
This note examines the safety verification of the solution of Ito stochastic differential equations using the notion of stochastic zeroing barrier function. The main tools in the proposed method include Ito calculus and the concept of stochastic invariant set.
This note examines the safety verification of the solution of Ito stochastic differential equations using the notion of stochastic zeroing barrier function. The main tools in the proposed method include Ito calculus and the concept of stochastic invariant set.
△ Less
Submitted 4 April, 2020;
originally announced April 2020.
-
A Novel Decision Tree for Depression Recognition in Speech
Authors:
Zhenyu Liu,
Dongyu Wang,
Lan Zhang,
Bin Hu
Abstract:
Depression is a common mental disorder worldwide which causes a range of serious outcomes. The diagnosis of depression relies on patient-reported scales and psychiatrist interview which may lead to subjective bias. In recent years, more and more researchers are devoted to depression recognition in speech , which may be an effective and objective indicator. This study proposes a new speech segment…
▽ More
Depression is a common mental disorder worldwide which causes a range of serious outcomes. The diagnosis of depression relies on patient-reported scales and psychiatrist interview which may lead to subjective bias. In recent years, more and more researchers are devoted to depression recognition in speech , which may be an effective and objective indicator. This study proposes a new speech segment fusion method based on decision tree to improve the depression recognition accuracy and conducts a validation on a sample of 52 subjects (23 depressed patients and 29 healthy controls). The recognition accuracy are 75.8% and 68.5% for male and female respectively on gender-dependent models. It can be concluded from the data that the proposed decision tree model can improve the depression classification performance.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
A study of resting-state EEG biomarkers for depression recognition
Authors:
Shuting Sun,
Jianxiu Li,
Huayu Chen,
Tao Gong,
Xiaowei Li,
Bin Hu
Abstract:
Background: Depression has become a major health burden worldwide, and effective detection depression is a great public-health challenge. This Electroencephalography (EEG)-based research is to explore the effective biomarkers for depression recognition. Methods: Resting state EEG data was collected from 24 major depressive patients (MDD) and 29 normal controls using 128 channel HydroCel Geodesic S…
▽ More
Background: Depression has become a major health burden worldwide, and effective detection depression is a great public-health challenge. This Electroencephalography (EEG)-based research is to explore the effective biomarkers for depression recognition. Methods: Resting state EEG data was collected from 24 major depressive patients (MDD) and 29 normal controls using 128 channel HydroCel Geodesic Sensor Net (HCGSN). To better identify depression, we extracted different types of EEG features including linear features, nonlinear features and functional connectivity features phase lagging index (PLI) to comprehensively analyze the EEG signals in patients with MDD. And using different feature selection methods and classifiers to evaluate the optimal feature sets. Results: Functional connectivity feature PLI is superior to the linear features and nonlinear features. And when combining all the types of features to classify MDD patients, we can obtain the highest classification accuracy 82.31% using ReliefF feature selection method and logistic regression (LR) classifier. Analyzing the distribution of optimal feature set, it was found that intrahemispheric connection edges of PLI were much more than the interhemispheric connection edges, and the intrahemispheric connection edges had a significant differences between two groups. Conclusion: Functional connectivity feature PLI plays an important role in depression recognition. Especially, intrahemispheric connection edges of PLI might be an effective biomarker to identify depression. And statistic results suggested that MDD patients might exist functional dysfunction in left hemisphere.
△ Less
Submitted 23 February, 2020;
originally announced February 2020.
-
Depression Detection using Resting State Three-channel EEG Signal
Authors:
Qiuxia Shi,
Ang Liu,
Rongyan Chen,
Jian Shen,
Qinglin Zhao,
Bin Hu
Abstract:
In universal environment, a patient-friendly inexpensive method is needed to realize the early diagnosis of depression, which is believed to be an effective way to reduce the mortality of depression. The purpose of this study is only to collect EEG signal from three electrodes Fp1, Fpz and Fp2, then the linear and nonlinear features of EEG used to classify depression patients and healthy controls.…
▽ More
In universal environment, a patient-friendly inexpensive method is needed to realize the early diagnosis of depression, which is believed to be an effective way to reduce the mortality of depression. The purpose of this study is only to collect EEG signal from three electrodes Fp1, Fpz and Fp2, then the linear and nonlinear features of EEG used to classify depression patients and healthy controls. The EEG recordings were carried out on a group of 18 medication-free depressive patients and 25 gender and age matched controls. In this paper, the selected features include three linear (maximum, mean and center values of the power) and three nonlinear features (correlation dimension, Renyi entropy and C0 complexity). The accuracy and effectiveness of classification model between depressive and control subjects were calculated using leave-one-out cross-validation. The experimental results indicate that selected three channel EEG and features can distinguish the subjects between depression and normal beings, the classification accuracy is 72.25%. It is hoped that the performed results can provide more choices for the early diagnosis of depression in a universal environment.
△ Less
Submitted 26 February, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Sampling Policy Design for Tracking Time-Varying Graph Signals with Adaptive Budget Allocation
Authors:
Xuan Xie,
Hui Feng,
Bo Hu
Abstract:
There have been many works that focus on the sampling set design for a static graph signal, but few for time-varying graph signals (GS). In this paper, we concentrate on how to select vertices to sample and how to allocate the sampling budget for a time-varying GS to achieve a minimal tracking error for the long-term. In the Kalman Filter (KF) framework, the problem of sampling policy design and b…
▽ More
There have been many works that focus on the sampling set design for a static graph signal, but few for time-varying graph signals (GS). In this paper, we concentrate on how to select vertices to sample and how to allocate the sampling budget for a time-varying GS to achieve a minimal tracking error for the long-term. In the Kalman Filter (KF) framework, the problem of sampling policy design and budget allocation is formulated as an infinite horizon sequential decision process, in which the optimal sampling policy is obtained by Dynamic Programming (DP). Since the optimal policy is intractable, an approximate algorithm is proposed by truncating the infinite horizon. By introducing a new tool for analyzing the convexity or concavity of composite functions, we prove that the truncated problem is convex. Finally, we demonstrate the performance of the proposed approach through numerical experiments.
△ Less
Submitted 22 October, 2020; v1 submitted 14 February, 2020;
originally announced February 2020.
-
Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems
Authors:
Joao Paulo Jansch-Porto,
Bin Hu,
Geir Dullerud
Abstract:
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non…
▽ More
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non-convexity of the resultant problem the unique stationary point is the global optimal solution. Next, we prove that the Gauss-Newton method and the natural policy gradient method converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which stabilizes the closed-loop dynamics in the mean square sense. We propose a novel Lyapunov argument to fix a key stability issue in the convergence proof. Finally, we present a numerical example to support our theory. Our work brings new insights for understanding the performance of policy learning methods on controlling unknown MJLS.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Fast Color-guided Depth Denoising for RGB-D Images by Graph Filtering
Authors:
Qiwei Huang,
Ruikang Li,
Zidong Jiang,
Wei Feng,
Sijie Lin,
Hui Feng,
Bo Hu
Abstract:
Depth images captured by off-the-shelf RGB-D cameras suffer from much stronger noise than color images. In this paper, we propose a method to denoise the depth images in RGB-D images by color-guided graph filtering. Our iterative method contains two components: color-guided similarity graph construction, and graph filtering on the depth signal. Implemented in graph vertex domain, filtering is acce…
▽ More
Depth images captured by off-the-shelf RGB-D cameras suffer from much stronger noise than color images. In this paper, we propose a method to denoise the depth images in RGB-D images by color-guided graph filtering. Our iterative method contains two components: color-guided similarity graph construction, and graph filtering on the depth signal. Implemented in graph vertex domain, filtering is accelerated as computation only occurs among neighboring vertices. Experimental results show that our method outperforms state-of-art depth image denoising methods significantly both on quality and efficiency.
△ Less
Submitted 7 December, 2019; v1 submitted 4 December, 2019;
originally announced December 2019.
-
High-Resolution, Respiratory-Resolved Coronary MRA Using a Phyllotaxis-Reordered Variable-Density 3D Cones Trajectory
Authors:
Srivathsan P. Koundinyan,
Corey A. Baron,
Mario O. Malave,
Frank Ong,
Nii Okai Addy,
Joseph Y. Cheng,
Phillip C. Yang,
Bob S. Hu,
Dwight G. Nishimura
Abstract:
Purpose: To develop a respiratory-resolved motion-compensation method for free-breathing, high-resolution coronary magnetic resonance angiography using a 3D cones trajectory.
Methods: To achieve respiratory-resolved 0.98 mm resolution images in a clinically relevant scan time, we undersample the imaging data with a variable-density 3D cones trajectory. For retrospective motion compensation, tran…
▽ More
Purpose: To develop a respiratory-resolved motion-compensation method for free-breathing, high-resolution coronary magnetic resonance angiography using a 3D cones trajectory.
Methods: To achieve respiratory-resolved 0.98 mm resolution images in a clinically relevant scan time, we undersample the imaging data with a variable-density 3D cones trajectory. For retrospective motion compensation, translational estimates from 3D image-based navigators (3D iNAVs) are used to bin the imaging data into four phases from end-expiration to end-inspiration. To ensure pseudo-random undersampling within each respiratory phase, we devise a phyllotaxis readout ordering scheme mindful of eddy current artifacts in steady state free precession imaging. Following binning, residual 3D translational motion within each phase is computed using the 3D iNAVs and corrected for in the imaging data. The noise-like aliasing characteristic of the combined phyllotaxis and cones sampling pattern is leveraged in a compressed sensing reconstruction with spatial and temporal regularization to reduce aliasing in each of the respiratory phases.
Results: In a volunteer and 5 patients, respiratory motion compensation using the proposed method yields improved image quality compared to non-respiratory-resolved approaches with no motion correction and with 3D translational correction. Qualitative assessment by two cardiologists indicates the superior sharpness of coronary segments reconstructed with the proposed method (P < 0.01).
Conclusion: The proposed method better mitigates motion artifacts in free-breathing, high-resolution coronary angiography exams compared to translational correction.
△ Less
Submitted 27 October, 2019;
originally announced October 2019.
-
Unraveling the Effect of Spatial Resolution and Scan Acceleration on 3D Image-Based Navigators for Respiratory Motion Tracking in Coronary MR Angiography
Authors:
Srivathsan P. Koundinyan,
Joseph Y. Cheng,
Mario O. Malave,
Phillip C. Yang,
Bob S. Hu,
Dwight G. Nishimura,
Corey A. Baron
Abstract:
Purpose: To study the accuracy of motion information extracted from beat-to-beat 3D image-based navigators (3D iNAVs) collected using a variable-density cones trajectory with different combinations of spatial resolutions and scan acceleration factors.
Methods: Fully sampled, breath-held 4.4 mm 3D iNAV datasets for six respiratory phases are acquired in a volunteer. Ground truth translational and…
▽ More
Purpose: To study the accuracy of motion information extracted from beat-to-beat 3D image-based navigators (3D iNAVs) collected using a variable-density cones trajectory with different combinations of spatial resolutions and scan acceleration factors.
Methods: Fully sampled, breath-held 4.4 mm 3D iNAV datasets for six respiratory phases are acquired in a volunteer. Ground truth translational and nonrigid motion information is derived from these datasets. Subsequently, the motion estimates from synthesized undersampled 3D iNAVs with isotropic spatial resolutions of 4.4 mm (acceleration factor = 10.9), 5.4 mm (acceleration factor = 7.2), 6.4 mm (acceleration factor = 4.2), and 7.8 mm (acceleration factor = 2.9) are assessed against the ground truth information. The undersampled 3D iNAV configuration with the highest accuracy motion estimates in simulation is then compared with the originally proposed 4.4 mm undersampled 3D iNAV in six volunteer studies.
Results: The simulations indicate that for navigators beyond certain scan acceleration factors, the accuracy of motion estimates is compromised due to errors from residual aliasing and blurring/smoothening effects following compressed sensing reconstruction. The 6.4 mm 3D iNAV achieves an acceptable spatial resolution with a small acceleration factor, resulting in the highest accuracy motion information among all assessed undersampled 3D iNAVs. Reader scores for six volunteer studies demonstrate superior coronary vessel sharpness when applying an autofocusing nonrigid correction technique using the 6.4 mm 3D iNAVs in place of 4.4 mm 3D iNAVs.
Conclusion: Undersampled 6.4 mm 3D iNAVs enable motion tracking with improved accuracy relative to previously proposed undersampled 4.4 mm 3D iNAVs.
△ Less
Submitted 27 October, 2019;
originally announced October 2019.
-
Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence
Authors:
Kaiqing Zhang,
Bin Hu,
Tamer Başar
Abstract:
Policy optimization (PO) is a key ingredient for reinforcement learning (RL). For control design, certain constraints are usually enforced on the policies to optimize, accounting for either the stability, robustness, or safety concerns on the system. Hence, PO is by nature a constrained (nonconvex) optimization in most cases, whose global convergence is challenging to analyze in general. More impo…
▽ More
Policy optimization (PO) is a key ingredient for reinforcement learning (RL). For control design, certain constraints are usually enforced on the policies to optimize, accounting for either the stability, robustness, or safety concerns on the system. Hence, PO is by nature a constrained (nonconvex) optimization in most cases, whose global convergence is challenging to analyze in general. More importantly, some constraints that are safety-critical, e.g., the $\mathcal{H}_\infty$-norm constraint that guarantees the system robustness, are difficult to enforce as the PO methods proceed. Recently, policy gradient methods have been shown to converge to the global optimum of linear quadratic regulator (LQR), a classical optimal control problem, without regularizing/projecting the control iterates onto the stabilizing set, its (implicit) feasible set. This striking result is built upon the coercive property of the cost, ensuring that the iterates remain feasible as the cost decreases. In this paper, we study the convergence theory of PO for $\mathcal{H}_2$ linear control with $\mathcal{H}_\infty$-norm robustness guarantee. One significant new feature of this problem is the lack of coercivity, i.e., the cost may have finite value around the feasible set boundary, breaking the existing analysis for LQR. Interestingly, we show that two PO methods enjoy the implicit regularization property, i.e., the iterates preserve the $\mathcal{H}_\infty$ robustness constraint as if they are regularized by the algorithms. Furthermore, despite the nonconvexity of the problem, we show that these algorithms converge to the globally optimal policies with globally sublinear rates, avoiding all suboptimal stationary points/local minima, and with locally (super-)linear rates under certain conditions.
△ Less
Submitted 14 February, 2021; v1 submitted 21 October, 2019;
originally announced October 2019.
-
Bayesian Design of Sampling Set for Bandlimited Graph Signals
Authors:
Xuan Xie,
Junhao Yu,
Hui Feng,
Bo Hu
Abstract:
The design of sampling set (DoS) for bandlimited graph signals (GS) has been extensively studied in recent years, but few of them exploit the benefits of the stochastic prior of GS. In this work, we introduce the optimization framework for Bayesian DoS of bandlimited GS. We also illustrate how the choice of different sampling sets affects the estimation error and how the prior knowledge influences…
▽ More
The design of sampling set (DoS) for bandlimited graph signals (GS) has been extensively studied in recent years, but few of them exploit the benefits of the stochastic prior of GS. In this work, we introduce the optimization framework for Bayesian DoS of bandlimited GS. We also illustrate how the choice of different sampling sets affects the estimation error and how the prior knowledge influences the result of DoS compared with the non-Bayesian DoS by the aid of analyzing Gershgorin discs of error metric matrix. Finally, based on our analysis, we propose a heuristic algorithm for DoS to avoid solving the optimization problem directly.
△ Less
Submitted 7 September, 2019;
originally announced September 2019.
-
On Critical Sampling of Time-Vertex Graph Signals
Authors:
Junhao Yu,
Xuan Xie,
Hui Feng,
Bo Hu
Abstract:
Joint time-vertex graph signals are pervasive in real-world. This paper focuses on the fundamental problem of sampling and reconstruction of joint time-vertex graph signals. We prove the existence and the necessary condition of a critical sampling set using minimum number of samples in time and graph domain respectively. The theory proposed in this paper suggests to assign heterogeneous sampling p…
▽ More
Joint time-vertex graph signals are pervasive in real-world. This paper focuses on the fundamental problem of sampling and reconstruction of joint time-vertex graph signals. We prove the existence and the necessary condition of a critical sampling set using minimum number of samples in time and graph domain respectively. The theory proposed in this paper suggests to assign heterogeneous sampling pattern for each node in a network under the constraint of minimum resources. An efficient algorithm is also provided to construct a critical sampling set.
△ Less
Submitted 19 November, 2019; v1 submitted 5 September, 2019;
originally announced September 2019.
-
A Preliminary Study on Data Augmentation of Deep Learning for Image Classification
Authors:
Benlin Hu,
Cheng Lei,
Dong Wang,
Shu Zhang,
Zhenyu Chen
Abstract:
Deep learning models have a large number of freeparameters that need to be calculated by effective trainingof the models on a great deal of training data to improvetheir generalization performance. However, data obtaining andlabeling is expensive in practice. Data augmentation is one of themethods to alleviate this problem. In this paper, we conduct apreliminary study on how three variables (augme…
▽ More
Deep learning models have a large number of freeparameters that need to be calculated by effective trainingof the models on a great deal of training data to improvetheir generalization performance. However, data obtaining andlabeling is expensive in practice. Data augmentation is one of themethods to alleviate this problem. In this paper, we conduct apreliminary study on how three variables (augmentation method,augmentation rate and size of basic dataset per label) can affectthe accuracy of deep learning for image classification. The studyprovides some guidelines: (1) it is better to use transformationsthat alter the geometry of the images rather than those justlighting and color. (2) 2-3 times augmentation rate is good enoughfor training. (3) the smaller amount of data, the more obviouscontributions could have.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Active Sampling for Approximately Bandlimited Graph Signals
Authors:
Sijie Lin,
Xuan Xie,
Hui Feng,
Bo Hu
Abstract:
This paper investigates the active sampling for estimation of approximately bandlimited graph signals. With the assistance of a graph filter, an approximately bandlimited graph signal can be formulated by a Gaussian random field over the graph. In contrast to offline sampling set design methods which usually rely on accurate prior knowledge about the model, unknown parameters in signal and noise d…
▽ More
This paper investigates the active sampling for estimation of approximately bandlimited graph signals. With the assistance of a graph filter, an approximately bandlimited graph signal can be formulated by a Gaussian random field over the graph. In contrast to offline sampling set design methods which usually rely on accurate prior knowledge about the model, unknown parameters in signal and noise distribution are allowed in the proposed active sampling algorithm. The active sampling process is divided into two alternating stages: unknown parameters are first estimated by Expectation Maximization (EM), with which the next node to sample is selected based on historical observations according to predictive uncertainty. Validated by simulations compared with related approaches, the proposed algorithm can reduce the sample size to reach a certain estimation accuracy.
△ Less
Submitted 16 February, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Robust Beamforming for Downlink 3D-MIMO Systems with $l_1$-norm Bounded CSI Uncertainty
Authors:
Kai Liu,
Hui Feng,
Tao Yang,
Bo Hu
Abstract:
In this paper, a novel robust beamforming scheme is proposed in three dimensional multi-input multi-output (3D-MIMO) systems. As one of the typical deployments of massive MIMO, a 3D-MIMO system owns sparse channels in angular domain. Thus, various of sparse channel estimation algorithms produce sparse channel estimation errors which can be utilized to narrow down the perturbation region of imperfe…
▽ More
In this paper, a novel robust beamforming scheme is proposed in three dimensional multi-input multi-output (3D-MIMO) systems. As one of the typical deployments of massive MIMO, a 3D-MIMO system owns sparse channels in angular domain. Thus, various of sparse channel estimation algorithms produce sparse channel estimation errors which can be utilized to narrow down the perturbation region of imperfect CSI. We investigate a $l_1$-norm bounded channel uncertainty model for the robust beamforming problems, which captures the sparse nature of channel errors. Compared with the conventional spherical uncertainty, we prove that the scheme with $l_1$-norm bounded uncertainty consumes less beamforming power with the same signal to interference and noise ratio (SINR) thresholds. The proposed scheme is reformulated as a second-order cone programming (SOCP) and simulation results verify the effectiveness of our algorithm.
△ Less
Submitted 1 November, 2018;
originally announced December 2018.