Search | arXiv e-print repository

A Mel Spectrogram Enhancement Paradigm Based on CWT in Speech Synthesis

Authors: Guoqiang Hu, Huaning Tan, Ruilai Li

Abstract: Acoustic features play an important role in improving the quality of the synthesised speech. Currently, the Mel spectrogram is a widely employed acoustic feature in most acoustic models. However, due to the fine-grained loss caused by its Fourier transform process, the clarity of speech synthesised by Mel spectrogram is compromised in mutant signals. In order to obtain a more detailed Mel spectrog… ▽ More Acoustic features play an important role in improving the quality of the synthesised speech. Currently, the Mel spectrogram is a widely employed acoustic feature in most acoustic models. However, due to the fine-grained loss caused by its Fourier transform process, the clarity of speech synthesised by Mel spectrogram is compromised in mutant signals. In order to obtain a more detailed Mel spectrogram, we propose a Mel spectrogram enhancement paradigm based on the continuous wavelet transform (CWT). This paradigm introduces an additional task: a more detailed wavelet spectrogram, which like the post-processing network takes as input the Mel spectrogram output by the decoder. We choose Tacotron2 and Fastspeech2 for experimental validation in order to test autoregressive (AR) and non-autoregressive (NAR) speech systems, respectively. The experimental results demonstrate that the speech synthesised using the model with the Mel spectrogram enhancement paradigm exhibits higher MOS, with an improvement of 0.14 and 0.09 compared to the baseline model, respectively. These findings provide some validation for the universality of the enhancement paradigm, as they demonstrate the success of the paradigm in different architectures. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10052 [pdf, other]

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Authors: Haoyu Wang, Guoqiang Hu, Guodong Lin, Wei-Qiang Zhang, Jian Li

Abstract: As a robust and large-scale multilingual speech recognition model, Whisper has demonstrated impressive results in many low-resource and out-of-distribution scenarios. However, its encoder-decoder structure hinders its application to streaming speech recognition. In this paper, we introduce Simul-Whisper, which uses the time alignment embedded in Whisper's cross-attention to guide auto-regressive d… ▽ More As a robust and large-scale multilingual speech recognition model, Whisper has demonstrated impressive results in many low-resource and out-of-distribution scenarios. However, its encoder-decoder structure hinders its application to streaming speech recognition. In this paper, we introduce Simul-Whisper, which uses the time alignment embedded in Whisper's cross-attention to guide auto-regressive decoding and achieve chunk-based streaming ASR without any fine-tuning of the pre-trained model. Furthermore, we observe the negative effect of the truncated words at the chunk boundaries on the decoding results and propose an integrate-and-fire-based truncation detection model to address this issue. Experiments on multiple languages and Whisper architectures show that Simul-Whisper achieves an average absolute word error rate degradation of only 1.46% at a chunk size of 1 second, which significantly outperforms the current state-of-the-art baseline. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted by INTERSPEECH 2024

arXiv:2406.09304 [pdf]

Self-reconfigurable Multifunctional Memristive Nociceptor for Intelligent Robotics

Authors: Shengbo Wang, Mingchao Fang, Lekai Song, Cong Li, Jian Zhang, Arokia Nathan, Guohua Hu, Shuo Gao

Abstract: Artificial nociceptors, mimicking human-like stimuli perception, are of significance for intelligent robotics to work in hazardous and dynamic scenarios. One of the most essential characteristics of the human nociceptor is its self-adjustable attribute, which indicates that the threshold of determination of a potentially hazardous stimulus relies on environmental knowledge. This critical attribute… ▽ More Artificial nociceptors, mimicking human-like stimuli perception, are of significance for intelligent robotics to work in hazardous and dynamic scenarios. One of the most essential characteristics of the human nociceptor is its self-adjustable attribute, which indicates that the threshold of determination of a potentially hazardous stimulus relies on environmental knowledge. This critical attribute has been currently omitted, but it is highly desired for artificial nociceptors. Inspired by these shortcomings, this article presents, for the first time, a Self-Directed Channel (SDC) memristor-based self-reconfigurable nociceptor, capable of perceiving hazardous pressure stimuli under different temperatures and demonstrates key features of tactile nociceptors, including 'threshold,' 'no-adaptation,' and 'sensitization.' The maximum amplification of hazardous external stimuli is 1000%, and its response characteristics dynamically adapt to current temperature conditions by automatically altering the generated modulation schemes for the memristor. The maximum difference ratio of the response of memristors at different temperatures is 500%, and this adaptability closely mimics the functions of biological tactile nociceptors, resulting in accurate danger perception in various conditions. Beyond temperature adaptation, this memristor-based nociceptor has the potential to integrate different sensory modalities by applying various sensors, thereby achieving human-like perception capabilities in real-world environments. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 14 pages, 4 figures

arXiv:2406.00497 [pdf, ps, other]

Recent Advances in End-to-End Simultaneous Speech Translation

Authors: Xiaoqian Liu, Guoqiang Hu, Yangfan Du, Erfeng He, YingFeng Luo, Chen Xu, Tong Xiao, **gbo Zhu

Abstract: Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles.… ▽ More Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles. Secondly, satisfying real-time requirements presents inherent difficulties due to the need for immediate translation output. Thirdly, striking a balance between translation quality and latency constraints remains a critical challenge. Finally, the scarcity of annotated data adds another layer of complexity to the task. Through our exploration of these challenges and the proposed solutions, we aim to provide valuable insights into the current landscape of SimulST research and suggest promising directions for future exploration. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2403.06439 [pdf, other]

Wide-Field, High-Resolution Reconstruction in Computational Multi-Aperture Miniscope Using a Fourier Neural Network

Authors: Qianwan Yang, Ruipeng Guo, Guorong Hu, Yujia Xue, Yunzhe Li, Lei Tian

Abstract: Traditional fluorescence microscopy is constrained by inherent trade-offs among resolution, field-of-view, and system complexity. To navigate these challenges, we introduce a simple and low-cost computational multi-aperture miniature microscope, utilizing a microlens array for single-shot wide-field, high-resolution imaging. Addressing the challenges posed by extensive view multiplexing and non-lo… ▽ More Traditional fluorescence microscopy is constrained by inherent trade-offs among resolution, field-of-view, and system complexity. To navigate these challenges, we introduce a simple and low-cost computational multi-aperture miniature microscope, utilizing a microlens array for single-shot wide-field, high-resolution imaging. Addressing the challenges posed by extensive view multiplexing and non-local, shift-variant aberrations in this device, we present SV-FourierNet, a novel multi-channel Fourier neural network. SV-FourierNet facilitates high-resolution image reconstruction across the entire imaging field through its learned global receptive field. We establish a close relationship between the physical spatially-varying point-spread functions and the network's learned effective receptive field. This ensures that SV-FourierNet has effectively encapsulated the spatially-varying aberrations in our system, and learned a physically meaningful function for image reconstruction. Training of SV-FourierNet is conducted entirely on a physics-based simulator. We showcase wide-field, high-resolution video reconstructions on colonies of freely moving C. elegans and imaging of a mouse brain section. Our computational multi-aperture miniature microscope, augmented with SV-FourierNet, represents a major advancement in computational microscopy and may find broad applications in biomedical research and other fields requiring compact microscopy solutions. △ Less

Submitted 30 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2402.16908 [pdf]

Lightweight, error-tolerant edge detection using memristor-enabled stochastic logics

Authors: Lekai Song, Pengyu Liu, **gfang Pei, Yang Liu, Songwei Liu, Shengbo Wang, Leonard W. T. Ng, Tawfique Hasan, Kong-Pang Pun, Shuo Gao, Guohua Hu

Abstract: The demand for efficient edge vision has spurred the interest in develo** stochastic computing approaches for performing image processing tasks. Memristors with inherent stochasticity readily introduce probability into the computations and thus enable stochastic image processing computations. Here, we present a stochastic computing approach for edge detection, a fundamental image processing tech… ▽ More The demand for efficient edge vision has spurred the interest in develo** stochastic computing approaches for performing image processing tasks. Memristors with inherent stochasticity readily introduce probability into the computations and thus enable stochastic image processing computations. Here, we present a stochastic computing approach for edge detection, a fundamental image processing technique, facilitated with memristor-enabled stochastic logics. Specifically, we integrate the memristors with logic circuits and harness the stochasticity from the memristors to realize compact stochastic logics for stochastic number encoding and processing. The stochastic numbers, exhibiting well-regulated probabilities and correlations, can be processed to perform logic operations with statistical probabilities. This can facilitate lightweight stochastic edge detection for edge visual scenarios characterized with high-level noise errors. As a practical demonstration, we implement a hardware stochastic Roberts cross operator using the stochastic logics, and prove its exceptional edge detection performance, remarkably, with 95% less computational cost while withstanding 50% bit-flip errors. The results underscore the great potential of our stochastic edge detection approach in develo** lightweight, error-tolerant edge vision hardware and systems for autonomous driving, virtual/augmented reality, medical imaging diagnosis, industrial automation, and beyond. △ Less

Submitted 20 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

arXiv:2312.05763 [pdf, ps, other]

Fluid Antennas-Enabled Multiuser Uplink: A Low-Complexity Gradient Descent for Total Transmit Power Minimization

Authors: Guojie Hu, Qingqing Wu, Kui Xu, Jian Ouyang, Jiangbo Si, Yunlong Cai, Naofal Al-Dhahir

Abstract: We investigate multiuser uplink communication from multiple single-antenna users to a base station (BS), which is equipped with a movable-antenna (MA) array and adopts zero-forcing receivers to decode multiple signals. We aim to optimize the MAs' positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. After applying transformations, we show t… ▽ More We investigate multiuser uplink communication from multiple single-antenna users to a base station (BS), which is equipped with a movable-antenna (MA) array and adopts zero-forcing receivers to decode multiple signals. We aim to optimize the MAs' positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. After applying transformations, we show that the problem is equivalent to minimizing the sum of each eigenvalue's reciprocal of a matrix, which is a function of all MAs' positions. Subsequently, the projected gradient descent (PGD) method is utilized to find a locally optimal solution. In particular, different from the latest related work, we exploit the eigenvalue decomposition to successfully derive a closed-form gradient for the PGD, which facilitates the practical implementation greatly. We demonstrate by simulations that via careful optimization for all MAs' positions in our proposed design, the total transmit power of all users can be decreased significantly as compared to competitive benchmarks. △ Less

Submitted 8 January, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

arXiv:2311.11814 [pdf, ps, other]

Movable-Antenna-Array-Enabled Communications with CoMP Reception

Authors: Guojie Hu, Qingqing Wu, Jian Ouyang, Kui Xu, Yunlong Cai, Naofal Al-Dhahir

Abstract: We consider the movable-antenna (MA) arrayenabled wireless communication with coordinate multi-point (CoMP) reception, where multiple destinations adopt the maximal ratio combination technique to jointly decode the common message sent from the transmitter equipped with the MA array. Our goal is to maximize the effective received signal-to-noise ratio, by jointly optimizing the transmit beamforming… ▽ More We consider the movable-antenna (MA) arrayenabled wireless communication with coordinate multi-point (CoMP) reception, where multiple destinations adopt the maximal ratio combination technique to jointly decode the common message sent from the transmitter equipped with the MA array. Our goal is to maximize the effective received signal-to-noise ratio, by jointly optimizing the transmit beamforming and the positions of the MA array. Although the formulated problem is highly non-convex, we reveal that it is fundamental to maximize the principal eigenvalue of a hermite channel matrix which is a function of the positions of the MA array. The corresponding sub-problem is still non-convex, for which we develop a computationally efficient algorithm. Afterwards, the optimal transmit beamforming is determined with a closed-form solution. In addition, the theoretical performance upper bound is analyzed. Since the MA array brings an additional spatial degree of freedom by flexibly adjusting all antennas' positions, it achieves significant performance gain compared to competitive benchmarks. △ Less

Submitted 25 January, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.07104 [pdf, ps, other]

Secure Wireless Communication via Movable-Antenna Array

Authors: Guojie Hu, Qingqing Wu, Kui Xu, Jiangbo Si, Naofal Al-Dhahir

Abstract: Movable antenna (MA) array is a novel technology recently developed where positions of transmit/receive antennas can be flexibly adjusted in the specified region to reconfigure the wireless channel and achieve a higher capacity. In this letter, we, for the first time, investigate the MA array-assisted physical-layer security where the confidential information is transmitted from a MA array-enabled… ▽ More Movable antenna (MA) array is a novel technology recently developed where positions of transmit/receive antennas can be flexibly adjusted in the specified region to reconfigure the wireless channel and achieve a higher capacity. In this letter, we, for the first time, investigate the MA array-assisted physical-layer security where the confidential information is transmitted from a MA array-enabled Alice to a single-antenna Bob, in the presence of multiple single-antenna and colluding eavesdroppers. We aim to maximize the achievable secrecy rate by jointly designing the transmit beamforming and positions of all antennas at Alice subject to the transmit power budget and specified regions for positions of all transmit antennas. The resulting problem is highly non-convex, for which the projected gradient ascent (PGA) and the alternating optimization methods are utilized to obtain a high-quality suboptimal solution. Simulation results demonstrate that since the additional spatial degree of freedom (DoF) can be fully exploited, the MA array significantly enhances the secrecy rate compared to the conventional fixed-position antenna (FPA) array. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.02376 [pdf, ps, other]

Intelligent Reflecting Surface-Aided Wireless Communication with Movable Elements

Authors: Guojie Hu, Qingqing Wu, Dognhui Xu, Kui Xu, Jiangbo Si, Yunlong Cai, Naofal Al-Dhahir

Abstract: Intelligent reflecting surface (IRS) has been recognized as a powerful technology for boosting communication performance. To reduce manufacturing and control costs, it is preferable to consider discrete phase shifts (DPSs) for IRS, which are set by default as uniformly distributed in the range of $[ - π,π)$ in the literature. Such setting, however, cannot achieve a desirable performance over the g… ▽ More Intelligent reflecting surface (IRS) has been recognized as a powerful technology for boosting communication performance. To reduce manufacturing and control costs, it is preferable to consider discrete phase shifts (DPSs) for IRS, which are set by default as uniformly distributed in the range of $[ - π,π)$ in the literature. Such setting, however, cannot achieve a desirable performance over the general Rician fading where the channel phase concentrates in a narrow range with a higher probability. Motivated by this drawback, we in this paper design optimal non-uniform DPSs for IRS to achieve a desirable performance level. The fundamental challenge is the \textit{possible offset in phase distribution across different cascaded source-element-destination channels}, if adopting conventional IRS where the position of each element is fixed. Such phenomenon leads to different patterns of optimal non-uniform DPSs for each IRS element and thus causes huge manufacturing costs especially when the number of IRS elements is large. Driven by the recently emerging fluid antenna system (or movable antenna technology), we demonstrate that if the position of each IRS element can be flexibly adjusted, the above phase distribution offset can be surprisingly eliminated, leading to the same pattern of DPSs for each IRS element. Armed with this, we then determine the form of unified non-uniform DPSs based on a low-complexity iterative algorithm. Simulations show that our proposed design significantly improves the system performance compared to competitive benchmarks. △ Less

Submitted 4 November, 2023; originally announced November 2023.

arXiv:2310.00730 [pdf]

EventLFM: Event Camera integrated Fourier Light Field Microscopy for Ultrafast 3D imaging

Authors: Ruipeng Guo, Qianwan Yang, Andrew S. Chang, Guorong Hu, Joseph Greene, Christopher V. Gabel, Sixian You, Lei Tian

Abstract: Ultrafast 3D imaging is indispensable for visualizing complex and dynamic biological processes. Conventional scanning-based techniques necessitate an inherent trade-off between acquisition speed and space-bandwidth product (SBP). Emerging single-shot 3D wide-field techniques offer a promising alternative but are bottlenecked by the synchronous readout constraints of conventional CMOS systems, thus… ▽ More Ultrafast 3D imaging is indispensable for visualizing complex and dynamic biological processes. Conventional scanning-based techniques necessitate an inherent trade-off between acquisition speed and space-bandwidth product (SBP). Emerging single-shot 3D wide-field techniques offer a promising alternative but are bottlenecked by the synchronous readout constraints of conventional CMOS systems, thus restricting data throughput to maintain high SBP at limited frame rates. To address this, we introduce EventLFM, a straightforward and cost-effective system that overcomes these challenges by integrating an event camera with Fourier light field microscopy (LFM), a state-of-the-art single-shot 3D wide-field imaging technique. The event camera operates on a novel asynchronous readout architecture, thereby bypassing the frame rate limitations inherent to conventional CMOS systems. We further develop a simple and robust event-driven LFM reconstruction algorithm that can reliably reconstruct 3D dynamics from the unique spatiotemporal measurements captured by EventLFM. Experimental results demonstrate that EventLFM can robustly reconstruct fast-moving and rapidly blinking 3D fluorescent samples at kHz frame rates. Furthermore, we highlight EventLFM's capability for imaging of blinking neuronal signals in scattering mouse brain tissues and 3D tracking of GFP-labeled neurons in freely moving C. elegans. We believe that the combined ultrafast speed and large 3D SBP offered by EventLFM may open up new possibilities across many biomedical applications. △ Less

Submitted 3 April, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

arXiv:2309.08835 [pdf]

doi 10.1038/s41467-024-48908-8

Intelligent machines work in unstructured environments by differential neuromorphic computing

Authors: Shengbo Wang, Shuo Gao, Chenyu Tang, Edoardo Occhipinti, Cong Li, Shurui Wang, Jiaqi Wang, Hubin Zhao, Guohua Hu, Arokia Nathan, Ravinder Dahiya, Luigi Occhipinti

Abstract: Efficient operation of intelligent machines in the real world requires methods that allow them to understand and predict the uncertainties presented by the unstructured environments with good accuracy, scalability and generalization, similar to humans. Current methods rely on pretrained networks instead of continuously learning from the dynamic signal properties of working environments and suffer… ▽ More Efficient operation of intelligent machines in the real world requires methods that allow them to understand and predict the uncertainties presented by the unstructured environments with good accuracy, scalability and generalization, similar to humans. Current methods rely on pretrained networks instead of continuously learning from the dynamic signal properties of working environments and suffer inherent limitations, such as data-hungry procedures, and limited generalization capabilities. Herein, we present a memristor-based differential neuromorphic computing, perceptual signal processing and learning method for intelligent machines. The main features of environmental information such as amplification (>720%) and adaptation (<50%) of mechanical stimuli encoded in memristors, are extracted to obtain human-like processing in unstructured environments. The developed method takes advantage of the intrinsic multi-state property of memristors and exhibits good scalability and generalization, as confirmed by validation in two different application scenarios: object gras** and autonomous driving. In the former, a robot hand experimentally realizes safe and stable gras** through fast learning (in ~1 ms) the unknown object features (e.g., sharp corner and smooth surface) with a single memristor. In the latter, the decision-making information of 10 unstructured environments in autonomous driving (e.g., overtaking cars, pedestrians) is accurately (94%) extracted with a 40*25 memristor array. By mimicking the intrinsic nature of human low-level perception mechanisms, the electronic memristive neuromorphic circuit-based method, presented here shows the potential for adapting to diverse sensing technologies and hel** intelligent machines generate smart high-level decisions in the real world. △ Less

Submitted 17 November, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 16 pages, 5 figures

Journal ref: Nat Commun, vol. 15, no. 1, p. 4671, May 2024

arXiv:2306.14646 [pdf, other]

Multi-View Attention Learning for Residual Disease Prediction of Ovarian Cancer

Authors: Xiangneng Gao, Shulan Ruan, Jun Shi, Guoqing Hu, Wei Wei

Abstract: In the treatment of ovarian cancer, precise residual disease prediction is significant for clinical and surgical decision-making. However, traditional methods are either invasive (e.g., laparoscopy) or time-consuming (e.g., manual analysis). Recently, deep learning methods make many efforts in automatic analysis of medical images. Despite the remarkable progress, most of them underestimated the im… ▽ More In the treatment of ovarian cancer, precise residual disease prediction is significant for clinical and surgical decision-making. However, traditional methods are either invasive (e.g., laparoscopy) or time-consuming (e.g., manual analysis). Recently, deep learning methods make many efforts in automatic analysis of medical images. Despite the remarkable progress, most of them underestimated the importance of 3D image information of disease, which might brings a limited performance for residual disease prediction, especially in small-scale datasets. To this end, in this paper, we propose a novel Multi-View Attention Learning (MuVAL) method for residual disease prediction, which focuses on the comprehensive learning of 3D Computed Tomography (CT) images in a multi-view manner. Specifically, we first obtain multi-view of 3D CT images from transverse, coronal and sagittal views. To better represent the image features in a multi-view manner, we further leverage attention mechanism to help find the more relevant slices in each view. Extensive experiments on a dataset of 111 patients show that our method outperforms existing deep-learning methods. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2303.12573 [pdf, other]

Robust single-shot 3D fluorescence imaging in scattering media with a simulator-trained neural network

Authors: Jeffrey Alido, Joseph Greene, Yujia Xue, Guorong Hu, Yunzhe Li, Mitchell Gilmore, Kevin J. Monk, Brett T. DiBenedictis, Ian G. Davison, Lei Tian

Abstract: Imaging through scattering is a pervasive and difficult problem in many biological applications. The high background and the exponentially attenuated target signals due to scattering fundamentally limits the imaging depth of fluorescence microscopy. Light-field systems are favorable for high-speed volumetric imaging, but the 2D-to-3D reconstruction is fundamentally ill-posed, and scattering exacer… ▽ More Imaging through scattering is a pervasive and difficult problem in many biological applications. The high background and the exponentially attenuated target signals due to scattering fundamentally limits the imaging depth of fluorescence microscopy. Light-field systems are favorable for high-speed volumetric imaging, but the 2D-to-3D reconstruction is fundamentally ill-posed, and scattering exacerbates the condition of the inverse problem. Here, we develop a scattering simulator that models low-contrast target signals buried in heterogeneous strong background. We then train a deep neural network solely on synthetic data to descatter and reconstruct a 3D volume from a single-shot light-field measurement with low signal-to-background ratio (SBR). We apply this network to our previously developed Computational Miniature Mesoscope and demonstrate the robustness of our deep learning algorithm on scattering phantoms with different scattering conditions. The network can robustly reconstruct emitters in 3D with a 2D measurement of SBR as low as 1.05 and as deep as a scattering length. We analyze fundamental tradeoffs based on network design factors and out-of-distribution data that affect the deep learning model's generalizability to real experimental data. Broadly, we believe that our simulator-based deep learning approach can be applied to a wide range of imaging through scattering techniques where experimental paired training data is lacking. △ Less

Submitted 8 December, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

arXiv:2210.12361 [pdf]

doi 10.2147/JMDH.S417068

MS-DCANet: A Novel Segmentation Network For Multi-Modality COVID-19 Medical Images

Authors: Xiaoyu Pan, Huazheng Zhu, **glong Du, Guangtao Hu, Baoru Han, Yuanyuan Jia

Abstract: The Coronavirus Disease 2019 (COVID-19) pandemic has increased the public health burden and brought profound disaster to humans. For the particularity of the COVID-19 medical images with blurred boundaries, low contrast and different infection sites, some researchers have improved the accuracy by adding more complexity. Also, they overlook the complexity of lesions, which hinder their ability to c… ▽ More The Coronavirus Disease 2019 (COVID-19) pandemic has increased the public health burden and brought profound disaster to humans. For the particularity of the COVID-19 medical images with blurred boundaries, low contrast and different infection sites, some researchers have improved the accuracy by adding more complexity. Also, they overlook the complexity of lesions, which hinder their ability to capture the relationship between segmentation sites and the background, as well as the edge contours and global context. However, increasing the computational complexity, parameters and inference speed is unfavorable for model transfer from laboratory to clinic. A perfect segmentation network needs to balance the above three factors completely. To solve the above issues, this paper propose a symmetric automatic segmentation framework named MS-DCANet. We introduce Tokenized MLP block, a novel attention scheme that use a shift-window mechanism to conditionally fuse local and global features to get more continuous boundaries and spatial positioning capabilities. It has greater understanding of irregular lesions contours. MS-DCANet also uses several Dual Channel blocks and a Res-ASPP block to improve the ability to recognize small targets. On multi-modality COVID-19 tasks, MS-DCANet achieved state-of-the-art performance compared with other baselines. It can well trade off the accuracy and complexity. To prove the strong generalization ability of our proposed model, we apply it to other tasks (ISIC 2018 and BAA) and achieve satisfactory results. △ Less

Submitted 19 July, 2023; v1 submitted 22 October, 2022; originally announced October 2022.

Comments: 21pages,13 figures,9 tables

Journal ref: J Multidiscip Healthc. 2023;16:2023-2043

arXiv:2206.06657 [pdf, other]

The Open Kidney Ultrasound Data Set

Authors: Rohit Singla, Cailin Ringstrom, Grace Hu, Victoria Lessoway, Janice Reid, Christopher Nguan, Robert Rohling

Abstract: Ultrasound, because of its low cost, non-ionizing, and non-invasive characteristics, has established itself as a cornerstone radiological examination. Research on ultrasound applications has also expanded, especially with image analysis with machine learning. However, ultrasound data are frequently restricted to closed data sets, with only a few openly available. Despite being a frequently examine… ▽ More Ultrasound, because of its low cost, non-ionizing, and non-invasive characteristics, has established itself as a cornerstone radiological examination. Research on ultrasound applications has also expanded, especially with image analysis with machine learning. However, ultrasound data are frequently restricted to closed data sets, with only a few openly available. Despite being a frequently examined organ, the kidney lacks a publicly available ultrasonography data set. The proposed Open Kidney Ultrasound Data Set is the first publicly available set of kidney brightness mode (B-mode) ultrasound data that includes annotations for multi-class semantic segmentation. It is based on data retrospectively collected in a 5-year period from over 500 patients with a mean age of 53.2 +/- 14.7 years, body mass index of 27.0 +/- 5.4 kg/m2, and most common primary diseases being diabetes mellitus, immunoglobulin A (IgA) nephropathy, and hypertension. There are labels for the view and fine-grained manual annotations from two expert sonographers. Notably, this data includes native and transplanted kidneys. Initial bench-marking measurements are performed, demonstrating a state-of-the-art algorithm achieving a Dice Sorenson Coefficient of 0.85 for the kidney capsule. This data set is a high-quality data set, including two sets of expert annotations, with a larger breadth of images than previously available. In increasing access to kidney ultrasound data, future researchers may be able to create novel image analysis techniques for tissue characterization, disease detection, and prognostication. △ Less

Submitted 3 December, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 21 pages, 1 figure, 5 tables

arXiv:2205.00123 [pdf, other]

doi 10.1364/OPTICA.464700

Deep-learning-augmented Computational Miniature Mesoscope

Authors: Yujia Xue, Qianwan Yang, Guorong Hu, Kehan Guo, Lei Tian

Abstract: Fluorescence microscopy is essential to study biological structures and dynamics. However, existing systems suffer from a tradeoff between field-of-view (FOV), resolution, and complexity, and thus cannot fulfill the emerging need of miniaturized platforms providing micron-scale resolution across centimeter-scale FOVs. To overcome this challenge, we developed Computational Miniature Mesoscope (CM… ▽ More Fluorescence microscopy is essential to study biological structures and dynamics. However, existing systems suffer from a tradeoff between field-of-view (FOV), resolution, and complexity, and thus cannot fulfill the emerging need of miniaturized platforms providing micron-scale resolution across centimeter-scale FOVs. To overcome this challenge, we developed Computational Miniature Mesoscope (CM$^2$) that exploits a computational imaging strategy to enable single-shot 3D high-resolution imaging across a wide FOV in a miniaturized platform. Here, we present CM$^2$ V2 that significantly advances both the hardware and computation. We complement the 3$\times$3 microlens array with a new hybrid emission filter that improves the imaging contrast by 5$\times$, and design a 3D-printed freeform collimator for the LED illuminator that improves the excitation efficiency by 3$\times$. To enable high-resolution reconstruction across the large imaging volume, we develop an accurate and efficient 3D linear shift-variant (LSV) model that characterizes the spatially varying aberrations. We then train a multi-module deep learning model, CM$^2$Net, using only the 3D-LSV simulator. We show that CM$^2$Net generalizes well to experiments and achieves accurate 3D reconstruction across a $\sim$7-mm FOV and 800-$μ$m depth, and provides $\sim$6-$μ$m lateral and $\sim$25-$μ$m axial resolution. This provides $\sim$8$\times$ better axial localization and $\sim$1400$\times$ faster speed as compared to the previous model-based algorithm. We anticipate this simple and low-cost computational miniature imaging system will be impactful to many large-scale 3D fluorescence imaging applications. △ Less

Submitted 7 September, 2022; v1 submitted 29 April, 2022; originally announced May 2022.

Journal ref: Optica 9, 1009-1021 (2022)

arXiv:2108.00952 [pdf, other]

An Applied Deep Learning Approach for Estimating Soybean Relative Maturity from UAV Imagery to Aid Plant Breeding Decisions

Authors: Saba Moeinizade, Hieu Pham, Ye Han, Austin Dobbels, Gui** Hu

Abstract: For a global breeding organization, identifying the next generation of superior crops is vital for its success. Recognizing new genetic varieties requires years of in-field testing to gather data about the crop's yield, pest resistance, heat resistance, etc. At the conclusion of the growing season, organizations need to determine which varieties will be advanced to the next growing season (or sold… ▽ More For a global breeding organization, identifying the next generation of superior crops is vital for its success. Recognizing new genetic varieties requires years of in-field testing to gather data about the crop's yield, pest resistance, heat resistance, etc. At the conclusion of the growing season, organizations need to determine which varieties will be advanced to the next growing season (or sold to farmers) and which ones will be discarded from the candidate pool. Specifically for soybeans, identifying their relative maturity is a vital piece of information used for advancement decisions. However, this trait needs to be physically observed, and there are resource limitations (time, money, etc.) that bottleneck the data collection process. To combat this, breeding organizations are moving toward advanced image capturing devices. In this paper, we develop a robust and automatic approach for estimating the relative maturity of soybeans using a time series of UAV images. An end-to-end hybrid model combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) is proposed to extract features and capture the sequential behavior of time series data. The proposed deep learning model was tested on six different environments across the United States. Results suggest the effectiveness of our proposed CNN-LSTM model compared to the local regression method. Furthermore, we demonstrate how this newfound information can be used to aid in plant breeding advancement decisions. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 22 pages, 7 figures

arXiv:2105.14286 [pdf, other]

Social Cost Optimization for Prosumer Community with Two Price-Package Incentives in Two-Settlement Based Electricity Market

Authors: Jianzheng Wang, Guoqiang Hu

Abstract: In this paper, we consider a future electricity market consisting of aggregated energy prosumers, who are equipped with local wind power plants (WPPs) to support (part of) their energy demands and can also trade energy with day-ahead market (DAM) and energy balancing market (EBM). In addition, an energy aggregator (EA) is established, who can provide the trading gateways between prosumers and the… ▽ More In this paper, we consider a future electricity market consisting of aggregated energy prosumers, who are equipped with local wind power plants (WPPs) to support (part of) their energy demands and can also trade energy with day-ahead market (DAM) and energy balancing market (EBM). In addition, an energy aggregator (EA) is established, who can provide the trading gateways between prosumers and the markets. The EA is responsible for making pricing strategies on the prosumers to influence their trading behaviours such that the social benefit of the prosumer community is improved. Specifically, two price packages are provided by the EA: wholesale price (WP) package and lump-sum (LS) package, which can be flexibly selected by prosumers based on their own preferences. Analytical energy-trading strategies will be derived for WP prosumers and LS prosumers based on non-cooperative games and Nash resource allocation strategies, respectively. In this work, a social cost optimization problem will be formulated for the EA, where the detailed WP/LS selection plans are unknown in advance. Consequently, a stochastic Stackelberg game between prosumers and the EA is formulated, and a two-level stochastic convex programming algorithm is proposed to minimize the expectation of the social cost. The performance of the proposed algorithm is demonstrated with a two-settlement based market model in the simulation. △ Less

Submitted 29 May, 2021; originally announced May 2021.

Comments: 15 pages, 10 figures

arXiv:2105.02427 [pdf, ps, other]

Resilient Time-Varying Output Formation Tracking of Linear Multi-Agent Systems Against Unbounded FDI Sensor Attacks and Unreliable Digraphs

Authors: Zhi Feng, Guoqiang Hu

Abstract: One salient feature of cooperative formation tracking is its distributed nature that relies on localized control and information sharing over a sparse communication network. That is, a distributed control manner could be prone to malicious attacks and unreliable communication that deteriorate the formation tracking performance or even destabilize the whole multi-agent system. This paper studies a… ▽ More One salient feature of cooperative formation tracking is its distributed nature that relies on localized control and information sharing over a sparse communication network. That is, a distributed control manner could be prone to malicious attacks and unreliable communication that deteriorate the formation tracking performance or even destabilize the whole multi-agent system. This paper studies a safe and reliable time-varying output formation tracking problem of linear multi-agent systems, where an attacker adversely injects any unbounded time-varying signals (false data injection (FDI) attacks), while an interruption of communication channels between the agents is caused by an unreliable network. Both characteristics improve the practical relevance of the problem to be addressed, which poses some technical challenges to the distributed algorithm design and stability analysis. To mitigate the adverse effect, a novel resilient distributed control architecture is established to guarantee time-varying output formation tracking exponentially. The key features of the proposed framework are threefold: 1) an observer-based identifier is integrated to compensate for adverse effects; 2) a reliable distributed algorithm is proposed to deal with time-varying topologies caused by unreliable communication; and 3) in contrast to the existing remedies that deal with attacks as bounded disturbances/faults with known knowledge, we propose resilience strategies to handle unknown and unbounded attacks for exponential convergence of dynamic formation tracking errors, whereas most of existing results achieve uniformly ultimately boundedness (UUB) results. Numerical simulations are given to show the effectiveness of the proposed design. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2105.02423 [pdf, ps, other]

Attack-Resilient Distributed Convex Optimization of Linear Multi-Agent Systems Against Malicious Cyber-Attacks over Random Digraphs

Authors: Zhi Feng, Guoqiang Hu

Abstract: This paper addresses a resilient exponential distributed convex optimization problem for a heterogeneous linear multi-agent system under Denial-of-Service (DoS) attacks over random digraphs. The random digraphs are caused by unreliable networks and the DoS attacks, allowed to occur aperiodically, refer to an interruption of the communication channels carried out by the intelligent adversaries. In… ▽ More This paper addresses a resilient exponential distributed convex optimization problem for a heterogeneous linear multi-agent system under Denial-of-Service (DoS) attacks over random digraphs. The random digraphs are caused by unreliable networks and the DoS attacks, allowed to occur aperiodically, refer to an interruption of the communication channels carried out by the intelligent adversaries. In contrast to many existing distributed convex optimization works over a prefect communication network, the global optimal solution might not be sought under the adverse influences that result in performance degradations or even failures of optimization algorithms. The aforementioned setting poses certain technical challenges to optimization algorithm design and exponential convergence analysis. In this work, several resilient algorithms are presented such that a team of agents minimizes a sum of local non-quadratic cost functions in a safe and reliable manner with global exponential convergence. Inspired by the preliminary works in [15]-[18], an explicit analysis of frequency and duration of attacks is investigated to guarantee exponential optimal solutions. Numerical simulation results are presented to demonstrate the effectiveness of the proposed design. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2103.01452 [pdf, other]

Social Profit Optimization with Demand Response Management in Electricity Market: A Multi-timescale Leader-following Approach

Authors: Jianzheng Wang, Yipeng Pang, Guoqiang Hu

Abstract: In the electricity market, it is quite common that the market participants make "selfish" strategies to harvest the maximum profits for themselves, which may cause the social benefit loss and impair the sustainability of the society in the long term. Regarding this issue, in this work, we will discuss how the social profit can be improved through strategic demand response (DR) management. Specific… ▽ More In the electricity market, it is quite common that the market participants make "selfish" strategies to harvest the maximum profits for themselves, which may cause the social benefit loss and impair the sustainability of the society in the long term. Regarding this issue, in this work, we will discuss how the social profit can be improved through strategic demand response (DR) management. Specifically, we explore two interaction mechanisms in the market: Nash equilibrium (NE) and Stackelberg equilibrium (SE) among utility companies (UCs) and user-UC interactions, respectively. At the user side, each user determines the optimal energy-purchasing strategy to maximize its own profit. At the UC side, a governmental UC (g-UC) is considered, who aims to optimize the social profit of the market. Meanwhile, normal UCs play games to maximize their own profits. As a result, a basic leader-following problem among the UCs is formulated under the coordination of the independent system operator (ISO). Moreover, by using our proposed demand function amelioration (DFA) strategy, a multi-timescale leader-following problem is formulated. In this case, the maximal market efficiency can be achieved without changing the "selfish instinct" of normal UCs. In addition, by considering the local constraints for the UCs, two projection-based pricing algorithms are proposed for UCs, which can provide approximate optimal solutions for the resulting non-convex social profit optimization problems. The feasibility of the proposed algorithms is verified by using the concept of price of anarchy (PoA) in a multi-UC multi-user market model in the simulation. △ Less

Submitted 28 May, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: 33 pages, 15 figures

arXiv:2012.11374 [pdf]

Dual-energy CT Reconstruction from Dual Quarter Scans

Authors: Wenkun Zhang, Ningning Liang, Linyuan Wang, Ailong Cai, Zhizhong Zheng, Chao Tang, Yizhong Wang, Lei Li, Bin Yan, Guoen Hu

Abstract: Compared with conventional single-energy computed tomography (CT), dual-energy CT (DECT) provides better material differentiation but most DECT imaging systems require dual full-angle projection data at different X-ray spectra. Relaxing the requirement of data acquisition is a particularly attractive research to promote the applications of DECT in a wide range of imaging areas. In this work, we de… ▽ More Compared with conventional single-energy computed tomography (CT), dual-energy CT (DECT) provides better material differentiation but most DECT imaging systems require dual full-angle projection data at different X-ray spectra. Relaxing the requirement of data acquisition is a particularly attractive research to promote the applications of DECT in a wide range of imaging areas. In this work, we design a novel DECT imaging scheme with dual quarter scans and propose an efficient method to reconstruct the desired DECT images from dual limited-angle projection data, which enables DECT on imaging configurations with half-scan and largely reduces scanning angles and radiation doses. We first study the characteristics of image artifacts under dual quarter scans scheme, and find that the directional limited-angle artifacts of DECT images are complementarily distributed in image domain because the corresponding X-rays of high- and low-energy scans are orthogonal. Inspired by this finding, a fusion CT image is generated by integrating the limited-angle DECT images of dual quarter scans. This strategy largely reduces the limited-angle artifacts and preserves the image edges and inner structures. Utilizing the capability of neural network in the modeling of nonlinear problem, a novel Anchor network with single-entry double-out architecture is designed in this work to yield the desired DECT images from the generated fusion CT image. Experimental results on the simulated and real data verify the effectiveness of the proposed method. △ Less

Submitted 17 December, 2020; originally announced December 2020.

arXiv:2011.14336 [pdf]

An Features Extraction and Recognition Method for Underwater Acoustic Target Based on ATCNN

Authors: Gang Hu, Kejun Wang, Liangliang Liu

Abstract: Facing the complex marine environment, it is extremely challenging to conduct underwater acoustic target recognition (UATR) using ship-radiated noise. Inspired by neural mechanism of auditory perception, this paper provides a new deep neural network trained by original underwater acoustic signals with depthwise separable convolution (DWS) and time-dilated convolution neural network, named auditory… ▽ More Facing the complex marine environment, it is extremely challenging to conduct underwater acoustic target recognition (UATR) using ship-radiated noise. Inspired by neural mechanism of auditory perception, this paper provides a new deep neural network trained by original underwater acoustic signals with depthwise separable convolution (DWS) and time-dilated convolution neural network, named auditory perception inspired time-dilated convolution neural network (ATCNN), and then implements detection and classification for underwater acoustic signals. The proposed ATCNN model consists of learnable features extractor and integration layer inspired by auditory perception, and time-dilated convolution inspired by language model. This paper decomposes original time-domain ship-radiated noise signals into different frequency components with depthwise separable convolution filter, and then extracts signal features based on auditory perception. The deep features are integrated on integration layer. The time-dilated convolution is used for long-term contextual modeling. As a result, like language model, intra-class and inter-class information can be fully used for UATR. For UATR task, the classification accuracy reaches 90.9%, which is the highest in contrast experiment. Experimental results show that ATCNN has great potential to improve the performance of UATR classification. △ Less

Submitted 29 November, 2020; originally announced November 2020.

arXiv:2009.11649 [pdf, ps, other]

Prescribed-Time Fully Distributed Nash Equilibrium Seeking in Noncooperative Games

Authors: Zhi Feng, Guoqiang Hu

Abstract: In this paper, we investigate a prescribed-time and fully distributed Nash Equilibrium (NE) seeking problem for continuous-time noncooperative games. By exploiting pseudo-gradient play and consensus-based schemes, various distributed NE seeking algorithms are presented over either fixed or switching communication topologies so that the convergence to the NE is reached in a prescribed time. In part… ▽ More In this paper, we investigate a prescribed-time and fully distributed Nash Equilibrium (NE) seeking problem for continuous-time noncooperative games. By exploiting pseudo-gradient play and consensus-based schemes, various distributed NE seeking algorithms are presented over either fixed or switching communication topologies so that the convergence to the NE is reached in a prescribed time. In particular, a prescribed-time distributed NE seeking algorithm is firstly developed under a fixed graph to find the NE in a prior-given and user-defined time, provided that a static controller gain can be selected based on certain global information such as the algebraic connectivity of the communication graph and both the Lipschitz and monotone constants of the pseudo-gradient associated with players' objective functions. Secondly, a prescribed-time and fully distributed NE seeking algorithm is proposed to remove global information by designing heterogeneous dynamic gains that turn on-line the weights of the communication topology. Further, we extend this algorithm to accommodate jointly switching topologies. It is theoretically proved that the global convergence of those proposed algorithms to the NE is rigorously guaranteed in a prescribed time based on a time function transformation approach. In the last, numerical simulation results are presented to verify the effectiveness of the designs. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: arXiv admin note: text overlap with arXiv:2009.10666

arXiv:2009.10700 [pdf, ps, other]

Fault-Tolerant Formation Tracking of Heterogeneous Multi-Agent Systems with Time-Varying Actuator Faults and Its Application to Task-Space Cooperative Tracking of Manipulators

Authors: Z. Feng, G. Hu

Abstract: This paper addresses a formation tracking problem for nonlinear multi-agent systems with time-varying actuator faults, in which only a subset of agents has access to the leader's information over the directed leader-follower network with a spanning tree. Both the amplitudes and signs of control coefficients induced by actuator faults are unknown and time-varying. The aforementioned setting improve… ▽ More This paper addresses a formation tracking problem for nonlinear multi-agent systems with time-varying actuator faults, in which only a subset of agents has access to the leader's information over the directed leader-follower network with a spanning tree. Both the amplitudes and signs of control coefficients induced by actuator faults are unknown and time-varying. The aforementioned setting improves the practical relevance of the problem to be investigated, and meanwhile, it poses technical challenges to distributed controller design and asymptotic stability analysis. By introducing a distributed estimation and control framework, a novel distributed control law based on a Nussbaum gain technique is developed to achieve robust fault-tolerant formation tracking for heterogeneous nonlinear multi-agent systems with time-varying actuator faults. It can be proved that the asymptotic convergence is guaranteed. In addition, the proposed approach is applied to task-space cooperative tracking of networked manipulators irrespective of the uncertain kinematics, dynamics, and actuator faults. Numerical simulation results are presented to verify the effectiveness of the proposed designs. △ Less

Submitted 22 September, 2020; originally announced September 2020.

arXiv:2009.10666 [pdf, ps, other]

Attack-Resilient Distributed Algorithms for Exponential Nash Equilibrium Seeking

Authors: Zhi Feng, Guoqiang Hu

Abstract: This paper investigates a resilient distributed Nash equilibrium (NE) seeking problem on a directed communication network subject to malicious cyber-attacks. The considered attacks, named as Denial-of-Service (DoS) attacks, are allowed to occur aperiodically, which refers to interruptions of communication channels carried out by intelligent adversaries. In such an insecure network environment, the… ▽ More This paper investigates a resilient distributed Nash equilibrium (NE) seeking problem on a directed communication network subject to malicious cyber-attacks. The considered attacks, named as Denial-of-Service (DoS) attacks, are allowed to occur aperiodically, which refers to interruptions of communication channels carried out by intelligent adversaries. In such an insecure network environment, the existence of cyber-attacks may result in undesirable performance degradations or even the failures of distributed algorithm to seek the NE of noncooperative games. Hence, the aforementioned setting can improve the practical relevance of the problem to be addressed and meanwhile, it poses some technical challenges to the distributed algorithm design and exponential convergence analysis. In contrast to the existing distributed NE seeking results over a prefect communication network, an attack-resilient distributed algorithm is presented such that the NE can be exactly reached with an exponential convergence rate in the presence of DoS attacks. Inspired by the previous works in [21]-[26], an explicit analysis of the attack frequency and duration is investigated to enable exponential NE seeking with resilience against attacks.Examples and numerical simulation results are given to show the effectiveness of the proposed design. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 9 pages

arXiv:2009.03289 [pdf]

Data-Driven Transferred Energy Management Strategy for Hybrid Electric Vehicles via Deep Reinforcement Learning

Authors: Hao Chen, Gang Guo, Bangbei Tang, Guo Hu, Xiaolin Tang, Teng Liu

Abstract: Real-time applications of energy management strategies (EMSs) in hybrid electric vehicles (HEVs) are the harshest requirements for researchers and engineers. Inspired by the excellent problem-solving capabilities of deep reinforcement learning (DRL), this paper proposes a real-time EMS via incorporating the DRL method and transfer learning (TL). The related EMSs are derived from and evaluated on t… ▽ More Real-time applications of energy management strategies (EMSs) in hybrid electric vehicles (HEVs) are the harshest requirements for researchers and engineers. Inspired by the excellent problem-solving capabilities of deep reinforcement learning (DRL), this paper proposes a real-time EMS via incorporating the DRL method and transfer learning (TL). The related EMSs are derived from and evaluated on the real-world collected driving cycle dataset from Transportation Secure Data Center (TSDC). The concrete DRL algorithm is proximal policy optimization (PPO) belonging to the policy gradient (PG) techniques. For specification, many source driving cycles are utilized for training the parameters of deep network based on PPO. The learned parameters are transformed into the target driving cycles under the TL framework. The EMSs related to the target driving cycles are estimated and compared in different training conditions. Simulation results indicate that the presented transfer DRL-based EMS could effectively reduce time consumption and guarantee control performance. △ Less

Submitted 12 December, 2022; v1 submitted 7 September, 2020; originally announced September 2020.

Comments: 28 pages, 14 figures

arXiv:2007.10126 [pdf]

Human-like Energy Management Based on Deep Reinforcement Learning and Historical Driving Experiences

Authors: Hao Chen, Xiaolin Tang, Guo Hu, Teng Liu

Abstract: Development of hybrid electric vehicles depends on an advanced and efficient energy management strategy (EMS). With online and real-time requirements in mind, this article presents a human-like energy management framework for hybrid electric vehicles according to deep reinforcement learning methods and collected historical driving data. The hybrid powertrain studied has a series-parallel topology,… ▽ More Development of hybrid electric vehicles depends on an advanced and efficient energy management strategy (EMS). With online and real-time requirements in mind, this article presents a human-like energy management framework for hybrid electric vehicles according to deep reinforcement learning methods and collected historical driving data. The hybrid powertrain studied has a series-parallel topology, and its control-oriented modeling is founded first. Then, the distinctive deep reinforcement learning (DRL) algorithm, named deep deterministic policy gradient (DDPG), is introduced. To enhance the derived power split controls in the DRL framework, the global optimal control trajectories obtained from dynamic programming (DP) are regarded as expert knowledge to train the DDPG model. This operation guarantees the optimality of the proposed control architecture. Moreover, the collected historical driving data based on experienced drivers are employed to replace the DP-based controls, and thus construct the human-like EMSs. Finally, different categories of experiments are executed to estimate the optimality and adaptability of the proposed human-like EMS. Improvements in fuel economy and convergence rate indicate the effectiveness of the constructed control structure. △ Less

Submitted 25 September, 2023; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: 8 pages, 10 figures

arXiv:2003.08208 [pdf, other]

Distributed Control of Multi-zone HVAC Systems Considering Indoor Air Quality

Authors: Yu Yang, Seshadhri Srinivasan, Guoqiang Hu, Costas J. Spanos

Abstract: This paper studies a scalable control method for multi-zone heating, ventilation and air-conditioning (HVAC) systems to optimize the energy cost for maintaining thermal comfort and indoor air quality (IAQ) (represented by CO2) simultaneously. This problem is computationally challenging due to the complex system dynamics, various spatial and temporal couplings as well as multiple control variables… ▽ More This paper studies a scalable control method for multi-zone heating, ventilation and air-conditioning (HVAC) systems to optimize the energy cost for maintaining thermal comfort and indoor air quality (IAQ) (represented by CO2) simultaneously. This problem is computationally challenging due to the complex system dynamics, various spatial and temporal couplings as well as multiple control variables to be coordinated. To address the challenges, we propose a two-level distributed method (TLDM) with a upper level and lower level control integrated. The upper level computes zone mass flow rates for maintaining zone thermal comfort with minimal energy cost, and then the lower level strategically regulates zone mass flow rates and the ventilation rate to achieve IAQ while preserving the near energy saving performance of upper level. As both the upper and lower level computation are deployed in a distributed manner, the proposed method is scalable and computationally efficient. The near-optimal performance of the method in energy cost saving is demonstrated through comparison with the centralized method. In addition, the comparisons with the existing distributed method show that our method can provide IAQ with only little increase of energy cost while the latter fails. Moreover, we demonstrate our method outperforms the demand controlled ventilation strategies (DCVs) for IAQ management with about 8-10% energy cost reduction. △ Less

Submitted 4 January, 2021; v1 submitted 17 March, 2020; originally announced March 2020.

Comments: 12 pages, 12 figures

arXiv:2002.03914 [pdf, other]

Smartphone Impostor Detection with Built-in Sensors and Deep Learning

Authors: Guangyuan Hu, Zecheng He, Ruby Lee

Abstract: In this paper, we show that sensor-based impostor detection with deep learning can achieve excellent impostor detection accuracy at lower hardware cost compared to past work on sensor-based user authentication (the inverse problem) which used more conventional machine learning algorithms. While these methods use other smartphone users' sensor data to build the (user, non-user) classification model… ▽ More In this paper, we show that sensor-based impostor detection with deep learning can achieve excellent impostor detection accuracy at lower hardware cost compared to past work on sensor-based user authentication (the inverse problem) which used more conventional machine learning algorithms. While these methods use other smartphone users' sensor data to build the (user, non-user) classification models, we go further to show that using only the legitimate user's sensor data can still achieve very good accuracy while preserving the privacy of the user's sensor data (behavioral biometrics). For this use case, a key contribution is showing that the detection accuracy of a Recurrent Neural Network (RNN) deep learning model can be significantly improved by comparing prediction error distributions. This requires generating and comparing empirical probability distributions, which we show in an efficient hardware design. Another novel contribution is in the design of SID (Smartphone impostor Detection), a minimalist hardware accelerator that can be integrated into future smartphones for efficient impostor detection for different scenarios. Our SID module can implement many common Machine Learning and Deep Learning algorithms. SID is also scalable in parallelism and performance and easy to program. We show an FPGA prototype of SID, which can provide more than enough performance for real-time impostor detection, with very low hardware complexity and power consumption (one to two orders of magnitude less than related performance-oriented FPGA accelerators). We also show that the FPGA implementation of SID consumes 64.41X less energy than an implementation using the CPU with a GPU. △ Less

Submitted 10 February, 2020; originally announced February 2020.

arXiv:1911.00840 [pdf, other]

Stochastic Optimal Control of HVAC system for Energy-efficient Buildings

Authors: Yu Yang, Guoqiang Hu, Costas J. Spanos

Abstract: The heating, ventilation and air-conditioning (HVAC) system accounts for substantial energy use in buildings, whereas a large group of occupants are still not actually feeling comfortable staying inside. This poses the issue of develo** energy-efficient HVAC control, i.e., reduce energy use (cost) while simultaneously enhancing human comfort. This paper pursues the objective and studies the stoc… ▽ More The heating, ventilation and air-conditioning (HVAC) system accounts for substantial energy use in buildings, whereas a large group of occupants are still not actually feeling comfortable staying inside. This poses the issue of develo** energy-efficient HVAC control, i.e., reduce energy use (cost) while simultaneously enhancing human comfort. This paper pursues the objective and studies the stochastic optimal HVAC control subject to uncertain thermal demand (i.e., the weather and occupancy etc). Particularly, we involve the elaborate predicted mean vote (PMV) thermal comfort model in the optimization. The problem is computationally challenging due to the non-linear and non-analytical constraints imposed by the system dynamics and PMV model. We make the following contributions to address it. First, we formulate the problem as a Markov decision process (MDP) which is a desirable modeling technique capable of handling the complexities. Second, we propose a gradient-based learning (GB-L) method for progressively learning a stochastic control policy off-line and store it for on-line execution. Third, we prove the learning method converge to the optimal policies theoretically, and its performance (i.e., energy cost, thermal comfort and on-line computation) for HVAC control via simulations. The comparisons with the existing model predictive control based relaxation (MPC-R) method which is assumed with accurate future information and supposed to provide the near-optimal bounds, show that though there exists some performance loss in energy cost reduction (i.e., 6.5%), the proposed method can enable efficient on-line implementation (less than 1 second) and provide high probability of thermal comfort under uncertainties. △ Less

Submitted 4 February, 2021; v1 submitted 3 November, 2019; originally announced November 2019.

Comments: 11 pages, 10 figures

arXiv:1909.08331 [pdf, other]

Coupling Chaotic System Based on Unit Transform and Its Applications in Image Encryption

Authors: Guozhen Hu, Baobin Li

Abstract: Chaotic maps are very important for establishing chaos-based image encryption systems. This paper introduces a coupling chaotic system based on a certain unit transform, which can combine any two 1D chaotic maps to generate a new one with excellent performance. The chaotic behavior analysis has verified this coupling system's effectiveness and progress. In particular, we give a specific strategy a… ▽ More Chaotic maps are very important for establishing chaos-based image encryption systems. This paper introduces a coupling chaotic system based on a certain unit transform, which can combine any two 1D chaotic maps to generate a new one with excellent performance. The chaotic behavior analysis has verified this coupling system's effectiveness and progress. In particular, we give a specific strategy about selecting an appropriate unit transform function to enhance chaos of generated maps. Besides, a new chaos based pseudo-random number generator, shorted as CBPRNG, is designed to improve the distribution of chaotic sequences. We give a mathematical illustration on the uniformity of CBPRNG, and test the randomness of it. Moreover, based on CBPRNG, an image encryption algorithm is introduced. Simulation results and security analysis indicate that the proposed image encryption scheme is competitive with some advanced existing methods. △ Less

Submitted 18 September, 2019; originally announced September 2019.

Comments: 41 pages, 15 figures

arXiv:1908.07307 [pdf, other]

Investigation of wind pressures on tall building under interference effects using machine learning techniques

Authors: Gang Hu, Lingbo Liu, Dacheng Tao, Jie Song, K. C. S. Kwok

Abstract: Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machin… ▽ More Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machine learning techniques to resolve the conflicting requirement between limited wind tunnel tests that produce unreliable results and a completed investigation of the interference effects that is costly and time-consuming. Four machine learning models including decision tree, random forest, XGBoost, generative adversarial networks (GANs), were trained based on 30% of a dataset to predict both mean and fluctuating pressure coefficients on the principal building. The GANs model exhibited the best performance in predicting these pressure coefficients. A number of GANs models were then trained based on different portions of the dataset ranging from 10% to 90%. It was found that the GANs model based on 30% of the dataset is capable of predicting both mean and fluctuating pressure coefficients under unseen interference conditions accurately. By using this GANs model, 70% of the wind tunnel test cases can be saved, largely alleviating the cost of this kind of wind tunnel testing study. △ Less

Submitted 20 August, 2019; originally announced August 2019.

Comments: 15 pages, 14 figures

arXiv:1905.10934 [pdf, other]

HVAC Energy Cost Optimization for a Multi-zone Building via a Decentralized Approach

Authors: Yu Yang, Guoqiang Hu, Costas J. Spanos

Abstract: It has been well acknowledged that buildings account for a large proportion of the world's energy consumption. However, the energy use of buildings, especially the heating, ventilation and air-conditioning (HVAC), is far from being efficient. There still exists a dramatic potential to save energy through improving building energy efficiency. Therefore, this paper studies the control of HVAC system… ▽ More It has been well acknowledged that buildings account for a large proportion of the world's energy consumption. However, the energy use of buildings, especially the heating, ventilation and air-conditioning (HVAC), is far from being efficient. There still exists a dramatic potential to save energy through improving building energy efficiency. Therefore, this paper studies the control of HVAC system for multi-zone buildings with the objective to reduce energy consumption cost while satisfying thermal comfort. In particular, the thermal couplings due to the heat transfer between the adjacent zones are incorporated in the optimization. Considering that a centralized method is generally computationally prohibitive for large buildings, an efficient decentralized approach is developed, based on the Accelerated Distributed Augmented Lagrangian (ADAL) method [1]. To evaluate the performance of the proposed method, we first compare it with a centralized method, in which the optimal solution of a small-scale problem can be obtained. We find that this decentralized approach can almost approach the optimal solution of the problem. Further, this decentralized approach is compared with the Distributed Token-Based Scheduling Strategy (DTBSS) [2]. The numeric results reveal that when the number of zones is relatively small (less than 20), the two decentralized methods can achieve a comparable performance regarding the cost of the HVAC system. However, with an increase of the number of zones in the building, the proposed decentralized approach demonstrates better performance with a considerable reduction of the total cost. Moreover, the decentralized approach proposed in this paper demonstrate better scalability with less average computation required. △ Less

Submitted 21 May, 2019; originally announced May 2019.

Comments: 13 pages, 8 figures

arXiv:1802.03122 [pdf, ps, other]

doi 10.1109/TCYB.2021.3119461

Delay-Dependent Distributed Kalman Fusion Estimation with Dimensionality Reduction in Cyber-Physical Systems

Authors: Bo Chen, Daniel W. C. Ho, Guoqiang Hu, Li Yu

Abstract: This paper studies the distributed dimensionality reduction fusion estimation problem with communication delays for a class of cyber-physical systems (CPSs). The raw measurements are preprocessed in each sink node to obtain the local optimal estimate (LOE) of a CPS, and the compressed LOE under dimensionality reduction encounters with communication delays during the transmission. Under this case,… ▽ More This paper studies the distributed dimensionality reduction fusion estimation problem with communication delays for a class of cyber-physical systems (CPSs). The raw measurements are preprocessed in each sink node to obtain the local optimal estimate (LOE) of a CPS, and the compressed LOE under dimensionality reduction encounters with communication delays during the transmission. Under this case, a mathematical model with compensation strategy is proposed to characterize the dimensionality reduction and communication delays. This model also has the property to reduce the information loss caused by the dimensionality reduction and delays. Based on this model, a recursive distributed Kalman fusion estimator (DKFE) is derived by optimal weighted fusion criterion in the linear minimum variance sense. A stability condition for the DKFE, which can be easily verified by the exiting software, is derived. In addition, this condition can guarantee that estimation error covariance matrix of the DKFE converges to the unique steady-state matrix for any initial values, and thus the steady-state DKFE (SDKFE) is given. Notice that the computational complexity of the SDKFE is much lower than that of the DKFE. Moreover, a probability selection criterion for determining the dimensionality reduction strategy is also presented to guarantee the stability of the DKFE. Two illustrative examples are given to show the advantage and effectiveness of the proposed methods. △ Less

Submitted 1 July, 2021; v1 submitted 8 February, 2018; originally announced February 2018.

Journal ref: IEEE Transactions on Cybernetics, 2021

arXiv:1708.08583 [pdf, ps, other]

doi 10.1109/TAC.2018.2849612

A New Approach to Linear/Nonlinear Distributed Fusion Estimation Problem

Authors: Bo Chen, Guoqiang Hu, Daniel W. C. Ho, Li Yu

Abstract: Disturbance noises are always bounded in a practical system, while fusion estimation is to best utilize multiple sensor data containing noises for the purpose of estimating a quantity--a parameter or process. However, few results are focused on the information fusion estimation problem under bounded noises. In this paper, we study the distributed fusion estimation problem for linear time-varying s… ▽ More Disturbance noises are always bounded in a practical system, while fusion estimation is to best utilize multiple sensor data containing noises for the purpose of estimating a quantity--a parameter or process. However, few results are focused on the information fusion estimation problem under bounded noises. In this paper, we study the distributed fusion estimation problem for linear time-varying systems and nonlinear systems with bounded noises, where the addressed noises do not provide any statistical information, and are unknown but bounded. When considering linear time-varying fusion systems with bounded noises, a new local Kalman-like estimator is designed such that the square error of the estimator is bounded as time goes to $\infty$. A novel constructive method is proposed to find an upper bound of fusion estimation error, then a convex optimization problem on the design of an optimal weighting fusion criterion is established in terms of linear matrix inequalities, which can be solved by standard software packages. Furthermore, according to the design method of linear time-varying fusion systems, each local nonlinear estimator is derived for nonlinear systems with bounded noises by using Taylor series expansion, and a corresponding distributed fusion criterion is obtained by solving a convex optimization problem. Finally, target tracking system and localization of a mobile robot are given to show the advantages and effectiveness of the proposed methods. △ Less

Submitted 19 July, 2018; v1 submitted 28 August, 2017; originally announced August 2017.

Comments: 9 pages, 3 figures

MSC Class: 93E10; 93E11

Journal ref: IEEE Transactions on Automatic Control, 2018

arXiv:1707.05044 [pdf, ps, other]

Economic MPC of Nonlinear Systems with Non-Monotonic Lyapunov Functions and Its Application to HVAC Control

Authors: Zheming Wang, Guoqiang Hu

Abstract: This paper proposes a Lyapunov-based economic MPC scheme for nonlinear sytems with non-monotonic Lyapunov functions. Relaxed Lyapunov-based constraints are used in the MPC formulation to improve the economic performance. These constraints will enforce a Lyapunov decrease after every few steps. Recursive feasibility and asymptotical convergence to the steady state can be achieved using Lyapunov-lik… ▽ More This paper proposes a Lyapunov-based economic MPC scheme for nonlinear sytems with non-monotonic Lyapunov functions. Relaxed Lyapunov-based constraints are used in the MPC formulation to improve the economic performance. These constraints will enforce a Lyapunov decrease after every few steps. Recursive feasibility and asymptotical convergence to the steady state can be achieved using Lyapunov-like stability analysis. The proposed economic MPC can be applied to minimize energy consumption in HVAC control of commercial buildings. The Lyapunov-based constraints in the online MPC problem enable the tracking of the desired set-point temperature. The performance is demonstrated by a virtual building composed of two adjacent zones. △ Less

Submitted 17 July, 2017; originally announced July 2017.

arXiv:1508.02636 [pdf, ps, other]

doi 10.1109/TCYB.2016.2524452

Game Design and Analysis for Price based Demand Response: An Aggregate Game Approach

Authors: Maojiao Ye, Guoqiang Hu

Abstract: In this paper, an aggregate game approach is proposed for the modeling and analysis of energy consumption control in smart grid. Since the electricity user's cost function depends on the aggregate load, which is unknown to the end users, an aggregate load estimator is employed to estimate it. Based on the communication among the users about their estimations on the aggregate load, Nash equilibrium… ▽ More In this paper, an aggregate game approach is proposed for the modeling and analysis of energy consumption control in smart grid. Since the electricity user's cost function depends on the aggregate load, which is unknown to the end users, an aggregate load estimator is employed to estimate it. Based on the communication among the users about their estimations on the aggregate load, Nash equilibrium seeking strategies are proposed for the electricity users. By using singular perturbation analysis and Lyapunov stability analysis, a local convergence result to the Nash equilibrium is presented for the energy consumption game that may have multiple Nash equilibria. For the energy consumption game with a unique Nash equilibrium, it is shown that the players' strategies converge to the Nash equilibrium non-locally. More specially, if the unique Nash equilibrium is an inner Nash equilibrium, then the convergence rate can be quantified. Energy consumption game with stubborn players is also investigated. Convergence to the best response strategies for the rational players is ensured. Numerical examples are provided to verify the effectiveness of the proposed methods. △ Less

Submitted 1 February, 2016; v1 submitted 3 August, 2015; originally announced August 2015.

Showing 1–39 of 39 results for author: Hu, G