Search | arXiv e-print repository

arXiv:2406.18536 [pdf, other]

Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

Authors: Yuxuan Yin, Xiaoxiao Wang, Rebecca Chen, Chen He, Peng Li

Abstract: Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertaintie… ▽ More Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertainties caused by different sources of variations. While some existing techniques offer region predictions, but they rely on certain distributional assumptions and/or provide no coverage guarantees. In response to these limitations, we propose a novel distribution-free $V_{min}$ interval estimation methodology possessing a theoretical guarantee of coverage. Our approach leverages conformalized quantile regression and on-chip monitors to generate reliable prediction intervals. We demonstrate the effectiveness of the proposed method on an industrial 5nm automotive chip dataset. Moreover, we show that the use of on-chip monitors can reduce the interval length significantly for $V_{min}$ prediction. △ Less

Submitted 3 May, 2024; originally announced June 2024.

Comments: Accepted by DATE 2024. Camera-ready version

arXiv:2406.11917 [pdf, other]

doi 10.1016/j.aei.2024.102568

Interpretable modulated differentiable STFT and physics-informed balanced spectrum metric for freight train wheelset bearing cross-machine transfer fault diagnosis under speed fluctuations

Authors: Chao He, Hongmei Shi, Ruixin Li, Jianbo Li, ZuJun Yu

Abstract: The service conditions of wheelset bearings has a direct impact on the safe operation of railway heavy haul freight trains as the key components. However, speed fluctuation of the trains and few fault samples are the two main problems that restrict the accuracy of bearing fault diagnosis. Therefore, a cross-machine transfer diagnosis (pyDSN) network coupled with interpretable modulated differentia… ▽ More The service conditions of wheelset bearings has a direct impact on the safe operation of railway heavy haul freight trains as the key components. However, speed fluctuation of the trains and few fault samples are the two main problems that restrict the accuracy of bearing fault diagnosis. Therefore, a cross-machine transfer diagnosis (pyDSN) network coupled with interpretable modulated differentiable short-time Fourier transform (STFT) and physics-informed balanced spectrum quality metric is proposed to learn domain-invariant and discriminative features under time-varying speeds. Firstly, due to insufficiency in extracting extract frequency components of time-varying speed signals using fixed windows, a modulated differentiable STFT (MDSTFT) that is interpretable with STFT-informed theoretical support, is proposed to extract the robust time-frequency spectrum (TFS). During training process, multiple windows with different lengths dynamically change. Also, in addition to the classification metric and domain discrepancy metric, we creatively introduce a third kind of metric, referred to as the physics-informed metric, to enhance transferable TFS. A physics-informed balanced spectrum quality (BSQ) regularization loss is devised to guide an optimization direction for MDSTFT and model. With it, not only can model acquire high-quality TFS, but also a physics-restricted domain adaptation network can be also acquired, making it learn real-world physics knowledge, ultimately diminish the domain discrepancy across different datasets. The experiment is conducted in the scenario of migrating from the laboratory datasets to the freight train dataset, indicating that the hybrid-driven pyDSN outperforms existing methods and has practical value. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Journal ref: Advanced Engineering Informatics, 2024

arXiv:2405.05353 [pdf, other]

Eco-driving Accounting for Interactive Cut-in Vehicles

Authors: Chaozhe R. He, Nan Li

Abstract: Automated vehicles can gather information about surrounding traffic and plan safe and energy-efficient driving behavior, which is known as eco-driving. Conventional eco-driving designs only consider preceding vehicles in the same lane as the ego vehicle. In heavy traffic, however, vehicles in adjacent lanes may cut into the ego vehicle's lane, influencing the ego vehicle's eco-driving behavior and… ▽ More Automated vehicles can gather information about surrounding traffic and plan safe and energy-efficient driving behavior, which is known as eco-driving. Conventional eco-driving designs only consider preceding vehicles in the same lane as the ego vehicle. In heavy traffic, however, vehicles in adjacent lanes may cut into the ego vehicle's lane, influencing the ego vehicle's eco-driving behavior and compromising the energy-saving performance. Therefore, in this paper, we propose an eco-driving design that accounts for neighbor vehicles that have cut-in intentions. Specifically, we integrate a leader-follower game to predict the interaction between the ego and the cut-in vehicles and a model-predictive controller for planning energy-efficient behavior for the automated ego vehicle. We show that the leader-follower game model can reasonably represent the interactive motion between the ego vehicle and the cut-in vehicle. More importantly, we show that the proposed design can predict and react to neighbor vehicles' cut-in behaviors properly, leading to improved energy efficiency in cut-in scenarios compared to baseline designs that consider preceding vehicles only. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted at 2024 IEEE International Conference on Mobility: Operations, Services, and Technologies (MOST)

arXiv:2404.15992 [pdf, other]

HDDGAN: A Heterogeneous Dual-Discriminator Generative Adversarial Network for Infrared and Visible Image Fusion

Authors: Guosheng Lu, Zile Fang, Chunming He, Zhigang Zhao

Abstract: Infrared and visible image fusion (IVIF) aims to preserve thermal radiation information from infrared images while integrating texture details from visible images, enabling the capture of important features and hidden details of subjects in complex scenes and disturbed environments. Consequently, IVIF offers distinct advantages in practical applications such as video surveillance, night navigation… ▽ More Infrared and visible image fusion (IVIF) aims to preserve thermal radiation information from infrared images while integrating texture details from visible images, enabling the capture of important features and hidden details of subjects in complex scenes and disturbed environments. Consequently, IVIF offers distinct advantages in practical applications such as video surveillance, night navigation, and target recognition. However, prevailing methods often face challenges in simultaneously capturing thermal region features and detailed information due to the disparate characteristics of infrared and visible images. Consequently, fusion outcomes frequently entail a compromise between thermal target area information and texture details. In this study, we introduce a novel heterogeneous dual-discriminator generative adversarial network (HDDGAN) to address this issue. Specifically, the generator is structured as a multi-scale skip-connected structure, facilitating the extraction of essential features from different source images. To enhance the information representation ability of the fusion result, an attention mechanism is employed to construct the information fusion layer within the generator, leveraging the disparities between the source images. Moreover, recognizing the distinct learning requirements of information in infrared and visible images, we design two discriminators with differing structures. This approach aims to guide the model to learn salient information from infrared images while simultaneously capturing detailed information from visible images. Extensive experiments conducted on various public datasets demonstrate the superiority of our proposed HDDGAN over other state-of-the-art (SOTA) algorithms, highlighting its enhanced potential for practical applications. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.15341 [pdf, other]

Classifier-guided neural blind deconvolution: a physics-informed denoising module for bearing fault diagnosis under heavy noise

Authors: **g-Xiao Liao, Chao He, Jipu Li, **wei Sun, Shi** Zhang, Xiaoge Zhang

Abstract: Blind deconvolution (BD) has been demonstrated as an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD's desirable feature in adaptability and mathematical interpretability, a significant challenge persists: How to effectively integrate BD with fault-diagnosing classifiers? This issue arises because the traditional… ▽ More Blind deconvolution (BD) has been demonstrated as an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD's desirable feature in adaptability and mathematical interpretability, a significant challenge persists: How to effectively integrate BD with fault-diagnosing classifiers? This issue arises because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with downstream deep learning classifiers, the different learning objectives will be in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault classification. Firstly, we present a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating the seamless integration of BD and the deep learning classifier for co-optimization of model parameters. Subsequently, we develop a unified framework to use a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. Consequently, the fault labels provide useful information to direct BD to extract features that distinguish classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three datasets demonstrate that ClassBD outperforms other state-of-the-art methods under noisy conditions. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.08334 [pdf, other]

Guaranteed Completion of Complex Tasks via Temporal Logic Trees and Hamilton-Jacobi Reachability

Authors: Frank J. Jiang, Kaj Munhoz Arfvidsson, Chong He, Mo Chen, Karl H. Johansson

Abstract: In this paper, we present an approach for guaranteeing the completion of complex tasks with cyber-physical systems (CPS). Specifically, we leverage temporal logic trees constructed using Hamilton-Jacobi reachability analysis to (1) check for the existence of control policies that complete a specified task and (2) develop a computationally-efficient approach to synthesize the full set of control in… ▽ More In this paper, we present an approach for guaranteeing the completion of complex tasks with cyber-physical systems (CPS). Specifically, we leverage temporal logic trees constructed using Hamilton-Jacobi reachability analysis to (1) check for the existence of control policies that complete a specified task and (2) develop a computationally-efficient approach to synthesize the full set of control inputs the CPS can implement in real-time to ensure the task is completed. We show that, by checking the approximation directions of each state set in the temporal logic tree, we can check if the temporal logic tree suffers from the "leaking corner issue," where the intersection of reachable sets yields an incorrect approximation. By ensuring a temporal logic tree has no leaking corners, we know the temporal logic tree correctly verifies the existence of control policies that satisfy the specified task. After confirming the existence of control policies, we show that we can leverage the value functions obtained through Hamilton-Jacobi reachability analysis to efficiently compute the set of control inputs the CPS can implement throughout the deployment time horizon to guarantee the completion of the specified task. Finally, we use a newly released Python toolbox to evaluate the presented approach on a simulated driving task. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.15853 [pdf]

An edge detection-based deep learning approach for tear meniscus height measurement

Authors: Kesheng Wang, Kunhui Xu, Xiaoyu Chen, Chunlei He, Jianfeng Zhang, Dexing Kong, Qi Dai, Shoujun Huang

Abstract: Automatic measurements of tear meniscus height (TMH) have been achieved by using deep learning techniques; however, annotation is significantly influenced by subjective factors and is both time-consuming and labor-intensive. In this paper, we introduce an automatic TMH measurement technique based on edge detection-assisted annotation within a deep learning framework. This method generates mask lab… ▽ More Automatic measurements of tear meniscus height (TMH) have been achieved by using deep learning techniques; however, annotation is significantly influenced by subjective factors and is both time-consuming and labor-intensive. In this paper, we introduce an automatic TMH measurement technique based on edge detection-assisted annotation within a deep learning framework. This method generates mask labels less affected by subjective factors with enhanced efficiency compared to previous annotation approaches. For improved segmentation of the pupil and tear meniscus areas, the convolutional neural network Inceptionv3 was first implemented as an image quality assessment model, effectively identifying higher-quality images with an accuracy of 98.224%. Subsequently, by using the generated labels, various algorithms, including Unet, ResUnet, Deeplabv3+FcnResnet101, Deeplabv3+FcnResnet50, FcnResnet50, and FcnResnet101 were trained, with Unet demonstrating the best performance. Finally, Unet was used for automatic pupil and tear meniscus segmentation to locate the center of the pupil and calculate TMH,respectively. An evaluation of the mask quality predicted by Unet indicated a Mean Intersection over Union of 0.9362, a recall of 0.9261, a precision of 0.9423, and an F1-Score of 0.9326. Additionally, the TMH predicted by the model was assessed, with the fitting curve represented as y= 0.982x-0.862, an overall correlation coefficient of r^2=0.961 , and an accuracy of 94.80% (237/250). In summary, the algorithm can automatically screen images based on their quality,segment the pupil and tear meniscus areas, and automatically measure TMH. Measurement results using the AI algorithm demonstrate a high level of consistency with manual measurements, offering significant support to clinical doctors in diagnosing dry eye disease. △ Less

Submitted 23 March, 2024; originally announced March 2024.

Comments: 22 pages, 5 figures

arXiv:2403.00529 [pdf, other]

VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis

Authors: Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee

Abstract: Achieving nuanced and accurate emulation of human voice has been a longstanding goal in artificial intelligence. Although significant progress has been made in recent years, the mainstream of speech synthesis models still relies on supervised speaker modeling and explicit reference utterances. However, there are many aspects of human voice, such as emotion, intonation, and speaking style, for whic… ▽ More Achieving nuanced and accurate emulation of human voice has been a longstanding goal in artificial intelligence. Although significant progress has been made in recent years, the mainstream of speech synthesis models still relies on supervised speaker modeling and explicit reference utterances. However, there are many aspects of human voice, such as emotion, intonation, and speaking style, for which it is hard to obtain accurate labels. In this paper, we propose VoxGenesis, a novel unsupervised speech synthesis framework that can discover a latent speaker manifold and meaningful voice editing directions without supervision. VoxGenesis is conceptually simple. Instead of map** speech features to waveforms deterministically, VoxGenesis transforms a Gaussian distribution into speech distributions conditioned and aligned by semantic tokens. This forces the model to learn a speaker distribution disentangled from the semantic content. During the inference, sampling from the Gaussian distribution enables the creation of novel speakers with distinct characteristics. More importantly, the exploration of latent space uncovers human-interpretable directions associated with specific speaker characteristics such as gender attributes, pitch, tone, and emotion, allowing for voice editing by manipulating the latent codes along these identified directions. We conduct extensive experiments to evaluate the proposed VoxGenesis using both subjective and objective metrics, finding that it produces significantly more diverse and realistic speakers with distinct characteristics than the previous approaches. We also show that latent space manipulation produces consistent and human-identifiable effects that are not detrimental to the speech quality, which was not possible with previous approaches. Audio samples of VoxGenesis can be found at: \url{https://bit.ly/VoxGenesis}. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: preprint

arXiv:2402.17645 [pdf, other]

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation

Authors: Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Conghui He, Dahua Lin, Jiaqi Wang

Abstract: We present SongComposer, an innovative LLM designed for song composition. It could understand and generate melodies and lyrics in symbolic song representations, by leveraging the capability of LLM. Existing music-related LLM treated the music as quantized audio signals, while such implicit encoding leads to inefficient encoding and poor flexibility. In contrast, we resort to symbolic song represen… ▽ More We present SongComposer, an innovative LLM designed for song composition. It could understand and generate melodies and lyrics in symbolic song representations, by leveraging the capability of LLM. Existing music-related LLM treated the music as quantized audio signals, while such implicit encoding leads to inefficient encoding and poor flexibility. In contrast, we resort to symbolic song representation, the mature and efficient way humans designed for music, and enable LLM to explicitly compose songs like humans. In practice, we design a novel tuple design to format lyric and three note attributes (pitch, duration, and rest duration) in the melody, which guarantees the correct LLM understanding of musical symbols and realizes precise alignment between lyrics and melody. To impart basic music understanding to LLM, we carefully collected SongCompose-PT, a large-scale song pretraining dataset that includes lyrics, melodies, and paired lyrics-melodies in either Chinese or English. After adequate pre-training, 10K carefully crafted QA pairs are used to empower the LLM with the instruction-following capability and solve diverse tasks. With extensive experiments, SongComposer demonstrates superior performance in lyric-to-melody generation, melody-to-lyric generation, song continuation, and text-to-song creation, outperforming advanced LLMs like GPT-4. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: project page: https://pjlab-songcomposer.github.io/ code: https://github.com/pjlab-songcomposer/songcomposer

arXiv:2311.13196 [pdf, other]

Optimal Time of Arrival Estimation for MIMO Backscatter Channels

Authors: Chen He, Luyang Han, Z. Jane Wang

Abstract: In this paper, we propose a novel time of arrival (TOA) estimator for multiple-input-multiple-output (MIMO) backscatter channels in closed form. The proposed estimator refines the estimation precision from the topological structure of the MIMO backscatter channels, and can considerably enhance the estimation accuracy. Particularly, we show that for the general $M \times N$ bistatic topology, the m… ▽ More In this paper, we propose a novel time of arrival (TOA) estimator for multiple-input-multiple-output (MIMO) backscatter channels in closed form. The proposed estimator refines the estimation precision from the topological structure of the MIMO backscatter channels, and can considerably enhance the estimation accuracy. Particularly, we show that for the general $M \times N$ bistatic topology, the mean square error (MSE) is $\frac{M+N-1}{MN}σ^2_0$, and for the general $M \times M$ monostatic topology, it is $\frac{2M-1}{M^2}σ^2_0$ for the diagonal subchannels, and $\frac{M-1}{M^2}σ^2_0$ for the off-diagonal subchannels, where $σ^2_0$ is the MSE of the conventional least square estimator. In addition, we derive the Cramer-Rao lower bound (CRLB) for MIMO backscatter TOA estimation which indicates that the proposed estimator is optimal. Simulation results verify that the proposed TOA estimator can considerably improve both estimation and positioning accuracy, especially when the MIMO scale is large. △ Less

Submitted 22 November, 2023; originally announced November 2023.

arXiv:2307.07829 [pdf, other]

HQG-Net: Unpaired Medical Image Enhancement with High-Quality Guidance

Authors: Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxiang Tang, Yulun Zhang, Xiu Li, Yaowei Wang

Abstract: Unpaired Medical Image Enhancement (UMIE) aims to transform a low-quality (LQ) medical image into a high-quality (HQ) one without relying on paired images for training. While most existing approaches are based on Pix2Pix/CycleGAN and are effective to some extent, they fail to explicitly use HQ information to guide the enhancement process, which can lead to undesired artifacts and structural distor… ▽ More Unpaired Medical Image Enhancement (UMIE) aims to transform a low-quality (LQ) medical image into a high-quality (HQ) one without relying on paired images for training. While most existing approaches are based on Pix2Pix/CycleGAN and are effective to some extent, they fail to explicitly use HQ information to guide the enhancement process, which can lead to undesired artifacts and structural distortions. In this paper, we propose a novel UMIE approach that avoids the above limitation of existing methods by directly encoding HQ cues into the LQ enhancement process in a variational fashion and thus model the UMIE task under the joint distribution between the LQ and HQ domains. Specifically, we extract features from an HQ image and explicitly insert the features, which are expected to encode HQ cues, into the enhancement network to guide the LQ enhancement with the variational normalization module. We train the enhancement network adversarially with a discriminator to ensure the generated HQ image falls into the HQ domain. We further propose a content-aware loss to guide the enhancement process with wavelet-based pixel-level and multi-encoder-based feature-level constraints. Additionally, as a key motivation for performing image enhancement is to make the enhanced images serve better for downstream tasks, we propose a bi-level learning scheme to optimize the UMIE task and downstream tasks cooperatively, hel** generate HQ images both visually appealing and favorable for downstream tasks. Experiments on three medical datasets, including two newly collected datasets, verify that the proposed method outperforms existing techniques in terms of both enhancement quality and downstream task performance. We will make the code and the newly collected datasets publicly available for community study. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Comments: 14 pages, 10 figures

arXiv:2306.17634 [pdf, other]

Enhancing Feature Extraction for Indoor Fingerprint Localization Using Diversified Data

Authors: Jiyu Jiao, Xiaojun Wang, Chenlin He

Abstract: Given the rapid advancements in wireless communication and terminal devices, high-speed and convenient WiFi has permeated various aspects of people's lives, and attention has been drawn to the location services that WiFi can provide. Fingerprint-based methods, as an excellent approach for localization, have gradually become a hot research topic. However, in practical localization, fingerprint feat… ▽ More Given the rapid advancements in wireless communication and terminal devices, high-speed and convenient WiFi has permeated various aspects of people's lives, and attention has been drawn to the location services that WiFi can provide. Fingerprint-based methods, as an excellent approach for localization, have gradually become a hot research topic. However, in practical localization, fingerprint features of traditional methods suffer from low reliability and lacking robustness in complex indoor environments. To overcome these limitations, this paper proposes a innovative feature extraction-enhanced intelligent localization scheme named Secci, based on diversified channel state information (CSI). By modifying the device driver, diversified CSI data are extracted and transformed into RGB CSI images, which serve as input to a deep convolutional neural network (DCNN) with SE attention mechanism-assisted training in the offline stage. Employing a greedy probabilistic approach, rapid prediction of the estimated location is performed in the online stage using test RGB CSI images. The Secci system is implemented using off-the-shelf WiFi devices, and comprehensive experiments are carried out in two representative indoor environments to showcase the superior performance of Secci compared to four existing algorithms. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.00812 [pdf, other]

Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, **g Lu

Abstract: With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccura… ▽ More With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: accepted by Interspeech 2023

arXiv:2305.13839 [pdf, other]

SAR-to-Optical Image Translation via Thermodynamics-inspired Network

Authors: Ming** Zhang, Jiamin Xu, Chengyu He, Wenteng Shang, Yunsong Li, Xinbo Gao

Abstract: Synthetic aperture radar (SAR) is prevalent in the remote sensing field but is difficult to interpret in human visual perception. Recently, SAR-to-optical (S2O) image conversion methods have provided a prospective solution for interpretation. However, since there is a huge domain difference between optical and SAR images, they suffer from low image quality and geometric distortion in the produced… ▽ More Synthetic aperture radar (SAR) is prevalent in the remote sensing field but is difficult to interpret in human visual perception. Recently, SAR-to-optical (S2O) image conversion methods have provided a prospective solution for interpretation. However, since there is a huge domain difference between optical and SAR images, they suffer from low image quality and geometric distortion in the produced optical images. Motivated by the analogy between pixels during the S2O image translation and molecules in a heat field, Thermodynamics-inspired Network for SAR-to-Optical Image Translation (S2O-TDN) is proposed in this paper. Specifically, we design a Third-order Finite Difference (TFD) residual structure in light of the TFD equation of thermodynamics, which allows us to efficiently extract inter-domain invariant features and facilitate the learning of the nonlinear translation map**. In addition, we exploit the first law of thermodynamics (FLT) to devise an FLT-guided branch that promotes the state transition of the feature values from the unstable diffusion state to the stable one, aiming to regularize the feature diffusion and preserve image structures during S2O image translation. S2O-TDN follows an explicit design principle derived from thermodynamic theory and enjoys the advantage of explainability. Experiments on the public SEN1-2 dataset show the advantages of the proposed S2O-TDN over the current methods with more delicate textures and higher quantitative results. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.11049 [pdf, other]

NODE-ImgNet: a PDE-informed effective and robust model for image denoising

Authors: Xinheng Xie, Yue Wu, Hao Ni, Cuiyu He

Abstract: Inspired by the traditional partial differential equation (PDE) approach for image denoising, we propose a novel neural network architecture, referred as NODE-ImgNet, that combines neural ordinary differential equations (NODEs) with convolutional neural network (CNN) blocks. NODE-ImgNet is intrinsically a PDE model, where the dynamic system is learned implicitly without the explicit specification… ▽ More Inspired by the traditional partial differential equation (PDE) approach for image denoising, we propose a novel neural network architecture, referred as NODE-ImgNet, that combines neural ordinary differential equations (NODEs) with convolutional neural network (CNN) blocks. NODE-ImgNet is intrinsically a PDE model, where the dynamic system is learned implicitly without the explicit specification of the PDE. This naturally circumvents the typical issues associated with introducing artifacts during the learning process. By invoking such a NODE structure, which can also be viewed as a continuous variant of a residual network (ResNet) and inherits its advantage in image denoising, our model achieves enhanced accuracy and parameter efficiency. In particular, our model exhibits consistent effectiveness in different scenarios, including denoising gray and color images perturbed by Gaussian noise, as well as real-noisy images, and demonstrates superiority in learning from small image datasets. △ Less

Submitted 6 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.08099 [pdf, other]

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations

Authors: Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu

Abstract: Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings. However, the success of SSL models has yet to transfer to utterance-level tasks such as speaker, emotion, and language recognition, which still require supervised fine-tuning of… ▽ More Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings. However, the success of SSL models has yet to transfer to utterance-level tasks such as speaker, emotion, and language recognition, which still require supervised fine-tuning of the SSL models to obtain good performance. We argue that the problem is caused by the lack of disentangled representations and an utterance-level learning objective for these tasks. Inspired by how HuBERT uses clustering to discover hidden acoustic units, we formulate a factor analysis (FA) model that uses the discovered hidden acoustic units to align the SSL features. The underlying utterance-level representations are disentangled from the content of speech using probabilistic inference on the aligned features. Furthermore, the variational lower bound derived from the FA model provides an utterance-level objective, allowing error gradients to be backpropagated to the Transformer layers to learn highly discriminative acoustic units. When used in conjunction with HuBERT's masked prediction training, our models outperform the current best model, WavLM, on all utterance-level non-semantic tasks on the SUPERB benchmark with only 20% of labeled data. △ Less

Submitted 4 October, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

Comments: accepted by ICML 2023

arXiv:2302.09953 [pdf, other]

Personalized speech enhancement combining band-split RNN and speaker attentive module

Authors: Xiaohuai Le, Li Chen, Chao He, Yiqing Guo, Cheng Chen, Xianjun Xia, **g Lu

Abstract: Target speaker information can be utilized in speech enhancement (SE) models to more effectively extract the desired speech. Previous works introduce the speaker embedding into speech enhancement models by means of concatenation or affine transformation. In this paper, we propose a speaker attentive module to calculate the attention scores between the speaker embedding and the intermediate feature… ▽ More Target speaker information can be utilized in speech enhancement (SE) models to more effectively extract the desired speech. Previous works introduce the speaker embedding into speech enhancement models by means of concatenation or affine transformation. In this paper, we propose a speaker attentive module to calculate the attention scores between the speaker embedding and the intermediate features, which are used to rescale the features. By merging this module in the state-of-the-art SE model, we construct the personalized SE model for ICASSP Signal Processing Grand Challenge: DNS Challenge 5 (2023). Our system achieves a final score of 0.529 on the blind test set of track1 and 0.549 on track2. △ Less

Submitted 16 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

arXiv:2212.03986 [pdf, other]

Experimental Validation of a Safe Controller Integration Scheme for Connected Automated Trucks

Authors: Anil Alan, Chaozhe R. He, Tamas G. Molnar, Johaan C. Mathew, A. Harvey Bell, Gabor Orosz

Abstract: Accomplishing safe and efficient driving is one of the predominant challenges in the controller design of connected automated vehicles (CAVs). It is often more convenient to address these goals separately and integrate the resulting controllers. In this study, we propose a controller integration scheme to fuse performance-based controllers and safety-oriented controllers safely for the longitudina… ▽ More Accomplishing safe and efficient driving is one of the predominant challenges in the controller design of connected automated vehicles (CAVs). It is often more convenient to address these goals separately and integrate the resulting controllers. In this study, we propose a controller integration scheme to fuse performance-based controllers and safety-oriented controllers safely for the longitudinal motion of a CAV. The resulting structure is compatible with a large class of controllers, and offers flexibility to design each controller individually without affecting the performance of the others. We implement the proposed safe integration scheme on a connected automated truck using an optimal-in-energy controller and a safety-oriented connected cruise controller. We validate the premise of the safe integration through experiments with a full-scale truck in two scenarios: a controlled experiment on a test track and a real-world experiment on a public highway. In both scenarios, we achieve energy efficient driving without violating safety. △ Less

Submitted 7 December, 2022; originally announced December 2022.

Comments: 14 pages, 11 figures

arXiv:2211.05309 [pdf]

Generic Cryo-CMOS Device Modeling and EDACompatible Platform for Reliable Cryogenic IC Design

Authors: Zhidong Tang, Zewei Wang, Yumeng Yuan, Chang He, Xin Luo, Ao Guo, Renhe Chen, Yongqi Hu, Longfei Yang, Chengwei Cao, Linlin Liu, Liujiang Yu, Ganbing Shang, Yongfeng Cao, Shoumian Chen, Yuhang Zhao, Shaojian Hu, Xufeng Kou

Abstract: This paper outlines the establishment of a generic cryogenic CMOS database in which key electrical parameters and transfer characteristics of the MOSFETs are quantified as functions of device size, temperature/frequency responses. Meanwhile, comprehensive device statistical study is conducted to evaluate the influence of variation and mismatch effects at low temperatures. Furthermore, by incorpora… ▽ More This paper outlines the establishment of a generic cryogenic CMOS database in which key electrical parameters and transfer characteristics of the MOSFETs are quantified as functions of device size, temperature/frequency responses. Meanwhile, comprehensive device statistical study is conducted to evaluate the influence of variation and mismatch effects at low temperatures. Furthermore, by incorporating the Cryo-CMOS compact model into the process design kit (PDK), the cryogenic 4 Kb SRAM, 5-bit flash ADC and 8-bit current steering DAC are designed, and their performance is readily investigated and optimized on the EDA-compatible platform, hence laying a solid foundation for large-scale cryogenic IC design. △ Less

Submitted 9 February, 2024; v1 submitted 9 November, 2022; originally announced November 2022.

arXiv:2210.04397 [pdf, other]

Energy-efficient Reactive and Predictive Connected Cruise Control

Authors: Minghao Shen, R. Austin Dollar, Tamas G. Molnar, Chaozhe R. He, Ardalan Vahidi, Gabor Orosz

Abstract: In this paper, we propose a framework for the longitudinal control of connected and automated vehicles traveling in mixed traffic consisting of connected and non-connected human-driven vehicles. Reactive and predictive controllers are proposed. Reactive controllers are given by explicit feedback control laws. In predictive controllers, the control input is optimized in a receding-horizon fashion,… ▽ More In this paper, we propose a framework for the longitudinal control of connected and automated vehicles traveling in mixed traffic consisting of connected and non-connected human-driven vehicles. Reactive and predictive controllers are proposed. Reactive controllers are given by explicit feedback control laws. In predictive controllers, the control input is optimized in a receding-horizon fashion, which depends on the predictions of motions of preceding vehicles. Beyond-line-of-sight information is obtained via vehicle-to-vehicle (V2V) communication, and is utilized in the proposed reactive and predictive controllers. Simulations utilizing real traffic data are used to show that connectivity can bring significant energy savings. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: 18 pages, 12 figures, submitted to Transportation Research Part C: Emerging Technologies

arXiv:2206.13801 [pdf, other]

Joint Precoding for Active Intelligent Transmitting Surface Empowered Outdoor-to-Indoor Communication in mmWave Cellular Networks

Authors: Xie Xie, Chen He, Feifei Gao, Zhu Han, Z. Jane Wang

Abstract: Outdoor-to-indoor communications in millimeter-wave (mmWave) cellular networks have been one challenging research problem due to the severe attenuation and the high penetration loss caused by the propagation characteristics of mmWave signals. We propose a viable solution to implement the outdoor-to-indoor mmWave communication system with the aid of an active intelligent transmitting surface (activ… ▽ More Outdoor-to-indoor communications in millimeter-wave (mmWave) cellular networks have been one challenging research problem due to the severe attenuation and the high penetration loss caused by the propagation characteristics of mmWave signals. We propose a viable solution to implement the outdoor-to-indoor mmWave communication system with the aid of an active intelligent transmitting surface (active-ITS), where the active-ITS allows the incoming signal from an outdoor base station (BS) to pass through the surface and be received by the indoor user-equipments (UEs) after shifting its phase and magnifying its amplitude. Then, the problem of joint precoding of the BS and active-ITS is investigated to maximize the weighted sum-rate (WSR) of the communication system. An efficient block coordinate descent (BCD) based algorithm is developed to solve it with the suboptimal solutions in nearly closed-forms. In addition, to reduce the size and hardware cost of an active-ITS, we provide a block-amplifying architecture to partially remove the circuit components for power-amplifying, where multiple transmissive-type elements (TEs) in each block share a same power amplifier. Simulations indicate that active-ITS has the potential of achieving a given performance with much fewer TEs compared to the passive-ITS under the same total system power consumption, which makes it suitable for application to the size-limited and aesthetic-needed scenario, and the inevitable performance degradation caused by the block-amplifying architecture is acceptable. △ Less

Submitted 28 June, 2022; originally announced June 2022.

Comments: 30 pages, 8 figures

arXiv:2206.03568 [pdf, other]

Control Barrier Functions and Input-to-State Safety with Application to Automated Vehicles

Authors: Anil Alan, Andrew J. Taylor, Chaozhe R. He, Aaron D. Ames, Gabor Orosz

Abstract: Balancing safety and performance is one of the predominant challenges in modern control system design. Moreover, it is crucial to robustly ensure safety without inducing unnecessary conservativeness that degrades performance. In this work we present a constructive approach for safety-critical control synthesis via Control Barrier Functions (CBF). By filtering a hand-designed controller via a CBF,… ▽ More Balancing safety and performance is one of the predominant challenges in modern control system design. Moreover, it is crucial to robustly ensure safety without inducing unnecessary conservativeness that degrades performance. In this work we present a constructive approach for safety-critical control synthesis via Control Barrier Functions (CBF). By filtering a hand-designed controller via a CBF, we are able to attain performant behavior while providing rigorous guarantees of safety. In the face of disturbances, robust safety and performance are simultaneously achieved through the notion of Input-to-State Safety (ISSf). We take a tutorial approach by develo** the CBF-design methodology in parallel with an inverted pendulum example, making the challenges and sensitivities in the design process concrete. To establish the capability of the proposed approach, we consider the practical setting of safety-critical design via CBFs for a connected automated vehicle (CAV) in the form of a class-8 truck without a trailer. Through experimentation we see the impact of unmodeled disturbances in the truck's actuation system on the safety guarantees provided by CBFs. We characterize these disturbances and using ISSf, produce a robust controller that achieves safety without conceding performance. We evaluate our design both in simulation, and for the first time on an automotive system, experimentally. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: 16 pages, 16 figures

arXiv:2205.03473 [pdf, other]

Energy-efficient Connected Cruise Control with Lean Penetration of Connected Vehicles

Authors: Minghao Shen, Chaozhe R. He, Tamas Molnar, A. Harvey Bell, Gabor Orosz

Abstract: This paper focuses on energy-efficient longitudinal controller design for a connected automated truck that travels in mixed traffic consisting of connected and non-connected vehicles. The truck has access to information about connected vehicles beyond line of sight using vehicle-to-vehicle (V2V) communication. A novel connected cruise control design is proposed which incorporates additional delays… ▽ More This paper focuses on energy-efficient longitudinal controller design for a connected automated truck that travels in mixed traffic consisting of connected and non-connected vehicles. The truck has access to information about connected vehicles beyond line of sight using vehicle-to-vehicle (V2V) communication. A novel connected cruise control design is proposed which incorporates additional delays into the control law when responding to distant connected vehicles to account for the finite propagation of traffic waves. The speeds of non-connected vehicles are modeled as stochastic processes. A fundamental theorem is proven which links the spectral properties of the motion signals to the average energy consumption. This enables us to tune controller parameters and maximize energy efficiency. Simulations with synthetic data and real traffic data are used to demonstrate the energy efficiency of the control design. It is demonstrated that even with lean penetration of connected vehicles, our controller can bring significant energy savings. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: This is submitted to IEEE Transactions on Intelligent Transportation Systems

arXiv:2204.03947 [pdf, other]

Lensless coherent diffraction imaging based on spatial light modulator with unknown modulation curve

Authors: Hao Sha, Chao He, Shaowei Jiang, Pengming Song, Shuai Liu, Wenzhen Zou, Peiwu Qin, Haoqian Wang, Yongbing Zhang

Abstract: Lensless imaging is a popular research field for the advantages of small size, wide field-of-view and low aberration in recent years. However, some traditional lensless imaging methods suffer from slow convergence, mechanical errors and conjugate solution interference, which limit its further application and development. In this work, we proposed a lensless imaging method based on spatial light mo… ▽ More Lensless imaging is a popular research field for the advantages of small size, wide field-of-view and low aberration in recent years. However, some traditional lensless imaging methods suffer from slow convergence, mechanical errors and conjugate solution interference, which limit its further application and development. In this work, we proposed a lensless imaging method based on spatial light modulator (SLM) with unknown modulation curve. In our imaging system, we use SLM to modulate the wavefront of object, and introduce the ptychographic scanning algorithm that is able to recover the complex amplitude information even the SLM modulation curve is inaccurate or unknown. In addition, we also design a split-beam interference experiment to calibrate the modulation curve of SLM, and using the calibrated modulation function as the initial value of the expended ptychography iterative engine (ePIE) algorithm can improve the convergence speed. We further analyze the effect of modulation function, algorithm parameters and the characteristics of the coherent light source on the quality of reconstructed image. The simulated and real experiments show that the proposed method is superior to traditional mechanical scanning methods in terms of recovering speed and accuracy, with the recovering resolution up to 14 um. △ Less

Submitted 8 April, 2022; originally announced April 2022.

arXiv:2203.16025 [pdf]

Multiple Narrow-band signals Direction Finding with TMLA by Nonuniform Period Modulation

Authors: Kebin Liu, Lening Zhang, Qingkui Zhan, Chong He

Abstract: A new array signal reconstruction and signal-channel DOA estimation method based on TMLA by nonuniform period modulation are proposed. By using non-uniform period modulation, the harmonic component produced by different elements could be separated. Therefore, the conventional snapshot could be reconstructed by analyzing the spectrum of the combined signal. Then spatial spectrum estimation method i… ▽ More A new array signal reconstruction and signal-channel DOA estimation method based on TMLA by nonuniform period modulation are proposed. By using non-uniform period modulation, the harmonic component produced by different elements could be separated. Therefore, the conventional snapshot could be reconstructed by analyzing the spectrum of the combined signal. Then spatial spectrum estimation method is used to implement DOA estimation. Numerical simulations are provided to verify the feasibility and accuracy of the proposed method. Since the duration of the signal in the frequency domain analysis processed in a single time is very short, this method is also applicable to narrowband signals. Another highlight is that this method can simultaneously measure the number of the elements-1 angle of incident signals. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.14532 [pdf, ps, other]

Joint Active and Passive Beamforming Design for IRS-Aided Radar-Communication

Authors: Meng Hua, Qingqing Wu, Chong He, Shaodan Ma, Wen Chen

Abstract: In this paper, we study an intelligent reflecting surface (IRS)-aided radar-communication (Radcom) system, where the IRS is leveraged to help Radcom base station (BS) transmit the joint of communication signals and radar signals for serving communication users and tracking targets simultaneously. The objective of this paper is to minimize the total transmit power at the Radcom BS by jointly optimi… ▽ More In this paper, we study an intelligent reflecting surface (IRS)-aided radar-communication (Radcom) system, where the IRS is leveraged to help Radcom base station (BS) transmit the joint of communication signals and radar signals for serving communication users and tracking targets simultaneously. The objective of this paper is to minimize the total transmit power at the Radcom BS by jointly optimizing the active beamformers, including communication beamformers and radar beamformers, at the Radcom BS and the phase shifts at the IRS, subject to the minimum signal-to-interference-plus-noise ratio (SINR) required by communication users, the minimum SINR required by the radar, and the cross-correlation pattern design. In particular, we consider two cases, namely, case I and case II, based on the presence or absence of the radar cross-correlation design and the interference introduced by the IRS on the Radcom BS. For case I where the cross correlation design and the interference are not considered, we prove that the dedicated radar signals are not needed, which significantly reduces implementation complexity and simplifies algorithm design. Then, a penalty-based algorithm is proposed to solve the resulting non-convex optimization problem. Whereas for case II considering the cross-correlation design and the interference, we unveil that the dedicated radar signals are needed in general to enhance the system performance. Since the resulting optimization problem is more challenging to solve as compared with the case I, the semidefinite relaxation (SDR) based alternating optimization (AO) algorithm is proposed. Simulation results demonstrate the effectiveness of proposed algorithms and also show the superiority of the proposed scheme over various benchmark schemes. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: This paper answers the fundamental question: whether the dedicated radar signals are needed under two considered scenarios. The manuscript has been submitted to IEEE journal for possible publication

arXiv:2201.12565 [pdf, other]

Active IRS Aided Multiple Access for Energy-Constrained IoT Systems

Authors: Guangji Chen, Qingqing Wu, Chong He, Wen Chen, Jie Tang, Shi **

Abstract: We investigate the fundamental multiple access (MA) scheme in an active intelligent reflecting surface (IRS) aided energy-constrained Internet-of-Things (IoT) system, where an active IRS is deployed to assist the uplink transmission from multiple IoT devices to an access point (AP). Our goal is to maximize the sum throughput by optimizing the IRS beamforming vectors across time and resource alloca… ▽ More We investigate the fundamental multiple access (MA) scheme in an active intelligent reflecting surface (IRS) aided energy-constrained Internet-of-Things (IoT) system, where an active IRS is deployed to assist the uplink transmission from multiple IoT devices to an access point (AP). Our goal is to maximize the sum throughput by optimizing the IRS beamforming vectors across time and resource allocation. To this end, we first study two typical active IRS aided MA schemes, namely time division multiple access (TDMA) and non-orthogonal multiple access (NOMA), by analytically comparing their achievable sum throughput and proposing corresponding algorithms. Interestingly, we prove that given only one available IRS beamforming vector, the NOMA-based scheme generally achieves a larger throughput than the TDMA-based scheme, whereas the latter can potentially outperform the former if multiple IRS beamforming vectors are available to harness the favorable time selectivity of the IRS. To strike a flexible balance between the system performance and the associated signaling overhead incurred by more IRS beamforming vectors, we then propose a general hybrid TDMA-NOMA scheme with user grou**, where the devices in the same group transmit simultaneously via NOMA while devices in different groups occupy orthogonal time slots. By controlling the number of groups, the hybrid TDMA-NOMA scheme is applicable for any given number of IRS beamforming vectors available. Despite of the non-convexity of the considered optimization problem, we propose an efficient algorithm based on alternating optimization. Simulation results illustrate the practical superiorities of the active IRS over the passive IRS in terms of the coverage extension and supporting multiple energy-limited devices, and demonstrate the effectiveness of our proposed hybrid MA scheme for flexibly balancing the performance-cost tradeoff. △ Less

Submitted 29 January, 2022; originally announced January 2022.

arXiv:2201.09685 [pdf, other]

Robust Joint Design for Intelligent Reflecting Surfaces Assisted Cell-Free Networks

Authors: Xie Xie, Chen He, Xiaoya Li, Zhu Han, Z. Jane Wang

Abstract: Intelligent reflecting surfaces (IRSs) have emerged as a promising economical solution to implement cell-free networks. However, the performance gains achieved by IRSs critically depend on smartly tuned passive beamforming based on the assumption that the accurate channel state information (CSI) knowledge is available, which is practically impossible. Thus, in this paper, we investigate the impact… ▽ More Intelligent reflecting surfaces (IRSs) have emerged as a promising economical solution to implement cell-free networks. However, the performance gains achieved by IRSs critically depend on smartly tuned passive beamforming based on the assumption that the accurate channel state information (CSI) knowledge is available, which is practically impossible. Thus, in this paper, we investigate the impact of the CSI uncertainty on IRS-assisted cell-free networks. We adopt a stochastic programming method to cope with the CSI uncertainty by maximizing the expectation of the sum-rate, which guarantees robust performance over the average. Accordingly, an average sum-rate maximization problem is formulated, which is non-convex and arduous to obtain its optimal solution due to the coupled variables and the expectation operation with respect to CSI uncertainties. As a compromising approach, we develop an efficient robust joint design algorithm with low-complexity. Particularly, the original problem is equivalently transformed into a tractable form, and then, the locally optimal solution can be obtained by employing the block coordinate descent method. We further prove that the CSI uncertainty impacts the design of the active transmitting beamforming of APs, but surprisingly does not directly impact the design of the passive reflecting beamforming of IRSs. It is worth noting that the investigated scenario is flexible and general, and thus the proposed algorithm can act as a general framework to solve various sum-rate maximization problems. Simulation results demonstrate that IRSs can achieve considerable data rate improvement for conventional cell-free networks, and confirm the resilience of the proposed algorithm against the CSI uncertainty. △ Less

Submitted 20 February, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

Comments: 30 pages

arXiv:2201.02309 [pdf, other]

A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Authors: Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren, Jian Lu

Abstract: In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is… ▽ More In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations. △ Less

Submitted 6 January, 2022; originally announced January 2022.

Comments: 13 pages, 5 figures

arXiv:2201.00927 [pdf]

Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning Approach

Authors: Nathan A. Chi, Peter Washington, Aaron Kline, Arman Husic, Cathy Hou, Chloe He, Kaitlyn Dunlap, Dennis Wall

Abstract: Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process, significant attention has been given to develo** systems that automatically screen for autism. Pr… ▽ More Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process, significant attention has been given to develo** systems that automatically screen for autism. Prosody abnormalities are among the clearest signs of autism, with affected children displaying speech idiosyncrasies including echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. In this work, we present a suite of machine learning approaches to detect autism in self-recorded speech audio captured from autistic and neurotypical (NT) children in home environments. We consider three methods to detect autism in child speech: first, Random Forests trained on extracted audio features (including Mel-frequency cepstral coefficients); second, convolutional neural networks (CNNs) trained on spectrograms; and third, fine-tuned wav2vec 2.0--a state-of-the-art Transformer-based ASR model. We train our classifiers on our novel dataset of cellphone-recorded child speech audio curated from Stanford's Guess What? mobile game, an app designed to crowdsource videos of autistic and neurotypical children in a natural home environment. The Random Forest classifier achieves 70% accuracy, the fine-tuned wav2vec 2.0 model achieves 77% accuracy, and the CNN achieves 79% accuracy when classifying children's audio as either ASD or NT. Our models were able to predict autism status when training on a varied selection of home audio clips with inconsistent recording quality, which may be more generalizable to real world conditions. These results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment. △ Less

Submitted 3 January, 2022; originally announced January 2022.

Comments: 17 pages, 4 figures, submitted to JMIR Pediatrics and Parenting

arXiv:2112.11490 [pdf, other]

Do Androids Dream of Electric Fences? Safety-Aware Reinforcement Learning with Latent Shielding

Authors: Chloe He, Borja G. Leon, Francesco Belardinelli

Abstract: The growing trend of fledgling reinforcement learning systems making their way into real-world applications has been accompanied by growing concerns for their safety and robustness. In recent years, a variety of approaches have been put forward to address the challenges of safety-aware reinforcement learning; however, these methods often either require a handcrafted model of the environment to b… ▽ More The growing trend of fledgling reinforcement learning systems making their way into real-world applications has been accompanied by growing concerns for their safety and robustness. In recent years, a variety of approaches have been put forward to address the challenges of safety-aware reinforcement learning; however, these methods often either require a handcrafted model of the environment to be provided beforehand, or that the environment is relatively simple and low-dimensional. We present a novel approach to safety-aware deep reinforcement learning in high-dimensional environments called latent shielding. Latent shielding leverages internal representations of the environment learnt by model-based agents to "imagine" future trajectories and avoid those deemed unsafe. We experimentally demonstrate that this approach leads to improved adherence to formally-defined safety specifications. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: Accepted at SafeAI 2022

arXiv:2111.15380 [pdf, other]

Transient Stability of Low-Inertia Power Systems with Inverter-Based Generation

Authors: Changjun He, Xiuqiang He, Hua Geng, Huadong Sun, Shiyun Xu

Abstract: This study examines the transient stability of low-inertia power systems with inverter-based generation (IBG) and proposes a sufficient stability criterion. In low-inertia grids, transient interactions are induced between the electromagnetic dynamics of the IBG and the electromechanical dynamics of the synchronous generator (SG) under a fault. For this, a hybrid IBG-SG system is established and a… ▽ More This study examines the transient stability of low-inertia power systems with inverter-based generation (IBG) and proposes a sufficient stability criterion. In low-inertia grids, transient interactions are induced between the electromagnetic dynamics of the IBG and the electromechanical dynamics of the synchronous generator (SG) under a fault. For this, a hybrid IBG-SG system is established and a delta-power-frequency model is developed. Based on this model, new mechanisms of transient instability different from those of conventional power systems from the energy perspective are discovered. First, two loss-of-synchronization (LOS) types are identified based on the relative power imbalance owing to the mismatch between the inertia of the IBG and SG under a fault. Second, the relative angle and frequency will jump at the moment of a fault, thus affecting the system energy. Third, the cosine dam** coefficient induces a positive energy dissipation, thereby contributing to the system stability. A unified criterion for identifying the two LOS types is proposed using the energy function method. This criterion is proved to be a sufficient stability condition for addressing the effects of the jumps and cosine dam** coefficient on the system stability. The new mechanisms and effectiveness of the criterion are verified based on simulation results. △ Less

Submitted 21 April, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

arXiv:2110.12610 [pdf, other]

Antenna Array Enabled Space/Air/Ground Communications and Networking for 6G

Authors: Zhenyu Xiao, Zhu Han, Arumugam Nallanathan, Octavia A. Dobre, Bruno Clerckx, **ho Choi, Chong He, Wen Tong

Abstract: Antenna arrays have a long history of more than 100 years and have evolved closely with the development of electronic and information technologies, playing an indispensable role in wireless communications and radar. With the rapid development of electronic and information technologies, the demand for all-time, all-domain, and full-space network services has exploded, and new communication requirem… ▽ More Antenna arrays have a long history of more than 100 years and have evolved closely with the development of electronic and information technologies, playing an indispensable role in wireless communications and radar. With the rapid development of electronic and information technologies, the demand for all-time, all-domain, and full-space network services has exploded, and new communication requirements have been put forward on various space/air/ground platforms. To meet the ever increasing requirements of the future sixth generation (6G) wireless communications, such as high capacity, wide coverage, low latency, and strong robustness, it is promising to employ different types of antenna arrays with various beamforming technologies in space/air/ground communication networks, bringing in advantages such as considerable antenna gains, multiplexing gains, and diversity gains. However, enabling antenna array for space/air/ground communication networks poses specific, distinctive and tricky challenges, which has aroused extensive research attention. This paper aims to overview the field of antenna array enabled space/air/ground communications and networking. The technical potentials and challenges of antenna array enabled space/air/ground communications and networking are presented first. Subsequently, the antenna array structures and designs are discussed. We then discuss various emerging technologies facilitated by antenna arrays to meet the new communication requirements of space/air/ground communication systems. Enabled by these emerging technologies, the distinct characteristics, challenges, and solutions for space communications, airborne communications, and ground communications are reviewed. Finally, we present promising directions for future research in antenna array enabled space/air/ground communications and networking. △ Less

Submitted 26 March, 2022; v1 submitted 24 October, 2021; originally announced October 2021.

arXiv:2109.05462 [pdf, other]

Multi-Antenna Systems by Transmissive Reconfigurable Meta-Surface

Authors: Zhendong Li, Wen Chen, Chong He, Xudong Bai, Jianmin Lu

Abstract: Reconfigurable meta-surface (RMS) is proposed as a very promising and novel technology, which is composed of a large number of low-cost passive elements, and can achieve passive beamforming by controlling the amplitude and phase of incident electromagnetic (EM) waves. Therefore, in order to solve the challenges of high power consumption and high cost of existing base stations (BSs), we propose a l… ▽ More Reconfigurable meta-surface (RMS) is proposed as a very promising and novel technology, which is composed of a large number of low-cost passive elements, and can achieve passive beamforming by controlling the amplitude and phase of incident electromagnetic (EM) waves. Therefore, in order to solve the challenges of high power consumption and high cost of existing base stations (BSs), we propose a low-cost and low-power consumption transmissive RMS multi-antenna system in this paper. Specifically, we first provide an overview of the transmissive RMS multi-antenna system, including its advantages, network architecture, transmission mechanism, modulation principle, channel model and channel estimation technique. Then, we address transceiver design and optimization for downlink (DL) and uplink (UL), and some numerical results are also given to verify the effectiveness of the proposed algorithm. Finally, several potential research directions of the transmissive RMS multi-antenna system are given to inspire further investigation in future work. △ Less

Submitted 20 February, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

arXiv:2106.07976 [pdf, other]

doi 10.1145/3485730.3493444

Federated Learning for Internet of Things: A Federated Learning Framework for On-device Anomaly Data Detection

Authors: Tuo Zhang, Chaoyang He, Tianhao Ma, Lei Gao, Mark Ma, Salman Avestimehr

Abstract: Federated learning can be a promising solution for enabling IoT cybersecurity (i.e., anomaly detection in the IoT environment) while preserving data privacy and mitigating the high communication/storage overhead (e.g., high-frequency data from time-series sensors) of centralized over-the-cloud approaches. In this paper, to further push forward this direction with a comprehensive study in both algo… ▽ More Federated learning can be a promising solution for enabling IoT cybersecurity (i.e., anomaly detection in the IoT environment) while preserving data privacy and mitigating the high communication/storage overhead (e.g., high-frequency data from time-series sensors) of centralized over-the-cloud approaches. In this paper, to further push forward this direction with a comprehensive study in both algorithm and system design, we build FedIoT platform that contains FedDetect algorithm for on-device anomaly data detection and a system design for realistic evaluation of federated learning on IoT devices. Furthermore, the proposed FedDetect learning framework improves the performance by utilizing a local adaptive optimizer (e.g., Adam) and a cross-round learning rate scheduler. In a network of realistic IoT devices (Raspberry PI), we evaluate FedIoT platform and FedDetect algorithm in both model and system performance. Our results demonstrate the efficacy of federated learning in detecting a wider range of attack types occurred at multiple devices. The system efficiency analysis indicates that both end-to-end training time and memory cost are affordable and promising for resource-constrained IoT devices. The source code is publicly available at https://github.com/FedML-AI/FedIoT. △ Less

Submitted 18 October, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Journal ref: Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, November 2021, Pages 413-419

arXiv:2105.14545 [pdf, other]

A Joint Power Splitting, Active and Passive Beamforming Optimization Framework for IRS Assisted MIMO SWIPT System

Authors: Chen He, Xie Xie, Kun Yang, Z. Jane Wang

Abstract: This paper considers an intelligent reflecting surface (IRS) assisted multi-input multi-output (MIMO) power splitting (PS) based simultaneous wireless information and power transfer (SWIPT) system with multiple PS receivers (PSRs). The objective is to maximize the achievable data rate of the system by jointly optimizing the PS ratios at the PSRs, the active transmit beamforming (ATB) at the access… ▽ More This paper considers an intelligent reflecting surface (IRS) assisted multi-input multi-output (MIMO) power splitting (PS) based simultaneous wireless information and power transfer (SWIPT) system with multiple PS receivers (PSRs). The objective is to maximize the achievable data rate of the system by jointly optimizing the PS ratios at the PSRs, the active transmit beamforming (ATB) at the access point (AP), and the passive reflective beamforming (PRB) at the IRS, while the constraints on maximum transmission power at the AP, the reflective phase shift of each element at the IRS, the individual minimum harvested energy requirement of each PSR, and the domain of PS ratio of each PSR are all satisfied. For this unsolved problem, however, since the optimization variables are intricately coupled and the constraints are conflicting, the formulated problem is non-convex, and cannot be addressed by employing exist approaches directly. To this end, we propose a joint optimization framework to solve this problem. Particularly, we reformulate it as an equivalent form by employing the Lagrangian dual transform and the fractional programming transform, and decompose the transformed problem into several sub-problems. Then, we propose an alternate optimization algorithm by capitalizing on the dual sub-gradient method, the successive convex approximation method, and the penalty-based majorization-minimization approach, to solve the sub-problems iteratively, and obtain the optimal solutions in nearly closed-forms. Numerical simulation results verify the effectiveness of the IRS in SWIPT system and indicate that the proposed algorithm offers a substantial performance gain. △ Less

Submitted 30 May, 2021; originally announced May 2021.

Comments: 13 pages, 7 figures

arXiv:2105.03939 [pdf, other]

Differentiable Neural Architecture Search for Extremely Lightweight Image Super-Resolution

Authors: Han Huang, Li Shen, Chaoyang He, Weisheng Dong, Wei Liu

Abstract: Single Image Super-Resolution (SISR) tasks have achieved significant performance with deep neural networks. However, the large number of parameters in CNN-based met-hods for SISR tasks require heavy computations. Although several efficient SISR models have been recently proposed, most are handcrafted and thus lack flexibility. In this work, we propose a novel differentiable Neural Architecture Sea… ▽ More Single Image Super-Resolution (SISR) tasks have achieved significant performance with deep neural networks. However, the large number of parameters in CNN-based met-hods for SISR tasks require heavy computations. Although several efficient SISR models have been recently proposed, most are handcrafted and thus lack flexibility. In this work, we propose a novel differentiable Neural Architecture Search (NAS) approach on both the cell-level and network-level to search for lightweight SISR models. Specifically, the cell-level search space is designed based on an information distillation mechanism, focusing on the combinations of lightweight operations and aiming to build a more lightweight and accurate SR structure. The network-level search space is designed to consider the feature connections among the cells and aims to find which information flow benefits the cell most to boost the performance. Unlike the existing Reinforcement Learning (RL) or Evolutionary Algorithm (EA) based NAS methods for SISR tasks, our search pipeline is fully differentiable, and the lightweight SISR models can be efficiently searched on both the cell-level and network-level jointly on a single GPU. Experiments show that our methods can achieve state-of-the-art performance on the benchmark datasets in terms of PSNR, SSIM, and model complexity with merely 68G Multi-Adds for $\times 2$ and 18G Multi-Adds for $\times 4$ SR tasks. △ Less

Submitted 19 December, 2022; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: Accepted to IEEE Transactions on Circuits and Systems for Video Technology

arXiv:2103.08041 [pdf, ps, other]

Safe Controller Synthesis with Tunable Input-to-State Safe Control Barrier Functions

Authors: Anil Alan, Andrew J. Taylor, Chaozhe R. He, Gábor Orosz, Aaron D. Ames

Abstract: To bring complex systems into real world environments in a safe manner, they will have to be robust to uncertainties - both in the environment and the system. This paper investigates the safety of control systems under input disturbances, wherein the disturbances can capture uncertainties in the system. Safety, framed as forward invariance of sets in the state space, is ensured with the framework… ▽ More To bring complex systems into real world environments in a safe manner, they will have to be robust to uncertainties - both in the environment and the system. This paper investigates the safety of control systems under input disturbances, wherein the disturbances can capture uncertainties in the system. Safety, framed as forward invariance of sets in the state space, is ensured with the framework of control barrier functions (CBFs). Concretely, the definition of input to state safety (ISSf) is generalized to allow the synthesis of non-conservative, tunable controllers that are provably safe under varying disturbances. This is achieved by formulating the concept of tunable input to state safe control barrier functions (TISSf-CBFs) which guarantee safety for disturbances that vary with state and, therefore, provide less conservative means of accommodating uncertainty. The theoretical results are demonstrated with a simple control system with input disturbance and also applied to design a safe connected cruise controller for a heavy duty truck. △ Less

Submitted 4 June, 2021; v1 submitted 14 March, 2021; originally announced March 2021.

arXiv:2101.01886 [pdf, other]

A New Weighting Scheme for Fan-beam and Circle Cone-beam CT Reconstructions

Authors: Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren, Jian Lu, Tianfu Wang, Baiying Lei

Abstract: In this paper, we first present an arc based algorithm for fan-beam computed tomography (CT) reconstruction via applying Katsevich's helical CT formula to 2D fan-beam CT reconstruction. Then, we propose a new weighting function to deal with the redundant projection data. By extending the weighted arc based fan-beam algorithm to circle cone-beam geometry, we also obtain a new FDK-similar algorithm… ▽ More In this paper, we first present an arc based algorithm for fan-beam computed tomography (CT) reconstruction via applying Katsevich's helical CT formula to 2D fan-beam CT reconstruction. Then, we propose a new weighting function to deal with the redundant projection data. By extending the weighted arc based fan-beam algorithm to circle cone-beam geometry, we also obtain a new FDK-similar algorithm for circle cone-beam CT reconstruction. Experiments show that our methods can obtain higher PSNR and SSIM compared to the Parker-weighted conventional fan-beam algorithm and the FDK algorithm for super-short-scan trajectories. △ Less

Submitted 6 January, 2021; originally announced January 2021.

arXiv:2011.10316 [pdf]

doi 10.1109/TPWRS.2021.3098393

Synchronization Instability of Inverter-Based Generation During Asymmetrical Grid Faults

Authors: Xiuqiang He, Changjun He, Sisi Pan, Hua Geng, Feng Liu

Abstract: The transient stability of traditional power systems is concerned with the ability of generators to stay synchronized with the positive-sequence voltage of the network, whether for symmetrical or asymmetrical faults. In contrast, both positive- and negative-sequence synchronizations should be of concern for inverter-based generation (IBG) under asymmetrical faults. This is because the latest grid… ▽ More The transient stability of traditional power systems is concerned with the ability of generators to stay synchronized with the positive-sequence voltage of the network, whether for symmetrical or asymmetrical faults. In contrast, both positive- and negative-sequence synchronizations should be of concern for inverter-based generation (IBG) under asymmetrical faults. This is because the latest grid codes stipulate that IBG should inject dual-sequence current when riding through asymmetrical faults. Currently, much less is known about the synchronization stability during asymmetrical faults. This significantly differs from the positive-sequence synchronization alone because the coupled dual-sequence synchronization is involved. This paper aims to fill this gap. Considering the sequence coupling under asymmetrical faults, the dual-sequence synchronization model of IBG is developed. Based on the model, the conditions that steady-state equilibrium points should follow are identified. The conditions throw light on the possible types of synchronization instability, including the positive-sequence dominated instability and the negative-sequence dominated one. For different types of instability, the dominant factors are analyzed quantitatively, which are reflected by the limit on the current injection amplitude. Exceeding the limit will lead to the loss of both positive- and negative-sequence synchronizations. The model and the analysis are verified by simulations and hardware-in-the-loop experiments. △ Less

Submitted 22 July, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

arXiv:2010.06504 [pdf]

Single-Sideband Time-Modulated Phased Array With 2-bit Phased Shifters

Authors: Yanchang Gao, Gang Ni, Kun Wang, Yiqing Liu, Chong He, Ronghong **, Xianling Liang

Abstract: A novel single-sideband (SSB) time-modulated technique with 2-bit phase shifters is proposed. The timemodulated module is implemented by adding periodic phase modulation to 2-bit phase shifters, which is simpler without performance loss compared to existing SSB time-modulated method. During one modulation period, four phase states (0, π/2, π, 3π/2) of 2-bit phase shifters are switched in sequence.… ▽ More A novel single-sideband (SSB) time-modulated technique with 2-bit phase shifters is proposed. The timemodulated module is implemented by adding periodic phase modulation to 2-bit phase shifters, which is simpler without performance loss compared to existing SSB time-modulated method. During one modulation period, four phase states (0, π/2, π, 3π/2) of 2-bit phase shifters are switched in sequence. After the modulation, the SSB time modulation is realized and the main power is distributed to the first harmonic component. The feasibility of the proposed method is verified by experiments. The undesired harmonics are efficiently suppressed. Meanwhile, 80° beam scanning range are realized through the proposed module. △ Less

Submitted 6 October, 2020; originally announced October 2020.

arXiv:2008.03988 [pdf, other]

A model-guided deep network for limited-angle computed tomography

Authors: Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren, Jian Lu, Tianfu Wang, Baiying Lei

Abstract: In this paper, we first propose a variational model for the limited-angle computed tomography (CT) image reconstruction and then convert the model into an end-to-end deep network.We use the penalty method to solve the model and divide it into three iterative subproblems, where the first subproblem completes the sinograms by utilizing the prior information of sinograms in the frequency domain and t… ▽ More In this paper, we first propose a variational model for the limited-angle computed tomography (CT) image reconstruction and then convert the model into an end-to-end deep network.We use the penalty method to solve the model and divide it into three iterative subproblems, where the first subproblem completes the sinograms by utilizing the prior information of sinograms in the frequency domain and the second refines the CT images by using the prior information of CT images in the spatial domain, and the last merges the outputs of the first two subproblems. In each iteration, we use the convolutional neural networks (CNNs) to approxiamte the solutions of the first two subproblems and, thus, obtain an end-to-end deep network for the limited-angle CT image reconstruction. Our network tackles both the sinograms and the CT images, and can simultaneously suppress the artifacts caused by the incomplete data and recover fine structural information in the CT images. Experimental results show that our method outperforms the existing algorithms for the limited-angle CT image reconstruction. △ Less

Submitted 10 August, 2020; originally announced August 2020.

arXiv:2002.10053 [pdf, other]

doi 10.1103/PhysRevApplied.14.034006

Effective statistical fringe removal algorithm for high-sensitivity imaging of ultracold atoms

Authors: Bo Song, Chengdong He, Zejian Ren, Entong Zhao, Jeongwon Lee, Gyu-Boong Jo

Abstract: High-sensitivity imaging of ultracold atoms is often challenging when interference patterns are imprinted on the imaging light. Such image noises result in low signal-to-noise ratio and limit the capability to extract subtle physical quantities. Here we demonstrate an advanced fringe removal algorithm for absorption imaging of ultracold atoms, which efficiently suppresses unwanted fringe patterns… ▽ More High-sensitivity imaging of ultracold atoms is often challenging when interference patterns are imprinted on the imaging light. Such image noises result in low signal-to-noise ratio and limit the capability to extract subtle physical quantities. Here we demonstrate an advanced fringe removal algorithm for absorption imaging of ultracold atoms, which efficiently suppresses unwanted fringe patterns using a small number of sample images without taking additional reference images. The protocol is based on an image decomposition and projection method with an extended image basis. We apply this scheme to raw absorption images of degenerate Fermi gases for the measurement of atomic density fluctuations and temperatures. The quantitative analysis shows that image noises can be efficiently removed with only tens of reference images, which manifests the efficiency of our protocol. Our algorithm would be of particular interest for the quantum emulation experiments in which several physical parameters need to be scanned within a limited time duration. △ Less

Submitted 23 February, 2020; originally announced February 2020.

Comments: 6 pages, 5 figures, supplementary materials

Journal ref: Phys. Rev. Applied 14, 034006 (2020)

arXiv:2001.09193 [pdf, other]

doi 10.1016/j.media.2021.102166

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

Authors: Anjany Sekuboyina, Malek E. Husseini, Amirhossein Bayat, Maximilian Löffler, Hans Liebl, Hongwei Li, Giles Tetteh, Jan Kukačka, Christian Payer, Darko Štern, Martin Urschler, Maodong Chen, Dalong Cheng, Nikolas Lessmann, Yu** Hu, Tianfu Wang, Dong Yang, Daguang Xu, Felix Ambellan, Tamaz Amiranashvili, Moritz Ehlke, Hans Lamecker, Sebastian Lehnert, Marilia Lirio, Nicolás Pérez de Olaguer , et al. (44 additional authors not shown)

Abstract: Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to co… ▽ More Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms towards labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The content and code concerning VerSe can be accessed at: https://github.com/anjany/verse. △ Less

Submitted 5 April, 2022; v1 submitted 24 January, 2020; originally announced January 2020.

Comments: Challenge report for the VerSe 2019 and 2020. Published in Medical Image Analysis (DOI: https://doi.org/10.1016/j.media.2021.102166)

Journal ref: Medical Image Analysis, Volume 73, October 2021, 102166

arXiv:2001.07150 [pdf, other]

A deep network for sinogram and CT image reconstruction

Authors: Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren, Jian Lu, Tianfu Wang, Baiying Lei

Abstract: A CT image can be well reconstructed when the sampling rate of the sinogram satisfies the Nyquist criteria and the sampled signal is noise-free. However, in practice, the sinogram is usually contaminated by noise, which degrades the quality of a reconstructed CT image. In this paper, we design a deep network for sinogram and CT image reconstruction. The network consists of two cascaded blocks that… ▽ More A CT image can be well reconstructed when the sampling rate of the sinogram satisfies the Nyquist criteria and the sampled signal is noise-free. However, in practice, the sinogram is usually contaminated by noise, which degrades the quality of a reconstructed CT image. In this paper, we design a deep network for sinogram and CT image reconstruction. The network consists of two cascaded blocks that are linked by a filter backprojection (FBP) layer, where the former block is responsible for denoising and completing the sinograms while the latter is used to removing the noise and artifacts of the CT images. Experimental results show that the reconstructed CT images by our methods have the highest PSNR and SSIM in average compared to state of the art methods. △ Less

Submitted 20 January, 2020; originally announced January 2020.

arXiv:1912.03661 [pdf]

Adaptive Trajectory Estimation with Power Limited Steering Model under Perturbation Compensation

Authors: Weipeng Li, Xiaogang Yang, Ruitao Lu, Jiwei Fan, Tao Zhang, Chuan He

Abstract: Trajectory estimation of maneuvering objects is applied in numerous tasks like navigation, path planning and visual tracking. Many previous works get impressive results in the strictly controlled condition with accurate prior statistics and dedicated dynamic model for certain object. But in challenging conditions without dedicated dynamic model and precise prior statistics, the performance of thes… ▽ More Trajectory estimation of maneuvering objects is applied in numerous tasks like navigation, path planning and visual tracking. Many previous works get impressive results in the strictly controlled condition with accurate prior statistics and dedicated dynamic model for certain object. But in challenging conditions without dedicated dynamic model and precise prior statistics, the performance of these methods significantly declines. To solve the problem, a dynamic model called the power-limited steering model (PLS) is proposed to describe the motion of non-cooperative object. It is a natural combination of instantaneous power and instantaneous angular velocity, which relies on the nonlinearity instead of the state switching probability to achieve switching of states. And the renormalization group is introduced to compensate the nonlinear effect of perturbation in PLS model. For robust and efficient trajectory estimation, an adaptive trajectory estimation (AdaTE) algorithm is proposed. By updating the statistics and truncation time online, it corrects the estimation error caused by biased prior statistics and observation drift, while reducing the computational complexity lower than O(n). The experiment of trajectory estimation demonstrates the convergence of AdaTE, and the better robust to the biased prior statistics and the observation drift compared with EKF, UKF and sparse MAP. Other experiments demonstrate through slight modification, AdaTE can also be applied to local navigation in random obstacle environment, and trajectory optimization in visual tracking. △ Less

Submitted 1 July, 2020; v1 submitted 8 December, 2019; originally announced December 2019.

Comments: 19 pages, 7 figures

ACM Class: G.3.13; J.2.7

arXiv:1906.10886 [pdf, other]

Joint Multi-frame Detection and Segmentation for Multi-cell Tracking

Authors: Zibin Zhou, Fei Wang, Wenjuan Xi, Huaying Chen, Peng Gao, Chengkang He

Abstract: Tracking living cells in video sequence is difficult, because of cell morphology and high similarities between cells. Tracking-by-detection methods are widely used in multi-cell tracking. We perform multi-cell tracking based on the cell centroid detection, and the performance of the detector has high impact on tracking performance. In this paper, UNet is utilized to extract inter-frame and intra-f… ▽ More Tracking living cells in video sequence is difficult, because of cell morphology and high similarities between cells. Tracking-by-detection methods are widely used in multi-cell tracking. We perform multi-cell tracking based on the cell centroid detection, and the performance of the detector has high impact on tracking performance. In this paper, UNet is utilized to extract inter-frame and intra-frame spatio-temporal information of cells. Detection performance of cells in mitotic phase is improved by multi-frame input. Good detection results facilitate multi-cell tracking. A mitosis detection algorithm is proposed to detect cell mitosis and the cell lineage is built up. Another UNet is utilized to acquire primary segmentation. Jointly using detection and primary segmentation, cells can be fine segmented in highly dense cell population. Experiments are conducted to evaluate the effectiveness of our method, and results show its state-of-the-art performance. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Comments: Accepted by International Conference on Image and Graphics (ICIG 2019)

arXiv:1810.11548

On the Identifiability of the Influence Model for Stochastic Spatiotemporal Spread Processes

Authors: Chenyuan He, Yan Wan, Frank L. Lewis

Abstract: The influence model is a discrete-time stochastic model that succinctly captures the interactions of a network of Markov chains. The model produces a reduced-order representation of the stochastic network, and can be used to describe and tractably analyze probabilistic spatiotemporal spread dynamics, and hence has found broad usage in network applications such as social networks, traffic managemen… ▽ More The influence model is a discrete-time stochastic model that succinctly captures the interactions of a network of Markov chains. The model produces a reduced-order representation of the stochastic network, and can be used to describe and tractably analyze probabilistic spatiotemporal spread dynamics, and hence has found broad usage in network applications such as social networks, traffic management, and failure cascades in power systems. This paper provides sufficient and necessary conditions for the identifiability of the influence model, and also develops estimators for the model structure through exploiting the model's special properties. In addition, we analyze conditions for the identifiability of the partially observed influence model (POIM), for which not all of the sites can be measured. △ Less

Submitted 6 November, 2018; v1 submitted 26 October, 2018; originally announced October 2018.

Comments: This temporary draft version of this paper has caused conflict of interest and we request to withdraw this paper from arXiv

arXiv:1810.04840 [pdf, other]

A Comparison of CP-OFDM, PCC-OFDM and UFMC for 5G Uplink Communications

Authors: Gayathri Kongara, Lei Yang, Cuiwei He, Jean Armstrong

Abstract: Polynomial-cancellation-coded orthogonal frequency division multiplexing (PCC-OFDM) is a form of OFDM that has waveforms which are very well localized in both the time and frequency domains and so it is ideally suited for use in the 5G network. This paper analyzes the performance of PCC-OFDM in the uplink of a multiuser system using orthogonal frequency division multiple access (OFDMA) and compare… ▽ More Polynomial-cancellation-coded orthogonal frequency division multiplexing (PCC-OFDM) is a form of OFDM that has waveforms which are very well localized in both the time and frequency domains and so it is ideally suited for use in the 5G network. This paper analyzes the performance of PCC-OFDM in the uplink of a multiuser system using orthogonal frequency division multiple access (OFDMA) and compares it with conventional cyclic prefix OFDM (CP-OFDM), and universal filtered multicarrier (UFMC). PCC-OFDM is shown to be much less sensitive than either CP-OFDM or UFMC to time and frequency offsets. For a given constellation size, PCC-OFDM in additive white Gaussian noise (AWGN) requires 3dB lower signal-to-noise ratio (SNR) for a given bit-error-rate, and the SNR advantage of PCC-OFDM increases rapidly when there are timing and/or frequency offsets. For PCC-OFDM no frequency guard band is required between different OFDMA users. PCC-OFDM is completely compatible with CP-OFDM and adds negligible complexity and latency, as it uses a simple map** of data onto pairs of subcarriers at the transmitter, and a simple weighting-and-adding of pairs of subcarriers at the receiver. The weighting and adding step, which has been omitted in some of the literature, is shown to contribute substantially to the SNR advantage of PCC-OFDM. A disadvantage of PCC-OFDM (without overlap**) is the potential reduction in spectral efficiency because subcarriers are modulated in pairs, but this reduction is more than regained because no guard band or cyclic prefix is required and because, for a given channel, larger constellations can be used. △ Less

Submitted 11 October, 2018; originally announced October 2018.

arXiv:1808.04627 [pdf]

A Novel Sliding Mode Control for a Class of Affine Dynamic Systems

Authors: Zuren Feng, Ruizhi Sha, Na Lu, Chenlong He

Abstract: This paper proposes a novel sliding mode control (SMC) method for a class of affine dynamic systems. In this type of systems, the high-frequency gain matrix (HFGM), which is the matrix multiplying the control vector in the dynamic equation of the sliding variables vector, is neither deterministic nor positive definite. This case has rarely been covered by general SMC methods, which perform well un… ▽ More This paper proposes a novel sliding mode control (SMC) method for a class of affine dynamic systems. In this type of systems, the high-frequency gain matrix (HFGM), which is the matrix multiplying the control vector in the dynamic equation of the sliding variables vector, is neither deterministic nor positive definite. This case has rarely been covered by general SMC methods, which perform well under the condition that the HFGM is certain or uncertain but positive definite. In this study, the control law is determined by solving a nonlinear vector equation instead of the conventional algebraic expression, which is not applicable when the HFGM is uncertain and non-positive definite. Theorems with some relaxed system parametric uncertainty assumptions are proposed to guarantee the existence and uniqueness of the solution, and proofs of them, based on the principle of the convex cone set, are given in the text. The proposed control strategy can be easily applied in practice, and the chattering caused by the discontinuous control can be suppressed, as it can in general SMCs. The proposed controller was used in two affine dynamic systems, and the simulation results demonstrate its effectiveness. △ Less

Submitted 14 August, 2018; originally announced August 2018.

Showing 1–50 of 53 results for author: He, C