-
Real-time Deformation Correction in Additively Printed Flexible Antenna Arrays
Authors:
Sreeni Poolakkal,
Abdullah Islam,
Shrestha Bansal,
Arpit Rao,
Ted Dabrowski,
Kalsi Kwan,
Amit Mishra,
Quiyan Xu,
Erfan Ghaderi,
Pradeep Lall,
Sudip Shekhar,
Julio Navarro,
Shenqiang Ren,
John Williams,
Subhanshu Gupta
Abstract:
Conformal phased arrays provide multiple degrees of freedom to the scan angle, which is typically limited by antenna aperture in rigid arrays. Silicon-based RF signal processing offers reliable, reconfigurable, multi-functional, and compact control for conformal phased arrays that can be used for on-the-move communication. While the lightweight, compactness, and shape-changing properties of the co…
▽ More
Conformal phased arrays provide multiple degrees of freedom to the scan angle, which is typically limited by antenna aperture in rigid arrays. Silicon-based RF signal processing offers reliable, reconfigurable, multi-functional, and compact control for conformal phased arrays that can be used for on-the-move communication. While the lightweight, compactness, and shape-changing properties of the conformal phased arrays are attractive, these features result in dynamic deformation of the array during motion leading to significant dynamic beam pointing errors. We propose a silicon-based, compact, reconfigurable solution to self-correct these dynamic deformation-induced beam pointing errors. Furthermore, additive printing is leveraged to enhance the flexibility of the conformal phased arrays, as the printed conductive ink is more flexible than bulk copper and can be easily deposited on flexible sheets using different printing tools, providing an environmentally-friendly solution for large-scale production. The inks such as conventional silver inks are expensive and copper-based printable inks suffer from spontaneous metal oxidation that alters trace impedance and degrades beamforming performance. This work uses a low-cost molecular copper decomposition ink with reliable RF properties at different temperature and strain to print the proposed intelligent conformal phased array operating at 2.1 GHz. Proof-of-concept prototype $2\times2$ array self-corrects the deformation induces beampointing error with an error $<1.25^\circ$. The silicon based array processing part occupying only 2.58 mm$^2$ area and 83 mW power per tile.
△ Less
Submitted 21 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
ATDM:An Anthropomorphic Aerial Tendon-driven Manipulator with Low-Inertia and High-Stiffness
Authors:
Quman Xu,
Zhan Li,
Hai Li,
Xinghu Yu,
Yipeng Yang
Abstract:
Aerial Manipulator Systems (AMS) have garnered significant interest for their utility in aerial operations. Nonetheless, challenges related to the manipulator's limited stiffness and the coupling disturbance with manipulator movement persist. This paper introduces the Aerial Tendon-Driven Manipulator (ATDM), an innovative AMS that integrates a hexrotor Unmanned Aerial Vehicle (UAV) with a 4-degree…
▽ More
Aerial Manipulator Systems (AMS) have garnered significant interest for their utility in aerial operations. Nonetheless, challenges related to the manipulator's limited stiffness and the coupling disturbance with manipulator movement persist. This paper introduces the Aerial Tendon-Driven Manipulator (ATDM), an innovative AMS that integrates a hexrotor Unmanned Aerial Vehicle (UAV) with a 4-degree-of-freedom (4-DOF) anthropomorphic tendon-driven manipulator. The design of the manipulator is anatomically inspired, emulating the human arm anatomy from the shoulder joint downward. To enhance the structural integrity and performance, finite element topology optimization and lattice optimization are employed on the links to replicate the radially graded structure characteristic of bone, this approach effectively reduces weight and inertia while simultaneously maximizing stiffness. A novel tensioning mechanism with adjustable tension is introduced to address cable relaxation, and a Tension-amplification tendon mechanism is implemented to increase the manipulator's overall stiffness and output. The paper presents a kinematic model based on virtual coupled joints, a comprehensive workspace analysis, and detailed calculations of output torques and stiffness for individual arm joints.
The prototype arm has a total weight of 2.7 kg, with the end effector contributing only 0.818 kg. By positioning all actuators at the base, coupling disturbance are minimized. The paper includes a detailed mechanical design and validates the system's performance through semi-physical multi-body dynamics simulations, confirming the efficacy of the proposed design.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
On the Foundations of Earth and Climate Foundation Models
Authors:
Xiao Xiang Zhu,
Zhitong Xiong,
Yi Wang,
Adam J. Stewart,
Konrad Heidler,
Yuanyuan Wang,
Zhenghang Yuan,
Thomas Dujardin,
Qingsong Xu,
Yilei Shi
Abstract:
Foundation models have enormous potential in advancing Earth and climate sciences, however, current approaches may not be optimal as they focus on a few basic features of a desirable Earth and climate foundation model. Crafting the ideal Earth foundation model, we define eleven features which would allow such a foundation model to be beneficial for any geoscientific downstream application in an en…
▽ More
Foundation models have enormous potential in advancing Earth and climate sciences, however, current approaches may not be optimal as they focus on a few basic features of a desirable Earth and climate foundation model. Crafting the ideal Earth foundation model, we define eleven features which would allow such a foundation model to be beneficial for any geoscientific downstream application in an environmental- and human-centric manner.We further shed light on the way forward to achieve the ideal model and to evaluate Earth foundation models. What comes after foundation models? Energy efficient adaptation, adversarial defenses, and interpretability are among the emerging directions.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Light-weight Retinal Layer Segmentation with Global Reasoning
Authors:
Xiang He,
Weiye Song,
Yiming Wang,
Fabio Poiesi,
Ji Yi,
Manishi Desai,
Quanqing Xu,
Kongzheng Yang,
Yi Wan
Abstract:
Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases. However, it is challenging to achieve accurate segmentation due to low contrast and blood flow noises presented in the images. In addition, the algorithm should be light-weight to be deployed for practical clinical applications…
▽ More
Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases. However, it is challenging to achieve accurate segmentation due to low contrast and blood flow noises presented in the images. In addition, the algorithm should be light-weight to be deployed for practical clinical applications. Therefore, it is desired to design a light-weight network with high performance for retinal layer segmentation. In this paper, we propose LightReSeg for retinal layer segmentation which can be applied to OCT images. Specifically, our approach follows an encoder-decoder structure, where the encoder part employs multi-scale feature extraction and a Transformer block for fully exploiting the semantic information of feature maps at all scales and making the features have better global reasoning capabilities, while the decoder part, we design a multi-scale asymmetric attention (MAA) module for preserving the semantic information at each encoder scale. The experiments show that our approach achieves a better segmentation performance compared to the current state-of-the-art method TransUnet with 105.7M parameters on both our collected dataset and two other public datasets, with only 3.3M parameters.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey
Authors:
Marcos V. Conde,
Florin-Alexandru Vasluianu,
Radu Timofte,
Jianxing Zhang,
Jia Li,
Fan Wang,
Xiaopeng Li,
Zikun Liu,
Hyunhee Park,
Sejun Song,
Changho Kim,
Zhijuan Huang,
Hongyuan Yu,
Cheng Wan,
Wending Xiang,
Jiamin Lin,
Hang Zhong,
Qiaosong Zhang,
Yue Sun,
Xuanwu Yin,
Kunlong Zuo,
Senyan Xu,
Siyuan Jiang,
Zhi**g Sun,
Jiaying Zhu
, et al. (10 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois…
▽ More
This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as noise and blur. In the challenge, a total of 230 participants registered, and 45 submitted results during thee challenge period. The performance of the top-5 submissions is reviewed and provided here as a gauge for the current state-of-the-art in RAW Image Super-Resolution.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
An Alternative Method to Identify the Susceptibility Threshold Level of Device under Test in a Reverberation Chamber
Authors:
Qian Xu,
Kai Chen,
Xueqi Shen,
Lei Xing,
Yi Huang,
Tian Hong Loh
Abstract:
By counting the number of pass/fail occurrences of a DUT (Device under Test) in the stirring process in a reverberation chamber (RC), the threshold electric field (E-field) level can be well estimated without tuning the input power and repeating the whole testing many times. The Monte-Carlo method is used to verify the results. Estimated values and uncertainties are given for Rayleigh distributed…
▽ More
By counting the number of pass/fail occurrences of a DUT (Device under Test) in the stirring process in a reverberation chamber (RC), the threshold electric field (E-field) level can be well estimated without tuning the input power and repeating the whole testing many times. The Monte-Carlo method is used to verify the results. Estimated values and uncertainties are given for Rayleigh distributed fields and for Rice distributed fields with different K-factors.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Authors:
Xin Li,
Kun Yuan,
Ya**g Pei,
Yiting Lu,
Ming Sun,
Chao Zhou,
Zhibo Chen,
Radu Timofte,
Wei Sun,
Haoning Wu,
Zicheng Zhang,
Jun Jia,
Zhichao Zhang,
Linhan Cao,
Qiubo Chen,
Xiongkuo Min,
Weisi Lin,
Guangtao Zhai,
Jianhui Sun,
Tianyi Wang,
Lei Li,
Han Kong,
Wenxuan Wang,
Bing Li,
Cheng Luo
, et al. (43 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The…
▽ More
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The purpose is to build new benchmarks and advance the development of S-UGC VQA. The competition had 200 participants and 13 teams submitted valid solutions for the final testing phase. The proposed solutions achieved state-of-the-art performances for S-UGC VQA. The project can be found at https://github.com/lixinustc/KVQChallenge-CVPR-NTIRE2024.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Authors:
Bin Ren,
Yawei Li,
Nancy Mehta,
Radu Timofte,
Hongyuan Yu,
Cheng Wan,
Yuxin Hong,
Bingnan Han,
Zhuoyuan Wu,
Yajun Zou,
Yuqing Liu,
Jizhe Li,
Keji He,
Chao Fan,
Heng Zhang,
Xiaolin Zhang,
Xuanwu Yin,
Kunlong Zuo,
Bohao Liao,
Peizhe Xia,
Long Peng,
Zhibo Du,
Xin Di,
Wangkai Li,
Yang Wang
, et al. (109 additional authors not shown)
Abstract:
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such…
▽ More
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/.
△ Less
Submitted 25 June, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Tunable Superconducting Magnetic Levitation with Self-Stability
Authors:
Qi Xu,
Yi Lin,
Yunfei Tan,
Jianzhao Geng
Abstract:
Magnetic levitation based on the flux pinning nature of type II superconductors has the merit of self-stability, making it appealing for applications such as high speed bearings, maglev trains, space generators, etc. However, such levitation systems physically rely on the superconductor pre-capturing magnetic flux (i.e. field cooling process) before establishing the levitation state which is nonad…
▽ More
Magnetic levitation based on the flux pinning nature of type II superconductors has the merit of self-stability, making it appealing for applications such as high speed bearings, maglev trains, space generators, etc. However, such levitation systems physically rely on the superconductor pre-capturing magnetic flux (i.e. field cooling process) before establishing the levitation state which is nonadjustable afterwards. Moreover, practical type II superconductors in the levitation system inevitably suffer from various sources of energy losses, leading to continuous levitation force decay. These intrinsic drawbacks make superconducting maglev inflexible and impractical for long term operation. Here we propose and demonstrate a new form of superconducting maglev which is tunable and with self-stability. The maglev system uses a closed-loop type II superconducting coil to lock flux of a magnet, establishing self-stable levitation between the two objects. A flux pump is used to modulate the total magnetic flux of the coil without breaking its superconductivity, thus flexibly tuning levitation force and height meanwhile maintaining self-stability. For the first time, we experimentally demonstrate a self-stable type II superconducting maglev system which is able to: counteract long term levitation force decay, adjust levitation force and equilibrium position, and establish levitation under zero field cooling condition. These breakthroughs may bridge the gap between demonstrations and practical applications of type II superconducting maglevs.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
UN-SAM: Universal Prompt-Free Segmentation for Generalized Nuclei Images
Authors:
Zhen Chen,
Qing Xu,
Xinyu Liu,
Yixuan Yuan
Abstract:
In digital pathology, precise nuclei segmentation is pivotal yet challenged by the diversity of tissue types, staining protocols, and imaging conditions. Recently, the segment anything model (SAM) revealed overwhelming performance in natural scenarios and impressive adaptation to medical imaging. Despite these advantages, the reliance of labor-intensive manual annotation as segmentation prompts se…
▽ More
In digital pathology, precise nuclei segmentation is pivotal yet challenged by the diversity of tissue types, staining protocols, and imaging conditions. Recently, the segment anything model (SAM) revealed overwhelming performance in natural scenarios and impressive adaptation to medical imaging. Despite these advantages, the reliance of labor-intensive manual annotation as segmentation prompts severely hinders their clinical applicability, especially for nuclei image analysis containing massive cells where dense manual prompts are impractical. To overcome the limitations of current SAM methods while retaining the advantages, we propose the Universal prompt-free SAM framework for Nuclei segmentation (UN-SAM), by providing a fully automated solution with remarkable generalization capabilities. Specifically, to eliminate the labor-intensive requirement of per-nuclei annotations for prompt, we devise a multi-scale Self-Prompt Generation (SPGen) module to revolutionize clinical workflow by automatically generating high-quality mask hints to guide the segmentation tasks. Moreover, to unleash the generalization capability of SAM across a variety of nuclei images, we devise a Domain-adaptive Tuning Encoder (DT-Encoder) to seamlessly harmonize visual features with domain-common and domain-specific knowledge, and further devise a Domain Query-enhanced Decoder (DQ-Decoder) by leveraging learnable domain queries for segmentation decoding in different nuclei domains. Extensive experiments prove that UN-SAM with exceptional performance surpasses state-of-the-arts in nuclei instance and semantic segmentation, especially the generalization capability in zero-shot scenarios. The source code is available at https://github.com/CUHK-AIM-Group/UN-SAM.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Robust Data-EnablEd Predictive Leading Cruise Control via Reachability Analysis
Authors:
Shuai Li,
Chaoyi Chen,
Haotian Zheng,
Jiawei Wang,
Qing Xu,
Keqiang Li
Abstract:
Data-driven predictive control promises model-free wave-dampening strategies for Connected and Autonomous Vehicles (CAVs) in mixed traffic flow. However, its performance relies on data quality, which suffers from unknown noise and disturbances.This paper introduces a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) method based on reachability analysis, aiming to achieve safe and…
▽ More
Data-driven predictive control promises model-free wave-dampening strategies for Connected and Autonomous Vehicles (CAVs) in mixed traffic flow. However, its performance relies on data quality, which suffers from unknown noise and disturbances.This paper introduces a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) method based on reachability analysis, aiming to achieve safe and optimal CAV control under bounded process noise and external disturbances. Precisely, the matrix zonotope set technique and Willems' Fundamental Lemma are employed to derive the over-approximated system dynamics directly from data, and a data-driven feedback control technique is utilized to obtain an additional feedback input for stability. We decouple the mixed platoon into an error system and a nominal system, where the error system provides data-driven reachability sets for the enhanced safety constraints in the nominal system. Finally, a data-driven predictive control framework is formulated in a tube-based control manner for robustness guarantees. Nonlinear simulations with noise-corrupted data demonstrate that the proposed method outperforms baseline methods in mitigating traffic waves.
△ Less
Submitted 14 May, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Authors:
Huan Liao,
Haonan Han,
Kai Yang,
Tianjiao Du,
Rui Yang,
Zunnan Xu,
Qinmei Xu,
**gquan Liu,
Jiasheng Lu,
Xiu Li
Abstract:
With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment betw…
▽ More
With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment between generated audio and text prompt using human preference feedback. Our BATON comprises three key stages: Firstly, we curated a dataset containing both prompts and the corresponding generated audio, which was then annotated based on human feedback. Secondly, we introduced a reward model using the constructed dataset, which can mimic human preference by assigning rewards to input text-audio pairs. Finally, we employed the reward model to fine-tune an off-the-shelf text-to-audio model. The experiment results demonstrate that our BATON can significantly improve the generation quality of the original text-to-audio models, concerning audio integrity, temporal relationship, and alignment with human preference.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Localization of Dummy Data Injection Attacks in Power Systems Considering Incomplete Topological Information: A Spatio-Temporal Graph Wavelet Convolutional Neural Network Approach
Authors:
Zhaoyang Qu,
Yunchang Dong,
Yang Li,
Siqi Song,
Tao Jiang,
Min Li,
Qiming Wang,
Lei Wang,
Xiaoyong Bo,
Jiye Zang,
Qi Xu
Abstract:
The emergence of novel the dummy data injection attack (DDIA) poses a severe threat to the secure and stable operation of power systems. These attacks are particularly perilous due to the minimal Euclidean spatial separation between the injected malicious data and legitimate data, rendering their precise detection challenging using conventional distance-based methods. Furthermore, existing researc…
▽ More
The emergence of novel the dummy data injection attack (DDIA) poses a severe threat to the secure and stable operation of power systems. These attacks are particularly perilous due to the minimal Euclidean spatial separation between the injected malicious data and legitimate data, rendering their precise detection challenging using conventional distance-based methods. Furthermore, existing research predominantly focuses on various machine learning techniques, often analyzing the temporal data sequences post-attack or relying solely on Euclidean spatial characteristics. Unfortunately, this approach tends to overlook the inherent topological correlations within the non-Euclidean spatial attributes of power grid data, consequently leading to diminished accuracy in attack localization. To address this issue, this study takes a comprehensive approach. Initially, it examines the underlying principles of these new DDIAs on power systems. Here, an intricate mathematical model of the DDIA is designed, accounting for incomplete topological knowledge and alternating current (AC) state estimation from an attacker's perspective. Subsequently, by integrating a priori knowledge of grid topology and considering the temporal correlations within measurement data and the topology-dependent attributes of the power grid, this study introduces temporal and spatial attention matrices. These matrices adaptively capture the spatio-temporal correlations within the attacks. Leveraging gated stacked causal convolution and graph wavelet sparse convolution, the study jointly extracts spatio-temporal DDIA features. Finally, the research proposes a DDIA localization method based on spatio-temporal graph neural networks. The accuracy and effectiveness of the DDIA model are rigorously demonstrated through comprehensive analytical cases.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control
Authors:
Xin-Yang Liu,
Dariush Bodaghi,
Qian Xue,
Xudong Zheng,
Jian-Xun Wang
Abstract:
Fish fin rays constitute a sophisticated control system for ray-finned fish, facilitating versatile locomotion within complex fluid environments. Despite extensive research on the kinematics and hydrodynamics of fish locomotion, the intricate control strategies in fin-ray actuation remain largely unexplored. While deep reinforcement learning (DRL) has demonstrated potential in managing complex non…
▽ More
Fish fin rays constitute a sophisticated control system for ray-finned fish, facilitating versatile locomotion within complex fluid environments. Despite extensive research on the kinematics and hydrodynamics of fish locomotion, the intricate control strategies in fin-ray actuation remain largely unexplored. While deep reinforcement learning (DRL) has demonstrated potential in managing complex nonlinear dynamics; its trial-and-error nature limits its application to problems involving computationally demanding environmental interactions. This study introduces a cutting-edge off-policy DRL algorithm, interacting with a fluid-structure interaction (FSI) environment to acquire intricate fin-ray control strategies tailored for various propulsive performance objectives. To enhance training efficiency and enable scalable parallelism, an innovative asynchronous parallel training (APT) strategy is proposed, which fully decouples FSI environment interactions and policy/value network optimization. The results demonstrated the success of the proposed method in discovering optimal complex policies for fin-ray actuation control, resulting in a superior propulsive performance compared to the optimal sinusoidal actuation function identified through a parametric grid search. The merit and effectiveness of the APT approach are also showcased through comprehensive comparison with conventional DRL training strategies in numerical experiments of controlling nonlinear dynamics.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
On a Discrete-Time Networked SIV Epidemic Model with Polar Opinion Dynamics
Authors:
Qiulin Xu,
Hideaki Ishii
Abstract:
This paper studies novel epidemic spreading problems influenced by opinion evolution in social networks, where the opinions reflect the public health concerns. A coupled bilayer network is proposed, where the epidemics spread over several communities through a physical network layer while the opinions evolve over the same communities through a social network layer. The epidemic spreading process i…
▽ More
This paper studies novel epidemic spreading problems influenced by opinion evolution in social networks, where the opinions reflect the public health concerns. A coupled bilayer network is proposed, where the epidemics spread over several communities through a physical network layer while the opinions evolve over the same communities through a social network layer. The epidemic spreading process is described by a susceptible-infected-vigilant (SIV) model, which introduces opinion-dependent epidemic vigilance state compared with the classical epidemic models. The opinion process is modeled by a polar opinion dynamics model, which includes infection prevalence and human stubbornness into the opinion evolution. By introducing an opinion-dependent reproduction number, we analyze the stability of disease-free and endemic equilibria and derive sufficient conditions for their global asymptotic stability. We also discuss the mutual effects between epidemic eradication and opinion consensus, and the possibility of suppressing epidemic by intervening in the opinions or implementing public health strategies. Simulations are conducted to verify the theoretical results and demonstrate the feasibility of epidemic suppression.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
The Arrow of Time in Music -- Revisiting the Temporal Structure of Music with Distinguishability and Unique Orientability as the Anchor Point
Authors:
Qi Xu
Abstract:
Driven by the term "the arrow of time" as a general topic, the article develops a musical discussion by referring to the etymological origin of the term: philosophy (epistemology) and physics (thermodynamics). In particular, the article explores two specific conditions: distinguishability and unique orientability, from which the article derives respective musical propositions and case studies. For…
▽ More
Driven by the term "the arrow of time" as a general topic, the article develops a musical discussion by referring to the etymological origin of the term: philosophy (epistemology) and physics (thermodynamics). In particular, the article explores two specific conditions: distinguishability and unique orientability, from which the article derives respective musical propositions and case studies. For the distinguishability condition, the article focuses on the "recurrence" in music and tries to interpret Bach's Christmas Oratorio from the perspective of "birth/resurrection". For the unique orientability condition, the article discusses the process of delaying the climax, thereby proposing "AB-AAB left-replication" model, implying an organicist view by treating the temporal structure of music (e.g. form) as the product of a dynamic process: organic growth.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Safety-Enhanced Self-Learning for Optimal Power Converter Control
Authors:
Yihao Wan,
Qianwen Xu,
Tomislav Dragičević
Abstract:
Data-driven learning-based control methods such as reinforcement learning (RL) have become increasingly popular with recent proliferation of the machine learning paradigm. These methods address the parameter sensitiveness and unmodeled dynamics in model-based controllers, such as finite control-set model predictive control. RL agents are typically utilized in simulation environments, where they ar…
▽ More
Data-driven learning-based control methods such as reinforcement learning (RL) have become increasingly popular with recent proliferation of the machine learning paradigm. These methods address the parameter sensitiveness and unmodeled dynamics in model-based controllers, such as finite control-set model predictive control. RL agents are typically utilized in simulation environments, where they are allowed to explore multiple "unsafe" actions during the learning process. However, this type of learning is not applicable to online self-learning of controllers in physical power converters, because unsafe actions would damage them. To address this, this letter proposes a safe online RL-based control framework to autonomously find the optimal switching strategy for the power converters, while ensuring system safety during the entire self-learning process. The proposed safe online RL-based control is validated in a practical testbed on a two-level voltage source converter system, and the results confirm the effectiveness of the proposed method.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
Authors:
Jie Zhang,
Qing-Tian Xu,
Zhen-Hua Ling,
Haizhou Li
Abstract:
Speech enhancement is widely used as a front-end to improve the speech quality in many audio systems, while it is hard to extract the target speech in multi-talker conditions without prior information on the speaker identity. It was shown that the auditory attention on the target speaker can be decoded from the electroencephalogram (EEG) of the listener implicitly. In this work, we therefore propo…
▽ More
Speech enhancement is widely used as a front-end to improve the speech quality in many audio systems, while it is hard to extract the target speech in multi-talker conditions without prior information on the speaker identity. It was shown that the auditory attention on the target speaker can be decoded from the electroencephalogram (EEG) of the listener implicitly. In this work, we therefore propose a novel end-to-end brain-assisted speech enhancement network (BASEN), which incorporates the listeners' EEG signals and adopts a temporal convolutional network together with a convolutional multi-layer cross attention module to fuse EEG-audio features. Considering that an EEG cap with sparse channels exhibits multiple benefits and in practice many electrodes might contribute marginally, we further propose two channel selection methods, called residual Gumbel selection and convolutional regularization selection. They are dedicated to tackling training instability and duplicated channel selections, respectively. Experimental results on a public dataset show the superiority of the proposed BASEN over existing approaches. The proposed channel selection methods can significantly reduce the amount of informative EEG channels with a negligible impact on the performance.
△ Less
Submitted 25 June, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Resilient Clock Synchronization Architecture for Industrial Time-Sensitive Networking
Authors:
Yafei Sun,
Qimin Xu,
Cailian Chen,
** Guan
Abstract:
Time-Sensitive Networking (TSN) is a promising industrial Internet of Things technology. Clock synchronization provides unified time reference, which is critical to the deterministic communication of TSN. However, changes in internal network status and external work environments of devices both degrade practical synchronization performance. This paper proposes a temperature-resilient architecture…
▽ More
Time-Sensitive Networking (TSN) is a promising industrial Internet of Things technology. Clock synchronization provides unified time reference, which is critical to the deterministic communication of TSN. However, changes in internal network status and external work environments of devices both degrade practical synchronization performance. This paper proposes a temperature-resilient architecture considering delay asymmetry (TACD) to enhance the timing accuracy under the impacts of internal delay and external thermal changes. In TACD, an anti-delay-asymmetry method is developed, which employs a partial variational Bayesian algorithm to promote adaptability to non-stationary delay variation. An optimized skew estimator is further proposed, fusing the temperature skew model for ambiance perception with the traditional linear clock model to compensate for nonlinear error caused by temperature changes. Theoretical derivation of skew estimation lower bound proves the promotion of optimal accuracy after the fusion of clock models. Evaluations based on measured delay data demonstrate accuracy advantages regardless of internal or external influences.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
AI/ML for Beam Management in 5G-Advanced
Authors:
Qing Xue,
Jiajia Guo,
Binggui Zhou,
Yongjun Xu,
Zhidu Li,
Shaodan Ma
Abstract:
In beamformed wireless cellular systems such as 5G New Radio (NR) networks, beam management (BM) is a crucial operation. In the second phase of 5G NR standardization, known as 5G-Advanced, which is being vigorously promoted, the key component is the use of artificial intelligence (AI) based on machine learning (ML) techniques. AI/ML for BM is selected as a representative use case. This article pro…
▽ More
In beamformed wireless cellular systems such as 5G New Radio (NR) networks, beam management (BM) is a crucial operation. In the second phase of 5G NR standardization, known as 5G-Advanced, which is being vigorously promoted, the key component is the use of artificial intelligence (AI) based on machine learning (ML) techniques. AI/ML for BM is selected as a representative use case. This article provides an overview of the AI/ML for BM in 5G-Advanced. The legacy non-AI and prime AI-enabled BM frameworks are first introduced and compared. Then, the main scope of AI/ML for BM is presented, including improving accuracy, reducing overhead and latency. Finally, the key challenges and open issues in the standardization of AI/ML for BM are discussed, especially the design of new protocols for AI-enabled BM. This article provides a guideline for the study of AI/ML-based BM standardization.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Scalable Scheduling for Industrial Time-Sensitive Networking: A Hyper-flow Graph Based Scheme
Authors:
Yanzhou Zhang,
Cailian Chen,
Qimin Xu,
Shouliang Wang,
Lei Xu,
** Guan
Abstract:
Industrial Time-Sensitive Networking (TSN) provides deterministic mechanisms for real-time and reliable flow transmission. Increasing attention has been paid to efficient scheduling for time-sensitive flows with stringent requirements such as ultra-low latency and jitter. In TSN, the fine-grained traffic sha** protocol, cyclic queuing and forwarding (CQF), eliminates uncertain delay and frame lo…
▽ More
Industrial Time-Sensitive Networking (TSN) provides deterministic mechanisms for real-time and reliable flow transmission. Increasing attention has been paid to efficient scheduling for time-sensitive flows with stringent requirements such as ultra-low latency and jitter. In TSN, the fine-grained traffic sha** protocol, cyclic queuing and forwarding (CQF), eliminates uncertain delay and frame loss by cyclic traffic forwarding and queuing. However, it inevitably causes high scheduling complexity. Moreover, complexity is quite sensitive to flow attributes and network scale. The problem stems in part from the lack of an attribute mining mechanism in existing frame-based scheduling. For time-critical industrial networks with large-scale complex flows, a so-called hyper-flow graph based scheduling scheme is proposed to improve the scheduling scalability in terms of schedulability, scheduling efficiency and latency & jitter. The hyper-flow graph is built by aggregating similar flow sets as hyper-flow nodes and designing a hierarchical scheduling framework. The flow attribute-sensitive scheduling information is embedded into the condensed maximal cliques, and reverse maps them precisely to congestion flow portions for re-scheduling. Its parallel scheduling reduces network scale induced complexity. Further, this scheme is designed in its entirety as a comprehensive scheduling algorithm GH^2. It improves the three criteria of scalability along a Pareto front. Extensive simulation studies demonstrate its superiority. Notably, GH^2 is verified its scheduling stability with a runtime of less than 100 ms for 1000 flows and near 1/430 of the SOTA FITS method for 2000 flows.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Cloud Control of Connected Vehicle under Bi-directional Time-varying delay: An Application of Predictor-observer Structured Controller
Authors:
Ji-An Pan,
Qing Xu,
Keqiang Li,
Chunying Yang,
Jianqiang Wang
Abstract:
This article is devoted to addressing the cloud control of connected vehicles, specifically focusing on analyzing the effect of bi-directional communication-induced delays. To mitigate the adverse effects of such delays, a novel predictor-observer structured controller is proposed which compensate for both measurable output delays and unmeasurable, yet bounded, input delays simultaneously. The stu…
▽ More
This article is devoted to addressing the cloud control of connected vehicles, specifically focusing on analyzing the effect of bi-directional communication-induced delays. To mitigate the adverse effects of such delays, a novel predictor-observer structured controller is proposed which compensate for both measurable output delays and unmeasurable, yet bounded, input delays simultaneously. The study begins by novelly constructing an equivalent delay-free inter-connected system model that incorporates the Predictor-Observer controller, considering certain delay boundaries and model uncertainties. Subsequently, a stability analysis is conducted to assess the system's robustness under these conditions. Next, the connected vehicle lateral control scenario is built which contain high-fidelity vehicle dynamic model. The results demonstrate the controller's ability to accurately predict the system states, even under time-varying bi-directional delays. Finally, the proposed method is deployed in a real connected vehicle lateral control system. Comparative tests with a conventional linear feedback controller showcase significantly improved control performance under dominant bi-directional delay conditions, affirming the superiority of the proposed method against the delay.
△ Less
Submitted 9 December, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Information Flow Topology in Mixed Traffic: A Comparative Study between "Looking Ahead" and "Looking Behind"
Authors:
Shuai Li,
Haotian Zheng,
Jiawei Wang,
Chaoyi Chen,
Qing Xu,
Jianqiang Wang,
Keqiang Li
Abstract:
The emergence of connected and automated vehicles (CAVs) promises smoother traffic flow. In mixed traffic where human-driven vehicles (HDVs) also exist, existing research mostly focuses on "looking ahead" (i.e., the CAVs receive information from preceding vehicles) strategies for CAVs, while recent work reveals that "looking behind" (i.e., the CAVs receive information from their rear vehicles) str…
▽ More
The emergence of connected and automated vehicles (CAVs) promises smoother traffic flow. In mixed traffic where human-driven vehicles (HDVs) also exist, existing research mostly focuses on "looking ahead" (i.e., the CAVs receive information from preceding vehicles) strategies for CAVs, while recent work reveals that "looking behind" (i.e., the CAVs receive information from their rear vehicles) strategies might provide more possibilities for CAV longitudinal control. This paper presents a comparative study between these two types of information flow topology (IFT) from the string stability perspective, with the role of maximum platoon size (MPS) also under investigation. Precisely, we provide a dynamical modeling framework for the mixed platoon under the multi-predecessor-following (MPF) topology and the multi-successor-leading (MSL) topology. Then, a unified method for string stability analysis is presented, with explicit consideration of both IFT and MPS. Numerical results suggest that MSL ("looking behind") outperforms MPF ("looking ahead" ) in mitigating traffic perturbations. In addition, increasing MPS could further improve string stability of mixed traffic flow.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
SPPNet: A Single-Point Prompt Network for Nuclei Image Segmentation
Authors:
Qing Xu,
Wenwei Kuang,
Zeyu Zhang,
Xueyao Bao,
Haoran Chen,
Wenting Duan
Abstract:
Image segmentation plays an essential role in nuclei image analysis. Recently, the segment anything model has made a significant breakthrough in such tasks. However, the current model exists two major issues for cell segmentation: (1) the image encoder of the segment anything model involves a large number of parameters. Retraining or even fine-tuning the model still requires expensive computationa…
▽ More
Image segmentation plays an essential role in nuclei image analysis. Recently, the segment anything model has made a significant breakthrough in such tasks. However, the current model exists two major issues for cell segmentation: (1) the image encoder of the segment anything model involves a large number of parameters. Retraining or even fine-tuning the model still requires expensive computational resources. (2) in point prompt mode, points are sampled from the center of the ground truth and more than one set of points is expected to achieve reliable performance, which is not efficient for practical applications. In this paper, a single-point prompt network is proposed for nuclei image segmentation, called SPPNet. We replace the original image encoder with a lightweight vision transformer. Also, an effective convolutional block is added in parallel to extract the low-level semantic information from the image and compensate for the performance degradation due to the small image encoder. We propose a new point-sampling method based on the Gaussian kernel. The proposed model is evaluated on the MoNuSeg-2018 dataset. The result demonstrated that SPPNet outperforms existing U-shape architectures and shows faster convergence in training. Compared to the segment anything model, SPPNet shows roughly 20 times faster inference, with 1/70 parameters and computational cost. Particularly, only one set of points is required in both the training and inference phases, which is more reasonable for clinical applications. The code for our work and more technical details can be found at https://github.com/xq141839/SPPNet.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
A Survey of Beam Management for mmWave and THz Communications Towards 6G
Authors:
Qing Xue,
Chengwang Ji,
Shaodan Ma,
Jiajia Guo,
Yongjun Xu,
Qianbin Chen,
Wei Zhang
Abstract:
Communication in millimeter wave (mmWave) and even terahertz (THz) frequency bands is ushering in a new era of wireless communications. Beam management, namely initial access and beam tracking, has been recognized as an essential technique to ensure robust mmWave/THz communications, especially for mobile scenarios. However, narrow beams at higher carrier frequency lead to huge beam measurement ove…
▽ More
Communication in millimeter wave (mmWave) and even terahertz (THz) frequency bands is ushering in a new era of wireless communications. Beam management, namely initial access and beam tracking, has been recognized as an essential technique to ensure robust mmWave/THz communications, especially for mobile scenarios. However, narrow beams at higher carrier frequency lead to huge beam measurement overhead, which has a negative impact on beam acquisition and tracking. In addition, the beam management process is further complicated by the fluctuation of mmWave/THz channels, the random movement patterns of users, and the dynamic changes in the environment. For mmWave and THz communications toward 6G, we have witnessed a substantial increase in research and industrial attention on artificial intelligence (AI), reconfigurable intelligent surface (RIS), and integrated sensing and communications (ISAC). The introduction of these enabling technologies presents both open opportunities and unique challenges for beam management. In this paper, we present a comprehensive survey on mmWave and THz beam management. Further, we give some insights on technical challenges and future research directions in this promising area.
△ Less
Submitted 6 February, 2024; v1 submitted 4 August, 2023;
originally announced August 2023.
-
UCDFormer: Unsupervised Change Detection Using a Transformer-driven Image Translation
Authors:
Qingsong Xu,
Yilei Shi,
Jianhua Guo,
Chaojun Ouyang,
Xiao Xiang Zhu
Abstract:
Change detection (CD) by comparing two bi-temporal images is a crucial task in remote sensing. With the advantages of requiring no cumbersome labeled change information, unsupervised CD has attracted extensive attention in the community. However, existing unsupervised CD approaches rarely consider the seasonal and style differences incurred by the illumination and atmospheric conditions in multi-t…
▽ More
Change detection (CD) by comparing two bi-temporal images is a crucial task in remote sensing. With the advantages of requiring no cumbersome labeled change information, unsupervised CD has attracted extensive attention in the community. However, existing unsupervised CD approaches rarely consider the seasonal and style differences incurred by the illumination and atmospheric conditions in multi-temporal images. To this end, we propose a change detection with domain shift setting for remote sensing images. Furthermore, we present a novel unsupervised CD method using a light-weight transformer, called UCDFormer. Specifically, a transformer-driven image translation composed of a light-weight transformer and a domain-specific affinity weight is first proposed to mitigate domain shift between two images with real-time efficiency. After image translation, we can generate the difference map between the translated before-event image and the original after-event image. Then, a novel reliable pixel extraction module is proposed to select significantly changed/unchanged pixel positions by fusing the pseudo change maps of fuzzy c-means clustering and adaptive threshold. Finally, a binary change map is obtained based on these selected pixel pairs and a binary classifier. Experimental results on different unsupervised CD tasks with seasonal and style changes demonstrate the effectiveness of the proposed UCDFormer. For example, compared with several other related methods, UCDFormer improves performance on the Kappa coefficient by more than 12\%. In addition, UCDFormer achieves excellent performance for earthquake-induced landslide detection when considering large-scale applications. The code is available at \url{https://github.com/zhu-xlab/UCDFormer}
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
FlexDelta: A flexure-based fully decoupled parallel $xyz$ positioning stage with long stroke
Authors:
Qianjun Zhang,
Wei Dong,
Qingsong Xu,
Bimal J. Goteea,
Yongzhuo Gao
Abstract:
Decoupled parallel $xyz$ positioning stages with large stroke have been desired in high-speed and precise positioning fields. However, currently such stages are either short in stroke or unqualified in parasitic motion and coupling rate. This paper proposes a novel flexure-based decoupled parallel $xyz$ positioning stage (FlexDelta) and conducts its conceptual design, modeling, and experimental st…
▽ More
Decoupled parallel $xyz$ positioning stages with large stroke have been desired in high-speed and precise positioning fields. However, currently such stages are either short in stroke or unqualified in parasitic motion and coupling rate. This paper proposes a novel flexure-based decoupled parallel $xyz$ positioning stage (FlexDelta) and conducts its conceptual design, modeling, and experimental study. Firstly, the working principle of FlexDelta is introduced, followed by its mechanism design with flexure. Secondly, the stiffness model of flexure is established via matrix-based Castigliano's second theorem, and the influence of its lateral stiffness on the stiffness model of FlexDelta is comprehensively investigated and then optimally designed. Finally, experimental study was carried out based on the prototype fabricated. The results reveal that the positioning stage features centimeter-stroke in three axes, with coupling rate less than 0.53%, parasitic motion less than 1.72 mrad over full range. And its natural frequencies are 20.8 Hz, 20.8 Hz, and 22.4 Hz for $x$, $y$, and $z$ axis respectively. Multi-axis path tracking tests were also carried out, which validates its dynamic performance with micrometer error.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
NTIRE 2023 Quality Assessment of Video Enhancement Challenge
Authors:
Xiaohong Liu,
Xiongkuo Min,
Wei Sun,
Yulun Zhang,
Kai Zhang,
Radu Timofte,
Guangtao Zhai,
Yixuan Gao,
Yuqin Cao,
Tengchuan Kou,
Yunlong Dong,
Ziheng Jia,
Yilin Li,
Wei Wu,
Shuming Hu,
Sibin Deng,
Pengxiang Xiao,
Ying Chen,
Kai Li,
Kai Zhao,
Kun Yuan,
Ming Sun,
Heng Cong,
Hao Wang,
Lingzhi Fu
, et al. (47 additional authors not shown)
Abstract:
This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual…
▽ More
This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual Video Enhancement (VDPVE), which has a total of 1211 enhanced videos, including 600 videos with color, brightness, and contrast enhancements, 310 videos with deblurring, and 301 deshaked videos. The challenge has a total of 167 registered participants. 61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions. A total of 176 submissions were submitted by 37 participating teams during the final testing phase. Finally, 19 participating teams submitted their models and fact sheets, and detailed the methods they used. Some methods have achieved better results than baseline methods, and the winning methods have demonstrated superior prediction performance.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
DualAttNet: Synergistic Fusion of Image-level and Fine-Grained Disease Attention for Multi-Label Lesion Detection in Chest X-rays
Authors:
Qing Xu,
Wenting Duan
Abstract:
Chest radiographs are the most commonly performed radiological examinations for lesion detection. Recent advances in deep learning have led to encouraging results in various thoracic disease detection tasks. Particularly, the architecture with feature pyramid network performs the ability to recognise targets with different sizes. However, such networks are difficult to focus on lesion regions in c…
▽ More
Chest radiographs are the most commonly performed radiological examinations for lesion detection. Recent advances in deep learning have led to encouraging results in various thoracic disease detection tasks. Particularly, the architecture with feature pyramid network performs the ability to recognise targets with different sizes. However, such networks are difficult to focus on lesion regions in chest X-rays due to their high resemblance in vision. In this paper, we propose a dual attention supervised module for multi-label lesion detection in chest radiographs, named DualAttNet. It efficiently fuses global and local lesion classification information based on an image-level attention block and a fine-grained disease attention algorithm. A binary cross entropy loss function is used to calculate the difference between the attention map and ground truth at image level. The generated gradient flow is leveraged to refine pyramid representations and highlight lesion-related features. We evaluate the proposed model on VinDr-CXR, ChestX-ray8 and COVID-19 datasets. The experimental results show that DualAttNet surpasses baselines by 0.6% to 2.7% mAP and 1.4% to 4.7% AP50 with different detection architectures. The code for our work and more technical details can be found at https://github.com/xq141839/DualAttNet.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Sea Ice Extraction via Remote Sensed Imagery: Algorithms, Datasets, Applications and Challenges
Authors:
Anzhu Yu,
Wenjun Huang,
Qing Xu,
Qun Sun,
Wenyue Guo,
Song Ji,
Bowei Wen,
Chun** Qiu
Abstract:
The deep learning, which is a dominating technique in artificial intelligence, has completely changed the image understanding over the past decade. As a consequence, the sea ice extraction (SIE) problem has reached a new era. We present a comprehensive review of four important aspects of SIE, including algorithms, datasets, applications, and the future trends. Our review focuses on researches publ…
▽ More
The deep learning, which is a dominating technique in artificial intelligence, has completely changed the image understanding over the past decade. As a consequence, the sea ice extraction (SIE) problem has reached a new era. We present a comprehensive review of four important aspects of SIE, including algorithms, datasets, applications, and the future trends. Our review focuses on researches published from 2016 to the present, with a specific focus on deep learning-based approaches in the last five years. We divided all relegated algorithms into 3 categories, including classical image segmentation approach, machine learning-based approach and deep learning-based methods. We reviewed the accessible ice datasets including SAR-based datasets, the optical-based datasets and others. The applications are presented in 4 aspects including climate research, navigation, geographic information systems (GIS) production and others. It also provides insightful observations and inspiring future research directions.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Authors:
Jie Zhang,
Qing-Tian Xu,
Qiu-Shi Zhu,
Zhen-Hua Ling
Abstract:
Time-domain single-channel speech enhancement (SE) still remains challenging to extract the target speaker without any prior information on multi-talker conditions. It has been shown via auditory attention decoding that the brain activity of the listener contains the auditory information of the attended speaker. In this paper, we thus propose a novel time-domain brain-assisted SE network (BASEN) i…
▽ More
Time-domain single-channel speech enhancement (SE) still remains challenging to extract the target speaker without any prior information on multi-talker conditions. It has been shown via auditory attention decoding that the brain activity of the listener contains the auditory information of the attended speaker. In this paper, we thus propose a novel time-domain brain-assisted SE network (BASEN) incorporating electroencephalography (EEG) signals recorded from the listener for extracting the target speaker from monaural speech mixtures. The proposed BASEN is based on the fully-convolutional time-domain audio separation network. In order to fully leverage the complementary information contained in the EEG signals, we further propose a convolutional multi-layer cross attention module to fuse the dual-branch features. Experimental results on a public dataset show that the proposed model outperforms the state-of-the-art method in several evaluation metrics. The reproducible code is available at https://github.com/jzhangU/Basen.git.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Pi-ViMo: Physiology-inspired Robust Vital Sign Monitoring using mmWave Radars
Authors:
Bo Zhang,
Boyu Jiang,
Rong Zheng,
** Zhang,
Jun Li,
Qiang Xu
Abstract:
Continuous monitoring of human vital signs using non-contact mmWave radars is attractive due to their ability to penetrate garments and operate under different lighting conditions. Unfortunately, most prior research requires subjects to stay at a fixed distance from radar sensors and to remain still during monitoring. These restrictions limit the applications of radar vital sign monitoring in real…
▽ More
Continuous monitoring of human vital signs using non-contact mmWave radars is attractive due to their ability to penetrate garments and operate under different lighting conditions. Unfortunately, most prior research requires subjects to stay at a fixed distance from radar sensors and to remain still during monitoring. These restrictions limit the applications of radar vital sign monitoring in real life scenarios. In this paper, we address these limitations and present "Pi-ViMo", a non-contact Physiology-inspired Robust Vital Sign Monitoring system, using mmWave radars. We first derive a multi-scattering point model for the human body, and introduce a coherent combining of multiple scatterings to enhance the quality of estimated chest-wall movements. It enables vital sign estimations of subjects at any location in a radar's field of view. We then propose a template matching method to extract human vital signs by adopting physical models of respiration and cardiac activities. The proposed method is capable to separate respiration and heartbeat in the presence of micro-level random body movements (RBM) when a subject is at any location within the field of view of a radar. Experiments in a radar testbed show average respiration rate errors of 6% and heart rate errors of 11.9% for the stationary subjects and average errors of 13.5% for respiration rate and 13.6% for heart rate for subjects under different RBMs.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
An Efficient and Robust Method for Chest X-Ray Rib Suppression that Improves Pulmonary Abnormality Diagnosis
Authors:
Di Xu,
Qifan Xu,
Kevin Nhieu,
Dan Ruan,
Ke Sheng
Abstract:
Suppression of thoracic bone shadows on chest X-rays (CXRs) has been indicated to improve the diagnosis of pulmonary disease. Previous approaches can be categorized as unsupervised physical and supervised deep learning models. Nevertheless, with physical models able to preserve morphological details but at the cost of extremely long processing time, existing DL methods face challenges of gathering…
▽ More
Suppression of thoracic bone shadows on chest X-rays (CXRs) has been indicated to improve the diagnosis of pulmonary disease. Previous approaches can be categorized as unsupervised physical and supervised deep learning models. Nevertheless, with physical models able to preserve morphological details but at the cost of extremely long processing time, existing DL methods face challenges of gathering sufficient/qualitative ground truth (GT) for robust training, thus leading to failure in maintaining clinically acceptable false positive rates. We hereby propose a generalizable yet efficient workflow of two stages: (1) training pairs generation with GT bone shadows eliminated in by a physical model in spatially transformed gradient fields. (2) fully supervised image denoising network training on stage-one datasets for fast rib removal on incoming CXRs. For step two, we designed a densely connected network called SADXNet, combined with peak signal to noise ratio and multi-scale structure similarity index measure objective minimization to suppress bony structures. The SADXNet organizes spatial filters in U shape (e.g., X=7; filters = 16, 64, 256, 512, 256, 64, 16) and preserves the feature map dimension throughout the network flow. Visually, SADXNet can suppress the rib edge and that near the lung wall/vertebra without jeopardizing the vessel/abnormality conspicuity. Quantitively, it achieves RMSE of ~0 during testing with one prediction taking <1s. Downstream tasks including lung nodule detection as well as common lung disease classification and localization are used to evaluate our proposed rib suppression mechanism. We observed 3.23% and 6.62% area under the curve (AUC) increase as well as 203 and 385 absolute false positive decrease for lung nodule detection and common lung disease localization, separately.
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
Task modules Partitioning, Scheduling and Floorplanning for Partially Dynamically Reconfigurable Systems Based on Modern Heterogeneous FPGAs
Authors:
Bo Ding,
**glei Huang,
Junpeng Wang,
Qi Xu,
Song Chen,
Yi Kang
Abstract:
Modern field programmable gate array(FPGA) can be partially dynamically reconfigurable with heterogeneous resources distributed on the chip. And FPGA-based partially dynamically reconfigurable system(FPGA-PDRS) can be used to accelerate computing and improve computing flexibility.
However, the traditional design of FPGA-PDRS is based on manual design.
Implementing the automation of FPGA-PDRS n…
▽ More
Modern field programmable gate array(FPGA) can be partially dynamically reconfigurable with heterogeneous resources distributed on the chip. And FPGA-based partially dynamically reconfigurable system(FPGA-PDRS) can be used to accelerate computing and improve computing flexibility.
However, the traditional design of FPGA-PDRS is based on manual design.
Implementing the automation of FPGA-PDRS needs to solve the problems of task modules partitioning, scheduling, and floorplanning on heterogeneous resources.
Existing works only partly solve problems for the automation process of FPGA-PDRS or model homogeneous resource for FPGA-PDRS.
To better solve the problems in the automation process of FPGA-PDRS and narrow the gap between algorithm and application, in this paper, we propose a complete workflow including three parts, pre-processing to generate the list of task modules candidate shapes according to the resources requirements, exploration process to search the solution of task modules partitioning, scheduling, and floorplanning, and post-optimization to improve the success rate of floorplan.
Experimental results show that, compared with state-of-the-art work, the proposed complete workflow can improve performance by 18.7\%, reduce communication cost by 8.6\%, on average, with improving the resources reuse rate of the heterogeneous resources on the chip. And based on the solution generated by the exploration process, the post-optimization can improve the success rate of the floorplan by 14\%.
△ Less
Submitted 10 December, 2022;
originally announced December 2022.
-
Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin
Authors:
Jianghong Dong,
Qing Xu,
Jiawei Wang,
Chunying Yang,
Mengchi Cai,
Chaoyi Chen,
Jianqiang Wang,
Keqiang Li
Abstract:
Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and phys…
▽ More
Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and physical spaces into a mixed one, where physical entities coexist and interact with virtual entities via their digital counterparts. Under the framework of mixedDT, MCCT contains three major experimental platforms in the physical, virtual and mixed spaces respectively, and provides a unified access for various human-machine interfaces and external devices such as driving simulators. A cloud unit, where the mixed experimental platform is deployed, is responsible for fusing multi-platform information and assigning control instructions, contributing to synchronous operation and real-time cross-platform interaction. Particularly, MCCT allows for multi-vehicle coordination composed of different multi-source vehicles (\eg, physical vehicles, virtual vehicles and human-driven vehicles). Validations on vehicle platooning demonstrate the flexibility and scalability of MCCT.
△ Less
Submitted 4 December, 2022;
originally announced December 2022.
-
Collaborative Honeypot Defense in UAV Networks: A Learning-Based Game Approach
Authors:
Yuntao Wang,
Zhou Su,
Abderrahim Benslimane,
Qichao Xu,
Minghui Dai,
Ruidong Li
Abstract:
The proliferation of unmanned aerial vehicles (UAVs) opens up new opportunities for on-demand service provisioning anywhere and anytime, but also exposes UAVs to a variety of cyber threats. Low/medium interaction honeypots offer a promising lightweight defense for actively protecting mobile Internet of things, particularly UAV networks. While previous research has primarily focused on honeypot sys…
▽ More
The proliferation of unmanned aerial vehicles (UAVs) opens up new opportunities for on-demand service provisioning anywhere and anytime, but also exposes UAVs to a variety of cyber threats. Low/medium interaction honeypots offer a promising lightweight defense for actively protecting mobile Internet of things, particularly UAV networks. While previous research has primarily focused on honeypot system design and attack pattern recognition, the incentive issue for motivating UAV's participation (e.g., sharing trapped attack data in honeypots) to collaboratively resist distributed and sophisticated attacks remains unexplored. This paper proposes a novel game-theoretical collaborative defense approach to address optimal, fair, and feasible incentive design, in the presence of network dynamics and UAVs' multi-dimensional private information (e.g., valid defense data (VDD) volume, communication delay, and UAV cost). Specifically, we first develop a honeypot game between UAVs and the network operator under both partial and complete information asymmetry scenarios. The optimal VDD-reward contract design problem with partial information asymmetry is then solved using a contract-theoretic approach that ensures budget feasibility, truthfulness, fairness, and computational efficiency. In addition, under complete information asymmetry, we devise a distributed reinforcement learning algorithm to dynamically design optimal contracts for distinct types of UAVs in the time-varying UAV network. Extensive simulations demonstrate that the proposed scheme can motivate UAV's cooperation in VDD sharing and improve defensive effectiveness, compared with conventional schemes.
△ Less
Submitted 29 August, 2023; v1 submitted 28 October, 2022;
originally announced November 2022.
-
Monolingual Recognizers Fusion for Code-switching Speech Recognition
Authors:
Tongtong Song,
Qiang Xu,
Haoyu Lu,
Longbiao Wang,
Hao Shi,
Yuqin Lin,
Yanbing Yang,
Jianwu Dang
Abstract:
The bi-encoder structure has been intensively investigated in code-switching (CS) automatic speech recognition (ASR). However, most existing methods require the structures of two monolingual ASR models (MAMs) should be the same and only use the encoder of MAMs. This leads to the problem that pre-trained MAMs cannot be timely and fully used for CS ASR. In this paper, we propose a monolingual recogn…
▽ More
The bi-encoder structure has been intensively investigated in code-switching (CS) automatic speech recognition (ASR). However, most existing methods require the structures of two monolingual ASR models (MAMs) should be the same and only use the encoder of MAMs. This leads to the problem that pre-trained MAMs cannot be timely and fully used for CS ASR. In this paper, we propose a monolingual recognizers fusion method for CS ASR. It has two stages: the speech awareness (SA) stage and the language fusion (LF) stage. In the SA stage, acoustic features are mapped to two language-specific predictions by two independent MAMs. To keep the MAMs focused on their own language, we further extend the language-aware training strategy for the MAMs. In the LF stage, the BELM fuses two language-specific predictions to get the final prediction. Moreover, we propose a text simulation strategy to simplify the training process of the BELM and reduce reliance on CS data. Experiments on a Mandarin-English corpus show the efficiency of the proposed method. The mix error rate is significantly reduced on the test set after using open-source pre-trained MAMs.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Distributed data-driven predictive control for cooperatively smoothing mixed traffic flow
Authors:
Jiawei Wang,
Yingzhao Lian,
Yuning Jiang,
Qing Xu,
Keqiang Li,
Colin N. Jones
Abstract:
Cooperative control of connected and automated vehicles (CAVs) promises smoother traffic flow. In mixed traffic, where human-driven vehicles with unknown dynamics coexist, data-driven predictive control techniques allow for CAV safe and optimal control with measurable traffic data. However, the centralized control setting in most existing strategies limits their scalability for large-scale mixed t…
▽ More
Cooperative control of connected and automated vehicles (CAVs) promises smoother traffic flow. In mixed traffic, where human-driven vehicles with unknown dynamics coexist, data-driven predictive control techniques allow for CAV safe and optimal control with measurable traffic data. However, the centralized control setting in most existing strategies limits their scalability for large-scale mixed traffic flow. To address this problem, this paper proposes a cooperative DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) formulation and its distributed implementation algorithm. In cooperative DeeP-LCC, the traffic system is naturally partitioned into multiple subsystems with one single CAV, which collects local trajectory data for subsystem behavior predictions based on the Willems' fundamental lemma. Meanwhile, the cross-subsystem interaction is formulated as a coupling constraint. Then, we employ the Alternating Direction Method of Multipliers (ADMM) to design the distributed DeeP-LCC algorithm. This algorithm achieves both computation and communication efficiency, as well as trajectory data privacy, through parallel calculation. Our simulations on different traffic scales verify the real-time wave-dampening potential of distributed DeeP-LCC, which can reduce fuel consumption by over 31.84% in a large-scale traffic system of 100 vehicles with only 5%-20% CAVs.
△ Less
Submitted 28 April, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Joint Optimization of Active and Passive Beamforming in Multi-IRS Aided mmWave Communications
Authors:
Renlong Wei,
Qing Xue,
Shaodan Ma,
Yongjun Xu,
Li Yan,
Xuming Fang
Abstract:
Intelligent reflecting surface (IRS) has been considered as a promising technology to alleviate the blockage effect and enhance coverage in millimeter wave (mmWave) communication. To explore the impact of IRS on the performance of mmWave communication, we investigate a multi-IRS assisted mmWave communication network and formulate a sum rate maximization problem by jointly optimizing the active and…
▽ More
Intelligent reflecting surface (IRS) has been considered as a promising technology to alleviate the blockage effect and enhance coverage in millimeter wave (mmWave) communication. To explore the impact of IRS on the performance of mmWave communication, we investigate a multi-IRS assisted mmWave communication network and formulate a sum rate maximization problem by jointly optimizing the active and passive beamforming and the set of IRSs for assistance. The optimization problem is intractable due to the lack of convexity of the objective function and the binary nature of the IRS selection variables. To tackle the complex non-convex problem, an alternating iterative approach is proposed. In particular, utilizing the fractional programming method to optimize the active and passive beamforming and the optimization of IRS selection is solved by enumerating. Simulation results demonstrate the performance gain of our proposed approach.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
A Fast Algorithm for Onboard Atmospheric Powered Descent Guidance
Authors:
Yushu Chen,
Guangwen Yang,
Lu Wang,
Qingzhong Gan,
Haipeng Chen,
Quanyong Xu
Abstract:
Atmospheric powered descent guidance can be solved by successive convexification; however, its onboard application is impeded by the sharp increase in computation caused by nonlinear aerodynamic forces. The problem has to be converted into a sequence of convex subproblems instead of a single convex problem when aerodynamic forces are ignored. Besides, each subproblem is significantly more complica…
▽ More
Atmospheric powered descent guidance can be solved by successive convexification; however, its onboard application is impeded by the sharp increase in computation caused by nonlinear aerodynamic forces. The problem has to be converted into a sequence of convex subproblems instead of a single convex problem when aerodynamic forces are ignored. Besides, each subproblem is significantly more complicated, which increases computation. A fast real-time interior point method was presented to solve the correlated convex subproblems onboard in the work. The main contributions are as follows: Firstly, an algorithm was proposed to accelerate the solution of linear systems that cost most of the computation in each iterative step by exploiting the specific problem structure. Secondly, a warm-starting scheme was introduced to refine the initial value of a subproblem with a rough approximate solution of the former subproblem, which lessened the iterative steps required for each subproblem. The method proposed reduced the run time by a factor of 9 compared with the fastest publicly available solver tested in Monte Carlo simulations to evaluate the efficiency of solvers. Runtimes on the order of 0.6 s are achieved on a radiation-hardened flight processor, which demonstrated the potential of the real-time onboard application.
△ Less
Submitted 6 June, 2023; v1 submitted 9 September, 2022;
originally announced September 2022.
-
A Language Agnostic Multilingual Streaming On-Device ASR System
Authors:
Bo Li,
Tara N. Sainath,
Ruoming Pang,
Shuo-yiin Chang,
Qiumin Xu,
Trevor Strohman,
Vince Chen,
Qiao Liang,
Heguang Liu,
Yanzhang He,
Parisa Haghani,
Sameer Bidichandani
Abstract:
On-device end-to-end (E2E) models have shown improvements over a conventional model on English Voice Search tasks in both quality and latency. E2E models have also shown promising results for multilingual automatic speech recognition (ASR). In this paper, we extend our previous capacity solution to streaming applications and present a streaming multilingual E2E ASR system that runs fully on device…
▽ More
On-device end-to-end (E2E) models have shown improvements over a conventional model on English Voice Search tasks in both quality and latency. E2E models have also shown promising results for multilingual automatic speech recognition (ASR). In this paper, we extend our previous capacity solution to streaming applications and present a streaming multilingual E2E ASR system that runs fully on device with comparable quality and latency to individual monolingual models. To achieve that, we propose an Encoder Endpointer model and an End-of-Utterance (EOU) Joint Layer for a better quality and latency trade-off. Our system is built in a language agnostic manner allowing it to natively support intersentential code switching in real time. To address the feasibility concerns on large models, we conducted on-device profiling and replaced the time consuming LSTM decoder with the recently developed Embedding decoder. With these changes, we managed to run such a system on a mobile device in less than real time.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Authors:
Tongtong Song,
Qiang Xu,
Meng Ge,
Longbiao Wang,
Hao Shi,
Yongjie Lv,
Yuqin Lin,
Jianwu Dang
Abstract:
Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition. Because LSEs are initialized by two pre-trained language-specific models (LSMs), the dual-encoder structure can exploit sufficient monolingual data and capture the individual language attributes. However, most existing methods have no language constraints on LSEs and underutili…
▽ More
Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition. Because LSEs are initialized by two pre-trained language-specific models (LSMs), the dual-encoder structure can exploit sufficient monolingual data and capture the individual language attributes. However, most existing methods have no language constraints on LSEs and underutilize language-specific knowledge of LSMs. In this paper, we propose a language-specific characteristic assistance (LSCA) method to mitigate the above problems. Specifically, during training, we introduce two language-specific losses as language constraints and generate corresponding language-specific targets for them. During decoding, we take the decoding abilities of LSMs into account by combining the output probabilities of two LSMs and the mixture model to obtain the final predictions. Experiments show that either the training or decoding method of LSCA can improve the model's performance. Furthermore, the best result can obtain up to 15.4% relative error reduction on the code-switching test set by combining the training and decoding methods of LSCA. Moreover, the system can process code-switching speech recognition tasks well without extra shared parameters or even retraining based on two pre-trained LSMs by using our method.
△ Less
Submitted 11 July, 2022; v1 submitted 29 June, 2022;
originally announced June 2022.
-
The Musical Arrow of Time -- The Role of Temporal Asymmetry in Music and Its Organicist Implications
Authors:
Qi Xu
Abstract:
Adopting a performer-centric perspective, we frequently encounter two statements: "music flows", and "music is life-like". This dissertation builds on top of the two statements above, resulting in an exploration of the role of temporal asymmetry in music (generalizing "music flows") and its relation to the idea of organicism (generalizing "music is life-like"). We focus on two aspects of temporal…
▽ More
Adopting a performer-centric perspective, we frequently encounter two statements: "music flows", and "music is life-like". This dissertation builds on top of the two statements above, resulting in an exploration of the role of temporal asymmetry in music (generalizing "music flows") and its relation to the idea of organicism (generalizing "music is life-like"). We focus on two aspects of temporal asymmetry. The first aspect concerns the vastly different epistemic mechanisms with which we obtain knowledge of the past and the future. A particular musical consequence follows: recurrence. The epistemic difference between the past and the future shapes our experience and interpretation of recurring events in music. The second aspect concerns the arrow of time: the unambiguous ordering imposed on temporal events gives rise to the a priori pointedness of time, rendering time asymmetrical and irreversible. A discussion on thermodynamics informs us musically: the arrow of time effectuates itself in musical forms by delaying the placement of the climax.
Organicism serves as a mediating topic, engaging with the concept of life as in organisms. On the one hand, organicism is related to temporal asymmetry in science via a thermodynamical interpretation of life as entropy-reducing entities. On the other hand, organicism is a topic native to music via the universally acknowledged artistic idea that music should be interpreted as a vital force possessing volitional power. With organicism as a mediator, we better understand the role of temporal asymmetry in music. In particular, we view musical form as a process of expansion and elaboration analogous to organic growth. Finally, we present an organicist interpretation of delaying the climax: viewing musical form as the result of organic growth, the arrow of time translates to a preference for prepending structure over appending structure.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Implementation and Experimental Validation of Data-Driven Predictive Control for Dissipating Stop-and-Go Waves in Mixed Traffic
Authors:
Jiawei Wang,
Yang Zheng,
Jianghong Dong,
Chaoyi Chen,
Mengchi Cai,
Keqiang Li,
Qing Xu
Abstract:
In this paper, we present the first experimental results of data-driven predictive control for connected and autonomous vehicles (CAVs) in dissipating traffic waves. In particular, we consider a recent strategy of Data-EnablEd Predicted Leading Cruise Control (DeeP-LCC), which bypasses the need of identifying the driving behaviors of surrounding vehicles and directly relies on measurable traffic d…
▽ More
In this paper, we present the first experimental results of data-driven predictive control for connected and autonomous vehicles (CAVs) in dissipating traffic waves. In particular, we consider a recent strategy of Data-EnablEd Predicted Leading Cruise Control (DeeP-LCC), which bypasses the need of identifying the driving behaviors of surrounding vehicles and directly relies on measurable traffic data to achieve safe and optimal CAV control in mixed traffic. We present the implementation details of DeeP-LCC, including data collection, equilibrium estimation, and control execution. Based on a miniature experiment platform, we reproduce the phenomenon of stop-and-go waves in two typical traffic scenarios: 1) open straight-road scenario under external disturbances and 2) closed ring-road scenario with no bottlenecks. Our experiments clearly demonstrate that DeeP-LCC enables one or a few CAVs to dissipate the traffic waves in both traffic scenarios. These experimental findings validate the great potential of DeeP-LCC in smoothing practical traffic flow in the presence of noisy data, uncertain low-level vehicle dynamics, and communication and computation delays. The code and videos of our experimental results are available at https://github.com/soc-ucsd/DeeP-LCC.
△ Less
Submitted 23 November, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
DeeP-LCC: Data-EnablEd Predictive Leading Cruise Control in Mixed Traffic Flow
Authors:
Jiawei Wang,
Yang Zheng,
Keqiang Li,
Qing Xu
Abstract:
For the control of connected and autonomous vehicles (CAVs), most existing methods focus on model-based strategies. They require explicit knowledge of car-following dynamics of human-driven vehicles that are non-trivial to identify accurately. In this paper, instead of relying on a parametric car-following model, we introduce a data-driven non-parametric strategy, called DeeP-LCC (Data-EnablEd Pre…
▽ More
For the control of connected and autonomous vehicles (CAVs), most existing methods focus on model-based strategies. They require explicit knowledge of car-following dynamics of human-driven vehicles that are non-trivial to identify accurately. In this paper, instead of relying on a parametric car-following model, we introduce a data-driven non-parametric strategy, called DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control), to achieve safe and optimal control of CAVs in mixed traffic. We first utilize Willems' fundamental lemma to obtain a data-centric representation of mixed traffic behavior. This is justified by rigorous analysis on controllability and observability properties of mixed traffic. We then employ a receding horizon strategy to solve a finite-horizon optimal control problem at each time step, in which input/output constraints are incorporated for collision-free guarantees. Numerical experiments validate the performance of DeeP-LCC compared to a standard predictive controller that requires an accurate model. Multiple nonlinear traffic simulations further confirm its great potential on improving traffic efficiency, driving safety, and fuel economy.
△ Less
Submitted 16 January, 2023; v1 submitted 20 March, 2022;
originally announced March 2022.
-
To what extent can Plug-and-Play methods outperform neural networks alone in low-dose CT reconstruction
Authors:
Qifan Xu,
Qihui Lyu,
Dan Ruan,
Ke Sheng
Abstract:
The Plug-and-Play (PnP) framework was recently introduced for low-dose CT reconstruction to leverage the interpretability and the flexibility of model-based methods to incorporate various plugins, such as trained deep learning (DL) neural networks. However, the benefits of PnP vs. state-of-the-art DL methods have not been clearly demonstrated. In this work, we proposed an improved PnP framework to…
▽ More
The Plug-and-Play (PnP) framework was recently introduced for low-dose CT reconstruction to leverage the interpretability and the flexibility of model-based methods to incorporate various plugins, such as trained deep learning (DL) neural networks. However, the benefits of PnP vs. state-of-the-art DL methods have not been clearly demonstrated. In this work, we proposed an improved PnP framework to address the previous limitations and develop clinical-relevant segmentation metrics for quantitative result assessment. Compared with the DL alone methods, our proposed PnP framework was slightly inferior in MSE and PSNR. However, the power spectrum of the resulting images better matched that of full-dose images than that of DL denoised images. The resulting images supported higher accuracy in airway segmentation than DL denoised images for all the ten patients in the test set, more substantially on the airways with a cross-section smaller than 0.61cm$^2$, and outperformed the DL denoised images for 45 out of 50 lung lobes in lobar segmentation. Our PnP method proved to be significantly better at preserving the image texture, which translated to task-specific benefits in automated structure segmentation and detection.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation
Authors:
Qing Xu,
Zhicheng Ma,
Na HE,
Wenting Duan
Abstract:
Deep learning architecture with convolutional neural network (CNN) achieves outstanding success in the field of computer vision. Where U-Net, an encoder-decoder architecture structured by CNN, makes a great breakthrough in biomedical image segmentation and has been applied in a wide range of practical scenarios. However, the equal design of every downsampling layer in the encoder part and simply s…
▽ More
Deep learning architecture with convolutional neural network (CNN) achieves outstanding success in the field of computer vision. Where U-Net, an encoder-decoder architecture structured by CNN, makes a great breakthrough in biomedical image segmentation and has been applied in a wide range of practical scenarios. However, the equal design of every downsampling layer in the encoder part and simply stacked convolutions do not allow U-Net to extract sufficient information of features from different depths. The increasing complexity of medical images brings new challenges to the existing methods. In this paper, we propose a deeper and more compact split-attention u-shape network (DCSAU-Net), which efficiently utilises low-level and high-level semantic information based on two novel frameworks: primary feature conservation and compact split-attention block. We evaluate the proposed model on CVC-ClinicDB, 2018 Data Science Bowl, ISIC-2018 and SegPC-2021 datasets. As a result, DCSAU-Net displays better performance than other state-of-the-art (SOTA) methods in terms of the mean Intersection over Union (mIoU) and F1-socre. More significantly, the proposed model demonstrates excellent segmentation performance on challenging images. The code for our work and more technical details can be found at https://github.com/xq141839/DCSAU-Net.
△ Less
Submitted 24 September, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Multi-Mode Spatial Signal Processor with Rainbow-like Fast Beam Training and Wideband Communications using True-Time-Delay Arrays
Authors:
Chung-Ching Lin,
Chase Puglisi,
Veljko Boljanovic,
Han Yan,
Erfan Ghaderi,
Jayce Gaddis,
Qiuyan Xu,
Sreeni Poolakkal,
Danijela Cabric,
Subhanshu Gupta
Abstract:
Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) wireless networks and beyond. Limited bandwidth in existing standards and use of phase-shifters in analog/hybrid phased-antenna arrays (PAA) are not suited for these emerging standards demanding low-latency direction finding. This work proposes a reconfigurable true-time-delay (T…
▽ More
Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) wireless networks and beyond. Limited bandwidth in existing standards and use of phase-shifters in analog/hybrid phased-antenna arrays (PAA) are not suited for these emerging standards demanding low-latency direction finding. This work proposes a reconfigurable true-time-delay (TTD) based spatial signal processor (SSP) with frequency-division beam training methodology and wideband beam-squint less data communications. Discrete-time delay compensated clocking technique is used to support 800~MHz bandwidth with a large unity-gain bandwidth ring-amplifier (RAMP)-based signal combiner. To extensively characterize the proposed SSP across different SSP modes and frequency-angle pairs, an automated testbed is developed using computer-vision techniques that significantly speeds up the testing progress and minimize possible human errors. Using seven levels of time-interleaving for each of the 4 antenna elements, the TTD SSP has a delay range of 3.8 ns over 800 MHz and achieves unique frequency-to-angle map** in the beamtraining mode with nearly 12 dB frequency-independent gain in the beamforming mode. The SSP is prototyped in 65nm CMOS with an area of 1.98mm$^2$ consuming only 29 mW excluding buffers. Further, an error vector magnitude (EVM) of 9.8% is realized for 16-QAM modulation at a speed of 122.8 Mb/s.
△ Less
Submitted 7 January, 2022;
originally announced January 2022.
-
Experimental Validation of Multi-lane Formation Control for Connected and Automated Vehicles in Multiple Scenarios
Authors:
Mengchi Cai,
Qing Xu,
Chunying Yang,
Jianghong Dong,
Chaoyi Chen,
Jiawei Wang,
Jianqiang Wang,
Keqiang Li
Abstract:
Formation control methods of connected and automated vehicles have been proposed to smoothly switch the structure of vehicular formations in different scenarios. In the previous research, simulations are often conducted to verify the performance of formation control methods. This paper presents the experimental results of multi-lane formation control for connected and automated vehicles. The coord…
▽ More
Formation control methods of connected and automated vehicles have been proposed to smoothly switch the structure of vehicular formations in different scenarios. In the previous research, simulations are often conducted to verify the performance of formation control methods. This paper presents the experimental results of multi-lane formation control for connected and automated vehicles. The coordinated formation control framework and specific methods utilized for different scenarios are introduced. The details of experimental platform and vehicle control strategy is provided. Simulations and experiments are conducted in different scenarios, and the results indicate that the formation control method is applicable to multiple traffic scenarios and able to improve formation-structure-switching efficiency compared with benchmark methods.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Wideband Beamforming with Rainbow Beam Training using Reconfigurable True-Time-Delay Arrays for Millimeter-Wave Wireless
Authors:
Chung-Ching Lin,
Veljko Boljanovic,
Han Yan,
Erfan Ghaderi,
Mohammad Ali Mokri,
Jayce Jeron Gaddis,
Aditya Wadaskar,
Chase Puglisi,
Soumen Mohapatra,
Qiuyan Xu,
Sreeni Poolakkal,
Deukhyoun Heo,
Subhanshu Gupta,
Danijela Cabric
Abstract:
The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large arrays with wide aperture widths. This article introduces highly reconfigurable delay elements implementable at analog or digital baseband that enables multiple SSP functions including wideband beamforming, wideband interference cancellation, and fast beam train…
▽ More
The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large arrays with wide aperture widths. This article introduces highly reconfigurable delay elements implementable at analog or digital baseband that enables multiple SSP functions including wideband beamforming, wideband interference cancellation, and fast beam training. Details of the beam-training algorithm, system design considerations, system architecture and circuits with large delay range-to-resolution ratios are presented leveraging integrated delay compensation techniques. The article lays out the framework for true-time-delay based arrays in next-generation network infrastructure supporting 3D beam training in planar arrays, low latency massive multiple access, and emerging wireless communications standards.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.