Search | arXiv e-print repository

arXiv:2406.18536 [pdf, other]

Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

Authors: Yuxuan Yin, Xiaoxiao Wang, Rebecca Chen, Chen He, Peng Li

Abstract: Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertaintie… ▽ More Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertainties caused by different sources of variations. While some existing techniques offer region predictions, but they rely on certain distributional assumptions and/or provide no coverage guarantees. In response to these limitations, we propose a novel distribution-free $V_{min}$ interval estimation methodology possessing a theoretical guarantee of coverage. Our approach leverages conformalized quantile regression and on-chip monitors to generate reliable prediction intervals. We demonstrate the effectiveness of the proposed method on an industrial 5nm automotive chip dataset. Moreover, we show that the use of on-chip monitors can reduce the interval length significantly for $V_{min}$ prediction. △ Less

Submitted 3 May, 2024; originally announced June 2024.

Comments: Accepted by DATE 2024. Camera-ready version

arXiv:2406.15232 [pdf, ps, other]

Dam** Wind Farm Resonances with Current Based Model Predictive Pulse Pattern Control

Authors: Orcun Karaca, Ioannis Tsoumas, Tinus Dorfling, Ran Chen, Lennart Harnefors

Abstract: It is well-established that a proportional current control gain emulates a resistor in the converter output impedance. Even though this resistance can provide additional dam** to grid resonances, its effect for traditional linear current controllers is known to be rather limited. Moreover, for medium-voltage systems, high switching frequencies are not an option due to the high switching losses.… ▽ More It is well-established that a proportional current control gain emulates a resistor in the converter output impedance. Even though this resistance can provide additional dam** to grid resonances, its effect for traditional linear current controllers is known to be rather limited. Moreover, for medium-voltage systems, high switching frequencies are not an option due to the high switching losses. To meet the harmonic standards, it is expedient to use optimized pulse patterns. This further exacerbates the problems with the resistance of classical controllers, since an additional filtering would be required so that the current controller acts only on the fundamental component (and not on the ripple component). Such a design limits the dam** effect not only in its amplitude but also in the frequency range where it is active. This paper shows that a high-bandwidth current-based model predictive pulse pattern controller can alleviate these limitations. The pulse pattern control approach can achieve a high gain even at low switching frequencies, while controlling directly the instantaneous currents (i.e., the fundamental component and the ripple together). With a fast implementation cycle, the frequency range where this dam** effect is active can be further extended. Numerical studies showcase these benefits for a multi-phase medium-voltage wind power conversion system. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.15222 [pdf]

Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed as having other acute chest pain conditions. Subsequently, these AAS patients will undergo clinically inaccurate or suboptimal differential diagnosis. Fortunately, even under these suboptimal protocols, nearly all these patients underwent non-contrast CT covering the aorta anatomy at the early stage of differential diagnosis. In this study, we developed an artificial intelligence model (DeepAAS) using non-contrast CT, which is highly accurate for identifying AAS and provides interpretable results to assist in clinical decision-making. Performance was assessed in two major phases: a multi-center retrospective study (n = 20,750) and an exploration in real-world emergency scenarios (n = 137,525). In the multi-center cohort, DeepAAS achieved a mean area under the receiver operating characteristic curve of 0.958 (95% CI 0.950-0.967). In the real-world cohort, DeepAAS detected 109 AAS patients with misguided initial suspicion, achieving 92.6% (95% CI 76.2%-97.5%) in mean sensitivity and 99.2% (95% CI 99.1%-99.3%) in mean specificity. Our AI model performed well on non-contrast CT at all applicable early stages of differential diagnosis workflows, effectively reduced the overall missed diagnosis and misdiagnosis rate from 48.8% to 4.8% and shortened the diagnosis time for patients with misguided initial suspicion from an average of 681.8 (74-11,820) mins to 68.5 (23-195) mins. DeepAAS could effectively fill the gap in the current clinical workflow without requiring additional tests. △ Less

Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: under peer review

arXiv:2406.11175 [pdf, other]

SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

Authors: Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu

Abstract: The proliferation of deep neural networks has spawned the rapid development of acoustic echo cancellation and noise suppression, and plenty of prior arts have been proposed, which yield promising performance. Nevertheless, they rarely consider the deployment generality in different processing scenarios, such as edge devices, and cloud processing. To this end, this paper proposes a general model, t… ▽ More The proliferation of deep neural networks has spawned the rapid development of acoustic echo cancellation and noise suppression, and plenty of prior arts have been proposed, which yield promising performance. Nevertheless, they rarely consider the deployment generality in different processing scenarios, such as edge devices, and cloud processing. To this end, this paper proposes a general model, termed SMRU, to cover different application scenarios. The novelty lies in two-fold. First, a multi-scale band split layer and band merge layer are proposed to effectively fuse local frequency bands for lower complexity modeling. Besides, by simulating the multi-resolution feature modeling characteristic of the classical UNet structure, a novel recurrent-dominated UNet is devised. It consists of multiple variable frame rate blocks, each of which involves the causal time down-/up-sampling layer with varying compression ratios and the dual-path structure for inter- and intra-band modeling. The model is configured from 50 M/s to 6.8 G/s in terms of MACs, and the experimental results show that the proposed approach yields competitive or even better performance over existing baselines, and has the full potential to adapt to more general scenarios with varying complexity requirements. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.05325 [pdf, other]

LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance

Authors: Shihao Chen, Yu Gu, Jie Zhang, Na Li, Rilin Chen, Li** Chen, Lirong Dai

Abstract: Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the singing voice of one singer into that of another, given only a few seconds of singing data. However, during the conversion process, the issue of timbre leakage is inevitable: the converted singing voice still sounds like the original singer's voice. To tackle this, we propose a latent diffusi… ▽ More Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the singing voice of one singer into that of another, given only a few seconds of singing data. However, during the conversion process, the issue of timbre leakage is inevitable: the converted singing voice still sounds like the original singer's voice. To tackle this, we propose a latent diffusion model for SVC (LDM-SVC) in this work, which attempts to perform SVC in the latent space using an LDM. We pretrain a variational autoencoder structure using the noted open-source So-VITS-SVC project based on the VITS framework, which is then used for the LDM training. Besides, we propose a singer guidance training method based on classifier-free guidance to further suppress the timbre of the original singer. Experimental results show the superiority of the proposed method over previous works in both subjective and objective evaluations of timbre similarity. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech 2024

arXiv:2406.03882 [pdf, other]

Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse acoustic and linguistic features embedded in spontaneous speech, both the Whisper speech model and textual large language models (LLMs) are used for suicide risk detection. Both all-parameter finetuning and parameter-efficient finetuning approaches are used to adapt the pre-trained models for suicide risk detection, and multiple audio-text fusion approaches are evaluated to combine the representations of Whisper and the LLM. The proposed system achieves a detection accuracy of 0.807 and an F1-score of 0.846 on the test set with 119 subjects, indicating promising potential for real suicide risk detection applications. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech 2024

arXiv:2406.01993 [pdf]

Choroidal Vessel Segmentation on Indocyanine Green Angiography Images via Human-in-the-Loop Labeling

Authors: Ruoyu Chen, Ziwei Zhao, Mayinuer Yusufu, Xianwen Shang, Danli Shi, Mingguang He

Abstract: Human-in-the-loop (HITL) strategy has been recently introduced into the field of medical image processing. Indocyanine green angiography (ICGA) stands as a well-established examination for visualizing choroidal vasculature and detecting chorioretinal diseases. However, the intricate nature of choroidal vascular networks makes large-scale manual segmentation of ICGA images challenging. Thus, the st… ▽ More Human-in-the-loop (HITL) strategy has been recently introduced into the field of medical image processing. Indocyanine green angiography (ICGA) stands as a well-established examination for visualizing choroidal vasculature and detecting chorioretinal diseases. However, the intricate nature of choroidal vascular networks makes large-scale manual segmentation of ICGA images challenging. Thus, the study aims to develop a high-precision choroidal vessel segmentation model with limited labor using HITL framework. We utilized a multi-source ICGA dataset, including 55 degree view and ultra-widefield ICGA (UWF-ICGA) images for model development. The choroidal vessel network was pre-segmented by a pre-trained vessel segmentation model, and then manually modified by two ophthalmologists. Choroidal vascular diameter, density, complexity, tortuosity, and branching angle were automatically quantified based on the segmentation. We finally conducted four cycles of HITL. One hundred and fifty 55 degree view ICGA images were used for the first three cycles (50 images per cycle), and twenty UWF-ICGA images for the last cycle. The average time needed to manually correct a pre-segmented ICGA image per cycle reduced from 20 minutes to 1 minute. High segmentation accuracy has been achieved on both 55 degree view ICGA and UWF-ICGA images. Additionally, the multi-dimensional choroidal vascular parameters were significantly associated with various chorioretinal diseases. Our study not only demonstrated the feasibility of the HITL strategy in improving segmentation performance with reduced manual labeling, but also innovatively introduced several risk predictors for choroidal abnormalities. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 25 pages,4 figures

arXiv:2405.11380 [pdf, other]

Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills

Authors: Tianhao Wei, Liqian Ma, Rui Chen, Weiye Zhao, Changliu Liu

Abstract: The requirements for real-world manipulation tasks are diverse and often conflicting; some tasks require precise motion while others require force compliance; some tasks require avoidance of certain regions, while others require convergence to certain states. Satisfying these varied requirements with a fixed state-action representation and control strategy is challenging, impeding the development… ▽ More The requirements for real-world manipulation tasks are diverse and often conflicting; some tasks require precise motion while others require force compliance; some tasks require avoidance of certain regions, while others require convergence to certain states. Satisfying these varied requirements with a fixed state-action representation and control strategy is challenging, impeding the development of a universal robotic foundation model. In this work, we propose Meta-Control, the first LLM-enabled automatic control synthesis approach that creates customized state representations and control strategies tailored to specific tasks. Our core insight is that a meta-control system can be built to automate the thought process that human experts use to design control systems. Specifically, human experts heavily use a model-based, hierarchical (from abstract to concrete) thought model, then compose various dynamic models and controllers together to form a control system. Meta-Control mimics the thought model and harnesses LLM's extensive control knowledge with Socrates' "art of midwifery" to automate the thought process. Meta-Control stands out for its fully model-based nature, allowing rigorous analysis, generalizability, robustness, efficient parameter tuning, and reliable real-time execution. △ Less

Submitted 7 June, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

arXiv:2403.14968 [pdf, other]

Real-time Safety Index Adaptation for Parameter-varying Systems via Determinant Gradient Ascend

Authors: Rui Chen, Weiye Zhao, Ruixuan Liu, Weiyang Zhang, Changliu Liu

Abstract: Safety Index Synthesis (SIS) is critical for deriving safe control laws. Recent works propose to synthesize a safety index (SI) via nonlinear programming and derive a safe control law such that the system 1) achieves forward invariant (FI) with some safe set and 2) guarantees finite time convergence (FTC) to that safe set. However, real-world system dynamics can vary during run-time, making the co… ▽ More Safety Index Synthesis (SIS) is critical for deriving safe control laws. Recent works propose to synthesize a safety index (SI) via nonlinear programming and derive a safe control law such that the system 1) achieves forward invariant (FI) with some safe set and 2) guarantees finite time convergence (FTC) to that safe set. However, real-world system dynamics can vary during run-time, making the control law infeasible and invalidating the initial SI. Since the full SIS nonlinear programming is computationally expensive, it is infeasible to re-synthesize the SI each time the dynamics are perturbed. To address that, this paper proposes an efficient approach to adapting the SI to varying system dynamics and maintaining the feasibility of the safe control law. The proposed method leverages determinant gradient ascend and derives a closed-form update to safety index parameters, enabling real-time adaptation performance. A numerical study validates the effectiveness of our approach. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: Accepted to American Control Conference (ACC) 2024

arXiv:2403.06066 [pdf]

CausalCellSegmenter: Causal Inference inspired Diversified Aggregation Convolution for Pathology Image Segmentation

Authors: Dawei Fan, Yifan Gao, Jiaming Yu, Yan** Chen, Wencheng Li, Chuancong Lin, Kaibin Li, Changcai Yang, Riqing Chen, Lifang Wei

Abstract: Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlap** between cell nucleus, and blurred edges often lead to poor performance. To address these ch… ▽ More Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlap** between cell nucleus, and blurred edges often lead to poor performance. To address these challenges, we propose a novel framework termed CausalCellSegmenter, which combines Causal Inference Module (CIM) with Diversified Aggregation Convolution (DAC) techniques. The DAC module is designed which incorporates diverse downsampling features through a simple, parameter-free attention module (SimAM), aiming to overcome the problems of false-positive identification and edge blurring. Furthermore, we introduce CIM to leverage sample weighting by directly removing the spurious correlations between features for every input sample and concentrating more on the correlation between features and labels. Extensive experiments on the MoNuSeg-2018 dataset achieves promising results, outperforming other state-of-the-art methods, where the mIoU and DSC scores growing by 3.6% and 2.65%. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: 10 pages, 5 figures, 2 tables, MICCAI

arXiv:2402.09735 [pdf, other]

DFORM: Diffeomorphic vector field alignment for assessing dynamics across learned models

Authors: Ruiqi Chen, Giacomo Vedovati, Todd Braver, ShiNung Ching

Abstract: Dynamical system models such as Recurrent Neural Networks (RNNs) have become increasingly popular as hypothesis-generating tools in scientific research. Evaluating the dynamics in such networks is key to understanding their learned generative mechanisms. However, comparison of learned dynamics across models is challenging due to their inherent nonlinearity and because a priori there is no enforced… ▽ More Dynamical system models such as Recurrent Neural Networks (RNNs) have become increasingly popular as hypothesis-generating tools in scientific research. Evaluating the dynamics in such networks is key to understanding their learned generative mechanisms. However, comparison of learned dynamics across models is challenging due to their inherent nonlinearity and because a priori there is no enforced equivalence of their coordinate systems. Here, we propose the DFORM (Diffeomorphic vector field alignment for comparing dynamics across learned models) framework. DFORM learns a nonlinear coordinate transformation which provides a continuous, maximally one-to-one map** between the trajectories of learned models, thus approximating a diffeomorphism between them. The mismatch between DFORM-transformed vector fields defines the orbital similarity between two models, thus providing a generalization of the concepts of smooth orbital and topological equivalence. As an example, we apply DFORM to models trained on a canonical neuroscience task, showing that learned dynamics may be functionally similar, despite overt differences in attractor landscapes. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 12 pages, 8 figures

arXiv:2312.09439 [pdf, other]

Smart Roads: Roadside Perception, Vehicle-Road Cooperation and Business Model

Authors: Rui Chen, Lu Gao, Yutian Liu, Yong Liang Guan, Yan Zhang

Abstract: Smart roads have become an essential component of intelligent transportation systems (ITS). The roadside perception technology, a critical aspect of smart roads, utilizes various sensors, roadside units (RSUs), and edge computing devices to gather real-time traffic data for vehicle-road cooperation. However, the full potential of smart roads in improving the safety and efficiency of autonomous veh… ▽ More Smart roads have become an essential component of intelligent transportation systems (ITS). The roadside perception technology, a critical aspect of smart roads, utilizes various sensors, roadside units (RSUs), and edge computing devices to gather real-time traffic data for vehicle-road cooperation. However, the full potential of smart roads in improving the safety and efficiency of autonomous vehicles only can be realized through the mass deployment of roadside perception and communication devices. On the one hand, roadside devices require significant investment but can only achieve monitoring function currently, resulting in no profitability for investors. On the other hand, drivers lack trust in the safety of autonomous driving technology, making it difficult to promote large-scale commercial applications. To deal with the dilemma of mass deployment, we propose a novel smart-road vehicle-guiding architecture for vehicle-road cooperative autonomous driving, based on which we then propose the corresponding business model and analyze its benefits from both operator and driver perspectives. The numerical simulations validate that our proposed smart road solution can enhance driving safety and traffic efficiency. Moreover, we utilize the cost-benefit analysis (CBA) model to assess the economic advantages of the proposed business model which indicates that the smart highway that can provide vehicle-guided-driving services for autonomous vehicles yields more profit than the regular highway. △ Less

Submitted 19 October, 2023; originally announced December 2023.

arXiv:2312.07864 [pdf, other]

MMSE Design of RIS-aided Communications

Authors: Wen-Xuan Long, Marco Moretti, Andrea Abrardo, Luca Sanguinetti, Rui Chen

Abstract: Consider a communication system in which a single antenna user equipment exchanges information with a multi-antenna base station via a reconfigurable intelligent surface (RIS) in the presence of spatially correlated channels and electromagnetic interference (EMI). To exploit the attractive advantages of RIS technology, accurate configuration of its reflecting elements is crucial. In this paper, we… ▽ More Consider a communication system in which a single antenna user equipment exchanges information with a multi-antenna base station via a reconfigurable intelligent surface (RIS) in the presence of spatially correlated channels and electromagnetic interference (EMI). To exploit the attractive advantages of RIS technology, accurate configuration of its reflecting elements is crucial. In this paper, we use statistical knowledge of channels and EMI to optimize the RIS elements for i) accurate channel estimation and ii) reliable data transmission. In both cases, our goal is to determine the RIS coefficients that minimize the mean square error, resulting in the formulation of two non-convex problems that share the same structure. To solve these two problems, we present an alternating optimization approach that reliably converges to a locally optimal solution. The incorporation of the diagonally scaled steepest descent algorithm, derived from Newton's method, ensures fast convergence with manageable complexity. Numerical results demonstrate the effectiveness of the proposed method under various propagation conditions. Notably, it shows significant advantages over existing alternatives that depend on a sub-optimal configuration of the RIS and are derived on the basis of different criteria. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: 13 pages, 10 figures

arXiv:2311.14515 [pdf, other]

On RIS-Aided SIMO Gaussian Channels: Towards A Single-RF MIMO Transceiver Architecture

Authors: Ru-Han Chen, **g Zhou, Yonggang Zhu, Kai Zhang

Abstract: In this paper, for a single-input multiple-output (SIMO) system aided by a passive reconfigurable intelligent surface (RIS), the joint transmission accomplished by the single transmit antenna and the RIS with multiple controllable reflective elements is considered. Relying on a general capacity upper bound derived by a maximum-trace argument, we respectively characterize the capacity of such \rev{… ▽ More In this paper, for a single-input multiple-output (SIMO) system aided by a passive reconfigurable intelligent surface (RIS), the joint transmission accomplished by the single transmit antenna and the RIS with multiple controllable reflective elements is considered. Relying on a general capacity upper bound derived by a maximum-trace argument, we respectively characterize the capacity of such \rev{a} channel in the low-SNR or the rank-one regimes, in which the optimal configuration of the RIS is proved to be beamforming with carefully-chosen phase shifts. To exploit the potential of modulating extra information on the RIS, based on the QR decomposition, successive interference cancellation, and a strategy named \textit{partially beamforming and partially information-carrying}, we propose a novel transceiver architecture with only a single RF front end at the transmitter, by which the considered channel can be regarded as a concatenation of a vector Gaussian channel and several phase-modulated channels. Especially, we investigate a class of vector Gaussian channels with a hypersphere input support constraint, and not only generalize the existing result to arbitrary-dimensional real spaces but also present its high-order capacity asymptotics, by which both capacities of hypersphere-constrained channels and achievable rates of the proposed transceiver with two different signaling schemes can be well-approximated. Information-theoretic analyses show that the transceiver architecture designed for the SIMO channel has a boosted multiplexing gain, rather than one for the conventionally-used optimized beamforming scheme.Numerical results verify our derived asymptotics and show notable superiority of the proposed transceiver. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: A Shortened version is submitted to IEEE journal

arXiv:2311.09019 [pdf, ps, other]

Closed-Loop Identification of Stabilized Models Using Dual Input-Output Parameterization

Authors: Ran Chen, Amber Srivastava, Mingzhou Yin, Roy S. Smith

Abstract: This paper introduces a dual input-output parameterization (dual IOP) for the identification of linear time-invariant systems from closed-loop data. It draws inspiration from the recent input-output parameterization developed to synthesize a stabilizing controller. The controller is parameterized in terms of closed-loop transfer functions, from the external disturbances to the input and output of… ▽ More This paper introduces a dual input-output parameterization (dual IOP) for the identification of linear time-invariant systems from closed-loop data. It draws inspiration from the recent input-output parameterization developed to synthesize a stabilizing controller. The controller is parameterized in terms of closed-loop transfer functions, from the external disturbances to the input and output of the system, constrained to lie in a given subspace. Analogously, the dual IOP method parameterizes the unknown plant with analogous closed-loop transfer functions, also referred to as dual parameters. In this case, these closed-loop transfer functions are constrained to lie in an affine subspace guaranteeing that the identified plant is \emph{stabilized} by the known controller. Compared with existing closed-loop identification techniques guaranteeing closed-loop stability, such as the dual Youla parameterization, the dual IOP neither requires a doubly-coprime factorization of the controller nor a nominal plant that is stabilized by the controller. The dual IOP does not depend on the order and the state-space realization of the controller either, as in the dual system-level parameterization. Simulation shows that the dual IOP outperforms the existing benchmark methods. △ Less

Submitted 15 November, 2023; originally announced November 2023.

arXiv:2311.08415 [pdf]

Scanning phase imaging without accurate positioning system

Authors: Tao Liu, Bingyang Wang, JiangTao Zhao, Fu rong Chen, Fucai Zhang

Abstract: Ptychography, a high-resolution phase imaging technique using precise in-plane translation information, has been widely applied in modern synchrotron radiation sources across the globe. A key requirement for successful ptychographic reconstruction is the precise knowledge of the scanning positions, which are typically obtained by a physical interferometric positioning system. Whereas high-throughp… ▽ More Ptychography, a high-resolution phase imaging technique using precise in-plane translation information, has been widely applied in modern synchrotron radiation sources across the globe. A key requirement for successful ptychographic reconstruction is the precise knowledge of the scanning positions, which are typically obtained by a physical interferometric positioning system. Whereas high-throughput positioning poses a challenge in engineering, especially in nano or even smaller scale. In this work, we propose a novel scanning imaging framework that does not require any prior position information from the positioning system. Specifically, our scheme utilizes the wavefront modulation mechanism to reconstruct the object functions at each scan position and the shared illumination function, simultaneously. The scanning trajectory information is extracted by our subpixel image registration algorithm from the overlap region of reconstructed object functions. Then, a completed object function can be obtained by assembling each part of the reconstructed sample functions. High-quality imaging of biological sample and position recovery with sub-pixel accuracy are demonstrated in proof-of-concept experiment. Based on current results, we find it may have great potential applications in high-resolution and high throughput phase imaging. △ Less

Submitted 31 October, 2023; originally announced November 2023.

Comments: 9 pages,4 figures

arXiv:2311.02865 [pdf, other]

Geometrically-Shaped Constellation for Visible Light Communications at Short Blocklength

Authors: Jia-Ning Guo, Ru-Han Chen, Jian Zhang, Longguang Li, Xu Yang, **g Zhou

Abstract: In this paper, we present a general framework of designing geometrically shaped constellations for short-packet visible light communications with a peak- and an average-intensity constraints. By leveraging tools from large deviation theory, we first characterize the second-order asymptotics of the optimal constellation sha** region under aforementioned intensity constraints, which serves as a go… ▽ More In this paper, we present a general framework of designing geometrically shaped constellations for short-packet visible light communications with a peak- and an average-intensity constraints. By leveraging tools from large deviation theory, we first characterize the second-order asymptotics of the optimal constellation sha** region under aforementioned intensity constraints, which serves as a good performance measure for the best geometric sha** in finite blocklength. To further incorporate a sufficiently large coding gain and a nearly-maximum sha** gain, we construct multidimensional constellations by the nested structure of Construction B lattices, where the constellation sha** is implemented by controlling the boundary of the embedded sublattice, i.e., a strategy called coarsely sha** and finely coding. Fast algorithms for constellation map** and demodulation are presented as well. As an illustrative example, we present an energy-efficient $24$-dimensional constellation design based on the Leech lattice, whose superiority over existing constellation designs is verified by numerical results. △ Less

Submitted 28 April, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

arXiv:2311.01781 [pdf, other]

Passive Handwriting Tracking via Weak mmWave Communication Signals

Authors: Chao Yu, Yan Luo, Renqi Chen, Rui Wang

Abstract: In this letter, a cooperative sensing framework based on millimeter wave (mmWave) communication systems is proposed to detect tiny motions with a millimeter-level resolution. Particularly, the cooperative sensing framework is facilitated with one transmitter and two receivers. There are two radio frequency (RF) chains at each receiver. Hence, the Doppler effect due to the tiny motions can be detec… ▽ More In this letter, a cooperative sensing framework based on millimeter wave (mmWave) communication systems is proposed to detect tiny motions with a millimeter-level resolution. Particularly, the cooperative sensing framework is facilitated with one transmitter and two receivers. There are two radio frequency (RF) chains at each receiver. Hence, the Doppler effect due to the tiny motions can be detected via passive sensing respectively at the receivers, and the velocities of the motions can be estimated by integrating the Doppler frequencies. It is demonstrated that the proposed cooperative sensing system is able to track the handwriting with 90% error below 6 mm. Moreover, the proposed cooperative sensing is robust to the strength of received signal. For example, it works even without the line-of-sight paths from the transmitter to the receivers or the sensing target, where the received signal strength is not sufficient for timing synchronization or demodulation. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2311.01003 [pdf, other]

Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial Vehicle

Authors: Chen Qian, Rui Chen, Peiyao Shen, Yongchun Fang, Jifu Yan, Tiefeng Li

Abstract: Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial VehicleThis paper presents both the trajectory generation and tracking control strategies for an underactuated flap** wing aerial vehicle (FWAV). First, the FWAV dynamics is analyzed in a practical perspective. Then, based on these analyses, we demonstrate the differential flatness of the FWAV system, and d… ▽ More Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial VehicleThis paper presents both the trajectory generation and tracking control strategies for an underactuated flap** wing aerial vehicle (FWAV). First, the FWAV dynamics is analyzed in a practical perspective. Then, based on these analyses, we demonstrate the differential flatness of the FWAV system, and develop a general-purpose trajectory generation strategy. Subsequently, the trajectory tracking controller is developed with the help of robust control and switch control techniques. After that, the overall system asymptotic stability is guaranteed by Lyapunov stability analysis. To make the controller applicable in real flight, we also provide several instructions. Finally, a series of experiment results manifest the successful implementation of the proposed trajectory generation strategy and tracking control strategy. This work firstly achieves the closed-loop integration of trajectory generation and control for real 3-dimensional flight of an underactuated FWAV to a practical level. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.18498 [pdf, ps, other]

GPT-4 Vision on Medical Image Classification -- A Case Study on COVID-19 Dataset

Authors: Ruibo Chen, Tianyi Xiong, Yihan Wu, Guodong Liu, Zhengmian Hu, Lichang Chen, Yanshuo Chen, Chenxi Liu, Heng Huang

Abstract: This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of in-context learning to enhance diagnostic processes. This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of in-context learning to enhance diagnostic processes. △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2310.17974

FaultSeg Swin-UNETR: Transformer-Based Self-Supervised Pretraining Model for Fault Recognition

Authors: Zeren Zhang, Ran Chen, **wen Ma

Abstract: This paper introduces an approach to enhance seismic fault recognition through self-supervised pretraining. Seismic fault interpretation holds great significance in the fields of geophysics and geology. However, conventional methods for seismic fault recognition encounter various issues, including dependence on data quality and quantity, as well as susceptibility to interpreter subjectivity. Curre… ▽ More This paper introduces an approach to enhance seismic fault recognition through self-supervised pretraining. Seismic fault interpretation holds great significance in the fields of geophysics and geology. However, conventional methods for seismic fault recognition encounter various issues, including dependence on data quality and quantity, as well as susceptibility to interpreter subjectivity. Currently, automated fault recognition methods proposed based on small synthetic datasets experience performance degradation when applied to actual seismic data. To address these challenges, we have introduced the concept of self-supervised learning, utilizing a substantial amount of relatively easily obtainable unlabeled seismic data for pretraining. Specifically, we have employed the Swin Transformer model as the core network and employed the SimMIM pretraining task to capture unique features related to discontinuities in seismic data. During the fine-tuning phase, inspired by edge detection techniques, we have also refined the structure of the Swin-UNETR model, enabling multiscale decoding and fusion for more effective fault detection. Experimental results demonstrate that our proposed method attains state-of-the-art performance on the Thebe dataset, as measured by the OIS and ODS metrics. △ Less

Submitted 8 January, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: The logical flow and background of the article need significant revisions

arXiv:2310.03297 [pdf, other]

Passive Respiration Detection via mmWave Communication Signal Under Interference

Authors: Kehan Wu, Renqi Chen, Haiyu Wang, Chenqing Ji, Jiayuan Zhu, Guang Wu

Abstract: Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration… ▽ More Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration detection in the presence of robust human motion interference. Operating within the 60.48 GHz band, the proposed system aims to detect human respiration even when confronted with substantial human motion interference within close proximity. Subsequently, a neural network is trained using the collected data by us to enable human respiration detection. The experimental results demonstrate a consistently high accuracy rate over 90\% of the human respiration detection under interference, given an adequate sensing duration. Finally, an empirical model is derived analytically to achieve the respiratory rate counting in 10 seconds. △ Less

Submitted 4 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to WCNC2024 Workshop

arXiv:2309.15415 [pdf]

Formation Wing-Beat Modulation (FWM): A Tool for Quantifying Bird Flocks Using Radar Micro-Doppler Signals

Authors: Jiangkun Gong, Jun Yan, Deyong Kong, Ruizhi Chen, Deren Li

Abstract: Radar echoes from bird flocks contain modulation signals, which we find are produced by the flap** gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flap** wings and are related to the bird number, wing-beat frequency,… ▽ More Radar echoes from bird flocks contain modulation signals, which we find are produced by the flap** gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flap** wings and are related to the bird number, wing-beat frequency, and flight phasing strategy. Our X-band radar data show that FWM signals exist in radar signals of a seagull flock, providing tools for quantifying the bird number and estimating the mean wingbeat rate of birds. This new finding could aid in research on the quantification of bird migration numbers and estimation of bird flight behavior in radar ornithology and aero-ecology. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.13456 [pdf, other]

An Optimal Control Framework for Influencing Human Driving Behavior in Mixed-Autonomy Traffic

Authors: Anirudh Chari, Rui Chen, Jaskaran Grover, Changliu Liu

Abstract: As autonomous vehicles (AVs) become increasingly prevalent, their interaction with human drivers presents a critical challenge. Current AVs lack social awareness, causing behavior that is often awkward or unsafe. To combat this, social AVs, which are proactive rather than reactive in their behavior, have been explored in recent years. With knowledge of robot-human interaction dynamics, a social AV… ▽ More As autonomous vehicles (AVs) become increasingly prevalent, their interaction with human drivers presents a critical challenge. Current AVs lack social awareness, causing behavior that is often awkward or unsafe. To combat this, social AVs, which are proactive rather than reactive in their behavior, have been explored in recent years. With knowledge of robot-human interaction dynamics, a social AV can influence a human driver to exhibit desired behaviors by strategically altering its own behaviors. In this paper, we present a novel framework for achieving human influence. The foundation of our framework lies in an innovative use of control barrier functions to formulate the desired objectives of influence as constraints in an optimal control problem. The computed controls gradually push the system state toward satisfaction of the objectives, e.g. slowing the human down to some desired speed. We demonstrate the proposed framework's feasibility in a variety of scenarios related to car-following and lane changes, including multi-robot and multi-human configurations. In two case studies, we validate the framework's effectiveness when applied to the problems of traffic flow optimization and aggressive behavior mitigation. Given these results, the main contribution of our framework is its versatility in a wide spectrum of influence objectives and mixed-autonomy configurations. △ Less

Submitted 22 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

Comments: Accepted to American Control Conference (ACC) 2024

arXiv:2309.12406 [pdf, other]

Safety Index Synthesis with State-dependent Control Space

Authors: Rui Chen, Weiye Zhao, Changliu Liu

Abstract: This paper introduces an approach for synthesizing feasible safety indices to derive safe control laws under state-dependent control spaces. The problem, referred to as Safety Index Synthesis (SIS), is challenging because it requires the existence of feasible control input in all states and leads to an infinite number of constraints. The proposed method leverages Positivstellensatz to formulate SI… ▽ More This paper introduces an approach for synthesizing feasible safety indices to derive safe control laws under state-dependent control spaces. The problem, referred to as Safety Index Synthesis (SIS), is challenging because it requires the existence of feasible control input in all states and leads to an infinite number of constraints. The proposed method leverages Positivstellensatz to formulate SIS as a nonlinear programming (NP) problem. We formally prove that the NP solutions yield safe control laws with two imperative guarantees: forward invariance within user-defined safe regions and finite-time convergence to those regions. A numerical study validates the effectiveness of our approach. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.09797 [pdf, other]

A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM

Authors: Renhe Chen, Albert Lee, Zirui Wang, Di Wu, Xufeng Kou

Abstract: This brief introduces a read bias circuit to improve readout yield of magnetic random access memories (MRAMs). A dynamic bias optimization (DBO) circuit is proposed to enable the real-time tracking of the optimal read voltage across processvoltage-temperature (PVT) variations within an MRAM array. It optimizes read performance by adjusting the read bias voltage dynamically for maximum sensing marg… ▽ More This brief introduces a read bias circuit to improve readout yield of magnetic random access memories (MRAMs). A dynamic bias optimization (DBO) circuit is proposed to enable the real-time tracking of the optimal read voltage across processvoltage-temperature (PVT) variations within an MRAM array. It optimizes read performance by adjusting the read bias voltage dynamically for maximum sensing margin. Simulation results on a 28-nm 1Mb MRAM macro show that the tracking accuracy of the proposed DBO circuit remains above 90% even when the optimal sensing voltage varies up to 50%. Such dynamic tracking strategy further results in up to two orders of magnitude reduction in the bit error rate with respect to different variations, highlighting its effectiveness in enhancing MRAM performance and reliability. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.04672 [pdf, other]

SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image Segmentation

Authors: Renqi Chen, **g**g Luo, Fan Nian, Yuhui Cen, Yiheng Peng, Zekuan Yu

Abstract: Accurate medical image segmentation especially for echocardiographic images with unmissable noise requires elaborate network design. Compared with manual design, Neural Architecture Search (NAS) realizes better segmentation results due to larger search space and automatic optimization, but most of the existing methods are weak in layer-wise feature aggregation and adopt a ``strong encoder, weak de… ▽ More Accurate medical image segmentation especially for echocardiographic images with unmissable noise requires elaborate network design. Compared with manual design, Neural Architecture Search (NAS) realizes better segmentation results due to larger search space and automatic optimization, but most of the existing methods are weak in layer-wise feature aggregation and adopt a ``strong encoder, weak decoder" structure, insufficient to handle global relationships and local details. To resolve these issues, we propose a novel semi-supervised hybrid NAS network for accurate medical image segmentation termed SSHNN. In SSHNN, we creatively use convolution operation in layer-wise feature fusion instead of normalized scalars to avoid losing details, making NAS a stronger encoder. Moreover, Transformers are introduced for the compensation of global context and U-shaped decoder is designed to efficiently connect global context with local features. Specifically, we implement a semi-supervised algorithm Mean-Teacher to overcome the limited volume problem of labeled medical image dataset. Extensive experiments on CAMUS echocardiography dataset demonstrate that SSHNN outperforms state-of-the-art approaches and realizes accurate segmentation. Code will be made publicly available. △ Less

Submitted 27 December, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Accepted by ICASSP2024

arXiv:2308.14553 [pdf, other]

Rep2wav: Noise Robust text-to-speech Using self-supervised representations

Authors: Qiushi Zhu, Yu Gu, Rilin Chen, Chao Weng, Yuchen Hu, Lirong Dai, Jie Zhang

Abstract: Benefiting from the development of deep learning, text-to-speech (TTS) techniques using clean speech have achieved significant performance improvements. The data collected from real scenes often contains noise and generally needs to be denoised by speech enhancement models. Noise-robust TTS models are often trained using the enhanced speech, which thus suffer from speech distortion and background… ▽ More Benefiting from the development of deep learning, text-to-speech (TTS) techniques using clean speech have achieved significant performance improvements. The data collected from real scenes often contains noise and generally needs to be denoised by speech enhancement models. Noise-robust TTS models are often trained using the enhanced speech, which thus suffer from speech distortion and background noise that affect the quality of the synthesized speech. Meanwhile, it was shown that self-supervised pre-trained models exhibit excellent noise robustness on many speech tasks, implying that the learned representation has a better tolerance for noise perturbations. In this work, we therefore explore pre-trained models to improve the noise robustness of TTS models. Based on HiFi-GAN, we first propose a representation-to-waveform vocoder, which aims to learn to map the representation of pre-trained models to the waveform. We then propose a text-to-representation FastSpeech2 model, which aims to learn to map text to pre-trained model representations. Experimental results on the LJSpeech and LibriTTS datasets show that our method outperforms those using speech enhancement methods in both subjective and objective metrics. Audio samples are available at: https://zqs01.github.io/rep2wav. △ Less

Submitted 3 September, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: 5 pages,2 figures

arXiv:2308.13790 [pdf, other]

FFPN: Fourier Feature Pyramid Network for Ultrasound Image Segmentation

Authors: Chaoyu Chen, Xin Yang, Rusi Chen, Junxuan Yu, Liwei Du, Jian Wang, Xindi Hu, Yan Cao, Yingying Liu, Dong Ni

Abstract: Ultrasound (US) image segmentation is an active research area that requires real-time and highly accurate analysis in many scenarios. The detect-to-segment (DTS) frameworks have been recently proposed to balance accuracy and efficiency. However, existing approaches may suffer from inadequate contour encoding or fail to effectively leverage the encoded results. In this paper, we introduce a novel F… ▽ More Ultrasound (US) image segmentation is an active research area that requires real-time and highly accurate analysis in many scenarios. The detect-to-segment (DTS) frameworks have been recently proposed to balance accuracy and efficiency. However, existing approaches may suffer from inadequate contour encoding or fail to effectively leverage the encoded results. In this paper, we introduce a novel Fourier-anchor-based DTS framework called Fourier Feature Pyramid Network (FFPN) to address the aforementioned issues. The contributions of this paper are two fold. First, the FFPN utilizes Fourier Descriptors to adequately encode contours. Specifically, it maps Fourier series with similar amplitudes and frequencies into the same layer of the feature map, thereby effectively utilizing the encoded Fourier information. Second, we propose a Contour Sampling Refinement (CSR) module based on the contour proposals and refined features produced by the FFPN. This module extracts rich features around the predicted contours to further capture detailed information and refine the contours. Extensive experimental results on three large and challenging datasets demonstrate that our method outperforms other DTS methods in terms of accuracy and efficiency. Furthermore, our framework can generalize well to other detection or segmentation tasks. △ Less

Submitted 26 August, 2023; originally announced August 2023.

Comments: 10 pages, 5 figures, Accepted by MLMI 2023

arXiv:2308.08269 [pdf, other]

OnUVS: Online Feature Decoupling Framework for High-Fidelity Ultrasound Video Synthesis

Authors: Han Zhou, Dong Ni, Ao Chang, Xinrui Zhou, Rusi Chen, Yanlin Chen, Lian Liu, Jiamin Liang, Yuhao Huang, Tong Han, Zhe Liu, Deng-** Fan, Xin Yang

Abstract: Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis… ▽ More Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis of US videos may represent a promising solution to this issue. Nevertheless, it is challenging to accurately animate the intricate motion of dynamic anatomic structures while preserving image fidelity. To address this, we present a novel online feature-decoupling framework called OnUVS for high-fidelity US video synthesis. Our highlights can be summarized by four aspects. First, we introduced anatomic information into keypoint learning through a weakly-supervised training strategy, resulting in improved preservation of anatomical integrity and motion while minimizing the labeling burden. Second, to better preserve the integrity and textural information of US images, we implemented a dual-decoder that decouples the content and textural features in the generator. Third, we adopted a multiple-feature discriminator to extract a comprehensive range of visual cues, thereby enhancing the sharpness and fine details of the generated videos. Fourth, we constrained the motion trajectories of keypoints during online learning to enhance the fluidity of generated videos. Our validation and user studies on in-house echocardiographic and pelvic floor US videos showed that OnUVS synthesizes US videos with high fidelity. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: 14 pages, 13 figures and 6 tables

arXiv:2308.07342 [pdf, other]

Emergent communication for AR

Authors: Ruxiao Chen, Shuaishuai Guo

Abstract: Mobile augmented reality (MAR) is widely acknowledged as one of the ubiquitous interfaces to the digital twin and Metaverse, demanding unparalleled levels of latency, computational power, and energy efficiency. The existing solutions for realizing MAR combine multiple technologies like edge, cloud computing, and fifth-generation (5G) networks. However, the inherent communication latency of visual… ▽ More Mobile augmented reality (MAR) is widely acknowledged as one of the ubiquitous interfaces to the digital twin and Metaverse, demanding unparalleled levels of latency, computational power, and energy efficiency. The existing solutions for realizing MAR combine multiple technologies like edge, cloud computing, and fifth-generation (5G) networks. However, the inherent communication latency of visual data imposes apparent limitations on the quality of experience (QoE). To address the challenge, we propose an emergent semantic communication framework to learn the communication protocols in MAR. Specifically, we train two agents through a modified Lewis signaling game to emerge a discrete communication protocol spontaneously. Based on this protocol, two agents can communicate about the abstract idea of visual data through messages with extremely small data sizes in a noisy channel, which leads to message errors. To better simulate real-world scenarios, we incorporate channel uncertainty into our training process. Experiments have shown that the proposed scheme has better generalization on unseen objects than traditional object recognition used in MAR and can effectively enhance communication efficiency through the utilization of small-size messages. △ Less

Submitted 12 August, 2023; originally announced August 2023.

arXiv:2308.02782 [pdf]

doi 10.1364/OL.501622

Non-line-of-sight reconstruction via structure sparsity regularization

Authors: Duolan Huang, Quan Chen, Zhun Wei, Rui Chen

Abstract: Non-line-of-sight (NLOS) imaging allows for the imaging of objects around a corner, which enables potential applications in various fields such as autonomous driving, robotic vision, medical imaging, security monitoring, etc. However, the quality of reconstruction is challenged by low signal-noise-ratio (SNR) measurements. In this study, we present a regularization method, referred to as structure… ▽ More Non-line-of-sight (NLOS) imaging allows for the imaging of objects around a corner, which enables potential applications in various fields such as autonomous driving, robotic vision, medical imaging, security monitoring, etc. However, the quality of reconstruction is challenged by low signal-noise-ratio (SNR) measurements. In this study, we present a regularization method, referred to as structure sparsity (SS) regularization, for denoising in NLOS reconstruction. By exploiting the prior knowledge of structure sparseness, we incorporate nuclear norm penalization into the cost function of directional light-cone transform (DLCT) model for NLOS imaging system. This incorporation effectively integrates the neighborhood information associated with the directional albedo, thereby facilitating the denoising process. Subsequently, the reconstruction is achieved by optimizing a directional albedo model with SS regularization using fast iterative shrinkage-thresholding algorithm. Notably, the robust reconstruction of occluded objects is observed. Through comprehensive evaluations conducted on both synthetic and experimental datasets, we demonstrate that the proposed approach yields high-quality reconstructions, surpassing the state-of-the-art reconstruction algorithms, especially in scenarios involving short exposure and low SNR measurements. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 figures

arXiv:2305.19558 [pdf, other]

Look-Ahead Task Offloading for Multi-User Mobile Augmented Reality in Edge-Cloud Computing

Authors: Ruxiao Chen, Shuaishuai Guo

Abstract: Mobile augmented reality (MAR) blends a real scenario with overlaid virtual content, which has been envisioned as one of the ubiquitous interfaces to the Metaverse. Due to the limited computing power and battery life of MAR devices, it is common to offload the computation tasks to edge or cloud servers in close proximity. However, existing offloading solutions developed for MAR tasks suffer from h… ▽ More Mobile augmented reality (MAR) blends a real scenario with overlaid virtual content, which has been envisioned as one of the ubiquitous interfaces to the Metaverse. Due to the limited computing power and battery life of MAR devices, it is common to offload the computation tasks to edge or cloud servers in close proximity. However, existing offloading solutions developed for MAR tasks suffer from high migration overhead, poor scalability, and short-sightedness when applied in provisioning multi-user MAR services. To address these issues, a MAR service-oriented task offloading scheme is designed and evaluated in edge-cloud computing networks. Specifically, the task interdependency of MAR applications is firstly analyzed and modeled by using directed acyclic graphs. Then, we propose a look-ahead offloading scheme based on a modified Monte Carlo tree (MMCT) search, which can run several multi-step executions in advance to get an estimate of the long-term effect of immediate action. Experiment results show that the proposed offloading scheme can effectively improve the quality of service (QoS) in provisioning multi-user MAR services, compared to four benchmark schemes. Furthermore, it is also shown that the proposed solution is stable and suitable for applications in a highly volatile environment. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: Accepted by IEEE Network

arXiv:2304.14660 [pdf, other]

doi 10.1016/j.media.2023.103061

Segment Anything Model for Medical Images?

Authors: Yuhao Huang, Xin Yang, Lian Liu, Han Zhou, Ao Chang, Xinrui Zhou, Rusi Chen, Junxuan Yu, Jiongquan Chen, Chaoyu Chen, Si**g Liu, Haozhe Chi, Xindi Hu, Kejuan Yue, Lei Li, Vicente Grau, Deng-** Fan, Fa** Dong, Dong Ni

Abstract: The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's perfo… ▽ More The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: 1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. 2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. 3) SAM performed better with manual hints, especially box, than the Everything mode. 4) SAM could help human annotation with high labeling quality and less time. 5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. 6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. 7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. 8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM. △ Less

Submitted 17 January, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

Comments: Accepted by Medical Image Analysis. 23 pages, 18 figures, 8 tables

arXiv:2302.14257 [pdf, ps, other]

Beamforming Design for RIS-Aided AF Relay Networks

Authors: Xuehui Wang, Feng Shu, Riqing Chen, Peng Zhang, Qi Zhang, Guiyang Xia, Wei** shi, Jiangzhou Wang

Abstract: Since reconfigurable intelligent surface (RIS) is considered to be a passive reflector for rate performance enhancement, a RIS-aided amplify-and-forward (AF) relay network is presented. By jointly optimizing the beamforming matrix at AF relay and the phase shifts matrices at RIS, two schemes are put forward to address a maximizing signal-to-noise ratio (SNR) problem. Firstly, aiming at achieving a… ▽ More Since reconfigurable intelligent surface (RIS) is considered to be a passive reflector for rate performance enhancement, a RIS-aided amplify-and-forward (AF) relay network is presented. By jointly optimizing the beamforming matrix at AF relay and the phase shifts matrices at RIS, two schemes are put forward to address a maximizing signal-to-noise ratio (SNR) problem. Firstly, aiming at achieving a high rate, a high-performance alternating optimization (AO) method based on Charnes-Cooper transformation and semidefinite programming (CCT-SDP) is proposed, where the optimization problem is decomposed to three subproblems solved by CCT-SDP and rank-one solutions can be recovered by Gaussian randomization. While the optimization variables in CCT-SDP method are matrices, which leads to extremely high complexity. In order to reduce the complexity, a low-complexity AO scheme based on Dinkelbachs transformation and successive convex approximation (DT-SCA) is put forward, where matrices variables are transformed to vector variables and three decoupled subproblems are solved by DT-SCA. Simulation results verify that compared to two benchmarks (i.e. a RIS-assisted AF relay network with random phase and a AF relay network without RIS), the proposed CCT-SDP and DT-SCA schemes can harvest better rate performance. Furthermore, it is revealed that the rate of the low-complexity DT-SCA method is close to that of CCT-SDP method. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2301.02858 [pdf, ps, other]

Two Efficient Beamforming Methods for Hybrid IRS-aided AF Relay Wireless Networks

Authors: Xuehui Wang, Feng Shu, Mengxing Huang, Fuhui Zhou, Riqing Chen, Cunhua Pan, Yongpeng Wu, Jiangzhou Wang

Abstract: Due to the double fading effect caused by conventional passive intelligent reflecting surface (IRS), the signal via the reflection link is weak. To enhance the received signal, active elements with the ability to amplify the reflected signal are introduced to the passive IRS forming hybrid IRS. In this paper, we propose a hybrid IRS-aided amplify-and-forward (AF) relay wireless network, where an o… ▽ More Due to the double fading effect caused by conventional passive intelligent reflecting surface (IRS), the signal via the reflection link is weak. To enhance the received signal, active elements with the ability to amplify the reflected signal are introduced to the passive IRS forming hybrid IRS. In this paper, we propose a hybrid IRS-aided amplify-and-forward (AF) relay wireless network, where an optimization problem is formulated, which is subject to the constraints of transmit power budgets at the source/AF relay/hybrid IRS and that of unit modulus for passive IRS elements. By alternately designing the beamforming matrix at AF relay and the reflecting coefficient matrices at IRS, signal-to-noise ratio can be maximized. To achieve high rate performance and extend the coverage range, a high-performance method based on semidefinite relaxation and fractional programming (HP-SDR-FP) algorithm is presented. Due to its extremely high complexity, a low-complexity method based on whitening filter, general power iterative and generalized Rayleigh-Ritz (WF-GPI-GRR) is proposed, which is different from HP-SDR-FP method. It is assumed that the amplifying coefficient of each active IRS element is equal, and the corresponding analytical solution of the amplifying coefficient can be obtained according to the transmit powers at AF relay and hybrid IRS. Simulation results show that the proposed two methods can greatly improve the rate performance compared to the existing networks, such as the passive IRS-aided AF relay and only AF relay network. In particular, a 50.0% rate gain over the existing networks is approximately achieved in the high power budget region of hybrid IRS. Moreover, it is verified that the proposed HP-SDR-FP method perform better than WF-GPI-GRR method in terms of rate performance. △ Less

Submitted 23 November, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

arXiv:2212.08391 [pdf, ps, other]

Enhanced-rate Iterative Beamformers for Active IRS-assisted Wireless Communications

Authors: Yeqing Lin, Feng Shu, Rongen Dong, Riqing Chen, Siling Feng, Wei** Shi, **g Liu, Jiangzhou Wang

Abstract: Compared to passive intelligent reflecting surface (IRS), active IRS is viewed as a more efficient promising technique to combat the double-fading impact in IRS-aided wireless network. In this paper, in order to boost the achievable rate of user in such a wireless network, three enhanced-rate iterative beamforming methods are proposed by designing the amplifying factors and the corresponding phase… ▽ More Compared to passive intelligent reflecting surface (IRS), active IRS is viewed as a more efficient promising technique to combat the double-fading impact in IRS-aided wireless network. In this paper, in order to boost the achievable rate of user in such a wireless network, three enhanced-rate iterative beamforming methods are proposed by designing the amplifying factors and the corresponding phases at active IRS. The first method, maximizing the simplified signal-to-noise ratio (Max-SSNR) is designed by omitting the cross-term in the definition of rate. Using the Rayleigh-Ritz (RR) theorem, Max-SSNR-RR is proposed to iteratively optimize the norm of beamforming vector and its associated normalized vector. In addition, generalized maximum ratio reflection (GMRR) is presented with a closed-form expression, which is motivated by the maximum ratio combining. To further improve rate, maximizing SNR (Max-SNR) is designed by fractional programming (FP), which is called Max-SNR-FP. Simulation results show that the proposed three methods make an obvious rate enhancement over Max-reflecting signal-to-noise ratio (Max-RSNR), maximum ratio reflection (MRR), selective ratio reflecting (SRR), equal gain reflection (EGR) and passive IRS, and are in increasing order of rate performance as follows: Max-SSNR-RR, GMRR, and Max-SNR-FP. △ Less

Submitted 14 May, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

arXiv:2211.14361 [pdf, other]

doi 10.1109/IROS55552.2023.10341790

gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic Environments

Authors: Devansh R Agrawal, Ruichang Chen, Dimitra Panagou

Abstract: This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method that ensures that trajectories of a nonlinear system satisfy safety constraints despite sensing limitations. gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step to ensure that proposed trajectories can be executed safely, despite non… ▽ More This paper presents the gatekeeper algorithm, a real-time and computationally-lightweight method that ensures that trajectories of a nonlinear system satisfy safety constraints despite sensing limitations. gatekeeper integrates with existing path planners and feedback controllers by introducing an additional verification step to ensure that proposed trajectories can be executed safely, despite nonlinear dynamics subject to bounded disturbances, input constraints and partial knowledge of the environment. Our key contribution is that (A) we propose an algorithm to recursively construct safe trajectories by numerically forward propagating the system over a (short) finite horizon, and (B) we prove that tracking such a trajectory ensures the system remains safe for all future time, i.e., beyond the finite horizon. We demonstrate the method in a simulation of a dynamic firefighting mission, and in physical experiments of a quadrotor navigating in an obstacle environment that is sensed online. We also provide comparisons against the state-of-the-art techniques for similar problems. △ Less

Submitted 27 March, 2024; v1 submitted 25 November, 2022; originally announced November 2022.

Comments: Accepted at IROS 2023, 8 pages, 4 figures, Conditional Accept at IEEE T-RO

arXiv:2211.05309 [pdf]

Generic Cryo-CMOS Device Modeling and EDACompatible Platform for Reliable Cryogenic IC Design

Authors: Zhidong Tang, Zewei Wang, Yumeng Yuan, Chang He, Xin Luo, Ao Guo, Renhe Chen, Yongqi Hu, Longfei Yang, Chengwei Cao, Linlin Liu, Liujiang Yu, Ganbing Shang, Yongfeng Cao, Shoumian Chen, Yuhang Zhao, Shaojian Hu, Xufeng Kou

Abstract: This paper outlines the establishment of a generic cryogenic CMOS database in which key electrical parameters and transfer characteristics of the MOSFETs are quantified as functions of device size, temperature/frequency responses. Meanwhile, comprehensive device statistical study is conducted to evaluate the influence of variation and mismatch effects at low temperatures. Furthermore, by incorpora… ▽ More This paper outlines the establishment of a generic cryogenic CMOS database in which key electrical parameters and transfer characteristics of the MOSFETs are quantified as functions of device size, temperature/frequency responses. Meanwhile, comprehensive device statistical study is conducted to evaluate the influence of variation and mismatch effects at low temperatures. Furthermore, by incorporating the Cryo-CMOS compact model into the process design kit (PDK), the cryogenic 4 Kb SRAM, 5-bit flash ADC and 8-bit current steering DAC are designed, and their performance is readily investigated and optimized on the EDA-compatible platform, hence laying a solid foundation for large-scale cryogenic IC design. △ Less

Submitted 9 February, 2024; v1 submitted 9 November, 2022; originally announced November 2022.

arXiv:2209.13647 [pdf]

Deep learning based sferics recognition for AMT data processing in the dead band

Authors: Enhua Jiang, Rujun Chen, Xinming Wu, Jianxin Liu, Debin Zhu, Weiqiang Liu

Abstract: In the audio magnetotellurics (AMT) sounding data processing, the absence of sferic signals in some time ranges typically results in a lack of energy in the AMT dead band, which may cause unreliable resistivity estimate. We propose a deep convolutional neural network (CNN) to automatically recognize sferic signals from redundantly recorded data in a long time range and use them to compensate for t… ▽ More In the audio magnetotellurics (AMT) sounding data processing, the absence of sferic signals in some time ranges typically results in a lack of energy in the AMT dead band, which may cause unreliable resistivity estimate. We propose a deep convolutional neural network (CNN) to automatically recognize sferic signals from redundantly recorded data in a long time range and use them to compensate for the resistivity estimation. We train the CNN by using field time series data with different signal to noise rations that were acquired from different regions in mainland China. To solve the potential overfitting problem due to the limited number of sferic labels, we propose a training strategy that randomly generates training samples (with random data augmentations) while optimizing the CNN model parameters. We stop the training process and data generation until the training loss converges. In addition, we use a weighted binary cross-entropy loss function to solve the sample imbalance problem to better optimize the network, use multiple reasonable metrics to evaluate network performance, and carry out ablation experiments to optimally choose the model hyperparameters. Extensive field data applications show that our trained CNN can robustly recognize sferic signals from noisy time series for subsequent impedance estimation. The subsequent processing results show that our method can significantly improve S/N and effectively solve the problem of lack of energy in dead band. Compared to the traditional processing method without sferic compensation, our method can generate a smoother and more reasonable apparent resistivity-phase curves and depolarized phase tensor, correct the estimation error of sudden drop of high-frequency apparent resistivity and abnormal behavior of phase reversal, and finally better restore the real shallow subsurface resistivity structure. △ Less

Submitted 21 September, 2022; originally announced September 2022.

arXiv:2208.13954 [pdf]

doi 10.1109/TCSVT.2022.3197420

Video-based Cross-modal Auxiliary Network for Multimodal Sentiment Analysis

Authors: Rongfei Chen, Wenju Zhou, Yang Li, Huiyu Zhou

Abstract: Multimodal sentiment analysis has a wide range of applications due to its information complementarity in multimodal interactions. Previous works focus more on investigating efficient joint representations, but they rarely consider the insufficient unimodal features extraction and data redundancy of multimodal fusion. In this paper, a Video-based Cross-modal Auxiliary Network (VCAN) is proposed, wh… ▽ More Multimodal sentiment analysis has a wide range of applications due to its information complementarity in multimodal interactions. Previous works focus more on investigating efficient joint representations, but they rarely consider the insufficient unimodal features extraction and data redundancy of multimodal fusion. In this paper, a Video-based Cross-modal Auxiliary Network (VCAN) is proposed, which is comprised of an audio features map module and a cross-modal selection module. The first module is designed to substantially increase feature diversity in audio feature extraction, aiming to improve classification accuracy by providing more comprehensive acoustic representations. To empower the model to handle redundant visual features, the second module is addressed to efficiently filter the redundant visual frames during integrating audiovisual data. Moreover, a classifier group consisting of several image classification networks is introduced to predict sentiment polarities and emotion categories. Extensive experimental results on RAVDESS, CMU-MOSI, and CMU-MOSEI benchmarks indicate that VCAN is significantly superior to the state-of-the-art methods for improving the classification accuracy of multimodal sentiment analysis. △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2207.03127 [pdf]

5G for Railways: the Next Generation Railway Dedicated Communications

Authors: Ruisi He, Bo Ai, Zhangdui Zhong, Mi Yang, Ruifeng Chen, Jianwen Ding, Zhangfeng Ma, Guiqi Sun, Changzhu Liu

Abstract: To overcome increasing traffic, provide various new services, further ensure safety and security, significantly improve travel comfort, a new communication system for railways is required. Since 2019, public networks have been evolving to the fifth generation communication (5G) worldwide, whereas the main communication system of railway is still based on the second generation communication (2G). I… ▽ More To overcome increasing traffic, provide various new services, further ensure safety and security, significantly improve travel comfort, a new communication system for railways is required. Since 2019, public networks have been evolving to the fifth generation communication (5G) worldwide, whereas the main communication system of railway is still based on the second generation communication (2G). It is thus necessary for railways to replace the current 2G-based technology with the next generation railway dedicated communication system with improved capacity and capability, and the 5G for railways (5G-R) technology is a promising solution for further intelligent railways. This article gives a review of the current developments of the next generation railway communications, followed by a discussion of the typical services that the 5G-R can provide to intelligent railways. Then, main application scenarios of 5G-R are summarized and system configurations are compared. Some key technologies of 5G-R such as network architecture, massive MIMO, millimeter-wave, multiple access scheme, ultra-reliable low latency communication, and advanced video processing are presented and analyzed. Finally, some challenges of 5G-R are highlighted. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.11996 [pdf, other]

The Real Deal: A Review of Challenges and Opportunities in Moving Reinforcement Learning-Based Traffic Signal Control Systems Towards Reality

Authors: Rex Chen, Fei Fang, Norman Sadeh

Abstract: Traffic signal control (TSC) is a high-stakes domain that is growing in importance as traffic volume grows globally. An increasing number of works are applying reinforcement learning (RL) to TSC; RL can draw on an abundance of traffic data to improve signalling efficiency. However, RL-based signal controllers have never been deployed. In this work, we provide the first review of challenges that mu… ▽ More Traffic signal control (TSC) is a high-stakes domain that is growing in importance as traffic volume grows globally. An increasing number of works are applying reinforcement learning (RL) to TSC; RL can draw on an abundance of traffic data to improve signalling efficiency. However, RL-based signal controllers have never been deployed. In this work, we provide the first review of challenges that must be addressed before RL can be deployed for TSC. We focus on four challenges involving (1) uncertainty in detection, (2) reliability of communications, (3) compliance and interpretability, and (4) heterogeneous road users. We show that the literature on RL-based TSC has made some progress towards addressing each challenge. However, more work should take a systems thinking approach that considers the impacts of other pipeline components on RL. △ Less

Submitted 3 October, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

Comments: 26 pages; accepted version, with shortened version published at the 12th International Workshop on Agents in Traffic and Transportation (ATT '22) at IJCAI 2022

arXiv:2206.08885 [pdf, other]

Incorporating intratumoral heterogeneity into weakly-supervised deep learning models via variance pooling

Authors: Iain Carmichael, Andrew H. Song, Richard J. Chen, Drew F. K. Williamson, Tiffany Y. Chen, Faisal Mahmood

Abstract: Supervised learning tasks such as cancer survival prediction from gigapixel whole slide images (WSIs) are a critical challenge in computational pathology that requires modeling complex features of the tumor microenvironment. These learning tasks are often solved with deep multi-instance learning (MIL) models that do not explicitly capture intratumoral heterogeneity. We develop a novel variance poo… ▽ More Supervised learning tasks such as cancer survival prediction from gigapixel whole slide images (WSIs) are a critical challenge in computational pathology that requires modeling complex features of the tumor microenvironment. These learning tasks are often solved with deep multi-instance learning (MIL) models that do not explicitly capture intratumoral heterogeneity. We develop a novel variance pooling architecture that enables a MIL model to incorporate intratumoral heterogeneity into its predictions. Two interpretability tools based on representative patches are illustrated to probe the biological signals captured by these models. An empirical study with 4,479 gigapixel WSIs from the Cancer Genome Atlas shows that adding variance pooling onto MIL frameworks improves survival prediction performance for five cancer types. △ Less

Submitted 19 November, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: MICCAI 2022

arXiv:2205.05651 [pdf, other]

Joint OAM Radar-Communication Systems: Target Recognition and Beam Optimization

Authors: Wen-Xuan Long, Rui Chen, Marco Moretti, Wei Zhang, Jiandong Li

Abstract: Orbital angular momentum (OAM) radars are able to estimate the azimuth angle and the rotation velocity of multiple targets without relative motion or beam scanning. Moreover, OAM wireless communications can achieve high spectral efficiency (SE) by utilizing a set of information-bearing modes on the same frequency channel. Benefitting from the above advantages, in this paper, we design a novel rada… ▽ More Orbital angular momentum (OAM) radars are able to estimate the azimuth angle and the rotation velocity of multiple targets without relative motion or beam scanning. Moreover, OAM wireless communications can achieve high spectral efficiency (SE) by utilizing a set of information-bearing modes on the same frequency channel. Benefitting from the above advantages, in this paper, we design a novel radar-centric joint OAM radar-communication (RadCom) scheme based on uniform circular arrays (UCAs), which modulates information signals on the existing OAM radar waveform. In details, we first propose an OAM-based three-dimensional (3-D) super-resolution position estimation and rotation velocity detection method, which can accurately estimate the 3-D position and rotation velocity of multiple targets. Then, we derive the posterior Cramer-Rao bound (PCRB) of the OAM-based estimates and, finally, we analyze the transmission rate of the integrated communication system. To achieve the best trade-off between imaging and communication, the transmitted integrated OAM beams are optimized by means of an exhaustive search method. Both mathematical analysis and simulation results show that the proposed radar-centric joint OAM RadCom scheme can accurately estimate the 3-D position and rotation velocity of multiple targets while ensuring the transmission rate of the communication receiver, which can be regarded as an effective supplement to existing joint RadCom schemes. △ Less

Submitted 11 May, 2022; originally announced May 2022.

arXiv:2204.03213 [pdf]

MC-UNet Multi-module Concatenation based on U-shape Network for Retinal Blood Vessels Segmentation

Authors: Ting Zhang, Jun Li, Yi Zhao, Nan Chen, Han Zhou, Hongtao Xu, Zihao Guan, Changcai Yang, Lanyan Xue, Riqing Chen, Lifang Wei

Abstract: Accurate segmentation of the blood vessels of the retina is an important step in clinical diagnosis of ophthalmic diseases. Many deep learning frameworks have come up for retinal blood vessels segmentation tasks. However, the complex vascular structure and uncertain pathological features make the blood vessel segmentation still very challenging. A novel U-shaped network named Multi-module Concaten… ▽ More Accurate segmentation of the blood vessels of the retina is an important step in clinical diagnosis of ophthalmic diseases. Many deep learning frameworks have come up for retinal blood vessels segmentation tasks. However, the complex vascular structure and uncertain pathological features make the blood vessel segmentation still very challenging. A novel U-shaped network named Multi-module Concatenation which is based on Atrous convolution and multi-kernel pooling is put forward to retinal vessels segmentation in this paper. The proposed network structure retains three layers the essential structure of U-Net, in which the atrous convolution combining the multi-kernel pooling blocks are designed to obtain more contextual information. The spatial attention module is concatenated with dense atrous convolution module and multi-kernel pooling module to form a multi-module concatenation. And different dilation rates are selected by cascading to acquire a larger receptive field in atrous convolution. Adequate comparative experiments are conducted on these public retinal datasets: DRIVE, STARE and CHASE_DB1. The results show that the proposed method is effective, especially for microvessels. The code will be put out at https://github.com/Rebeccala/MC-UNet △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: 13pages,3957

MSC Class: 65D19 ACM Class: I.4.6

arXiv:2204.03038 [pdf, other]

Safe Interactive Industrial Robots using Jerk-based Safe Set Algorithm

Authors: Ruixuan Liu, Rui Chen, Changliu Liu

Abstract: The need to increase the flexibility of production lines is calling for robots to collaborate with human workers. However, existing interactive industrial robots only guarantee intrinsic safety (reduce collision impact), but not interactive safety (collision avoidance), which greatly limited their flexibility. The issue arises from two limitations in existing control software for industrial robots… ▽ More The need to increase the flexibility of production lines is calling for robots to collaborate with human workers. However, existing interactive industrial robots only guarantee intrinsic safety (reduce collision impact), but not interactive safety (collision avoidance), which greatly limited their flexibility. The issue arises from two limitations in existing control software for industrial robots: 1) lack of support for real-time trajectory modification; 2) lack of intelligent safe control algorithms with guaranteed collision avoidance under robot dynamics constraints. To address the first issue, a jerk-bounded position controller (JPC) was developed previously. This paper addresses the second limitation, on top of the JPC. Specifically, we introduce a jerk-based safe set algorithm (JSSA) to ensure collision avoidance while considering the robot dynamics constraints. The JSSA greatly extends the scope of the original safe set algorithm, which has only been applied for second-order systems with unbounded accelerations. The JSSA is implemented on the FANUC LR Mate 200id/7L robot and validated with HRI tasks. Experiments show that the JSSA can consistently keep the robot at a safe distance from the human while executing the designated task. △ Less

Submitted 19 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

arXiv:2203.17102 [pdf, other]

Sequential Cooperative Energy and Time-Optimal Lane Change Maneuvers for Highway Traffic

Authors: Andres S. Chavez Armijos, Rui Chen, Christos G. Cassandras, Yasir K. Al-Nadawi, Hossein Noukhiz Mahjoub, Hidekazu Araki

Abstract: We derive optimal control policies for a Connected Automated Vehicle (CAV) and cooperating neighboring CAVs to carry out a lane change maneuver consisting of a longitudinal phase where the CAV properly positions itself relative to the cooperating neighbors and a lateral phase where it safely changes lanes. In contrast to prior work on this problem, where the CAV "selfishly" seeks to minimize its m… ▽ More We derive optimal control policies for a Connected Automated Vehicle (CAV) and cooperating neighboring CAVs to carry out a lane change maneuver consisting of a longitudinal phase where the CAV properly positions itself relative to the cooperating neighbors and a lateral phase where it safely changes lanes. In contrast to prior work on this problem, where the CAV "selfishly" seeks to minimize its maneuver time, we seek to ensure that the fast-lane traffic flow is minimally disrupted (through a properly defined metric) and that highway throughput is improved by optimally selecting the cooperating vehicles. We show that analytical solutions for the optimal trajectories can be derived and are guaranteed to satisfy safety constraints for all vehicles involved in the maneuver. When feasible solutions do not exist, we include a time relaxation method trading off a longer maneuver time with reduced disruption. Our analysis is also extended to multiple sequential maneuvers. Simulation results where the controllers are implemented show their effectiveness in terms of safety guarantees and up to 35% throughput improvement compared to maneuvers with no vehicle cooperation. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2203.05784 [pdf]

AI-enabled Automatic Multimodal Fusion of Cone-Beam CT and Intraoral Scans for Intelligent 3D Tooth-Bone Reconstruction and Clinical Applications

Authors: ** Hao, Jiaxiang Liu, ** Li, Wei Pan, Ruizhe Chen, Huimin Xiong, Kaiwei Sun, Hangzheng Lin, Wanlu Liu, Wanghui Ding, Jianfei Yang, Haoji Hu, Yueling Zhang, Yang Feng, Zeyu Zhao, Huikai Wu, Youyi Zheng, Bing Fang, Zuozhu Liu, Zhihe Zhao

Abstract: A critical step in virtual dental treatment planning is to accurately delineate all tooth-bone structures from CBCT with high fidelity and accurate anatomical information. Previous studies have established several methods for CBCT segmentation using deep learning. However, the inherent resolution discrepancy of CBCT and the loss of occlusal and dentition information largely limited its clinical ap… ▽ More A critical step in virtual dental treatment planning is to accurately delineate all tooth-bone structures from CBCT with high fidelity and accurate anatomical information. Previous studies have established several methods for CBCT segmentation using deep learning. However, the inherent resolution discrepancy of CBCT and the loss of occlusal and dentition information largely limited its clinical applicability. Here, we present a Deep Dental Multimodal Analysis (DDMA) framework consisting of a CBCT segmentation model, an intraoral scan (IOS) segmentation model (the most accurate digital dental model), and a fusion model to generate 3D fused crown-root-bone structures with high fidelity and accurate occlusal and dentition information. Our model was trained with a large-scale dataset with 503 CBCT and 28,559 IOS meshes manually annotated by experienced human experts. For CBCT segmentation, we use a five-fold cross validation test, each with 50 CBCT, and our model achieves an average Dice coefficient and IoU of 93.99% and 88.68%, respectively, significantly outperforming the baselines. For IOS segmentations, our model achieves an mIoU of 93.07% and 95.70% on the maxillary and mandible on a test set of 200 IOS meshes, which are 1.77% and 3.52% higher than the state-of-art method. Our DDMA framework takes about 20 to 25 minutes to generate the fused 3D mesh model following the sequential processing order, compared to over 5 hours by human experts. Notably, our framework has been incorporated into a software by a clear aligner manufacturer, and real-world clinical cases demonstrate that our model can visualize crown-root-bone structures during the entire orthodontic treatment and can predict risks like dehiscence and fenestration. These findings demonstrate the potential of multi-modal deep learning to improve the quality of digital dental models and help dentists make better clinical decisions. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: 30 pages, 6 figures, 3 tables

arXiv:2202.11693 [pdf, other]

Hybrid Mechanical and Electronic Beam Steering for Maximizing OAM Channel Capacity

Authors: Rui Chen, Zhenyang Tian, Wen-Xuan Long, Xiaodong Wang, Wei Zhang

Abstract: Radio frequency-orbital angular momentum (RF-OAM) is a novel approach of multiplexing a set of orthogonal modes on the same frequency channel to achieve high spectrum efficiencies. Since OAM requires precise alignment of the transmit and the receive antennas, the electronic beam steering approach has been proposed for the uniform circular array (UCA)-based OAM communication system to circumvent la… ▽ More Radio frequency-orbital angular momentum (RF-OAM) is a novel approach of multiplexing a set of orthogonal modes on the same frequency channel to achieve high spectrum efficiencies. Since OAM requires precise alignment of the transmit and the receive antennas, the electronic beam steering approach has been proposed for the uniform circular array (UCA)-based OAM communication system to circumvent large performance degradation induced by small antenna misalignment in practical environment. However, in the case of large-angle misalignment, the OAM channel capacity can not be effectively compensated only by the electronic beam steering. To solve this problem, we propose a hybrid mechanical and electronic beam steering scheme, in which mechanical rotating devices controlled by pulse width modulation (PWM) signals as the execution unit are utilized to eliminate the large misalignment angle, while electronic beam steering is in charge of the remaining small misalignment angle caused by perturbations. Furthermore, due to the interferometry, the receive signal-to-noise ratios (SNRs) are not uniform at the elements of the receive UCA. Therefore, a rotatable UCA structure is proposed for the OAM receiver to maximize the channel capacity, in which the simulated annealing algorithm is adopted to obtain the optimal rotation angle at first, then the servo system performs mechanical rotation, at last the electronic beam steering is adjusted accordingly. Both mathematical analysis and simulation results validate that the proposed hybrid mechanical and electronic beam steering scheme can effectively eliminate the effect of diverse misalignment errors of any practical OAM channel and maximize the OAM channel capacity. △ Less

Submitted 5 August, 2022; v1 submitted 28 January, 2022; originally announced February 2022.

Showing 1–50 of 93 results for author: Chen, R