-
Equity-aware Load Shedding Optimization
Authors:
Xin Fang,
Wenbo Wang,
Fei Ding
Abstract:
Load shedding is usually the last resort to balance generation and demand to maintain stable operation of the electric grid after major disturbances. Current load-shedding optimization practices focus mainly on the physical optimality of the network power flow. This might lead to an uneven allocation of load curtailment, disadvantaging some loads more than others. Addressing this oversight, this p…
▽ More
Load shedding is usually the last resort to balance generation and demand to maintain stable operation of the electric grid after major disturbances. Current load-shedding optimization practices focus mainly on the physical optimality of the network power flow. This might lead to an uneven allocation of load curtailment, disadvantaging some loads more than others. Addressing this oversight, this paper introduces an innovative equity-aware load-shedding optimization model that emphasizes a fair allocation of load curtailment across the network. By proposing a novel equity indicator for load shedding and integrating it into an ACOPF-based optimization framework, we offer grid operators a more balanced and equitable load shedding strategy. Case studies highlight the importance of equity considerations in determining optimal load curtailment between buses.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios
Authors:
Ya Jiang,
Qing Wang,
Jun Du,
Maocheng Hu,
Pengfei Hu,
Zeyan Liu,
Shi Cheng,
Zhaoxu Nian,
Yuxuan Dong,
Mingqi Cai,
Xin Fang,
Chin-Hui Lee
Abstract:
This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c…
▽ More
This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich collection of audio data with multiple data augmentation techniques, to an audio-visual student model trained with only a limited set of multi-modal data. Next, we propose a two-stage audio-visual fusion strategy, consisting of an early feature fusion and a late video-guided decision fusion to exploit synergies between audio and video modalities. Finally, we introduce an innovative video pixel swap** (VPS) technique to extend an audio channel swap** (ACS) method to an audio-visual joint augmentation. Evaluation results on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge data set demonstrate significant improvements in SELD performances. Furthermore, our submission to the SELD task of the DCASE 2023 Challenge ranks first place by effectively integrating the proposed techniques into a model ensemble.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Local Gaussian Process Regression Approach to Frequency Response Function Estimation
Authors:
Xiaozhu Fang,
Yu Xu,
Tianshi Chen
Abstract:
Frequency response function (FRF) estimation is a classical subject in system identification. In the past two decades, there have been remarkable advances in develo** local methods for this subject, e.g., the local polynomial method, local rational method, and iterative local rational method. The recent concentrations for local methods are two issues: the model order selection and the identifica…
▽ More
Frequency response function (FRF) estimation is a classical subject in system identification. In the past two decades, there have been remarkable advances in develo** local methods for this subject, e.g., the local polynomial method, local rational method, and iterative local rational method. The recent concentrations for local methods are two issues: the model order selection and the identification of lightly damped systems. To address these two issues, we propose a new local method called local Gaussian process regression (LGPR). We show that the frequency response function locally is either analytic or resonant, and this prior knowledge can be embedded into a kernel-based regularized estimate through a dot-product kernel plus a resonance kernel induced by a second-order resonant system. The LGPR provides a new route to tackle the aforementioned issues. In the numerical simulations, the LGPR shows the best FRF estimation accuracy compared with the existing local methods, and moreover, the LGPR is more robust with respect to sample size and noise level.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Deep Learning-Based Residual Useful Lifetime Prediction for Assets with Uncertain Failure Modes
Authors:
Yuqi Su,
Xiaolei Fang
Abstract:
Industrial prognostics focuses on utilizing degradation signals to forecast and continually update the residual useful life of complex engineering systems. However, existing prognostic models for systems with multiple failure modes face several challenges in real-world applications, including overlap** degradation signals from multiple components, the presence of unlabeled historical data, and t…
▽ More
Industrial prognostics focuses on utilizing degradation signals to forecast and continually update the residual useful life of complex engineering systems. However, existing prognostic models for systems with multiple failure modes face several challenges in real-world applications, including overlap** degradation signals from multiple components, the presence of unlabeled historical data, and the similarity of signals across different failure modes. To tackle these issues, this research introduces two prognostic models that integrate the mixture (log)-location-scale distribution with deep learning. This integration facilitates the modeling of overlap** degradation signals, eliminates the need for explicit failure mode identification, and utilizes deep learning to capture complex nonlinear relationships between degradation signals and residual useful lifetimes. Numerical studies validate the superior performance of these proposed models compared to existing methods.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
An Integrated Communication and Computing Scheme for Wi-Fi Networks based on Generative AI and Reinforcement Learning
Authors:
Xinyang Du,
Xuming Fang
Abstract:
The continuous evolution of future mobile communication systems is heading towards the integration of communication and computing, with Mobile Edge Computing (MEC) emerging as a crucial means of implementing Artificial Intelligence (AI) computation. MEC could enhance the computational performance of wireless edge networks by offloading computing-intensive tasks to MEC servers. However, in edge com…
▽ More
The continuous evolution of future mobile communication systems is heading towards the integration of communication and computing, with Mobile Edge Computing (MEC) emerging as a crucial means of implementing Artificial Intelligence (AI) computation. MEC could enhance the computational performance of wireless edge networks by offloading computing-intensive tasks to MEC servers. However, in edge computing scenarios, the sparse sample problem may lead to high costs of time-consuming model training. This paper proposes an MEC offloading decision and resource allocation solution that combines generative AI and deep reinforcement learning (DRL) for the communication-computing integration scenario in the 802.11ax Wi-Fi network. Initially, the optimal offloading policy is determined by the joint use of the Generative Diffusion Model (GDM) and the Twin Delayed DDPG (TD3) algorithm. Subsequently, resource allocation is accomplished by using the Hungarian algorithm. Simulation results demonstrate that the introduction of Generative AI significantly reduces model training costs, and the proposed solution exhibits significant reductions in system task processing latency and total energy consumption costs.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Multitask frame-level learning for few-shot sound event detection
Authors:
Liang Zou,
Genwei Yan,
Ruoyu Wang,
Jun Du,
Meng Lei,
Tian Gao,
Xin Fang
Abstract:
This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been…
▽ More
This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been proposed to overcome these limitations, these strategies commonly face difficulties with prediction truncation caused by background noise. To alleviate this issue, we introduces an innovative multitask frame-level SED framework. In addition, we introduce TimeFilterAug, a linear timing mask for data augmentation, to increase the model's robustness and adaptability to diverse acoustic environments. The proposed method achieves a F-score of 63.8%, securing the 1st rank in the few-shot bioacoustic event detection category of the Detection and Classification of Acoustic Scenes and Events Challenge 2023.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Radio Map-Based Spectrum Sharing for Joint Communication and Sensing
Authors:
Xionran Fang,
Wei Feng,
Yunfei Chen,
Dingxi Yang,
Ning Ge,
Zhiyong Feng,
Yue Gao
Abstract:
The sixth-generation (6G) network is expected to provide both communication and sensing (C&S) services. However, spectrum scarcity poses a major challenge to the harmonious coexistence of C&S systems. Without effective cooperation, the interference resulting from spectrum sharing impairs the performance of both systems. This paper addresses C&S interference within a distributed network. Different…
▽ More
The sixth-generation (6G) network is expected to provide both communication and sensing (C&S) services. However, spectrum scarcity poses a major challenge to the harmonious coexistence of C&S systems. Without effective cooperation, the interference resulting from spectrum sharing impairs the performance of both systems. This paper addresses C&S interference within a distributed network. Different from traditional schemes that require pilot-based high-frequency interactions between C&S systems, we introduce a third party named the radio map to provide the large-scale channel state information (CSI). With large-scale CSI, we optimize the transmit power of C&S systems to maximize the signal-to-interference-plus-noise ratio (SINR) for the radar detection, while meeting the ergodic rate requirement of the interfered user. Given the non-convexity of both the objective and constraint, we employ the techniques of auxiliary-function-based scaling and fractional programming for simplification. Subsequently, we propose an iterative algorithm to solve this problem. Simulation results corroborate our idea that the extrinsic information, i.e., positions and surroundings, is effective to decouple C&S interference.
△ Less
Submitted 27 June, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Stable Relay Learning Optimization Approach for Fast Power System Production Cost Minimization Simulation
Authors:
Zishan Guo,
Qinran Hu,
Tao Qian,
Xin Fang,
Renjie Hu,
Zaijun Wu
Abstract:
Production cost minimization (PCM) simulation is commonly employed for assessing the operational efficiency, economic viability, and reliability, providing valuable insights for power system planning and operations. However, solving a PCM problem is time-consuming, consisting of numerous binary variables for simulation horizon extending over months and years. This hinders rapid assessment of moder…
▽ More
Production cost minimization (PCM) simulation is commonly employed for assessing the operational efficiency, economic viability, and reliability, providing valuable insights for power system planning and operations. However, solving a PCM problem is time-consuming, consisting of numerous binary variables for simulation horizon extending over months and years. This hinders rapid assessment of modern energy systems with diverse planning requirements. Existing methods for accelerating PCM tend to sacrifice accuracy for speed. In this paper, we propose a stable relay learning optimization (s-RLO) approach within the Branch and Bound (B&B) algorithm. The proposed approach offers rapid and stable performance, and ensures optimal solutions. The two-stage s-RLO involves an imitation learning (IL) phase for accurate policy initialization and a reinforcement learning (RL) phase for time-efficient fine-tuning. When implemented on the popular SCIP solver, s-RLO returns the optimal solution up to 2 times faster than the default relpscost rule and 1.4 times faster than IL, or exhibits a smaller gap at the predefined time limit. The proposed approach shows stable performance, reducing fluctuations by approximately 50% compared with IL. The efficacy of the proposed s-RLO approach is supported by numerical results.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Angle-Displacement Rigidity Theory with Application to Distributed Network Localization
Authors:
Xu Fang,
Xiaolei Li,
Lihua Xie
Abstract:
This paper investigates the localization problem of a network in 2-D and 3-D spaces given the positions of anchor nodes in a global frame and inter-node relative measurements in local coordinate frames. It is assumed that the local frames of different nodes have different unknown orientations. First, an angle-displacement rigidity theory is developed, which can be used to localize all the free nod…
▽ More
This paper investigates the localization problem of a network in 2-D and 3-D spaces given the positions of anchor nodes in a global frame and inter-node relative measurements in local coordinate frames. It is assumed that the local frames of different nodes have different unknown orientations. First, an angle-displacement rigidity theory is developed, which can be used to localize all the free nodes by the known positions of the anchor nodes and local relative measurements (local relative position, distance, local relative bearing, angle, or ratio-of-distance measurements). Then, necessary and sufficient conditions for network localizability are given. Finally, a distributed network localization protocol is proposed, which can globally estimate the locations of all the free nodes of a network if the network is infinitesimally angle-displacement rigid. The proposed method unifies local-relative-position-based, distance-based, local-relative-bearing-based, angle-based, and ratio-of-distance-based distributed network localization approaches. The novelty of this work is that the proposed method can be applied in both generic and non-generic configurations with an unknown global coordinate frame in both 2-D and 3-D spaces.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Distributed Semi-global Output Feedback Formation Maneuver Control of High-order Multi-agent Systems
Authors:
Xu Fang,
Lihua Xie
Abstract:
This paper addresses the formation maneuver control problem of leader-follower multi-agent systems with high-order integrator dynamics. A distributed output feedback formation maneuver controller is proposed to achieve desired maneuvers so that the scale, orientation, translation, and shape of formation can be manipulated continuously, where the followers do not need to know or estimate the time-v…
▽ More
This paper addresses the formation maneuver control problem of leader-follower multi-agent systems with high-order integrator dynamics. A distributed output feedback formation maneuver controller is proposed to achieve desired maneuvers so that the scale, orientation, translation, and shape of formation can be manipulated continuously, where the followers do not need to know or estimate the time-varying maneuver parameters only known to the leaders. Compared with existing relative-measurement-based formation maneuver control, the advantages of the proposed method are that it is output (relative output) feedback based and shows how to realize different types of formation shape. In addition, it can be applied to non-generic and non-convex nominal configurations and the leaders are allowed to be maneuvered. It is worth noting that the proposed method can also be extended to general linear multi-agent systems under some additional conditions. The theoretical results are demonstrated by a simulation example.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
3-D Distributed Localization with Mixed Local Relative Measurements
Authors:
Xu Fang,
Xiaolei Li,
Lihua Xie
Abstract:
This paper studies 3-D distributed network localization using mixed types of local relative measurements. Each node holds a local coordinate frame without a common orientation and can only measure one type of information (relative position, distance, relative bearing, angle, or ratio-of-distance measurements) about its neighboring nodes in its local coordinate frame. A novel rigidity-theory-based…
▽ More
This paper studies 3-D distributed network localization using mixed types of local relative measurements. Each node holds a local coordinate frame without a common orientation and can only measure one type of information (relative position, distance, relative bearing, angle, or ratio-of-distance measurements) about its neighboring nodes in its local coordinate frame. A novel rigidity-theory-based distributed localization is developed to overcome the challenge due to the absence of a global coordinate frame. The main idea is to construct displacement constraints for the positions of the nodes by using mixed local relative measurements. Then, a linear distributed localization algorithm is proposed for each free node to estimate its position by solving the displacement constraints. The algebraic condition and graph condition are obtained to guarantee the global convergence of the proposed distributed localization algorithm.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Distributed Localization in Dynamic Networks via Complex Laplacian
Authors:
Xu Fang,
Lihua Xie,
Xiaolei Li
Abstract:
Different from most existing distributed localization approaches in static networks where the agents in a network are static, this paper addresses the distributed localization problem in dynamic networks where the positions of the agents are time-varying. Firstly, complex constraints for the positions of the agents are constructed based on local relative position (distance and local bearing) measu…
▽ More
Different from most existing distributed localization approaches in static networks where the agents in a network are static, this paper addresses the distributed localization problem in dynamic networks where the positions of the agents are time-varying. Firstly, complex constraints for the positions of the agents are constructed based on local relative position (distance and local bearing) measurements. Secondly, both algebraic condition and graph condition of network localizability in dynamic networks are given. Thirdly, a distributed localization protocol is proposed such that all the agents can cooperatively find their positions by solving the complex constraints in dynamic networks. Fourthly, the proposed method is extended to address the problem of integrated distributed localization and formation control. It is worth mentioning that the proposed algorithm can also be applied in the case that only distance and sign of direction measurements are available, where the sign of direction measurement is a kind of one bit local relative measurement and has less information than local bearing.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma
Authors:
Xiangde Luo,
Jia Fu,
Yunxin Zhong,
Shuolin Liu,
Bing Han,
Mehdi Astaraki,
Simone Bendazzoli,
Iuliana Toma-Dasu,
Yiwen Ye,
Ziyang Chen,
Yong Xia,
Yanzhou Su,
** Ye,
Junjun He,
Zhaohu Xing,
Hongqiu Wang,
Lei Zhu,
Kaixiang Yang,
Xin Fang,
Zhiwei Wang,
Chan Woong Lee,
Sang Joon Park,
Jaehee Chun,
Constantin Ulrich,
Klaus H. Maier-Hein
, et al. (17 additional authors not shown)
Abstract:
Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results…
▽ More
Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results in many medical image segmentation tasks. However, for NPC OARs and GTVs segmentation, few public datasets are available for model development and evaluation. To alleviate this problem, the SegRap2023 challenge was organized in conjunction with MICCAI2023 and presented a large-scale benchmark for OAR and GTV segmentation with 400 Computed Tomography (CT) scans from 200 NPC patients, each with a pair of pre-aligned non-contrast and contrast-enhanced CT scans. The challenge's goal was to segment 45 OARs and 2 GTVs from the paired CT scans. In this paper, we detail the challenge and analyze the solutions of all participants. The average Dice similarity coefficient scores for all submissions ranged from 76.68\% to 86.70\%, and 70.42\% to 73.44\% for OARs and GTVs, respectively. We conclude that the segmentation of large-size OARs is well-addressed, and more efforts are needed for GTVs and small-size or thin-structure OARs. The benchmark will remain publicly available here: https://segrap2023.grand-challenge.org
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Federated Multilinear Principal Component Analysis with Applications in Prognostics
Authors:
Chengyu Zhou,
Yuqi Su,
Tangbin Xia,
Xiaolei Fang
Abstract:
Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of…
▽ More
Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of their tensor data while kee** each user's data local and confidential. The proposed FMPCA method is guaranteed to have the same performance as traditional MPCA. An application of the proposed FMPCA in industrial prognostics is also demonstrated. Simulated data and a real-world data set are used to validate the performance of the proposed method.
△ Less
Submitted 28 April, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Distributed Formation Maneuver Control Using Complex Laplacian
Authors:
Xu Fang,
Lihua Xie
Abstract:
This paper studies the problem of distributed formation maneuver control of multi-agent systems via complex Laplacian. We will show how to change the translation, scaling, rotation, and also the shape of formation continuously by only tuning the positions of the leaders in both 2-D and 3-D spaces, where the rotation of formation in 3-D space is realized by changing the yaw angle, pitch angle, and…
▽ More
This paper studies the problem of distributed formation maneuver control of multi-agent systems via complex Laplacian. We will show how to change the translation, scaling, rotation, and also the shape of formation continuously by only tuning the positions of the leaders in both 2-D and 3-D spaces, where the rotation of formation in 3-D space is realized by changing the yaw angle, pitch angle, and roll angle of formation sequentially. Compared with real-Laplacian-based methods, the first advantage of the proposed complex-Laplacian-based approach is that each follower requires fewer neighbors and lesser communication. The second advantage is that non-convex and non-generic nominal configurations are allowed and the uniqueness of the complex-constraint-based target formation can be guaranteed by the non-collocated nominal agents. The third advantage is that more formation shapes can be realized by only tuning the positions of the leaders. Two simulation examples are given to illustrate the theoretical results.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
A Federated Data Fusion-Based Prognostic Model for Applications with Multi-Stream Incomplete Signals
Authors:
Madi Arabi,
Xiaolei Fang
Abstract:
Most prognostic methods require a decent amount of data for model training. In reality, however, the amount of historical data owned by a single organization might be small or not large enough to train a reliable prognostic model. To address this challenge, this article proposes a federated prognostic model that allows multiple users to jointly construct a failure time prediction model using their…
▽ More
Most prognostic methods require a decent amount of data for model training. In reality, however, the amount of historical data owned by a single organization might be small or not large enough to train a reliable prognostic model. To address this challenge, this article proposes a federated prognostic model that allows multiple users to jointly construct a failure time prediction model using their multi-stream, high-dimensional, and incomplete data while kee** each user's data local and confidential. The prognostic model first employs multivariate functional principal component analysis to fuse the multi-stream degradation signals. Then, the fused features coupled with the times-to-failure are utilized to build a (log)-location-scale regression model for failure prediction. To estimate parameters using distributed datasets and keep the data privacy of all participants, we propose a new federated algorithm for feature extraction. Numerical studies indicate that the performance of the proposed model is the same as that of classic non-federated prognostic models and is better than that of the models constructed by each user itself.
△ Less
Submitted 9 April, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Integrated Relative-Measurement-Based Network Localization and Formation Maneuver Control (Extended Version)
Authors:
Xu Fang,
Lihua Xie,
Xiaolei Li
Abstract:
This paper studies the problem of integrated distributed network localization and formation maneuver control. We develop an integrated relative-measurement-based scheme, which only uses relative positions, distances, bearings, angles, ratio-of-distances, or their combination to achieve distributed network localization and formation maneuver control in $\mathbb{R}^d (d \ge 2)$. By exploring the loc…
▽ More
This paper studies the problem of integrated distributed network localization and formation maneuver control. We develop an integrated relative-measurement-based scheme, which only uses relative positions, distances, bearings, angles, ratio-of-distances, or their combination to achieve distributed network localization and formation maneuver control in $\mathbb{R}^d (d \ge 2)$. By exploring the localizability and invariance of the target formation, the scale, rotation, and translation of the formation can be controlled simultaneously by only tuning the leaders' positions, i.e., the followers do not need to know parameters of the scale, rotation, and translation of the target formation. The proposed method can globally drive the formation errors to zero in finite time over multi-layer $d\!+\!1$-rooted graphs. A simulation example is given to illustrate the theoretical results.
△ Less
Submitted 13 November, 2023; v1 submitted 28 October, 2023;
originally announced October 2023.
-
Frequency-Aware Re-Parameterization for Over-Fitting Based Image Compression
Authors:
Yun Ye,
Yanjie Pan,
Qually Jiang,
Ming Lu,
Xiaoran Fang,
Beryl Xu
Abstract:
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper presents a simple re-parameterization method to train CNNs with reduced weights storage and accelerated convergence. The convolution kernels are re-parameterized as a weighted sum of discr…
▽ More
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper presents a simple re-parameterization method to train CNNs with reduced weights storage and accelerated convergence. The convolution kernels are re-parameterized as a weighted sum of discrete cosine transform (DCT) kernels enabling direct optimization in the frequency domain. Combined with L1 regularization, the proposed method surpasses vanilla convolutions by achieving a significantly improved rate-distortion with low computational cost. The proposed method is verified with extensive experiments of over-fitting-based image restoration on various datasets, achieving up to -46.12% BD-rate on top of HEIF with only 200 iterations.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
ChinaTelecom System Description to VoxCeleb Speaker Recognition Challenge 2023
Authors:
Mengjie Du,
Xiang Fang,
Jie Li
Abstract:
This technical report describes ChinaTelecom system for Track 1 (closed) of the VoxCeleb2023 Speaker Recognition Challenge (VoxSRC 2023). Our system consists of several ResNet variants trained only on VoxCeleb2, which were fused for better performance later. Score calibration was also applied for each variant and the fused system. The final submission achieved minDCF of 0.1066 and EER of 1.980%.
This technical report describes ChinaTelecom system for Track 1 (closed) of the VoxCeleb2023 Speaker Recognition Challenge (VoxSRC 2023). Our system consists of several ResNet variants trained only on VoxCeleb2, which were fused for better performance later. Score calibration was also applied for each variant and the fused system. The final submission achieved minDCF of 0.1066 and EER of 1.980%.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Control-Oriented Deep Space Communications For Unmanned Space Exploration
Authors:
Xinran Fang,
Wei Feng,
Yunfei Chen,
Ning Ge,
Gan Zheng
Abstract:
In unmanned space exploration, the cooperation among space robots requires advanced communication techniques. In this paper, we propose a communication optimization scheme for a specific cooperation system named the "mother-daughter system". In this setup, the mother spacecraft orbits the planet, while daughter probes are distributed across the planetary surface. During each control cycle, the mot…
▽ More
In unmanned space exploration, the cooperation among space robots requires advanced communication techniques. In this paper, we propose a communication optimization scheme for a specific cooperation system named the "mother-daughter system". In this setup, the mother spacecraft orbits the planet, while daughter probes are distributed across the planetary surface. During each control cycle, the mother spacecraft senses the environment, computes control commands and distributes them to daughter probes for actions. They synergistically form sensing-communication-computing-control ($\mathbf{SC^3}$) loops. Given the indivisibility of the $\mathbf{SC^3}$ loop, we optimize the mother-daughter downlink for closed-loop control. The optimization objective is the linear quadratic regulator (LQR) cost, and the optimization parameters are the block length and transmit power. To solve the nonlinear mixed-integer problem, we first identify the optimal block length and then transform the power allocation problem into a tractable convex problem. We further derive the approximate closed-form solutions for the proposed scheme and two communication-oriented schemes: the max-sum rate scheme and the max-min rate scheme. On this basis, we analyze their power allocation principles. In particular, for time-insensitive control tasks, we find that the proposed scheme demonstrates equivalence to the max-min rate scheme. These findings are verified through simulations.
△ Less
Submitted 27 June, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
On Kernel Design for Regularized Non-Causal System Identification
Authors:
Xiaozhu Fang,
Tianshi Chen
Abstract:
Through one decade's development, the kernel-based regularization method (KRM) has become a complement to the classical maximum likelihood/prediction error method and an emerging new system identification paradigm. One recent example is its application in the non-causal system identification, and the key issue lies in the design and analysis of kernels for non-causal systems. In this paper, we dev…
▽ More
Through one decade's development, the kernel-based regularization method (KRM) has become a complement to the classical maximum likelihood/prediction error method and an emerging new system identification paradigm. One recent example is its application in the non-causal system identification, and the key issue lies in the design and analysis of kernels for non-causal systems. In this paper, we develop systematic ways to deal with this issue. In particular, we first introduce the guidelines for kernel design and then extend the system theoretic framework to design the so-called non-causal simulation-induced (NCSI) kernel, and we also study its structural properties, including stability and semiseparability. Finally, we consider some special cases of the NCSI kernel and show their advantage over the existing kernels through numerical simulations.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Boosting Convolution with Efficient MLP-Permutation for Volumetric Medical Image Segmentation
Authors:
Yi Lin,
Xiao Fang,
Dong Zhang,
Kwang-Ting Cheng,
Hao Chen
Abstract:
Recently, the advent of vision Transformer (ViT) has brought substantial advancements in 3D dataset benchmarks, particularly in 3D volumetric medical image segmentation (Vol-MedSeg). Concurrently, multi-layer perceptron (MLP) network has regained popularity among researchers due to their comparable results to ViT, albeit with the exclusion of the resource-intensive self-attention module. In this w…
▽ More
Recently, the advent of vision Transformer (ViT) has brought substantial advancements in 3D dataset benchmarks, particularly in 3D volumetric medical image segmentation (Vol-MedSeg). Concurrently, multi-layer perceptron (MLP) network has regained popularity among researchers due to their comparable results to ViT, albeit with the exclusion of the resource-intensive self-attention module. In this work, we propose a novel permutable hybrid network for Vol-MedSeg, named PHNet, which capitalizes on the strengths of both convolution neural networks (CNNs) and MLP. PHNet addresses the intrinsic isotropy problem of 3D volumetric data by employing a combination of 2D and 3D CNNs to extract local features. Besides, we propose an efficient multi-layer permute perceptron (MLPP) module that captures long-range dependence while preserving positional information. This is achieved through an axis decomposition operation that permutes the input tensor along different axes, thereby enabling the separate encoding of the positional information. Furthermore, MLPP tackles the resolution sensitivity issue of MLP in Vol-MedSeg with a token segmentation operation, which divides the feature into smaller tokens and processes them individually. Extensive experimental results validate that PHNet outperforms the state-of-the-art methods with lower computational costs on the widely-used yet challenging COVID-19-20 and Synapse benchmarks. The ablation study also demonstrates the effectiveness of PHNet in harnessing the strengths of both CNNs and MLP.
△ Less
Submitted 24 August, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Kernel-based Regularized Iterative Learning Control of Repetitive Linear Time-varying Systems
Authors:
Xian Yu,
Xiaozhu Fang,
Biqiang Mu,
Tianshi Chen
Abstract:
For data-driven iterative learning control (ILC) methods, both the model estimation and controller design problems are converted to parameter estimation problems for some chosen model structures. It is well-known that if the model order is not chosen carefully, models with either large variance or large bias would be resulted, which is one of the obstacles to further improve the modeling and track…
▽ More
For data-driven iterative learning control (ILC) methods, both the model estimation and controller design problems are converted to parameter estimation problems for some chosen model structures. It is well-known that if the model order is not chosen carefully, models with either large variance or large bias would be resulted, which is one of the obstacles to further improve the modeling and tracking performances of data-driven ILC in practice. An emerging trend in the system identification community to deal with this issue is using regularization instead of the statistical tests, e.g., AIC, BIC, and one of the representatives is the so-called kernel-based regularization method (KRM). In this paper, we integrate KRM into data-driven ILC to handle a class of repetitive linear time-varying systems, and moreover, we show that the proposed method has ultimately bounded tracking error in the iteration domain. The numerical simulation results show that in contrast with the least squares method and some existing data-driven ILC methods, the proposed one can give faster convergence speed, better accuracy and robustness in terms of the tracking performance.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer
Authors:
Kang Li,
Yan Song,
Li-Rong Dai,
Ian McLoughlin,
Xin Fang,
Lin Liu
Abstract:
In this paper, we propose an effective sound event detection (SED) method based on the audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for audio tagging (AT) task, termed AST-SED. Pretrained AST models have recently shown promise on DCASE2022 challenge task4 where they help mitigate a lack of sufficient real annotated data. However, mainly due to differences betwe…
▽ More
In this paper, we propose an effective sound event detection (SED) method based on the audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for audio tagging (AT) task, termed AST-SED. Pretrained AST models have recently shown promise on DCASE2022 challenge task4 where they help mitigate a lack of sufficient real annotated data. However, mainly due to differences between the AT and SED tasks, it is suboptimal to directly utilize outputs from a pretrained AST model. Hence the proposed AST-SED adopts an encoder-decoder architecture to enable effective and efficient fine-tuning without needing to redesign or retrain the AST model. Specifically, the Frequency-wise Transformer Encoder (FTE) consists of transformers with self attention along the frequency axis to address multiple overlapped audio events issue in a single clip. The Local Gated Recurrent Units Decoder (LGD) consists of nearest-neighbor interpolation (NNI) and Bidirectional Gated Recurrent Units (Bi-GRU) to compensate for temporal resolution loss in the pretrained AST model output. Experimental results on DCASE2022 task4 development set have demonstrated the superiority of the proposed AST-SED with FTE-LGD architecture. Specifically, the Event-Based F1-score (EB-F1) of 59.60% and Polyphonic Sound detection Score scenario1 (PSDS1) score of 0.5140 significantly outperform CRNN and other pretrained AST-based systems.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Technology Report : Smartphone-Based Pedestrian Dead Reckoning Integrated with Data-Fusion-Adopted Visible Light Positioning
Authors:
Shangsheng Wen,
Ziyang Ge,
Danlan Yuan,
Yingcong Chen,
Xuecong Fang
Abstract:
Pedestrian dead-reckoning (PDR) is a potential indoor localization technology that obtains location estimation with the inertial measurement unit (IMU). However, one of its most significant drawbacks is the accumulation of its measurement error. This paper proposes a visible light positioning (VLP)-integrated PDR system, which could achieve real-time and accurate indoor positioning using IMU and t…
▽ More
Pedestrian dead-reckoning (PDR) is a potential indoor localization technology that obtains location estimation with the inertial measurement unit (IMU). However, one of its most significant drawbacks is the accumulation of its measurement error. This paper proposes a visible light positioning (VLP)-integrated PDR system, which could achieve real-time and accurate indoor positioning using IMU and the camera sensor of our smartphone. A multi-frame fusion method is proposed in the encoding and decoding process of the system, reaching 98.5% decoding accuracy with a 20-bit-long ID at the height of 2.1 m, which allows the variation in the shutter speeds of cameras and heights of the LED. Meanwhile, absolute locations and step length could be calibrated with the help of a single light-emitting diode (LED), promising average accuracy within 0.5 meters in a 108-meter walk.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
A deep local attention network for pre-operative lymph node metastasis prediction in pancreatic cancer via multiphase CT imaging
Authors:
Zhilin Zheng,
Xu Fang,
Jiawen Yao,
Mengmeng Zhu,
Le Lu,
Lingyun Huang,
**g Xiao,
Yu Shi,
Hong Lu,
Jian** Lu,
Ling Zhang,
Chengwei Shao,
Yun Bian
Abstract:
Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment de…
▽ More
Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment decision and surgical planning. Most studies only capture the tumor characteristics in CT imaging to implicitly infer LN metastasis and very few work exploit direct LN's CT imaging information. To the best of our knowledge, this is the first work to propose a fully-automated LN segmentation and identification network to directly facilitate the LN metastasis status prediction task. Nevertheless LN segmentation/detection is very challenging since LN can be easily confused with other hard negative anatomic structures (e.g., vessels) from radiological images. We explore the anatomical spatial context priors of pancreatic LN locations by generating a guiding attention map from related organs and vessels to assist segmentation and infer LN status. As such, LN segmentation is impelled to focus on regions that are anatomically adjacent or plausible with respect to the specific organs and vessels. The metastasized LN identification network is trained to classify the segmented LN instances into positives or negatives by reusing the segmentation network as a pre-trained backbone and padding a new classification head. More importantly, we develop a LN metastasis status prediction network that combines the patient-wise aggregation results of LN segmentation/identification and deep imaging features extracted from the tumor region. Extensive quantitative nested five-fold cross-validation is conducted on a discovery dataset of 749 patients with PDAC.
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
Beamforming Design and Trajectory Optimization for UAV-Empowered Adaptable Integrated Sensing and Communication
Authors:
Cailian Deng,
Xuming Fang,
Xianbin Wang
Abstract:
Unmanned aerial vehicle (UAV) has high flexibility and controllable mobility, therefore it is considered as a promising enabler for future integrated sensing and communication (ISAC). In this paper, we propose a novel adaptable ISAC (AISAC) mechanism in the UAV-enabled system, where the UAV performs sensing on demand during communication and the sensing duration is configured flexibly according to…
▽ More
Unmanned aerial vehicle (UAV) has high flexibility and controllable mobility, therefore it is considered as a promising enabler for future integrated sensing and communication (ISAC). In this paper, we propose a novel adaptable ISAC (AISAC) mechanism in the UAV-enabled system, where the UAV performs sensing on demand during communication and the sensing duration is configured flexibly according to the application requirements rather than kee** the same with the communication duration. Our designed mechanism avoids the excessive sensing and waste of radio resources, therefore improving the resource utilization and system performance. In the UAV-enabled AISAC system, we aim at maximizing the average system throughput by optimizing the communication and sensing beamforming as well as UAV trajectory while guaranteeing the quality-of-service requirements of communication and sensing. To efficiently solve the considered non-convex optimization problem, we first propose an efficient alternating optimization algorithm to optimize the communication and sensing beamforming for a given UAV location, and then develop a low-complexity joint beamforming and UAV trajectory optimization algorithm that sequentially searches the optimal UAV location until reaching the final location. Numerical results validate the superiority of the proposed adaptable mechanism and the effectiveness of the designed algorithm.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Joint Optimization of Active and Passive Beamforming in Multi-IRS Aided mmWave Communications
Authors:
Renlong Wei,
Qing Xue,
Shaodan Ma,
Yongjun Xu,
Li Yan,
Xuming Fang
Abstract:
Intelligent reflecting surface (IRS) has been considered as a promising technology to alleviate the blockage effect and enhance coverage in millimeter wave (mmWave) communication. To explore the impact of IRS on the performance of mmWave communication, we investigate a multi-IRS assisted mmWave communication network and formulate a sum rate maximization problem by jointly optimizing the active and…
▽ More
Intelligent reflecting surface (IRS) has been considered as a promising technology to alleviate the blockage effect and enhance coverage in millimeter wave (mmWave) communication. To explore the impact of IRS on the performance of mmWave communication, we investigate a multi-IRS assisted mmWave communication network and formulate a sum rate maximization problem by jointly optimizing the active and passive beamforming and the set of IRSs for assistance. The optimization problem is intractable due to the lack of convexity of the objective function and the binary nature of the IRS selection variables. To tackle the complex non-convex problem, an alternating iterative approach is proposed. In particular, utilizing the fractional programming method to optimize the active and passive beamforming and the optimization of IRS selection is solved by enumerating. Simulation results demonstrate the performance gain of our proposed approach.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Joint Optimization of Resource Allocation and Trajectory Control for Mobile Group Users in Fixed-Wing UAV-Enabled Wireless Network
Authors:
Xuezhen Yan,
Xuming Fang,
Cailian Deng,
Xianbin Wang
Abstract:
Owing to the controlling flexibility and cost-effectiveness, fixed-wing unmanned aerial vehicles (UAVs) are expected to serve as flying base stations (BSs) in the air-ground integrated network. By exploiting the mobility of UAVs, controllable coverage can be provided for mobile group users (MGUs) under challenging scenarios or even somewhere without communication infrastructure. However, in such d…
▽ More
Owing to the controlling flexibility and cost-effectiveness, fixed-wing unmanned aerial vehicles (UAVs) are expected to serve as flying base stations (BSs) in the air-ground integrated network. By exploiting the mobility of UAVs, controllable coverage can be provided for mobile group users (MGUs) under challenging scenarios or even somewhere without communication infrastructure. However, in such dual mobility scenario where the UAV and MGUs are all moving, both the non-hovering feature of the fixed-wing UAV and the movement of MGUs will exacerbate the dynamic changes of user scheduling, which eventually leads to the degradation of MGUs' quality-of-service (QoS). In this paper, we propose a fixed-wing UAV-enabled wireless network architecture to provide moving coverage for MGUs. In order to achieve fairness among MGUs, we maximize the minimum average throughput between all users by jointly optimizing the user scheduling, resource allocation, and UAV trajectory control under the constraints on users' QoS requirements, communication resources, and UAV trajectory switching. Considering the optimization problem is mixed-integer non-convex, we decompose it into three optimization subproblems. An efficient algorithm is proposed to solve these three subproblems alternately till the convergence is realized. Simulation results demonstrate that the proposed algorithm can significantly improve the minimum average throughput of MGUs.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
A Unified Analytical Method to Quantify Three Types of Fast Frequency Response from Inverter-based Resources
Authors:
Shuan Dong,
Xin Fang,
** Tan,
Ningchao Gao,
Xiaofan Cui,
Anderson Hoke
Abstract:
With more inverter-based resources (IBRs), our power systems have lower frequency nadirs following N-1 contingencies, and undesired under-frequency load shedding (UFLS) can occur. To address this challenge, IBRs can be programmed to provide at least three types of fast frequency response (FFR), e.g., step response, proportional response (P/f droop response), and derivative response (synthetic iner…
▽ More
With more inverter-based resources (IBRs), our power systems have lower frequency nadirs following N-1 contingencies, and undesired under-frequency load shedding (UFLS) can occur. To address this challenge, IBRs can be programmed to provide at least three types of fast frequency response (FFR), e.g., step response, proportional response (P/f droop response), and derivative response (synthetic inertia). However, these heterogeneous FFR challenge the study of power system frequency dynamics. Thus, this paper develops an analytical frequency nadir prediction method that allows for the consideration of all three potential forms of FFR provided by IBRs. The proposed method provides fast and accurate frequency nadir estimation after N-1 generation trip** contingencies. Our method is grounded on the closed-form solution for the frequency nadir, which is solved from the second-order system frequency response model considering the governor dynamics and three types of FFR. The simulation results in the IEEE 39-bus system with different types of FFR demonstrate that the proposed method provides an accurate and fast prediction of the frequency nadir under various disturbances.
△ Less
Submitted 25 August, 2023; v1 submitted 19 September, 2022;
originally announced September 2022.
-
Low-Complexity Acoustic Echo Cancellation with Neural Kalman Filtering
Authors:
Dong Yang,
Fei Jiang,
Wei Wu,
Xuefei Fang,
Muyong Cao
Abstract:
The Kalman filter has been adopted in acoustic echo cancellation due to its robustness to double-talk, fast convergence, and good steady-state performance. The performance of Kalman filter is closely related to the estimation accuracy of the state noise covariance and the observation noise covariance. The estimation error may lead to unacceptable results, especially when the echo path suffers abru…
▽ More
The Kalman filter has been adopted in acoustic echo cancellation due to its robustness to double-talk, fast convergence, and good steady-state performance. The performance of Kalman filter is closely related to the estimation accuracy of the state noise covariance and the observation noise covariance. The estimation error may lead to unacceptable results, especially when the echo path suffers abrupt changes, the tracking performance of the Kalman filter could be degraded significantly. In this paper, we propose the neural Kalman filtering (NKF), which uses neural networks to implicitly model the covariance of the state noise and observation noise and to output the Kalman gain in real-time. Experimental results on both synthetic test sets and real-recorded test sets show that, the proposed NKF has superior convergence and re-convergence performance while ensuring low near-end speech degradation comparing with the state-of-the-art model-based methods. Moreover, the model size of the proposed NKF is merely 5.3 K and the RTF is as low as 0.09, which indicates that it can be deployed in low-resource platforms.
△ Less
Submitted 29 October, 2022; v1 submitted 22 July, 2022;
originally announced July 2022.
-
A Supervised Tensor Dimension Reduction-Based Prognostics Model for Applications with Incomplete Imaging Data
Authors:
Chengyu Zhou,
Xiaolei Fang
Abstract:
This paper proposes a supervised dimension reduction methodology for tensor data which has two advantages over most image-based prognostic models. First, the model does not require tensor data to be complete which expands its application to incomplete data. Second, it utilizes time-to-failure (TTF) to supervise the extraction of low-dimensional features which makes the extracted features more effe…
▽ More
This paper proposes a supervised dimension reduction methodology for tensor data which has two advantages over most image-based prognostic models. First, the model does not require tensor data to be complete which expands its application to incomplete data. Second, it utilizes time-to-failure (TTF) to supervise the extraction of low-dimensional features which makes the extracted features more effective for the subsequent prognostic. Besides, an optimization algorithm is proposed for parameter estimation and closed-form solutions are derived under certain distributions.
△ Less
Submitted 4 June, 2023; v1 submitted 22 July, 2022;
originally announced July 2022.
-
DLMP of Competitive Markets in Active Distribution Networks: Models, Solutions, Applications, and Visions
Authors:
Xiaofei Wang,
Fangxing Li,
Linquan Bai,
Xin Fang
Abstract:
Traditionally, the electric distribution system operates with uniform energy prices across all system nodes. However, as the adoption of distributed energy resources (DERs) propels a shift from passive to active distribution network (ADN) operation, a distribution-level electricity market has been proposed to manage new complexities efficiently. In addition, distribution locational marginal price…
▽ More
Traditionally, the electric distribution system operates with uniform energy prices across all system nodes. However, as the adoption of distributed energy resources (DERs) propels a shift from passive to active distribution network (ADN) operation, a distribution-level electricity market has been proposed to manage new complexities efficiently. In addition, distribution locational marginal price (DLMP) has been established in the literature as the primary pricing mechanism. The DLMP inherits the LMP concept in the transmission-level wholesale market, but incorporates characteristics of the distribution system, such as high R/X ratios and power losses, system imbalance, and voltage regulation needs. The DLMP provides a solution that can be essential for competitive market operation in future distribution systems. This paper first provides an overview of the current distribution-level market architectures and their early implementations. Next, the general clearing model, model relaxations, and DLMP formulation are comprehensively reviewed. The state-of-the-art solution methods for distribution market clearing are summarized and categorized into centralized, distributed, and decentralized methods. Then, DLMP applications for the operation and planning of DERs and distribution system operators (DSOs) are discussed in detail. Finally, visions of future research directions and possible barriers and challenges are presented.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Joint Communication and Sensing: Models and Potential of Using MIMO
Authors:
Xinran Fang,
Wei Feng,
Yunfei Chen,
Ning Ge,
Yan Zhang
Abstract:
The sixth-generation (6G) network is envisioned to integrate communication and sensing functions, so as to improve the spectrum efficiency (SE) and support explosive novel applications. Although the similarities of wireless communication and radio sensing lay the foundation for their combination, there is still considerable incompatible interest between them. To simultaneously guarantee the commun…
▽ More
The sixth-generation (6G) network is envisioned to integrate communication and sensing functions, so as to improve the spectrum efficiency (SE) and support explosive novel applications. Although the similarities of wireless communication and radio sensing lay the foundation for their combination, there is still considerable incompatible interest between them. To simultaneously guarantee the communication capacity and the sensing accuracy, the multiple-input and multiple-output (MIMO) technique plays an important role due to its unique capability of spatial beamforming and waveform sha**. However, the configuration of MIMO also brings high hardware cost, high power consumption, and high signal processing complexity. How to efficiently apply MIMO to achieve balanced communication and sensing performance is still open. In this survey, we discuss joint communication and sensing (JCAS) in the context of MIMO. We first outline the roles of MIMO in the process of wireless communication and radar sensing. Then, we present current advances in both communication and sensing coexistence and integration in detail. Three novel JCAS MIMO models are subsequently discussed by combining cutting-edge technologies, i.e., cloud random access networks (C-RANs), unmanned aerial vehicles (UAVs) and reconfigurable intelligent surfaces (RISs). Examined from the practical perspective, the potential and challenges of MIMO in JCAS are summarized, and promising solutions are provided. Motivated by the great potential of the Internet of Things (IoT), we also specify JCAS in IoT scenarios and discuss the uniqueness of applying JCAS to IoT. In the end, open issues are outlined to envisage a ubiquitous, intelligent and secure JCAS network in the near future.
△ Less
Submitted 4 December, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Authors:
Ye-Qian Du,
Jie Zhang,
Qiu-Shi Zhu,
Li-Rong Dai,
Ming-Hui Wu,
Xin Fang,
Zhou-Wang Yang
Abstract:
Unpaired data has shown to be beneficial for low-resource automatic speech recognition~(ASR), which can be involved in the design of hybrid models with multi-task training or language model dependent pre-training. In this work, we leverage unpaired data to train a general sequence-to-sequence model. Unpaired speech and text are used in the form of data pairs by generating the corresponding missing…
▽ More
Unpaired data has shown to be beneficial for low-resource automatic speech recognition~(ASR), which can be involved in the design of hybrid models with multi-task training or language model dependent pre-training. In this work, we leverage unpaired data to train a general sequence-to-sequence model. Unpaired speech and text are used in the form of data pairs by generating the corresponding missing parts in prior to model training. Inspired by the complementarity of speech-PseudoLabel pair and SynthesizedAudio-text pair in both acoustic features and linguistic features, we propose a complementary joint training~(CJT) method that trains a model alternatively with two data pairs. Furthermore, label masking for pseudo-labels and gradient restriction for synthesized audio are proposed to further cope with the deviations from real data, termed as CJT++. Experimental results show that compared to speech-only training, the proposed basic CJT achieves great performance improvements on clean/other test sets, and the CJT++ re-training yields further performance enhancements. It is also apparent that the proposed method outperforms the wav2vec2.0 model with the same model size and beam size, particularly in extreme low-resource cases.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Logistics in the Sky: A Two-phase Optimization Approach for the Drone Package Pickup and Delivery System
Authors:
Fangyu Hong,
Guohua Wu,
Qizhang Luo,
Huan Liu,
** Fang,
Witold Pedrycz
Abstract:
The application of drones in the last-mile distribution is a research hotspot in recent years. Different from the previous urban distribution mode that depends on trucks, this paper proposes a novel package pick-up and delivery mode and system in which multiple drones collaborate with automatic devices. The proposed mode uses free areas on the top of residential buildings to set automatic devices…
▽ More
The application of drones in the last-mile distribution is a research hotspot in recent years. Different from the previous urban distribution mode that depends on trucks, this paper proposes a novel package pick-up and delivery mode and system in which multiple drones collaborate with automatic devices. The proposed mode uses free areas on the top of residential buildings to set automatic devices as delivery and pick-up points of packages, and employs drones to transport packages between buildings and depots. Integrated scheduling problem of package drop-pickup considering m-drone, m-depot, m-customer is crucial for the system. We propose a simulated-annealing-based two-phase optimization approach (SATO) to solve this problem. In the first phase, tasks are allocated to depots for serving, such that the initial problem is decomposed into multiple single depot scheduling problems with m-drone. In the second phase, considering the drone capability constraints and task demand constraints, we generate the route planning scheme for drones in each depot. Concurrently, an improved variable neighborhood descent algorithm (IVND) is designed in the first phase to reallocate tasks, and a local search algorithm (LS) are proposed to search the high-quality solution in the second phase. Finally, extensive experiments and comparative studies are conducted to test the effectiveness of the proposed approach. Experiments indicate that the proposed SATO-IVND can reduce the cost by more than 14% in a reasonable time compared with several other peer algorithms.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Model Predictive Control with Preview: Recursive Feasibility and Stability
Authors:
Xing Fang,
Wen-Hua Chen
Abstract:
This paper proposes a stabilising model predictive control (MPC) scheme with preview information of disturbance for nonlinear systems. The proposed MPC algorithm is able to not only reject disturbance by making use of disturbance preview information as necessary, but also take advantage of the disturbance if it is good for a control task. This is realised by taking into account both the task (e.g.…
▽ More
This paper proposes a stabilising model predictive control (MPC) scheme with preview information of disturbance for nonlinear systems. The proposed MPC algorithm is able to not only reject disturbance by making use of disturbance preview information as necessary, but also take advantage of the disturbance if it is good for a control task. This is realised by taking into account both the task (e.g. reference trajectory) and disturbance preview in the prediction horizon when performing online optimisation. Conditions are established to ensure recursive feasibility and stability under the disturbance. First the disturbance within the horizon is augmented with the state to form a new composite system and then the stage cost function is modified accordingly. With the help of input-to-state stability theory, a terminal cost and a terminal constraint are constructed and added to the MPC algorithm with preview to guarantee its recursive feasibility and stability under a pre-bounded disturbance. Numerical simulation results demonstrate the effectiveness of the proposed MPC algorithm.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Authors:
Zi-Qiang Zhang,
Jie Zhang,
Jian-Shu Zhang,
Ming-Hui Wu,
Xin Fang,
Li-Rong Dai
Abstract:
With the advance in self-supervised learning for audio and visual modalities, it has become possible to learn a robust audio-visual speech representation. This would be beneficial for improving the audio-visual speech recognition (AVSR) performance, as the multi-modal inputs contain more fruitful information in principle. In this paper, based on existing self-supervised representation learning met…
▽ More
With the advance in self-supervised learning for audio and visual modalities, it has become possible to learn a robust audio-visual speech representation. This would be beneficial for improving the audio-visual speech recognition (AVSR) performance, as the multi-modal inputs contain more fruitful information in principle. In this paper, based on existing self-supervised representation learning methods for audio modality, we therefore propose an audio-visual representation learning approach. The proposed approach explores both the complementarity of audio-visual modalities and long-term context dependency using a transformer-based fusion module and a flexible masking strategy. After pre-training, the model is able to extract fused representations required by AVSR. Without loss of generality, it can be applied to single-modal tasks, e.g. audio/visual speech recognition by simply masking out one modality in the fusion module. The proposed pre-trained model is evaluated on speech recognition and lipreading tasks using one or two modalities, where the superiority is revealed.
△ Less
Submitted 10 July, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Authors:
Qiu-Shi Zhu,
Jie Zhang,
Zi-Qiang Zhang,
Ming-Hui Wu,
Xin Fang,
Li-Rong Dai
Abstract:
Wav2vec2.0 is a popular self-supervised pre-training framework for learning speech representations in the context of automatic speech recognition (ASR). It was shown that wav2vec2.0 has a good robustness against the domain shift, while the noise robustness is still unclear. In this work, we therefore first analyze the noise robustness of wav2vec2.0 via experiments. We observe that wav2vec2.0 pre-t…
▽ More
Wav2vec2.0 is a popular self-supervised pre-training framework for learning speech representations in the context of automatic speech recognition (ASR). It was shown that wav2vec2.0 has a good robustness against the domain shift, while the noise robustness is still unclear. In this work, we therefore first analyze the noise robustness of wav2vec2.0 via experiments. We observe that wav2vec2.0 pre-trained on noisy data can obtain good representations and thus improve the ASR performance on the noisy test set, which however brings a performance degradation on the clean test set. To avoid this issue, in this work we propose an enhanced wav2vec2.0 model. Specifically, the noisy speech and the corresponding clean version are fed into the same feature encoder, where the clean speech provides training targets for the model. Experimental results reveal that the proposed method can not only improve the ASR performance on the noisy test set which surpasses the original wav2vec2.0, but also ensure a tiny performance decrease on the clean test set. In addition, the effectiveness of the proposed method is demonstrated under different types of noise conditions.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
Ultralow complexity long short-term memory network for fiber nonlinearity mitigation in coherent optical communication systems
Authors:
Hao Ming,
Xinyu Chen,
Xiansong Fang,
Lei Zhang,
Chenjia Li,
Fan Zhang
Abstract:
Fiber Kerr nonlinearity is a fundamental limitation to the achievable capacity of long-distance optical fiber communication. Digital back-propagation (DBP) is a primary methodology to mitigate both linear and nonlinear impairments by solving the inverse-propagating nonlinear Schrödinger equation (NLSE), which requires detailed link information. Recently, the paradigms based on neural network (NN)…
▽ More
Fiber Kerr nonlinearity is a fundamental limitation to the achievable capacity of long-distance optical fiber communication. Digital back-propagation (DBP) is a primary methodology to mitigate both linear and nonlinear impairments by solving the inverse-propagating nonlinear Schrödinger equation (NLSE), which requires detailed link information. Recently, the paradigms based on neural network (NN) were proposed to mitigate nonlinear transmission impairments in optical communication systems. However, almost all neural network-based equalization schemes yield high computation complexity, which prevents the practical implementation in commercial transmission systems. In this paper, we propose a center-oriented long short-term memory network (Co-LSTM) incorporating a simplified mode with a recycling mechanism in the equalization operation, which can mitigate fiber nonlinearity in coherent optical communication systems with ultralow complexity. To validate the proposed methodology, we carry out an experiment of ten-channel wavelength division multiplexing (WDM) transmission with 64 Gbaud polarization-division-multiplexed 16-ary quadrature amplitude modulation (16-QAM) signals. Co-LSTM and DBP achieve a comparable performance of nonlinear mitigation. However, the complexity of Co-LSTM with a simplified mode is almost independent of the transmission distance, which is much lower than that of the DBP. The proposed Co-LSTM methodology presents an attractive approach for low complexity nonlinearity mitigation with neural networks.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Impact of DER Communication Delay in AGC: Cyber-Physical Dynamic Simulation
Authors:
Wenbo Wang,
Xin Fang,
Anthony Florita
Abstract:
Distributed energy resource (DER) frequency regulations are promising technologies for future grid operation. Unlike conventional generators, DERs might require open communication networks to exchange signals with control centers, possibly through DER aggregators; therefore, the impacts of the communication variations on the system stability need to be investigated. This paper develops a cyber-phy…
▽ More
Distributed energy resource (DER) frequency regulations are promising technologies for future grid operation. Unlike conventional generators, DERs might require open communication networks to exchange signals with control centers, possibly through DER aggregators; therefore, the impacts of the communication variations on the system stability need to be investigated. This paper develops a cyber-physical dynamic simulation model based on the Hierarchical Engine for Large-Scale Co-Simulation (HELICS) to evaluate the impact of the communication variations, such as delays in DER frequency regulations. The feasible delay range can be obtained under different parameter settings. The results show that the risk of instability generally increases with the communication delay.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
Rethinking Annotation Granularity for Overcoming Shortcuts in Deep Learning-based Radiograph Diagnosis: A Multicenter Study
Authors:
Luyang Luo,
Hao Chen,
Yongjie Xiao,
Yanning Zhou,
Xi Wang,
Varut Vardhanabhuti,
Mingxiang Wu,
Chu Han,
Zaiyi Liu,
Xin Hao Benjamin Fang,
Efstratios Tsougenis,
Huang**g Lin,
Pheng-Ann Heng
Abstract:
Two DL models were developed using radiograph-level annotations (yes or no disease) and fine-grained lesion-level annotations (lesion bounding boxes), respectively named CheXNet and CheXDet. The models' internal classification performance and lesion localization performance were compared on a testing set (n=2,922), external classification performance was compared on NIH-Google (n=4,376) and PadChe…
▽ More
Two DL models were developed using radiograph-level annotations (yes or no disease) and fine-grained lesion-level annotations (lesion bounding boxes), respectively named CheXNet and CheXDet. The models' internal classification performance and lesion localization performance were compared on a testing set (n=2,922), external classification performance was compared on NIH-Google (n=4,376) and PadChest (n=24,536) datasets, and external lesion localization performance was compared on NIH-ChestX-ray14 dataset (n=880). The models were also compared to radiologists on a subset of the internal testing set (n=496). Given sufficient training data, both models performed comparably to radiologists. CheXDet achieved significant improvement for external classification, such as in classifying fracture on NIH-Google (CheXDet area under the ROC curve [AUC]: 0.67, CheXNet AUC: 0.51; p<.001) and PadChest (CheXDet AUC: 0.78, CheXNet AUC: 0.55; p<.001). CheXDet achieved higher lesion detection performance than CheXNet for most abnormalities on all datasets, such as in detecting pneumothorax on the internal set (CheXDet jacknife alternative free-response ROC-figure of merit [JAFROC-FOM]: 0.87, CheXNet JAFROC-FOM: 0.13; p<.001) and NIH-ChestX-ray14 (CheXDet JAFROC-FOM: 0.55, CheXNet JAFROC-FOM: 0.04; p<.001). To summarize, fine-grained annotations overcame shortcut learning and enabled DL models to identify correct lesion patterns, improving the models' generalizability.
△ Less
Submitted 8 November, 2022; v1 submitted 21 April, 2021;
originally announced April 2021.
-
USTC-NELSLIP System Description for DIHARD-III Challenge
Authors:
Yuxuan Wang,
Maokui He,
Shutong Niu,
Lei Sun,
Tian Gao,
Xin Fang,
Jia Pan,
Jun Du,
Chin-Hui Lee
Abstract:
This system description describes our submission system to the Third DIHARD Speech Diarization Challenge. Besides the traditional clustering based system, the innovation of our system lies in the combination of various front-end techniques to solve the diarization problem, including speech separation and target-speaker based voice activity detection (TS-VAD), combined with iterative data purificat…
▽ More
This system description describes our submission system to the Third DIHARD Speech Diarization Challenge. Besides the traditional clustering based system, the innovation of our system lies in the combination of various front-end techniques to solve the diarization problem, including speech separation and target-speaker based voice activity detection (TS-VAD), combined with iterative data purification. We also adopted audio domain classification to design domain-dependent processing. Finally, we performed post processing to do system fusion and selection. Our best system achieved DERs of 11.30% in track 1 and 16.78% in track 2 on evaluation set, respectively.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition
Authors:
Zi-Qiang Zhang,
Yan Song,
Ming-Hui Wu,
Xin Fang,
Li-Rong Dai
Abstract:
In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST). XLST is able to utilize a small amount of annotated data from high-resource languages to improve the representation learning on multilingual un-annotated data. Specifically, XLST uses a supervised trained model to produce initial representations and another model…
▽ More
In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST). XLST is able to utilize a small amount of annotated data from high-resource languages to improve the representation learning on multilingual un-annotated data. Specifically, XLST uses a supervised trained model to produce initial representations and another model to learn from them, by maximizing the similarity between output embeddings of these two models. Furthermore, the moving average mechanism and multi-view data augmentation are employed, which are experimentally shown to be crucial to XLST. Comprehensive experiments have been conducted on the CommonVoice corpus to evaluate the effectiveness of XLST. Results on 5 downstream low-resource ASR tasks shows that our multilingual pretrained model achieves relatively 18.6% PER reduction over the state-of-the-art self-supervised method, with leveraging additional 100 hours of annotated English data.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Transmission-and-Distribution Frequency Dynamic Co-Simulation Framework for Distributed Energy Resources Frequency Response
Authors:
Wenbo Wang,
Xin Fang,
Hantao Cui,
Fangxing Li
Abstract:
The rapid deployment of distributed energy resources (DERs) in distribution networks has brought challenges to balance the system and stabilize frequency. DERs have the ability to provide frequency regulation; however, existing dynamic frequency simulation tools-which were developed mainly for the transmission system-lack the capability to simulate distribution network dynamics with high penetrati…
▽ More
The rapid deployment of distributed energy resources (DERs) in distribution networks has brought challenges to balance the system and stabilize frequency. DERs have the ability to provide frequency regulation; however, existing dynamic frequency simulation tools-which were developed mainly for the transmission system-lack the capability to simulate distribution network dynamics with high penetrations of DERs. Although electromagnetic transient (EMT) simulation tools can simulate distribution network dynamics, the computation efficiency limits their use for large-scale transmission-and-distribution (T&D) simulations. This paper presents an efficient T&D dynamic frequency co-simulation framework for DER frequency response based on the HELICS platform and existing off-the-shelf simulators. The challenge of synchronizing frequency between the transmission network and DERs hosted in the distribution network is approached by detailed modeling of DERs in frequency dynamic models while DER phasor models are also preserved in the distribution networks. Thereby, local voltage constraints can be respected when dispatching the DER power for frequency response. The DER frequency responses (primary and secondary)-are simulated in case studies to validate the proposed framework. Lastly, fault-induced delayed voltage recovery (FIDVR) event of a large system is presented to demonstrate the efficiency and effectiveness of the overall framework.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Effective Parallelism for Equation and Jacobian Evaluation in Power Flow Calculation
Authors:
Hantao Cui,
Fangxing Li,
Xin Fang
Abstract:
This letter investigates parallelism approaches for equation and Jacobian evaluations in large-scale power flow calculation. Two levels of parallelism are proposed and analyzed: inter-model parallelism, which evaluates models in parallel, and intra-model parallelism, which evaluates calculations within each model in parallel. Parallelism techniques such as multi-threading and single instruction mu…
▽ More
This letter investigates parallelism approaches for equation and Jacobian evaluations in large-scale power flow calculation. Two levels of parallelism are proposed and analyzed: inter-model parallelism, which evaluates models in parallel, and intra-model parallelism, which evaluates calculations within each model in parallel. Parallelism techniques such as multi-threading and single instruction multiple data (SIMD) vectorization are discussed, implemented, and benchmarked as six calculation workflows. Case studies on the 70,000-bus synthetic grid show that equation evaluations can be accelerated by ten times, and the overall Newton power flow advances the state of the art by 20%.
△ Less
Submitted 21 August, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Polynomial Chaos-Based Flight Control Optimization with Guaranteed Probabilistic Performance
Authors:
Dalong Shi,
Xiang Fang,
Florian Holzapfel
Abstract:
A probabilistic performance-oriented controller design approach based on polynomial chaos expansion and optimization is proposed for flight dynamic systems. Unlike robust control techniques where uncertainties are conservatively handled, the proposed method aims at propagating uncertainties effectively and optimizing control parameters to satisfy the probabilistic requirements directly. To achieve…
▽ More
A probabilistic performance-oriented controller design approach based on polynomial chaos expansion and optimization is proposed for flight dynamic systems. Unlike robust control techniques where uncertainties are conservatively handled, the proposed method aims at propagating uncertainties effectively and optimizing control parameters to satisfy the probabilistic requirements directly. To achieve this, the sensitivities of violation probabilities are evaluated by the expansion coefficients and the fourth moment method for reliability analysis, after which an optimization that minimizes failure probability under chance constraints is conducted. Afterward, a time-dependent polynomial chaos expansion is performed to validate the results. With this approach, the failure probability is reduced while guaranteeing the closed-loop performance, thus increasing the safety margin. Simulations are carried out on a longitudinal model subject to uncertain parameters to demonstrate the effectiveness of this approach.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
BER Performance of Spatial Modulation Systems Under a Non-Stationary Massive MIMO Channel Model
Authors:
Yu Fu,
Cheng-Xiang Wang,
Xuming Fang,
Li Yan,
Stephen McLaughlin
Abstract:
In this paper, the bit error rate (BER) performance of spatial modulation (SM) systems is investigated both theoretically and by simulation in a non-stationary Kronecker-based massive multiple-input-multiple-output (MIMO) channel model in multi-user (MU) scenarios. Massive MIMO SM systems are considered in this paper using both a time-division multiple access (TDMA) scheme and a block diagonalizat…
▽ More
In this paper, the bit error rate (BER) performance of spatial modulation (SM) systems is investigated both theoretically and by simulation in a non-stationary Kronecker-based massive multiple-input-multiple-output (MIMO) channel model in multi-user (MU) scenarios. Massive MIMO SM systems are considered in this paper using both a time-division multiple access (TDMA) scheme and a block diagonalization (BD) based precoding scheme, for different system settings. Their performance is compared with a vertical Bell labs layered space-time (V-BLAST) architecture based system and a conventional channel inversion system. It is observed that a higher cluster evolution factor can result in better BER performance of SM systems due to the low correlation among sub-channels. Compared with the BD-SM system, the SM system using the TDMA scheme obtains a better BER performance but with a much lower total system data rate. The BD-MU-SM system achieves the best trade-off between the data rate and the BER performance among all of the systems considered. When compared with the V-BLAST system and the channel inversion system, SM approaches offer advantages in performance for MU massive MIMO systems.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
IEEE 802.11be-Wi-Fi 7: New Challenges and Opportunities
Authors:
Cailian Deng,
Xuming Fang,
Xiao Han,
Xianbin Wang,
Li Yan,
Rong He,
Yan Long,
Yuchen Guo
Abstract:
With the emergence of 4k/8k video, the throughput requirement of video delivery will keep grow to tens of Gbps. Other new high-throughput and low-latency video applications including augmented reality (AR), virtual reality (VR), and online gaming, are also proliferating. Due to the related stringent requirements, supporting these applications over wireless local area network (WLAN) is far beyond t…
▽ More
With the emergence of 4k/8k video, the throughput requirement of video delivery will keep grow to tens of Gbps. Other new high-throughput and low-latency video applications including augmented reality (AR), virtual reality (VR), and online gaming, are also proliferating. Due to the related stringent requirements, supporting these applications over wireless local area network (WLAN) is far beyond the capabilities of the new WLAN standard -- IEEE 802.11ax. To meet these emerging demands, the IEEE 802.11 will release a new amendment standard IEEE 802.11be -- Extremely High Throughput (EHT), also known as Wireless-Fidelity (Wi-Fi) 7. This article provides the comprehensive survey on the key medium access control (MAC) layer techniques and physical layer (PHY) techniques being discussed in the EHT task group, including the channelization and tone plan, multiple resource units (multi-RU) support, 4096 quadrature amplitude modulation (4096-QAM), preamble designs, multiple link operations (e.g., multi-link aggregation and channel access), multiple input multiple output (MIMO) enhancement, multiple access point (multi-AP) coordination (e.g., multi-AP joint transmission), enhanced link adaptation and retransmission protocols (e.g., hybrid automatic repeat request (HARQ)). This survey covers both the critical technologies being discussed in EHT standard and the related latest progresses from worldwide research. Besides, the potential developments beyond EHT are discussed to provide some possible future research directions for WLAN.
△ Less
Submitted 3 August, 2020; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Technical details of distributed localization
Authors:
Xu Fang,
Xiaolei Li,
Lihua Xie
Abstract:
This file proves the properties of the angle constraints and shows how to construct displacement constraints by various kinds of relative measurements.
This file proves the properties of the angle constraints and shows how to construct displacement constraints by various kinds of relative measurements.
△ Less
Submitted 29 July, 2020; v1 submitted 21 July, 2020;
originally announced July 2020.