Search | arXiv e-print repository

Interference Analysis for Coexistence of UAVs and Civil Aircrafts Based on Automatic Dependent Surveillance-Broadcast

Authors: Yiyang Liao, Ziye Jia, Chao Dong, Lei Zhang, Qihui Wu, Huiling Hu, Zhu Han

Abstract: Due to the advantages of high mobility and easy deployment, unmanned aerial vehicles (UAVs) are widely applied in both military and civilian fields. In order to strengthen the flight surveillance of UAVs and guarantee the airspace safety, UAVs can be equipped with the automatic dependent surveillance-broadcast (ADS-B) system, which periodically sends flight information to other aircrafts and groun… ▽ More Due to the advantages of high mobility and easy deployment, unmanned aerial vehicles (UAVs) are widely applied in both military and civilian fields. In order to strengthen the flight surveillance of UAVs and guarantee the airspace safety, UAVs can be equipped with the automatic dependent surveillance-broadcast (ADS-B) system, which periodically sends flight information to other aircrafts and ground stations (GSs). However, due to the limited resource of channel capacity, UAVs equipped with ADS-B results in the interference between UAVs and civil aircrafts (CAs), which further impacts the accuracy of received information at GSs. In detail, the channel capacity is mainly affected by the density of aircrafts and the transmitting power of ADS-B. Hence, based on the three-dimensional poisson point process, this work leverages the stochastic geometry theory to build a model of the coexistence of UAVs and CAs and analyze the interference performance of ADS-B monitoring system. From simulation results, we reveal the effects of transmitting power, density, threshold and pathloss on the performance of the ADS-B monitoring system. Besides, we provide the suggested transmitting power and density for the safe coexistence of UAVs and CAs. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.14905 [pdf, other]

Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation

Authors: Kang Liu, Zhuoqi Ma, Xiaolu Kang, Zhusi Zhong, Zhicheng Jiao, Grayson Baird, Harrison Bai, Qiguang Miao

Abstract: The automated generation of imaging reports proves invaluable in alleviating the workload of radiologists. A clinically applicable reports generation algorithm should demonstrate its effectiveness in producing reports that accurately describe radiology findings and attend to patient-specific indications. In this paper, we introduce a novel method, \textbf{S}tructural \textbf{E}ntities extraction a… ▽ More The automated generation of imaging reports proves invaluable in alleviating the workload of radiologists. A clinically applicable reports generation algorithm should demonstrate its effectiveness in producing reports that accurately describe radiology findings and attend to patient-specific indications. In this paper, we introduce a novel method, \textbf{S}tructural \textbf{E}ntities extraction and patient indications \textbf{I}ncorporation (SEI) for chest X-ray report generation. Specifically, we employ a structural entities extraction (SEE) approach to eliminate presentation-style vocabulary in reports and improve the quality of factual entity sequences. This reduces the noise in the following cross-modal alignment module by aligning X-ray images with factual entity sequences in reports, thereby enhancing the precision of cross-modal alignment and further aiding the model in gradient-free retrieval of similar historical cases. Subsequently, we propose a cross-modal fusion network to integrate information from X-ray images, similar historical cases, and patient-specific indications. This process allows the text decoder to attend to discriminative features of X-ray images, assimilate historical diagnostic information from similar cases, and understand the examination intention of patients. This, in turn, assists in triggering the text decoder to produce high-quality reports. Experiments conducted on MIMIC-CXR validate the superiority of SEI over state-of-the-art approaches on both natural language generation and clinical efficacy metrics. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: The code is available at https://github.com/mk-runner/SEI-Temp or https://github.com/mk-runner/SEI

arXiv:2405.14113 [pdf, other]

Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation

Authors: Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao

Abstract: In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that foc… ▽ More In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that focuses on high-risk regions. By learning spatial correlation in the detector, MRANet visually grounds region-specific descriptions, providing robust anatomical regions with a completion strategy. The visual features of each region are embedded using a novel survival attention mechanism, offering spatially and risk-aware features for sentence encoding while maintaining global coherence across tasks. A cross LLMs alignment is employed to enhance the image-to-text transfer process, resulting in sentences rich with clinical detail and improved explainability for radiologist. Multi-center experiments validate both MRANet's overall performance and each module's composition within the model, encouraging further advancements in radiology report generation research emphasizing clinical interpretation and trustworthiness in AI models applied to medical studies. The code is available at https://github.com/zzs95/MRANet. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.09586 [pdf, other]

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

Authors: Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

Abstract: The automation of writing imaging reports is a valuable tool for alleviating the workload of radiologists. Crucial steps in this process involve the cross-modal alignment between medical images and reports, as well as the retrieval of similar historical cases. However, the presence of presentation-style vocabulary (e.g., sentence structure and grammar) in reports poses challenges for cross-modal a… ▽ More The automation of writing imaging reports is a valuable tool for alleviating the workload of radiologists. Crucial steps in this process involve the cross-modal alignment between medical images and reports, as well as the retrieval of similar historical cases. However, the presence of presentation-style vocabulary (e.g., sentence structure and grammar) in reports poses challenges for cross-modal alignment. Additionally, existing methods for similar historical cases retrieval face suboptimal performance owing to the modal gap issue. In response, this paper introduces a novel method, named Factual Serialization Enhancement (FSE), for chest X-ray report generation. FSE begins with the structural entities approach to eliminate presentation-style vocabulary in reports, providing specific input for our model. Then, uni-modal features are learned through cross-modal alignment between images and factual serialization in reports. Subsequently, we present a novel approach to retrieve similar historical cases from the training set, leveraging aligned image features. These features implicitly preserve semantic similarity with their corresponding reference reports, enabling us to calculate similarity solely among aligned features. This effectively eliminates the modal gap issue for knowledge retrieval without the requirement for disease labels. Finally, the cross-modal fusion network is employed to query valuable information from these cases, enriching image features and aiding the text decoder in generating high-quality reports. Experiments on MIMIC-CXR and IU X-ray datasets from both specific and general scenarios demonstrate the superiority of FSE over state-of-the-art approaches in both natural language generation and clinical efficacy metrics. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.00733 [pdf, other]

Joint ADS-B in 5G for Hierarchical Aerial Networks: Performance Analysis and Optimization

Authors: Ziye Jia, Yiyang Liao, Chao Dong, Lijun He, Qihui Wu, Lei Zhang

Abstract: Unmanned aerial vehicles (UAVs) are widely applied in multiple fields, which emphasizes the challenge of obtaining UAV flight information to ensure the airspace safety. UAVs equipped with automatic dependent surveillance-broadcast (ADS-B) devices are capable of sending flight information to nearby aircrafts and ground stations (GSs). However, the saturation of limited frequency bands of ADS-B lead… ▽ More Unmanned aerial vehicles (UAVs) are widely applied in multiple fields, which emphasizes the challenge of obtaining UAV flight information to ensure the airspace safety. UAVs equipped with automatic dependent surveillance-broadcast (ADS-B) devices are capable of sending flight information to nearby aircrafts and ground stations (GSs). However, the saturation of limited frequency bands of ADS-B leads to interferences among UAVs and impairs the monitoring performance of GS to civil planes. To address this issue, the integration of the 5th generation mobile communication technology (5G) with ADS-B is proposed for UAV operations in this paper. Specifically, a hierarchical structure is proposed, in which the high-altitude central UAV is equipped with ADS-B and the low-altitude central UAV utilizes 5G modules to transmit flight information. Meanwhile, based on the mobile edge computing technique, the flight information of sub-UAVs is offloaded to the central UAV for further processing, and then transmitted to GS. We present the deterministic model and stochastic geometry based model to build the air-to-ground channel and air-to-air channel, respectively. The effectiveness of the proposed monitoring system is verified via simulations and experiments. This research contributes to improving the airspace safety and advancing the air traffic flow management. △ Less

Submitted 29 April, 2024; originally announced May 2024.

arXiv:2404.18436 [pdf, other]

Three-Dimension Collision-Free Trajectory Planning of UAVs Based on ADS-B Information in Low-Altitude Urban Airspace

Authors: Chao Dong, Yifan Zhang, Ziye Jia, Yiyang Liao, Lei Zhang, Qihui Wu

Abstract: The environment of low-altitude urban airspace is complex and variable due to numerous obstacles, non-cooperative aircrafts, and birds. Unmanned aerial vehicles (UAVs) leveraging environmental information to achieve three-dimension collision-free trajectory planning is the prerequisite to ensure airspace security. However, the timely information of surrounding situation is difficult to acquire by… ▽ More The environment of low-altitude urban airspace is complex and variable due to numerous obstacles, non-cooperative aircrafts, and birds. Unmanned aerial vehicles (UAVs) leveraging environmental information to achieve three-dimension collision-free trajectory planning is the prerequisite to ensure airspace security. However, the timely information of surrounding situation is difficult to acquire by UAVs, which further brings security risks. As a mature technology leveraged in traditional civil aviation, the automatic dependent surveillance-broadcast (ADS-B) realizes continuous surveillance of the information of aircrafts. Consequently, we leverage ADS-B for surveillance and information broadcasting, and divide the aerial airspace into multiple sub-airspaces to improve flight safety in UAV trajectory planning. In detail, we propose the secure sub-airspaces planning (SSP) algorithm and particle swarm optimization rapidly-exploring random trees (PSO-RRT) algorithm for the UAV trajectory planning in law-altitude airspace. The performance of the proposed algorithm is verified by simulations and the results show that SSP reduces both the maximum number of UAVs in the sub-airspace and the length of the trajectory, and PSO-RRT reduces the cost of UAV trajectory in the sub-airspace. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2402.01138 [pdf, other]

Graph Neural Networks in EEG-based Emotion Recognition: A Survey

Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Ruizhi Yang, Liming Zhai, Ziyu Jia, Yang Liu

Abstract: Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. H… ▽ More Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. However, brain region dependencies in emotional EEG have physiological bases that distinguish GNNs in this field from those in other time series fields. Besides, there is neither a comprehensive review nor guidance for constructing GNNs in EEG-based emotion recognition. In the survey, our categorization reveals the commonalities and differences of existing approaches under a unified framework of graph construction. We analyze and categorize methods from three stages in the framework to provide clear guidance on constructing GNNs in EEG-based emotion recognition. In addition, we discuss several open challenges and future directions, such as Temporal full-connected graph and Graph condensation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.06419 [pdf, other]

Energy-Efficient Data Offloading for Earth Observation Satellite Networks

Authors: Lijun He, Ziye Jia, Juncheng Wang, Feng Wang, Erick Lansard, Chau Yuen

Abstract: In Earth Observation Satellite Networks (EOSNs) with a large number of battery-carrying satellites, proper power allocation and task scheduling are crucial to improving the data offloading efficiency. As such, we jointly optimize power allocation and task scheduling to achieve energy-efficient data offloading in EOSNs, aiming to balance the objectives of reducing the total energy consumption and i… ▽ More In Earth Observation Satellite Networks (EOSNs) with a large number of battery-carrying satellites, proper power allocation and task scheduling are crucial to improving the data offloading efficiency. As such, we jointly optimize power allocation and task scheduling to achieve energy-efficient data offloading in EOSNs, aiming to balance the objectives of reducing the total energy consumption and increasing the sum weights of tasks. First, we derive the optimal power allocation solution to the joint optimization problem when the task scheduling policy is given. Second, leveraging the conflict graph model, we transform the original joint optimization problem into a maximum weight independent set problem when the power allocation strategy is given. Finally, we utilize the genetic framework to combine the above special solutions as a two-layer solution for the joint optimization problem. Simulation results demonstrate that our proposed solution can properly balance the sum weights of tasks and the total energy consumption, achieving superior system performance over the current best alternatives. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2312.15721 [pdf, ps, other]

UAV Trajectory Tracking via RNN-enhanced IMM-KF with ADS-B Data

Authors: Yian Zhu, Ziye Jia, Qihui Wu, Chao Dong, Zirui Zhuang, Huiling Hu, Qi Cai

Abstract: With the increasing use of autonomous unmanned aerial vehicles (UAVs), it is critical to ensure that they are continuously tracked and controlled, especially when UAVs operate beyond the communication range of ground stations (GSs). Conventional surveillance methods for UAVs, such as satellite communications, ground mobile networks and radars are subject to high costs and latency. The automatic de… ▽ More With the increasing use of autonomous unmanned aerial vehicles (UAVs), it is critical to ensure that they are continuously tracked and controlled, especially when UAVs operate beyond the communication range of ground stations (GSs). Conventional surveillance methods for UAVs, such as satellite communications, ground mobile networks and radars are subject to high costs and latency. The automatic dependent surveillance-broadcast (ADS-B) emerges as a promising method to monitor UAVs, due to the advantages of real-time capabilities, easy deployment and affordable cost. Therefore, we employ the ADS-B for UAV trajectory tracking in this work. However, the inherent noise in the transmitted data poses an obstacle for precisely tracking UAVs. Hence, we propose the algorithm of recurrent neural network-enhanced interacting multiple model-Kalman filter (RNN-enhanced IMM-KF) for UAV trajectory filtering. Specifically, the algorithm utilizes the RNN to capture the maneuvering behavior of UAVs and the noise level in the ADS-B data. Moreover, accurate UAV tracking is achieved by adaptively adjusting the process noise matrix and observation noise matrix of IMM-KF with the assistance of the RNN. The proposed algorithm can facilitate GSs to make timely decisions during trajectory deviations of UAVs and improve the airspace safety. Finally, via comprehensive simulations, the total root mean square error of the proposed algorithm decreases by 28.56%, compared to the traditional IMM-KF. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.15633 [pdf, other]

MuLA-GAN: Multi-Level Attention GAN for Enhanced Underwater Visibility

Authors: Ahsan Baidar Bakht, Zikai Jia, Muhayy ud Din, Waseem Akram, Lyes Saad Soud, Lakmal Seneviratne, Defu Lin, Shaoming He, Irfan Hussain

Abstract: The underwater environment presents unique challenges, including color distortions, reduced contrast, and blurriness, hindering accurate analysis. In this work, we introduce MuLA-GAN, a novel approach that leverages the synergistic power of Generative Adversarial Networks (GANs) and Multi-Level Attention mechanisms for comprehensive underwater image enhancement. The integration of Multi-Level Atte… ▽ More The underwater environment presents unique challenges, including color distortions, reduced contrast, and blurriness, hindering accurate analysis. In this work, we introduce MuLA-GAN, a novel approach that leverages the synergistic power of Generative Adversarial Networks (GANs) and Multi-Level Attention mechanisms for comprehensive underwater image enhancement. The integration of Multi-Level Attention within the GAN architecture significantly enhances the model's capacity to learn discriminative features crucial for precise image restoration. By selectively focusing on relevant spatial and multi-level features, our model excels in capturing and preserving intricate details in underwater imagery, essential for various applications. Extensive qualitative and quantitative analyses on diverse datasets, including UIEB test dataset, UIEB challenge dataset, U45, and UCCS dataset, highlight the superior performance of MuLA-GAN compared to existing state-of-the-art methods. Experimental evaluations on a specialized dataset tailored for bio-fouling and aquaculture applications demonstrate the model's robustness in challenging environmental conditions. On the UIEB test dataset, MuLA-GAN achieves exceptional PSNR (25.59) and SSIM (0.893) scores, surpassing Water-Net, the second-best model, with scores of 24.36 and 0.885, respectively. This work not only addresses a significant research gap in underwater image enhancement but also underscores the pivotal role of Multi-Level Attention in enhancing GANs, providing a novel and comprehensive framework for restoring underwater image quality. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2309.11992 [pdf, other]

UAV Swarm Deployment and Trajectory for 3D Area Coverage via Reinforcement Learning

Authors: Jia He, Ziye Jia, Chao Dong, Junyu Liu, Qihui Wu, **gxian Liu

Abstract: Unmanned aerial vehicles (UAVs) are recognized as promising technologies for area coverage due to the flexibility and adaptability. However, the ability of a single UAV is limited, and as for the large-scale three-dimensional (3D) scenario, UAV swarms can establish seamless wireless communication services. Hence, in this work, we consider a scenario of UAV swarm deployment and trajectory to satisf… ▽ More Unmanned aerial vehicles (UAVs) are recognized as promising technologies for area coverage due to the flexibility and adaptability. However, the ability of a single UAV is limited, and as for the large-scale three-dimensional (3D) scenario, UAV swarms can establish seamless wireless communication services. Hence, in this work, we consider a scenario of UAV swarm deployment and trajectory to satisfy 3D coverage considering the effects of obstacles. In detail, we propose a hierarchical swarm framework to efficiently serve the large-area users. Then, the problem is formulated to minimize the total trajectory loss of the UAV swarm. However, the problem is intractable due to the non-convex property, and we decompose it into smaller issues of users clustering, UAV swarm hovering points selection, and swarm trajectory determination. Moreover, we design a Q-learning based algorithm to accelerate the solution efficiency. Finally, we conduct extensive simulations to verify the proposed mechanisms, and the designed algorithm outperforms other referred methods. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2307.09729 [pdf, other]

NTIRE 2023 Quality Assessment of Video Enhancement Challenge

Authors: Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu , et al. (47 additional authors not shown)

Abstract: This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual… ▽ More This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual Video Enhancement (VDPVE), which has a total of 1211 enhanced videos, including 600 videos with color, brightness, and contrast enhancements, 310 videos with deblurring, and 301 deshaked videos. The challenge has a total of 167 registered participants. 61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions. A total of 176 submissions were submitted by 37 participating teams during the final testing phase. Finally, 19 participating teams submitted their models and fact sheets, and detailed the methods they used. Some methods have achieved better results than baseline methods, and the winning methods have demonstrated superior prediction performance. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.09008 [pdf, other]

Soft-IntroVAE for Continuous Latent space Image Super-Resolution

Authors: Zhi-Song Liu, Zijia Wang, Zhen Jia

Abstract: Continuous image super-resolution (SR) recently receives a lot of attention from researchers, for its practical and flexible image scaling for various displays. Local implicit image representation is one of the methods that can map the coordinates and 2D features for latent space interpolation. Inspired by Variational AutoEncoder, we propose a Soft-introVAE for continuous latent space image super-… ▽ More Continuous image super-resolution (SR) recently receives a lot of attention from researchers, for its practical and flexible image scaling for various displays. Local implicit image representation is one of the methods that can map the coordinates and 2D features for latent space interpolation. Inspired by Variational AutoEncoder, we propose a Soft-introVAE for continuous latent space image super-resolution (SVAE-SR). A novel latent space adversarial training is achieved for photo-realistic image restoration. To further improve the quality, a positional encoding scheme is used to extend the original pixel coordinates by aggregating frequency information over the pixel areas. We show the effectiveness of the proposed SVAE-SR through quantitative and qualitative comparisons, and further, illustrate its generalization in denoising and real-image super-resolution. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: 5 pages, 4 figures

Journal ref: 2023 IEEE International Conference on Image Processing

arXiv:2307.01534 [pdf, other]

Impact of UAVs Equipped with ADS-B on the Civil Aviation Monitoring System

Authors: Yiyang Liao, Lei Zhang, Ziye Jia, Chao Dong, Yifan Zhang, Qihui Wu, Huiling Hu, Bin Wang

Abstract: In recent years, there is an increasing demand for unmanned aerial vehicles (UAVs) to complete multiple applications. However, as unmanned equipments, UAVs lead to some security risks to general civil aviations. In order to strengthen the flight management of UAVs and guarantee the safety, UAVs can be equipped with automatic dependent surveillance-broadcast (ADS-B) devices. In addition, as an auto… ▽ More In recent years, there is an increasing demand for unmanned aerial vehicles (UAVs) to complete multiple applications. However, as unmanned equipments, UAVs lead to some security risks to general civil aviations. In order to strengthen the flight management of UAVs and guarantee the safety, UAVs can be equipped with automatic dependent surveillance-broadcast (ADS-B) devices. In addition, as an automatic system, ADS-B can periodically broadcast flight information to the nearby aircrafts or the ground stations, and the technology is already used in civil aviation systems. However, due to the limited frequency of ADS-B technique, UAVs equipped with ADS-B devices result in the loss of packets to both UAVs and civil aviation. Further, the operation of civil aviation are seriously interfered. Hence, this paper firstly examines the packets loss of civil planes at different distance, then analyzes the impact of UAVs equipped with ADS-B on the packets updating of civil planes. The result indicates that the 1090MHz band blocking is affected by the density of UAVs. Besides, the frequency capacity is affected by the requirement of updating interval of civil planes. The position updating probability within 3s is 92.3% if there are 200 planes within 50km and 20 UAVs within 5km. The position updating probability within 3s is 86.9% if there are 200 planes within 50km and 40 UAVs within 5km. △ Less

Submitted 4 July, 2023; originally announced July 2023.

arXiv:2307.00234 [pdf, ps, other]

The Potential of LEO Satellites in 6G Space-Air-Ground Enabled Access Networks

Authors: Ziye Jia, Chao Dong, Kun Guo, Qihui Wu

Abstract: Space-air-ground integrated networks (SAGINs) help enhance the service performance in the sixth generation communication system. SAGIN is basically composed of satellites, aerial vehicles, ground facilities, as well as multiple terrestrial users. Therein, the low earth orbit (LEO) satellites are popular in recent years due to the low cost of development and launch, global coverage and delay-enable… ▽ More Space-air-ground integrated networks (SAGINs) help enhance the service performance in the sixth generation communication system. SAGIN is basically composed of satellites, aerial vehicles, ground facilities, as well as multiple terrestrial users. Therein, the low earth orbit (LEO) satellites are popular in recent years due to the low cost of development and launch, global coverage and delay-enabled services. Moreover, LEO satellites can support various applications, e.g., direct access, relay, caching and computation. In this work, we firstly provide the preliminaries and framework of SAGIN, in which the characteristics of LEO satellites, high altitude platforms, as well as unmanned aerial vehicles are analyzed. Then, the roles and potentials of LEO satellite in SAGIN are analyzed for access services. A couple of advanced techniques such as multi-access edge computing (MEC) and network function virtualization are introduced to enhance the LEO-based access service abilities as hierarchical MEC and network slicing in SAGIN. In addition, corresponding use cases are provided to verify the propositions. Besides, we also discuss the open issues and promising directions in LEO-enabled SAGIN access services for the future research. △ Less

Submitted 1 July, 2023; originally announced July 2023.

arXiv:2306.14105 [pdf, other]

Sequential Manipulation Planning for Over-actuated Unmanned Aerial Manipulators

Authors: Yao Su, Jiarui Li, Ziyuan Jiao, Meng Wang, Chi Chu, Hang Li, Yixin Zhu, Hangxin Liu

Abstract: We investigate the sequential manipulation planning problem for unmanned aerial manipulators (UAMs). Unlike prior work that primarily focuses on one-step manipulation tasks, sequential manipulations require coordinated motions of a UAM's floating base, the manipulator, and the object being manipulated, entailing a unified kinematics and dynamics model for motion planning under designated constrain… ▽ More We investigate the sequential manipulation planning problem for unmanned aerial manipulators (UAMs). Unlike prior work that primarily focuses on one-step manipulation tasks, sequential manipulations require coordinated motions of a UAM's floating base, the manipulator, and the object being manipulated, entailing a unified kinematics and dynamics model for motion planning under designated constraints. By leveraging a virtual kinematic chain (VKC)-based motion planning framework that consolidates components' kinematics into one chain, the sequential manipulation task of a UAM can be planned as a whole, yielding more coordinated motions. Integrating the kinematics and dynamics models with a hierarchical control framework, we demonstrate, for the first time, an over-actuated UAM achieves a series of new sequential manipulation capabilities in both simulation and experiment. △ Less

Submitted 10 July, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

Journal ref: IROS 2023

arXiv:2305.16115 [pdf]

Development of high accurate family-use digital refractometer based on CMOS

Authors: Zhenxing Wang, Zhenyuan Jia

Abstract: This study aims to develop a low-cost refractometer for measuring the sucrose content of fruit juice, which is an important factor affecting human health. While laboratory-grade refractometers are expensive and unsuitable for personal use, existing low-cost commercial options lack stability and accuracy. To address this gap, we propose a refractometer that replaces the expensive CCD sensor and lig… ▽ More This study aims to develop a low-cost refractometer for measuring the sucrose content of fruit juice, which is an important factor affecting human health. While laboratory-grade refractometers are expensive and unsuitable for personal use, existing low-cost commercial options lack stability and accuracy. To address this gap, we propose a refractometer that replaces the expensive CCD sensor and light source with a conventional LED and a reasonably priced CMOS sensor. By analyzing the output waveform pattern of the CMOS sensor, we achieve high precision with a personal-use-appropriate accuracy of 0.1%. We tested the proposed refractometer by conducting 100 repeated measurements on various fruit juice samples, and the results demonstrate its reliability and consistency. Running on a 48 MHz ARM processor, the algorithm can acquire data within 0.2 seconds. Our low-cost refractometer is suitable for personal health management and small-scale production, providing an affordable and reliable method for measuring sucrose concentration in fruit juice. It improves upon the existing low-cost options by offering better stability and accuracy. This accessible tool has potential applications in optimizing the sucrose content of fruit juice for better health and quality control. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.05105 [pdf, other]

TinyML Design Contest for Life-Threatening Ventricular Arrhythmia Detection

Authors: Zhenge Jia, Dawei Li, Cong Liu, Liqi Liao, Xiaowei Xu, Lichuan **, Yiyu Shi

Abstract: The first ACM/IEEE TinyML Design Contest (TDC) held at the 41st International Conference on Computer-Aided Design (ICCAD) in 2022 is a challenging, multi-month, research and development competition. TDC'22 focuses on real-world medical problems that require the innovation and implementation of artificial intelligence/machine learning (AI/ML) algorithms on implantable devices. The challenge problem… ▽ More The first ACM/IEEE TinyML Design Contest (TDC) held at the 41st International Conference on Computer-Aided Design (ICCAD) in 2022 is a challenging, multi-month, research and development competition. TDC'22 focuses on real-world medical problems that require the innovation and implementation of artificial intelligence/machine learning (AI/ML) algorithms on implantable devices. The challenge problem of TDC'22 is to develop a novel AI/ML-based real-time detection algorithm for life-threatening ventricular arrhythmia over low-power microcontrollers utilized in Implantable Cardioverter-Defibrillators (ICDs). The dataset contains more than 38,000 5-second intracardiac electrograms (IEGMs) segments over 8 different types of rhythm from 90 subjects. The dedicated hardware platform is NUCLEO-L432KC manufactured by STMicroelectronics. TDC'22, which is open to multi-person teams world-wide, attracted more than 150 teams from over 50 organizations. This paper first presents the medical problem, dataset, and evaluation procedure in detail. It further demonstrates and discusses the designs developed by the leading teams as well as representative results. This paper concludes with the direction of improvement for the future TinyML design for health monitoring applications. △ Less

Submitted 26 August, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: The paper is about the first TinyML design contest for healthcare

arXiv:2304.14920 [pdf, other]

An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Authors: Xinliang Zhou, Dan Lin, Ziyu Jia, Chenyu Liu, Liming Zhai, Yang Liu

Abstract: Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or ful… ▽ More Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or full-head channel EEG data for model training, resulting in limited performance of driver drowsiness detection. In this paper, we are the first to propose an Interpretability-guided Channel Selection (ICS) framework for the driver drowsiness detection task. Specifically, we design a two-stage training strategy to progressively select the key contributing channels with the guidance of interpretability. We first train a teacher network in the first stage using full-head channel EEG data. Then we apply the class activation map** (CAM) to the trained teacher model to highlight the high-contributing EEG channels and further propose a channel voting scheme to select the top N contributing EEG channels. Finally, we train a student network with the selected channels of EEG data in the second stage for driver drowsiness detection. Experiments are designed on a public dataset, and the results demonstrate that our method is highly applicable and can significantly improve the performance of cross-subject driver drowsiness detection. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.10755 [pdf, other]

Interpretable and Robust AI in EEG Systems: A Survey

Authors: Xinliang Zhou, Chenyu Liu, Liming Zhai, Ziyu Jia, Cuntai Guan, Yang Liu

Abstract: The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain th… ▽ More The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain the trust of users. The robustness reflects the AI's reliability against attacks and perturbations, which is essential for sensitive and fragile EEG signals. Thus the interpretability and robustness of AI in EEG systems have attracted increasing attention, and their research has achieved great progress recently. However, there is still no survey covering recent advances in this field. In this paper, we present the first comprehensive survey and summarize the interpretable and robust AI techniques for EEG systems. Specifically, we first propose a taxonomy of interpretability by characterizing it into three types: backpropagation, perturbation, and inherently interpretable methods. Then we classify the robustness mechanisms into four classes: noise and artifacts, human variability, data acquisition instability, and adversarial attacks. Finally, we identify several critical and unresolved challenges for interpretable and robust AI in EEG systems and further discuss their future directions. △ Less

Submitted 30 August, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2303.06933 [pdf, ps, other]

Distributionally Robust Chance-Constrained Optimization for Hierarchical UAV-based MEC

Authors: Can Cui, Ziye Jia, Chao Dong, Zhuang Ling, Jiahao You, Qihui Wu

Abstract: Multi-access edge computing (MEC) is regarded as a promising technology in the sixth-generation communication. However, the antenna gain is always affected by the environment when unmanned aerial vehicles (UAVs) are served as MEC platforms, resulting in unexpected channel errors. In order to deal with the problem and reduce the power consumption in the UAV-based MEC, we jointly optimize the access… ▽ More Multi-access edge computing (MEC) is regarded as a promising technology in the sixth-generation communication. However, the antenna gain is always affected by the environment when unmanned aerial vehicles (UAVs) are served as MEC platforms, resulting in unexpected channel errors. In order to deal with the problem and reduce the power consumption in the UAV-based MEC, we jointly optimize the access scheme and power allocation in the hierarchical UAV-based MEC. Specifically, UAVs are deployed in the lower layer to collect data from ground users. Moreover, a UAV with powerful computation ability is deployed in the upper layer to assist with computing. The goal is to guarantee the quality of service and minimize the total power consumption. We consider the errors caused by various perturbations in realistic circumstances and formulate a distributionally robust chance-constrained optimization problem with an uncertainty set. The problem with chance constraints is intractable. To tackle this issue, we utilize the conditional value-at-risk method to reformulate the problem into a semidefinite programming form. Then, a joint algorithm for access scheme and power allocation is designed. Finally, we conduct simulations to demonstrate the efficiency of the proposed algorithm. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2303.01020 [pdf, other]

SFC Deployment in Space-Air-Ground Integrated Networks Based on Matching Game

Authors: Yilu Cao, Ziye Jia, Chao Dong, Yanting Wang, Jiahao You, Qihui Wu

Abstract: The space-air-ground integrated network (SAGIN) is dynamic and flexible, which can support transmitting data in environments lacking ground communication facilities. However, the nodes of SAGIN are heterogeneous and it is intractable to share the resources to provide multiple services. Therefore, in this paper, we consider using network function virtualization technology to handle the problem of a… ▽ More The space-air-ground integrated network (SAGIN) is dynamic and flexible, which can support transmitting data in environments lacking ground communication facilities. However, the nodes of SAGIN are heterogeneous and it is intractable to share the resources to provide multiple services. Therefore, in this paper, we consider using network function virtualization technology to handle the problem of agile resource allocation. In particular, the service function chains (SFCs) are constructed to deploy multiple virtual network functions of different tasks. To depict the dynamic model of SAGIN, we propose the reconfigurable time extension graph. Then, an optimization problem is formulated to maximize the number of completed tasks, i.e., the successful deployed SFC. It is a mixed integer linear programming problem, which is hard to solve in limited time complexity. Hence, we transform it as a many-to-one two-sided matching game problem. Then, we design a Gale-Shapley based algorithm. Finally, via abundant simulations, it is verified that the designed algorithm can effectively deploy SFCs with efficient resource utilization. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2212.08624 [pdf, ps, other]

Development of A Real-time POCUS Image Quality Assessment and Acquisition Guidance System

Authors: Zhenge Jia, Yiyu Shi, **gtong Hu, Lei Yang, Benjamin Nti

Abstract: Point-of-care ultrasound (POCUS) is one of the most commonly applied tools for cardiac function imaging in the clinical routine of the emergency department and pediatric intensive care unit. The prior studies demonstrate that AI-assisted software can guide nurses or novices without prior sonography experience to acquire POCUS by recognizing the interest region, assessing image quality, and providi… ▽ More Point-of-care ultrasound (POCUS) is one of the most commonly applied tools for cardiac function imaging in the clinical routine of the emergency department and pediatric intensive care unit. The prior studies demonstrate that AI-assisted software can guide nurses or novices without prior sonography experience to acquire POCUS by recognizing the interest region, assessing image quality, and providing instructions. However, these AI algorithms cannot simply replace the role of skilled sonographers in acquiring diagnostic-quality POCUS. Unlike chest X-ray, CT, and MRI, which have standardized imaging protocols, POCUS can be acquired with high inter-observer variability. Though being with variability, they are usually all clinically acceptable and interpretable. In challenging clinical environments, sonographers employ novel heuristics to acquire POCUS in complex scenarios. To help novice learners to expedite the training process while reducing the dependency on experienced sonographers in the curriculum implementation, We will develop a framework to perform real-time AI-assisted quality assessment and probe position guidance to provide training process for novice learners with less manual intervention. △ Less

Submitted 18 December, 2022; v1 submitted 16 December, 2022; originally announced December 2022.

arXiv:2211.05256 [pdf, other]

Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

Authors: Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang , et al. (29 additional authors not shown)

Abstract: Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this prob… ▽ More Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: arXiv admin note: text overlap with arXiv:2105.08826, arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.03885

arXiv:2210.15126 [pdf, other]

SWheg: A Wheel-Leg Transformable Robot With Minimalist Actuator Realization

Authors: Cunxi Dai, Xiaohan Liu, Jianxiang Zhou, Zhengtao Liu, Zhenzhong Jia

Abstract: This article presents the design, implementation, and performance evaluation of SWheg, a novel modular wheel-leg transformable robot family with minimalist actuator realization. SWheg takes advantage of both wheeled and legged locomotion by seamlessly integrating them on a single platform. In contrast to other designs that use multiple actuators, SWheg uses only one actuator to drive the transform… ▽ More This article presents the design, implementation, and performance evaluation of SWheg, a novel modular wheel-leg transformable robot family with minimalist actuator realization. SWheg takes advantage of both wheeled and legged locomotion by seamlessly integrating them on a single platform. In contrast to other designs that use multiple actuators, SWheg uses only one actuator to drive the transformation of all the wheel-leg modules in sync. This means an N-legged SWheg robot requires only N+1 actuators, which can significantly reduce the cost and malfunction rate of the platform. The tendon-driven wheel-leg transformation mechanism based on a four-bar linkage can perform fast morphology transitions between wheels and legs. We validated the design principle with two SWheg robots with four and six wheel-leg modules separately, namely Quadrupedal SWheg and Hexapod SWheg. The design process, mechatronics infrastructure, and the gait behavioral development of both platforms were discussed. The performance of the robot was evaluated in various scenarios, including driving and turning in wheeled mode, step crossing, irregular terrain passing, and stair climbing in legged mode. The comparison between these two platforms was also discussed. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2209.08337 [pdf]

doi 10.1016/j.knosys.2022.109824

Lightweight Spatial-Channel Adaptive Coordination of Multilevel Refinement Enhancement Network for Image Reconstruction

Authors: Yuxi Cai, Huicheng Lai, Zhenghong Jia

Abstract: Benefiting from the vigorous development of deep learning, many CNN-based image super-resolution methods have emerged and achieved better results than traditional algorithms. However, it is difficult for most algorithms to adaptively adjust the spatial region and channel features at the same time, let alone the information exchange between them. In addition, the exchange of information between att… ▽ More Benefiting from the vigorous development of deep learning, many CNN-based image super-resolution methods have emerged and achieved better results than traditional algorithms. However, it is difficult for most algorithms to adaptively adjust the spatial region and channel features at the same time, let alone the information exchange between them. In addition, the exchange of information between attention modules is even less visible to researchers. To solve these problems, we put forward a lightweight spatial-channel adaptive coordination of multilevel refinement enhancement networks(MREN). Specifically, we construct a space-channel adaptive coordination block, which enables the network to learn the spatial region and channel feature information of interest under different receptive fields. In addition, the information of the corresponding feature processing level between the spatial part and the channel part is exchanged with the help of jump connection to achieve the coordination between the two. We establish a communication bridge between attention modules through a simple linear combination operation, so as to more accurately and continuously guide the network to pay attention to the information of interest. Extensive experiments on several standard test sets have shown that our MREN achieves superior performance over other advanced algorithms with a very small number of parameters and very low computational complexity. △ Less

Submitted 17 September, 2022; originally announced September 2022.

arXiv:2209.03272 [pdf]

Compact and Robust Deep Learning Architecture for Fluorescence Lifetime Imaging and FPGA Implementation

Authors: Zhenya Zang, Dong Xiao, Quan Wang, Ziao Jiao, Chen Yu, David Day-Uei Li

Abstract: This paper reported a bespoke adder-based deep learning network for time-domain fluorescence lifetime imaging (FLIM). By leveraging the l1-norm extraction method, we propose a 1-D Fluorescence Lifetime AdderNet (FLAN) without multiplication-based convolutions to reduce the computational complexity. Further, we compressed fluorescence decays in temporal dimension using a log-scale merging technique… ▽ More This paper reported a bespoke adder-based deep learning network for time-domain fluorescence lifetime imaging (FLIM). By leveraging the l1-norm extraction method, we propose a 1-D Fluorescence Lifetime AdderNet (FLAN) without multiplication-based convolutions to reduce the computational complexity. Further, we compressed fluorescence decays in temporal dimension using a log-scale merging technique to discard redundant temporal information derived as log-scaling FLAN (FLAN+LS). FLAN+LS achieves 0.11 and 0.23 compression ratios compared with FLAN and a conventional 1-D convolutional neural network (1-D CNN) while maintaining high accuracy in retrieving lifetimes. We extensively evaluated FLAN and FLAN+LS using synthetic and real data. A traditional fitting method and other non-fitting, high-accuracy algorithms were compared with our networks for synthetic data. Our networks attained a minor reconstruction error in different photon-count scenarios. For real data, we used fluorescent beads' data acquired by a confocal microscope to validate the effectiveness of real fluorophores, and our networks can differentiate beads with different lifetimes. Additionally, we implemented the network architecture on a field-programmable gate array (FPGA) with a post-quantization technique to shorten the bit-width, thereby improving computing efficiency. FLAN+LS on hardware achieves the highest computing efficiency compared to 1-D CNN and FLAN. We also discussed the applicability of our network and hardware architecture for other time-resolved biomedical applications using photon-efficient, time-resolved sensors. △ Less

Submitted 9 September, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

Comments: 13 pages, 14 figures

arXiv:2209.00436 [pdf, other]

Recurrent LSTM-based UAV Trajectory Prediction with ADS-B Information

Authors: Yifan Zhang, Ziye Jia, Chao Dong, Yuntian Liu, Lei Zhang, Qihui Wu

Abstract: Recently, unmanned aerial vehicles (UAVs) are gathering increasing attentions from both the academia and industry. The ever-growing number of UAV brings challenges for air traffic control (ATC), and thus trajectory prediction plays a vital role in ATC, especially for avoiding collisions among UAVs. However, the dynamic flight of UAV aggravates the complexity of trajectory prediction. Different wit… ▽ More Recently, unmanned aerial vehicles (UAVs) are gathering increasing attentions from both the academia and industry. The ever-growing number of UAV brings challenges for air traffic control (ATC), and thus trajectory prediction plays a vital role in ATC, especially for avoiding collisions among UAVs. However, the dynamic flight of UAV aggravates the complexity of trajectory prediction. Different with civil aviation aircrafts, the most intractable difficulty for UAV trajectory prediction depends on acquiring effective location information. Fortunately, the automatic dependent surveillance-broadcast (ADS-B) is an effective technique to help obtain positioning information. It is widely used in the civil aviation aircraft, due to its high data update frequency and low cost of corresponding ground stations construction. Hence, in this work, we consider leveraging ADS-B to help UAV trajectory prediction. However, with the ADS-B information for a UAV, it still lacks efficient mechanism to predict the UAV trajectory. It is noted that the recurrent neural network (RNN) is available for the UAV trajectory prediction, in which the long short-term memory (LSTM) is specialized in dealing with the time-series data. As above, in this work, we design a system of UAV trajectory prediction with the ADS-B information, and propose the recurrent LSTM (RLSTM) based algorithm to achieve the accurate prediction. Finally, extensive simulations are conducted by Python to evaluate the proposed algorithms, and the results show that the average trajectory prediction error is satisfied, which is in line with expectations. △ Less

Submitted 1 September, 2022; originally announced September 2022.

arXiv:2208.14600 [pdf, other]

ELSR: Extreme Low-Power Super Resolution Network For Mobile Devices

Authors: Tianyu Xu, Zhuang Jia, Yijian Zhang, Long Bao, Heng Sun

Abstract: With the popularity of mobile devices, e.g., smartphone and wearable devices, lighter and faster model is crucial for the application of video super resolution. However, most previous lightweight models tend to concentrate on reducing lantency of model inference on desktop GPU, which may be not energy efficient in current mobile devices. In this paper, we proposed Extreme Low-Power Super Resolutio… ▽ More With the popularity of mobile devices, e.g., smartphone and wearable devices, lighter and faster model is crucial for the application of video super resolution. However, most previous lightweight models tend to concentrate on reducing lantency of model inference on desktop GPU, which may be not energy efficient in current mobile devices. In this paper, we proposed Extreme Low-Power Super Resolution (ELSR) network which only consumes a small amount of energy in mobile devices. Pretraining and finetuning methods are applied to boost the performance of the extremely tiny model. Extensive experiments show that our method achieves a excellent balance between restoration quality and power consumption. Finally, we achieve a competitive score of 90.9 with PSNR 27.34 dB and power 0.09 W/30FPS on the target MediaTek Dimensity 9000 plantform, ranking 1st place in the Mobile AI & AIM 2022 Real-Time Video Super-Resolution Challenge. △ Less

Submitted 30 August, 2022; originally announced August 2022.

arXiv:2208.11184 [pdf, other]

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

Authors: Ren Yang, Radu Timofte, Xin Li, Qi Zhang, Lin Zhang, Fanglong Liu, Dongliang He, Fu li, He Zheng, Weihang Yuan, Pavel Ostyakov, Dmitry Vyal, Magauiya Zhussip, Xueyi Zou, Youliang Yan, Lei Li, **gzhu Tang, Ming Chen, Shijie Zhao, Yu Zhu, Xiaoran Qin, Chenghua Li, Cong Leng, Jian Cheng, Claudio Rota , et al. (28 additional authors not shown)

Abstract: This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 3… ▽ More This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 365 videos, including the LDV 2.0 dataset (335 videos) and 30 additional videos. In this challenge, there are 12 teams and 2 teams that submitted the final results to Track 1 and Track 2, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution on compressed image and video. The proposed LDV 3.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge is at https://github.com/RenYang-home/AIM22_CompressSR. △ Less

Submitted 25 August, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Camera-ready version

arXiv:2206.01739 [pdf]

Mutual- and Self- Prototype Alignment for Semi-supervised Medical Image Segmentation

Authors: Zhenxi Zhang, Chunna Tian, Zhicheng Jiao

Abstract: Semi-supervised learning methods have been explored in medical image segmentation tasks due to the scarcity of pixel-level annotation in the real scenario. Proto-type alignment based consistency constraint is an intuitional and plausible solu-tion to explore the useful information in the unlabeled data. In this paper, we propose a mutual- and self- prototype alignment (MSPA) framework to better ut… ▽ More Semi-supervised learning methods have been explored in medical image segmentation tasks due to the scarcity of pixel-level annotation in the real scenario. Proto-type alignment based consistency constraint is an intuitional and plausible solu-tion to explore the useful information in the unlabeled data. In this paper, we propose a mutual- and self- prototype alignment (MSPA) framework to better utilize the unlabeled data. In specific, mutual-prototype alignment enhances the information interaction between labeled and unlabeled data. The mutual-prototype alignment imposes two consistency constraints in reverse directions between the unlabeled and labeled data, which enables the consistent embedding and model discriminability on unlabeled data. The proposed self-prototype alignment learns more stable region-wise features within unlabeled images, which optimizes the classification margin in semi-supervised segmentation by boosting the intra-class compactness and inter-class separation on the feature space. Extensive experimental results on three medical datasets demonstrate that with a small amount of labeled data, MSPA achieves large improvements by leveraging the unlabeled data. Our method also outperforms seven state-of-the-art semi-supervised segmentation methods on all three datasets. △ Less

Submitted 2 June, 2022; originally announced June 2022.

Comments: 11 pages, 3 figures

arXiv:2202.06046 [pdf, ps, other]

Hierarchical Aerial Computing for Internet of Things via Cooperation of HAPs and UAVs

Authors: Ziye Jia, Qihui Wu, Chao Dong, Chau Yuen, Zhu Han

Abstract: With the explosive increment of computation requirements, the multi-access edge computing (MEC) paradigm appears as an effective mechanism. Besides, as for the Internet of Things (IoT) in disasters or remote areas requiring MEC services, unmanned aerial vehicles (UAVs) and high altitude platforms (HAPs) are available to provide aerial computing services for these IoT devices. In this paper, we dev… ▽ More With the explosive increment of computation requirements, the multi-access edge computing (MEC) paradigm appears as an effective mechanism. Besides, as for the Internet of Things (IoT) in disasters or remote areas requiring MEC services, unmanned aerial vehicles (UAVs) and high altitude platforms (HAPs) are available to provide aerial computing services for these IoT devices. In this paper, we develop the hierarchical aerial computing framework composed of HAPs and UAVs, to provide MEC services for various IoT applications. In particular, the problem is formulated to maximize the total IoT data computed by the aerial MEC platforms, restricted by the delay requirement of IoT and multiple resource constraints of UAVs and HAPs, which is an integer programming problem and intractable to solve. Due to the prohibitive complexity of exhaustive search, we handle the problem by presenting the matching game theory based algorithm to deal with the offloading decisions from IoT devices to UAVs, as well as a heuristic algorithm for the offloading decisions between UAVs and HAPs. The external effect affected by interplay of different IoT devices in the matching is tackled by the externality elimination mechanism. Besides, an adjustment algorithm is also proposed to make the best of aerial resources. The complexity of proposed algorithms is analyzed and extensive simulation results verify the efficiency of the proposed algorithms, and the system performances are also analyzed by the numerical results. △ Less

Submitted 12 February, 2022; originally announced February 2022.

arXiv:2112.11779 [pdf, other]

Exploring Inter-frequency Guidance of Image for Lightweight Gaussian Denoising

Authors: Zhuang Jia

Abstract: Image denoising is of vital importance in many imaging or computer vision related areas. With the convolutional neural networks showing strong capability in computer vision tasks, the performance of image denoising has also been brought up by CNN based methods. Though CNN based image denoisers show promising results on this task, most of the current CNN based methods try to learn the map** from… ▽ More Image denoising is of vital importance in many imaging or computer vision related areas. With the convolutional neural networks showing strong capability in computer vision tasks, the performance of image denoising has also been brought up by CNN based methods. Though CNN based image denoisers show promising results on this task, most of the current CNN based methods try to learn the map** from noisy image to clean image directly, which lacks the explicit exploration of prior knowledge of images and noises. Natural images are observed to obey the reciprocal power law, implying the low-frequency band of image tend to occupy most of the energy. Thus in the condition of AGWN (additive gaussian white noise) deterioration, low-frequency band tend to preserve a higher PSNR than high-frequency band. Considering the spatial morphological consistency of different frequency bands, low-frequency band with more fidelity can be used as a guidance to refine the more contaminated high-frequency bands. Based on this thought, we proposed a novel network architecture denoted as IGNet, in order to refine the frequency bands from low to high in a progressive manner. Firstly, it decomposes the feature maps into high- and low-frequency subbands using DWT (discrete wavelet transform) iteratively, and then each low band features are used to refine the high band features. Finally, the refined feature maps are processed by a decoder to recover the clean result. With this design, more inter-frequency prior and information are utilized, thus the model size can be lightened while still perserves competitive results. Experiments on several public datasets show that our model obtains competitive performance comparing with other state-of-the-art methods yet with a lightweight structure. △ Less

Submitted 22 December, 2021; originally announced December 2021.

arXiv:2109.01824 [pdf, other]

Multi-View Spatial-Temporal Graph Convolutional Networks with Domain Generalization for Sleep Stage Classification

Authors: Ziyu Jia, Youfang Lin, **g Wang, Xiaojun Ning, Yuanlai He, Ronghao Zhou, Yuhan Zhou, Li-wei H. Lehman

Abstract: Sleep stage classification is essential for sleep assessment and disease diagnosis. Although previous attempts to classify sleep stages have achieved high classification performance, several challenges remain open: 1) How to effectively utilize time-varying spatial and temporal features from multi-channel brain signals remains challenging. Prior works have not been able to fully utilize the spatia… ▽ More Sleep stage classification is essential for sleep assessment and disease diagnosis. Although previous attempts to classify sleep stages have achieved high classification performance, several challenges remain open: 1) How to effectively utilize time-varying spatial and temporal features from multi-channel brain signals remains challenging. Prior works have not been able to fully utilize the spatial topological information among brain regions. 2) Due to the many differences found in individual biological signals, how to overcome the differences of subjects and improve the generalization of deep neural networks is important. 3) Most deep learning methods ignore the interpretability of the model to the brain. To address the above challenges, we propose a multi-view spatial-temporal graph convolutional networks (MSTGCN) with domain generalization for sleep stage classification. Specifically, we construct two brain view graphs for MSTGCN based on the functional connectivity and physical distance proximity of the brain regions. The MSTGCN consists of graph convolutions for extracting spatial features and temporal convolutions for capturing the transition rules among sleep stages. In addition, attention mechanism is employed for capturing the most relevant spatial-temporal information for sleep stage classification. Finally, domain generalization and MSTGCN are integrated into a unified framework to extract subject-invariant sleep features. Experiments on two public datasets demonstrate that the proposed model outperforms the state-of-the-art baselines. △ Less

Submitted 4 September, 2021; originally announced September 2021.

Comments: Accepted by IEEE Transactions on Neural Systems and Rehabilitation Engineering(TNSRE)

arXiv:2105.13864 [pdf, other]

SalientSleepNet: Multimodal Salient Wave Detection Network for Sleep Staging

Authors: Ziyu Jia, Youfang Lin, **g Wang, Xuehui Wang, Peiyi Xie, Yingbin Zhang

Abstract: Sleep staging is fundamental for sleep assessment and disease diagnosis. Although previous attempts to classify sleep stages have achieved high classification performance, several challenges remain open: 1) How to effectively extract salient waves in multimodal sleep data; 2) How to capture the multi-scale transition rules among sleep stages; 3) How to adaptively seize the key role of specific mod… ▽ More Sleep staging is fundamental for sleep assessment and disease diagnosis. Although previous attempts to classify sleep stages have achieved high classification performance, several challenges remain open: 1) How to effectively extract salient waves in multimodal sleep data; 2) How to capture the multi-scale transition rules among sleep stages; 3) How to adaptively seize the key role of specific modality for sleep staging. To address these challenges, we propose SalientSleepNet, a multimodal salient wave detection network for sleep staging. Specifically, SalientSleepNet is a temporal fully convolutional network based on the $\rm U^2$-Net architecture that is originally proposed for salient object detection in computer vision. It is mainly composed of two independent $\rm U^2$-like streams to extract the salient features from multimodal data, respectively. Meanwhile, the multi-scale extraction module is designed to capture multi-scale transition rules among sleep stages. Besides, the multimodal attention module is proposed to adaptively capture valuable information from multimodal data for the specific sleep stage. Experiments on the two datasets demonstrate that SalientSleepNet outperforms the state-of-the-art baselines. It is worth noting that this model has the least amount of parameters compared with the existing deep neural network models. △ Less

Submitted 24 May, 2021; originally announced May 2021.

arXiv:2104.10611 [pdf, other]

FourierNets enable the design of highly non-local optical encoders for computational imaging

Authors: Diptodip Deb, Zhenfei Jiao, Ruth Sims, Alex B. Chen, Michael Broxton, Misha B. Ahrens, Kaspar Podgorski, Srinivas C. Turaga

Abstract: Differentiable simulations of optical systems can be combined with deep learning-based reconstruction networks to enable high performance computational imaging via end-to-end (E2E) optimization of both the optical encoder and the deep decoder. This has enabled imaging applications such as 3D localization microscopy, depth estimation, and lensless photography via the optimization of local optical e… ▽ More Differentiable simulations of optical systems can be combined with deep learning-based reconstruction networks to enable high performance computational imaging via end-to-end (E2E) optimization of both the optical encoder and the deep decoder. This has enabled imaging applications such as 3D localization microscopy, depth estimation, and lensless photography via the optimization of local optical encoders. More challenging computational imaging applications, such as 3D snapshot microscopy which compresses 3D volumes into single 2D images, require a highly non-local optical encoder. We show that existing deep network decoders have a locality bias which prevents the optimization of such highly non-local optical encoders. We address this with a decoder based on a shallow neural network architecture using global kernel Fourier convolutional neural networks (FourierNets). We show that FourierNets surpass existing deep network based decoders at reconstructing photographs captured by the highly non-local DiffuserCam optical encoder. Further, we show that FourierNets enable E2E optimization of highly non-local optical encoders for 3D snapshot microscopy. By combining FourierNets with a large-scale multi-GPU differentiable optical simulation, we are able to optimize non-local optical encoders 170$\times$ to 7372$\times$ larger than prior state of the art, and demonstrate the potential for ROI-type specific optical encoding with a programmable microscope. △ Less

Submitted 2 November, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

Comments: Accepted to NeurIPS 2022

arXiv:2102.07107 [pdf, other]

Distributed Estimation, Control and Coordination of Quadcopter Swarm Robots

Authors: Zheng Jia, Michael Hamer, Raffaello D'Andrea

Abstract: In this thesis we are interested in applying distributed estimation, control and optimization techniques to enable a group of quadcopters to fly through openings. The quadcopters are assumed to be equipped with a simulated bearing and distance sensor for localization. Some quadcopters are designated as leaders who carry global position sensors. We assume quadcopters can communicate information wit… ▽ More In this thesis we are interested in applying distributed estimation, control and optimization techniques to enable a group of quadcopters to fly through openings. The quadcopters are assumed to be equipped with a simulated bearing and distance sensor for localization. Some quadcopters are designated as leaders who carry global position sensors. We assume quadcopters can communicate information with each other. △ Less

Submitted 14 February, 2021; originally announced February 2021.

arXiv:2011.13539 [pdf]

Decoding PPP Corrections from BDS B2b Signals Using a Software-defined Receiver: an Initial Performance Evaluation

Authors: Xiangchen Lu, Liang Chen, Nan Shen, Lei Wang, Zhenhang Jiao, Ruizhi Chen

Abstract: With the rapid development of China's BeiDou Navigation Satellite System(BDS), the application of real-time precise point positioning (RTPPP) based on BDS has become an active research area in the field of Global Navigation Satellite System (GNSS). BDS has provided the service of broadcasting RTPPP information. It indicates that BDS has become the second satellite system that provides RTPPP servic… ▽ More With the rapid development of China's BeiDou Navigation Satellite System(BDS), the application of real-time precise point positioning (RTPPP) based on BDS has become an active research area in the field of Global Navigation Satellite System (GNSS). BDS has provided the service of broadcasting RTPPP information. It indicates that BDS has become the second satellite system that provides RTPPP services, following Galileo among the GNSS, but work based on this direction has yet to be explored. Therefore, this paper evaluates the performance of precise point positioning (PPP) service using a software-defined receiver (SDR). An experiment was carried out tu verify the feasibility of the SDR. The PPP-B2b signal was processed to obtain PPP service information, including orbit corrections, clock corrections, and differential code bias corrections. The time-varying attributes of these corrections of BDS and GPS are evaluated, and the integrity and stability of the PPP service were analyzed. The results show the PPP-B2b signal can stably provide PPP services for staellites in the Asia-Pacific region, including centimeter to decimeter-level orbit corrections and meter-level clock correction for GPS satellites. Finally, detection tip for bitream availability in SDR is proposed. Some content which is not defined in the official document, such as the PPP-B2b frame arrangement, various correction update cycles and the progress of PPP service are discussed. △ Less

Submitted 26 November, 2020; originally announced November 2020.

arXiv:2008.08060 [pdf]

Personalized Deep Learning for Ventricular Arrhythmias Detection on Medical IoT Systems

Authors: Zhenge Jia, Zhepeng Wang, Feng Hong, Lichuan **, Yiyu Shi, **gtong Hu

Abstract: Life-threatening ventricular arrhythmias (VA) are the leading cause of sudden cardiac death (SCD), which is the most significant cause of natural death in the US. The implantable cardioverter defibrillator (ICD) is a small device implanted to patients under high risk of SCD as a preventive treatment. The ICD continuously monitors the intracardiac rhythm and delivers shock when detecting the life-t… ▽ More Life-threatening ventricular arrhythmias (VA) are the leading cause of sudden cardiac death (SCD), which is the most significant cause of natural death in the US. The implantable cardioverter defibrillator (ICD) is a small device implanted to patients under high risk of SCD as a preventive treatment. The ICD continuously monitors the intracardiac rhythm and delivers shock when detecting the life-threatening VA. Traditional methods detect VA by setting criteria on the detected rhythm. However, those methods suffer from a high inappropriate shock rate and require a regular follow-up to optimize criteria parameters for each ICD recipient. To ameliorate the challenges, we propose the personalized computing framework for deep learning based VA detection on medical IoT systems. The system consists of intracardiac and surface rhythm monitors, and the cloud platform for data uploading, diagnosis, and CNN model personalization. We equip the system with real-time inference on both intracardiac and surface rhythm monitors. To improve the detection accuracy, we enable the monitors to detect VA collaboratively by proposing the cooperative inference. We also introduce the CNN personalization for each patient based on the computing framework to tackle the unlabeled and limited rhythm data problem. When compared with the traditional detection algorithm, the proposed method achieves comparable accuracy on VA rhythm detection and 6.6% reduction in inappropriate shock rate, while the average inference latency is kept at 71ms. △ Less

Submitted 18 August, 2020; originally announced August 2020.

arXiv:2006.14345 [pdf]

doi 10.1016/j.patcog.2021.108515

Collaborative Boundary-aware Context Encoding Networks for Error Map Prediction

Authors: Zhenxi Zhang, Chunna Tian, Jie Li, Zhusi Zhong, Zhicheng Jiao, Xinbo Gao

Abstract: Medical image segmentation is usually regarded as one of the most important intermediate steps in clinical situations and medical imaging research. Thus, accurately assessing the segmentation quality of the automatically generated predictions is essential for guaranteeing the reliability of the results of the computer-assisted diagnosis (CAD). Many researchers apply neural networks to train segmen… ▽ More Medical image segmentation is usually regarded as one of the most important intermediate steps in clinical situations and medical imaging research. Thus, accurately assessing the segmentation quality of the automatically generated predictions is essential for guaranteeing the reliability of the results of the computer-assisted diagnosis (CAD). Many researchers apply neural networks to train segmentation quality regression models to estimate the segmentation quality of a new data cohort without labeled ground truth. Recently, a novel idea is proposed that transforming the segmentation quality assessment (SQA) problem intothe pixel-wise error map prediction task in the form of segmentation. However, the simple application of vanilla segmentation structures in medical image fails to detect some small and thin error regions of the auto-generated masks with complex anatomical structures. In this paper, we propose collaborative boundaryaware context encoding networks called AEP-Net for error prediction task. Specifically, we propose a collaborative feature transformation branch for better feature fusion between images and masks, and precise localization of error regions. Further, we propose a context encoding module to utilize the global predictor from the error map to enhance the feature representation and regularize the networks. We perform experiments on IBSR v2.0 dataset and ACDC dataset. The AEP-Net achieves an average DSC of 0.8358, 0.8164 for error prediction task,and shows a high Pearson correlation coefficient of 0.9873 between the actual segmentation accuracy and the predicted accuracy inferred from the predicted error map on IBSR v2.0 dataset, which verifies the efficacy of our AEP-Net. △ Less

Submitted 25 June, 2020; originally announced June 2020.

Journal ref: Pattern Recognition PR_108515 ,2022

arXiv:2006.00250 [pdf]

Sequence to Point Learning Based on Bidirectional Dilated Residual Network for Non Intrusive Load Monitoring

Authors: Ziyue Jia, Linfeng Yang, Zhenrong Zhang, Hui Liu, Fannie Kong

Abstract: Non Intrusive Load Monitoring (NILM) or Energy Disaggregation (ED), seeks to save energy by decomposing corresponding appliances power reading from an aggregate power reading of the whole house. It is a single channel blind source separation problem (SCBSS) and difficult prediction problem because it is unidentifiable. Recent research shows that deep learning has become a growing popularity for NI… ▽ More Non Intrusive Load Monitoring (NILM) or Energy Disaggregation (ED), seeks to save energy by decomposing corresponding appliances power reading from an aggregate power reading of the whole house. It is a single channel blind source separation problem (SCBSS) and difficult prediction problem because it is unidentifiable. Recent research shows that deep learning has become a growing popularity for NILM problem. The ability of neural networks to extract load features is closely related to its depth. However, deep neural network is difficult to train because of exploding gradient, vanishing gradient and network degradation. To solve these problems, we propose a sequence to point learning framework based on bidirectional (non-casual) dilated convolution for NILM. To be more convincing, we compare our method with the state of art method, Seq2point (Zhang) directly and compare with existing algorithms indirectly via two same datasets and metrics. Experiments based on REDD and UK-DALE data sets show that our proposed approach is far superior to existing approaches in all appliances. △ Less

Submitted 30 May, 2020; originally announced June 2020.

arXiv:2004.08586 [pdf]

doi 10.1038/s41377-021-00610-w

0.75 Gbit/s high-speed classical key distribution with mode-shift keying chaos synchronization of Fabry-Perot lasers

Authors: Hua Gao, Anbang Wang, Longsheng Wang, Zhiwei Jia, Yuanyuan Guo, Zhensen Gao, Lianshan Yan, Yuwen Qin, Yuncai Wang

Abstract: High-speed physical key distribution is diligently pursued for secure communication. In this paper, we propose and experimentally demonstrate a scheme of high-speed key distribution using mode-shift keying chaos synchronization between two multi-longitudinal-mode Fabry-Perot lasers commonly driven by a super-luminescent diode. Legitimate users dynamically select one of the longitudinal modes accor… ▽ More High-speed physical key distribution is diligently pursued for secure communication. In this paper, we propose and experimentally demonstrate a scheme of high-speed key distribution using mode-shift keying chaos synchronization between two multi-longitudinal-mode Fabry-Perot lasers commonly driven by a super-luminescent diode. Legitimate users dynamically select one of the longitudinal modes according to private control codes to achieve mode-shift keying chaos synchronization. The two remote chaotic light waveforms are quantized to generate two raw random bit streams, and then those bits corresponding to chaos synchronization are sifted as shared keys by comparing the control codes. In this method, the transition time, i.e., the chaos synchronization recovery time is determined by the rising time of the control codes rather than the laser transition response time, so the key distribution rate is improved greatly. Our experiment achieved 0.75-Gbit/s key distribution rate with a bit error rate of 3.8*10-3 over 160-km fiber transmission with dispersion compensation. The entropy rate of the laser chaos is evaluated as 16 Gbit/s, which determines the ultimate final key rate together with key generation ratio. It is therefore believed that the method pays a way for Gbit/s physical key distribution. △ Less

Submitted 12 September, 2021; v1 submitted 18 April, 2020; originally announced April 2020.

Comments: 9 pages,5 figures

Journal ref: Light: Science & Applications (2021)10:172

arXiv:2002.02040 [pdf, other]

doi 10.1109/TGRS.2020.2992043

Extracting dispersion curves from ambient noise correlations using deep learning

Authors: Xiaotian Zhang, Zhe Jia, Zachary E. Ross, Robert W. Clayton

Abstract: We present a machine-learning approach to classifying the phases of surface wave dispersion curves. Standard FTAN analysis of surfaces observed on an array of receivers is converted to an image, of which, each pixel is classified as fundamental mode, first overtone, or noise. We use a convolutional neural network (U-net) architecture with a supervised learning objective and incorporate transfer le… ▽ More We present a machine-learning approach to classifying the phases of surface wave dispersion curves. Standard FTAN analysis of surfaces observed on an array of receivers is converted to an image, of which, each pixel is classified as fundamental mode, first overtone, or noise. We use a convolutional neural network (U-net) architecture with a supervised learning objective and incorporate transfer learning. The training is initially performed with synthetic data to learn coarse structure, followed by fine-tuning of the network using approximately 10% of the real data based on human classification. The results show that the machine classification is nearly identical to the human picked phases. Expanding the method to process multiple images at once did not improve the performance. The developed technique will faciliate automated processing of large dispersion curve datasets. △ Less

Submitted 5 February, 2020; originally announced February 2020.

arXiv:2001.07698 [pdf]

Intelligent Bandwidth Allocation for Latency Management in NG-EPON using Reinforcement Learning Methods

Authors: Qi Zhou, **gjie Zhu, Junwen Zhang, Zhensheng Jia, Bernardo Huberman, Gee-Kung Chang

Abstract: A novel intelligent bandwidth allocation scheme in NG-EPON using reinforcement learning is proposed and demonstrated for latency management. We verify the capability of the proposed scheme under both fixed and dynamic traffic loads scenarios to achieve <1ms average latency. The RL agent demonstrates an efficient intelligent mechanism to manage the latency, which provides a promising IBA solution f… ▽ More A novel intelligent bandwidth allocation scheme in NG-EPON using reinforcement learning is proposed and demonstrated for latency management. We verify the capability of the proposed scheme under both fixed and dynamic traffic loads scenarios to achieve <1ms average latency. The RL agent demonstrates an efficient intelligent mechanism to manage the latency, which provides a promising IBA solution for the next-generation access network. △ Less

Submitted 21 January, 2020; originally announced January 2020.

arXiv:1911.11397 [pdf, other]

doi 10.1016/j.neucom.2021.04.134

Adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints

Authors: **gliang Duan, Zhengyu Liu, Shengbo Eben Li, Qi Sun, Zhenzhong Jia, Bo Cheng

Abstract: This paper presents a constrained adaptive dynamic programming (CADP) algorithm to solve general nonlinear nonaffine optimal control problems with known dynamics. Unlike previous ADP algorithms, it can directly deal with problems with state constraints. Firstly, a constrained generalized policy iteration (CGPI) framework is developed to handle state constraints by transforming the traditional poli… ▽ More This paper presents a constrained adaptive dynamic programming (CADP) algorithm to solve general nonlinear nonaffine optimal control problems with known dynamics. Unlike previous ADP algorithms, it can directly deal with problems with state constraints. Firstly, a constrained generalized policy iteration (CGPI) framework is developed to handle state constraints by transforming the traditional policy improvement process into a constrained policy optimization problem. Next, we propose an actor-critic variant of CGPI, called CADP, in which both policy and value functions are approximated by multi-layer neural networks to directly map the system states to control inputs and value function, respectively. CADP linearizes the constrained optimization problem locally into a quadratically constrained linear programming problem, and then obtains the optimal update of the policy network by solving its dual problem. A trust region constraint is added to prevent excessive policy update, thus ensuring linearization accuracy. We determine the feasibility of the policy optimization problem by calculating the minimum trust region boundary and update the policy using two recovery rules when infeasible. The vehicle control problem in the path-tracking task is used to demonstrate the effectiveness of this proposed method. △ Less

Submitted 8 April, 2022; v1 submitted 26 November, 2019; originally announced November 2019.

Journal ref: Neurocomputing 484 (2022) 128-141

arXiv:1910.09773 [pdf]

Trident Segmentation CNN: A Spatiotemporal Transformation CNN for Punctate White Matter Lesions Segmentation in Preterm Neonates

Authors: Yalong Liu, Jie Li, Miaomiao Wang, Zhicheng Jiao, Jian Yang, Xianjun Li

Abstract: Accurate segmentation of punctate white matter lesions (PWML) in preterm neonates by an automatic algorithm can better assist doctors in diagnosis. However, the existing algorithms have many limitations, such as low detection accuracy and large resource consumption. In this paper, a novel spatiotemporal transformation deep learning method called Trident Segmentation CNN (TS-CNN) is proposed to seg… ▽ More Accurate segmentation of punctate white matter lesions (PWML) in preterm neonates by an automatic algorithm can better assist doctors in diagnosis. However, the existing algorithms have many limitations, such as low detection accuracy and large resource consumption. In this paper, a novel spatiotemporal transformation deep learning method called Trident Segmentation CNN (TS-CNN) is proposed to segment PWML in MR images. It can convert spatial information into temporal information, which reduces the consumption of computing resources. Furthermore, a new improved training loss called Self-balancing Focal Loss (SBFL) is proposed to balance the loss during the training process. The whole model is evaluated on a dataset of 704 MR images. Overall the method achieves median DSC, sensitivity, specificity, and Hausdorff distance of 0.6355, 0.7126, 0.9998, and 24.5836 mm which outperforms the state-of-the-art algorithm. (The code is now available on https://github.com/YalongLiu/Trident-Segmentation-CNN) △ Less

Submitted 22 October, 2019; originally announced October 2019.

arXiv:1906.09684 [pdf]

Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants

Authors: Yalong Liu, Jie Li, Ying Wang, Miaomiao Wang, Xianjun Li, Zhicheng Jiao, Jian Yang, Xingbo Gao

Abstract: Accurate segmentation of punctate white matter lesion (PWML) in infantile brains by an automatic algorithm can reduce the potential risk of postnatal development. How to segment PWML effectively has become one of the active topics in medical image segmentation in recent years. In this paper, we construct an efficient two-stage PWML semantic segmentation network based on the characteristics of the… ▽ More Accurate segmentation of punctate white matter lesion (PWML) in infantile brains by an automatic algorithm can reduce the potential risk of postnatal development. How to segment PWML effectively has become one of the active topics in medical image segmentation in recent years. In this paper, we construct an efficient two-stage PWML semantic segmentation network based on the characteristics of the lesion, called refined segmentation R-CNN (RS RCNN). We propose a heuristic RPN (H-RPN) which can utilize surrounding information around the PWMLs for heuristic segmentation. Also, we design a lightweight segmentation network to segment the lesion in a fast way. Densely connected conditional random field (DCRF) is used to optimize the segmentation results. We only use T1w MRIs to segment PWMLs. The result shows that our model can well segment the lesion of ordinary size or even pixel size. The Dice similarity coefficient reaches 0.6616, the sensitivity is 0.7069, the specificity is 0.9997, and the Hausdorff distance is 52.9130. The proposed method outperforms the state-of-the-art algorithm. (The code of this paper is available on https://github.com/YalongLiu/Refined-Segmentation-R-CNN) △ Less

Submitted 30 June, 2019; v1 submitted 23 June, 2019; originally announced June 2019.

arXiv:1906.07549 [pdf]

doi 10.1007/978-3-030-32226-7_60

An Attention-Guided Deep Regression Model for Landmark Detection in Cephalograms

Authors: Zhusi Zhong, Jie Li, Zhenxi Zhang, Zhicheng Jiao, Xinbo Gao

Abstract: Cephalometric tracing method is usually used in orthodontic diagnosis and treatment planning. In this paper, we propose a deep learning based framework to automatically detect anatomical landmarks in cephalometric X-ray images. We train the deep encoder-decoder for landmark detection, and combine global landmark configuration with local high-resolution feature responses. The proposed frame-work is… ▽ More Cephalometric tracing method is usually used in orthodontic diagnosis and treatment planning. In this paper, we propose a deep learning based framework to automatically detect anatomical landmarks in cephalometric X-ray images. We train the deep encoder-decoder for landmark detection, and combine global landmark configuration with local high-resolution feature responses. The proposed frame-work is based on 2-stage u-net, regressing the multi-channel heatmaps for land-mark detection. In this framework, we embed attention mechanism with global stage heatmaps, guiding the local stage inferring, to regress the local heatmap patches in a high resolution. Besides, the Expansive Exploration strategy improves robustness while inferring, expanding the searching scope without increasing model complexity. We have evaluated our framework in the most widely-used public dataset of landmark detection in cephalometric X-ray images. With less computation and manually tuning, our framework achieves state-of-the-art results. △ Less

Submitted 27 September, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

arXiv:1906.07367 [pdf]

A sparse annotation strategy based on attention-guided active learning for 3D medical image segmentation

Authors: Zhenxi Zhang, Jie Li, Zhusi Zhong, Zhicheng Jiao, Xinbo Gao

Abstract: 3D image segmentation is one of the most important and ubiquitous problems in medical image processing. It provides detailed quantitative analysis for accurate disease diagnosis, abnormal detection, and classification. Currently deep learning algorithms are widely used in medical image segmentation, most algorithms trained models with full annotated datasets. However, obtaining medical image datas… ▽ More 3D image segmentation is one of the most important and ubiquitous problems in medical image processing. It provides detailed quantitative analysis for accurate disease diagnosis, abnormal detection, and classification. Currently deep learning algorithms are widely used in medical image segmentation, most algorithms trained models with full annotated datasets. However, obtaining medical image datasets is very difficult and expensive, and full annotation of 3D medical image is a monotonous and time-consuming work. Partially labelling informative slices in 3D images will be a great relief of manual annotation. Sample selection strategies based on active learning have been proposed in the field of 2D image, but few strategies focus on 3D images. In this paper, we propose a sparse annotation strategy based on attention-guided active learning for 3D medical image segmentation. Attention mechanism is used to improve segmentation accuracy and estimate the segmentation accuracy of each slice. The comparative experiments with three different strategies using datasets from the develo** human connectome project (dHCP) show that, our strategy only needs 15% to 20% annotated slices in brain extraction task and 30% to 35% annotated slices in tissue segmentation task to achieve comparative results as full annotation. △ Less

Submitted 18 June, 2019; originally announced June 2019.

arXiv:1903.04267 [pdf, other]

Energy Harvesting Powered Embedded Systems

Authors: Yawen Wu, Zhenge Jia, **gtong Hu

Abstract: Historically, battery is the power source for mobile, embedded and remote system applications. However, the development of battery techniques does not follow the Moore's Law. The large physical size, limited electric quantity and high-cost replacement process always restrict the performance of the application such as embedded systems, wireless sensors networks and lower-power electronics. Energy h… ▽ More Historically, battery is the power source for mobile, embedded and remote system applications. However, the development of battery techniques does not follow the Moore's Law. The large physical size, limited electric quantity and high-cost replacement process always restrict the performance of the application such as embedded systems, wireless sensors networks and lower-power electronics. Energy harvesting, a technique which enables the applications to scavenge energy from RF signal from TV towers, solar energy, piezoelectric driven by motion of people and thermal energy from the temperature difference, which could dramatically extend the operating lifetime of applications. Thus, energy harvesting is important for the sustainable operations of an application. △ Less

Submitted 7 March, 2019; originally announced March 2019.

Comments: 2 pages, 1 figure

Showing 1–50 of 55 results for author: Jia, Z