Search | arXiv e-print repository

Arctic Sea Ice Image Super-Resolution Based on Multi-Scale Convolution and Dual-Gating Mechanism

Authors: Zhaomin Fang, Wankun Chen, Feng Gao, Yanhai Gan, Junyu Dong, Yang Zhou

Abstract: Arctic Sea Ice Concentration (SIC) is the ratio of ice-covered area to the total sea area of the Arctic Ocean, which is a key indicator for maritime activities. Nowadays, we often use passive microwave images to display SIC, but it has low spatial resolution, and most of the existing super-resolution methods of Arctic SIC don't take the integration of spatial and channel features into account and… ▽ More Arctic Sea Ice Concentration (SIC) is the ratio of ice-covered area to the total sea area of the Arctic Ocean, which is a key indicator for maritime activities. Nowadays, we often use passive microwave images to display SIC, but it has low spatial resolution, and most of the existing super-resolution methods of Arctic SIC don't take the integration of spatial and channel features into account and can't effectively integrate the multi-scale feature. To overcome the aforementioned issues, we propose MFM-Net for Arctic SIC super-resolution, which concurrently aggregates multi-scale information while integrating spatial and channel features. Extensive experiments on Arctic SIC dataset from the AMSR-E/AMSR-2 SIC DT-ASI products from Ocean University of China validate the effectiveness of porposed MFM-Net. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Accepted by IEEE IGARSS 2024

arXiv:2406.01235 [pdf, other]

Boosting Spatial-Spectral Masked Auto-Encoder Through Mining Redundant Spectra for HSI-SAR/LiDAR Classification

Authors: Junyan Lin, Xuepeng **, Feng Gao, Junyu Dong, Hui Yu

Abstract: Although recent masked image modeling (MIM)-based HSI-LiDAR/SAR classification methods have gradually recognized the importance of the spectral information, they have not adequately addressed the redundancy among different spectra, resulting in information leakage during the pretraining stage. This issue directly impairs the representation ability of the model. To tackle the problem, we propose a… ▽ More Although recent masked image modeling (MIM)-based HSI-LiDAR/SAR classification methods have gradually recognized the importance of the spectral information, they have not adequately addressed the redundancy among different spectra, resulting in information leakage during the pretraining stage. This issue directly impairs the representation ability of the model. To tackle the problem, we propose a new strategy, named Mining Redundant Spectra (MRS). Unlike randomly masking spectral bands, MRS selectively masks them by similarity to increase the reconstruction difficulty. Specifically, a random spectral band is chosen during pretraining, and the selected and highly similar bands are masked. Experimental results demonstrate that employing the MRS strategy during the pretraining stage effectively improves the accuracy of existing MIM-based methods on the Berlin and Houston 2018 datasets. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Accepted by IGARSS 2024

arXiv:2406.00449 [pdf, other]

Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging

Authors: Jiahua Dong, Hui Yin, Hongliu Li, Wenbo Li, Yulun Zhang, Salman Khan, Fahad Shahbaz Khan

Abstract: Deep unfolding methods have made impressive progress in restoring 3D hyperspectral images (HSIs) from 2D measurements through convolution neural networks or Transformers in spectral compressive imaging. However, they cannot efficiently capture long-range dependencies using global receptive fields, which significantly limits their performance in HSI reconstruction. Moreover, these methods may suffe… ▽ More Deep unfolding methods have made impressive progress in restoring 3D hyperspectral images (HSIs) from 2D measurements through convolution neural networks or Transformers in spectral compressive imaging. However, they cannot efficiently capture long-range dependencies using global receptive fields, which significantly limits their performance in HSI reconstruction. Moreover, these methods may suffer from local context neglect if we directly utilize Mamba to unfold a 2D feature map as a 1D sequence for modeling global long-range dependencies. To address these challenges, we propose a novel Dual Hyperspectral Mamba (DHM) to explore both global long-range dependencies and local contexts for efficient HSI reconstruction. After learning informative parameters to estimate degradation patterns of the CASSI system, we use them to scale the linear projection and offer noise level for the denoiser (i.e., our proposed DHM). Specifically, our DHM consists of multiple dual hyperspectral S4 blocks (DHSBs) to restore original HSIs. Particularly, each DHSB contains a global hyperspectral S4 block (GHSB) to model long-range dependencies across the entire high-resolution HSIs using global receptive fields, and a local hyperspectral S4 block (LHSB) to address local context neglect by establishing structured state-space sequence (S4) models within local windows. Experiments verify the benefits of our DHM for HSI reconstruction. The source codes and models will be available at https://github.com/JiahuaDong/DHM. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2404.16484 [pdf, other]

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: CVPR 2024, AI for Streaming (AIS) Workshop

arXiv:2404.10343 [pdf, other]

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/. △ Less

Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

arXiv:2403.10067 [pdf, other]

Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

Authors: Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

Abstract: Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhan… ▽ More Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhance the modeling of both global and local features, we have devised a convolution and attention fusion module aimed at capturing long-range dependencies and neighborhood spectral correlations. Furthermore, to improve multi-scale information aggregation, we design a multi-scale feed-forward network to enhance denoising performance by extracting features at different scales. Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet. The proposed model is effective in removing various types of complex noise. Our codes are available at \url{https://github.com/summitgao/HCANet}. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: IEEE GRSL 2024

arXiv:2401.09673 [pdf, other]

Artwork Protection Against Neural Style Transfer Using Locally Adaptive Adversarial Color Attack

Authors: Zhongliang Guo, Junhao Dong, Yifei Qian, Kaixuan Wang, Weiye Li, Ziheng Guo, Yuheng Wang, Yanli Li, Ognjen Arandjelović, Lei Fang

Abstract: Neural style transfer (NST) generates new images by combining the style of one image with the content of another. However, unauthorized NST can exploit artwork, raising concerns about artists' rights and motivating the development of proactive protection methods. We propose Locally Adaptive Adversarial Color Attack (LAACA), empowering artists to protect their artwork from unauthorized style transf… ▽ More Neural style transfer (NST) generates new images by combining the style of one image with the content of another. However, unauthorized NST can exploit artwork, raising concerns about artists' rights and motivating the development of proactive protection methods. We propose Locally Adaptive Adversarial Color Attack (LAACA), empowering artists to protect their artwork from unauthorized style transfer by processing before public release. By delving into the intricacies of human visual perception and the role of different frequency components, our method strategically introduces frequency-adaptive perturbations in the image. These perturbations significantly degrade the generation quality of NST while maintaining an acceptable level of visual change in the original image, ensuring that potential infringers are discouraged from using the protected artworks, because of its bad NST generation quality. Additionally, existing metrics often overlook the importance of color fidelity in evaluating color-mattered tasks, such as the quality of NST-generated images, which is crucial in the context of artistic works. To comprehensively assess the color-mattered tasks, we propose the Adversarial Color Distance Metric (ACDM), designed to quantify the color difference of images pre- and post-manipulations. Experimental results confirm that attacking NST using LAACA results in visually inferior style transfer, and the ACDM can efficiently measure color-mattered tasks. By providing artists with a tool to safeguard their intellectual property, our work relieves the socio-technical challenges posed by the misuse of NST in the art community. △ Less

Submitted 19 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Comments: 9 pages, 5 figures, 4 tables

arXiv:2401.01496 [pdf, other]

From Pixel to Slide image: Polarization Modality-based Pathological Diagnosis Using Representation Learning

Authors: Jia Dong, Yao Yao, Yang Dong, Hui Ma

Abstract: Thyroid cancer is the most common endocrine malignancy, and accurately distinguishing between benign and malignant thyroid tumors is crucial for develo** effective treatment plans in clinical practice. Pathologically, thyroid tumors pose diagnostic challenges due to improper specimen sampling. In this study, we have designed a three-stage model using representation learning to integrate pixel-le… ▽ More Thyroid cancer is the most common endocrine malignancy, and accurately distinguishing between benign and malignant thyroid tumors is crucial for develo** effective treatment plans in clinical practice. Pathologically, thyroid tumors pose diagnostic challenges due to improper specimen sampling. In this study, we have designed a three-stage model using representation learning to integrate pixel-level and slice-level annotations for distinguishing thyroid tumors. This structure includes a pathology structure recognition method to predict structures related to thyroid tumors, an encoder-decoder network to extract pixel-level annotation information by learning the feature representations of image blocks, and an attention-based learning mechanism for the final classification task. This mechanism learns the importance of different image blocks in a pathological region, globally considering the information from each block. In the third stage, all information from the image blocks in a region is aggregated using attention mechanisms, followed by classification to determine the category of the region. Experimental results demonstrate that our proposed method can predict microscopic structures more accurately. After color-coding, the method achieves results on unstained pathology slides that approximate the quality of Hematoxylin and eosin staining, reducing the need for stained pathology slides. Furthermore, by leveraging the concept of indirect measurement and extracting polarized features from structures correlated with lesions, the proposed method can also classify samples where membrane structures cannot be obtained through sampling, providing a potential objective and highly accurate indirect diagnostic technique for thyroid tumors. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2312.16607 [pdf, other]

A Polarization and Radiomics Feature Fusion Network for the Classification of Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma

Authors: Jia Dong, Yao Yao, Liyan Lin, Yang Dong, Jiachen Wan, Ran Peng, Chao Li, Hui Ma

Abstract: Classifying hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) is a critical step in treatment selection and prognosis evaluation for patients with liver diseases. Traditional histopathological diagnosis poses challenges in this context. In this study, we introduce a novel polarization and radiomics feature fusion network, which combines polarization features obtained from Mu… ▽ More Classifying hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) is a critical step in treatment selection and prognosis evaluation for patients with liver diseases. Traditional histopathological diagnosis poses challenges in this context. In this study, we introduce a novel polarization and radiomics feature fusion network, which combines polarization features obtained from Mueller matrix images of liver pathological samples with radiomics features derived from corresponding pathological images to classify HCC and ICC. Our fusion network integrates a two-tier fusion approach, comprising early feature-level fusion and late classification-level fusion. By harnessing the strengths of polarization imaging techniques and image feature-based machine learning, our proposed fusion network significantly enhances classification accuracy. Notably, even at reduced imaging resolutions, the fusion network maintains robust performance due to the additional information provided by polarization features, which may not align with human visual perception. Our experimental results underscore the potential of this fusion network as a powerful tool for computer-aided diagnosis of HCC and ICC, showcasing the benefits and prospects of integrating polarization imaging techniques into the current image-intensive digital pathological diagnosis. We aim to contribute this innovative approach to top-tier journals, offering fresh insights and valuable tools in the fields of medical imaging and cancer diagnosis. By introducing polarization imaging into liver cancer classification, we demonstrate its interdisciplinary potential in addressing challenges in medical image analysis, promising advancements in medical imaging and cancer diagnosis. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.10921 [pdf, other]

AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis

Authors: Dongze Li, Kang Zhao, Wei Wang, Bo Peng, Yingya Zhang, **g Dong, Tieniu Tan

Abstract: Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality. Recent NeRF-based approaches have shown superiority in quality and fidelity compared to previous studies. However, when it comes to few-shot talking head generation, a practical scenario where only few seconds of talking video is available for one identity, two limitat… ▽ More Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality. Recent NeRF-based approaches have shown superiority in quality and fidelity compared to previous studies. However, when it comes to few-shot talking head generation, a practical scenario where only few seconds of talking video is available for one identity, two limitations emerge: 1) they either have no base model, which serves as a facial prior for fast convergence, or ignore the importance of audio when building the prior; 2) most of them overlook the degree of correlation between different face regions and audio, e.g., mouth is audio related, while ear is audio independent. In this paper, we present Audio Enhanced Neural Radiance Field (AE-NeRF) to tackle the above issues, which can generate realistic portraits of a new speaker with fewshot dataset. Specifically, we introduce an Audio Aware Aggregation module into the feature fusion stage of the reference scheme, where the weight is determined by the similarity of audio between reference and target image. Then, an Audio-Aligned Face Generation strategy is proposed to model the audio related and audio independent regions respectively, with a dual-NeRF framework. Extensive experiments have shown AE-NeRF surpasses the state-of-the-art on image fidelity, audio-lip synchronization, and generalization ability, even in limited training set or training iterations. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI 2024

arXiv:2311.10118 [pdf, other]

Now and Future of Artificial Intelligence-based Signet Ring Cell Diagnosis: A Survey

Authors: Zhu Meng, Junhao Dong, Limei Guo, Fei Su, Guangxi Wang, Zhicheng Zhao

Abstract: Since signet ring cells (SRCs) are associated with high peripheral metastasis rate and dismal survival, they play an important role in determining surgical approaches and prognosis, while they are easily missed by even experienced pathologists. Although automatic diagnosis SRCs based on deep learning has received increasing attention to assist pathologists in improving the diagnostic efficiency an… ▽ More Since signet ring cells (SRCs) are associated with high peripheral metastasis rate and dismal survival, they play an important role in determining surgical approaches and prognosis, while they are easily missed by even experienced pathologists. Although automatic diagnosis SRCs based on deep learning has received increasing attention to assist pathologists in improving the diagnostic efficiency and accuracy, the existing works have not been systematically overviewed, which hindered the evaluation of the gap between algorithms and clinical applications. In this paper, we provide a survey on SRC analysis driven by deep learning from 2008 to August 2023. Specifically, the biological characteristics of SRCs and the challenges of automatic identification are systemically summarized. Then, the representative algorithms are analyzed and compared via dividing them into classification, detection, and segmentation. Finally, for comprehensive consideration to the performance of existing methods and the requirements for clinical assistance, we discuss the open issues and future trends of SRC analysis. The retrospect research will help researchers in the related fields, particularly for who without medical science background not only to clearly find the outline of SRC analysis, but also gain the prospect of intelligent diagnosis, resulting in accelerating the practice and application of intelligent algorithms. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.04442 [pdf, other]

SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification

Authors: Junyan Lin, Feng Gao, Xiaocheng Shi, Junyu Dong, Qian Du

Abstract: Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectra… ▽ More Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectral masked auto-encoder (SS-MAE) for HSI and LiDAR/SAR data joint classification. Specifically, SS-MAE consists of a spatial-wise branch and a spectral-wise branch. The spatial-wise branch masks random patches and reconstructs missing pixels, while the spectral-wise branch masks random spectral channels and reconstructs missing channels. Our SS-MAE fully exploits the spatial and spectral representations of the input data. Furthermore, to complement local features in the training stage, we add two lightweight CNNs for feature extraction. Both global and local features are taken into account for feature modeling. To demonstrate the effectiveness of the proposed SS-MAE, we conduct extensive experiments on three publicly available datasets. Extensive experiments on three multi-source datasets verify the superiority of our SS-MAE compared with several state-of-the-art baselines. The source codes are available at \url{https://github.com/summitgao/SS-MAE}. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: IEEE TGRS 2023

arXiv:2310.04922 [pdf, ps, other]

Robust Multivariate Detection and Estimation with Fault Frequency Content Information

Authors: **gwei Dong, Kaikai Pan, Sergio Pequito, Peyman Mohajerin Esfahani

Abstract: This paper studies the problem of fault detection and estimation (FDE) for linear time-invariant (LTI) systems with a particular focus on frequency content information of faults, possibly as multiple disjoint continuum ranges, and under both disturbances and stochastic noise. To ensure the worst-case fault sensitivity in the considered frequency ranges and mitigate the effects of disturbances and… ▽ More This paper studies the problem of fault detection and estimation (FDE) for linear time-invariant (LTI) systems with a particular focus on frequency content information of faults, possibly as multiple disjoint continuum ranges, and under both disturbances and stochastic noise. To ensure the worst-case fault sensitivity in the considered frequency ranges and mitigate the effects of disturbances and noise, an optimization framework incorporating a mixed H_/H2 performance index is developed to compute the optimal detection filter. Moreover, a thresholding rule is proposed to guarantee both the false alarm rate (FAR) and the fault detection rate (FDR). Next, shifting attention to fault estimation in specific frequency ranges, an exact reformulation of the optimal estimation filter design using the restricted Hinf performance index is derived, which is inherently non-convex. However, focusing on finite frequency samples and fixed poles, a lower bound is established via a highly tractable quadratic programming (QP) problem. This lower bound together with an alternating optimization (AO) approach to the original estimation problem leads to a suboptimality gap for the overall estimation filter design. The effectiveness of the proposed approaches is validated through a synthetic non-minimum phase system and an application of the multi-area power system. △ Less

Submitted 15 May, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: 32pages, 15 figures

arXiv:2309.12010 [pdf, other]

Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection

Authors: Haopeng Zhang, Zi**g Lin, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorpo… ▽ More Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorporate global attention. To this end, we propose a convolution and attention mixer (CAMixer). First, to compensate the inductive bias for Transformer, we combine self-attention with shift convolution in a parallel way. The parallel design effectively captures the global semantic information via the self-attention and performs local feature extraction through shift convolution simultaneously. Second, we adopt a gating mechanism in the feed-forward network to enhance the non-linear feature transformation. The gating mechanism is formulated as the element-wise multiplication of two parallel linear layers. Important features can be highlighted, leading to high-quality representations against speckle noise. Extensive experiments conducted on three SAR datasets verify the superior performance of the proposed CAMixer. The source codes will be publicly available at https://github.com/summitgao/CAMixer . △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by IEEE GRSL

arXiv:2309.08803 [pdf, other]

Robust Indoor Localization with Ranging-IMU Fusion

Authors: Fan Jiang, David Caruso, Ashutosh Dhekne, Qi Qu, Jakob Julian Engel, **g Dong

Abstract: Indoor wireless ranging localization is a promising approach for low-power and high-accuracy localization of wearable devices. A primary challenge in this domain stems from non-line of sight propagation of radio waves. This study tackles a fundamental issue in wireless ranging: the unpredictability of real-time multipath determination, especially in challenging conditions such as when there is no… ▽ More Indoor wireless ranging localization is a promising approach for low-power and high-accuracy localization of wearable devices. A primary challenge in this domain stems from non-line of sight propagation of radio waves. This study tackles a fundamental issue in wireless ranging: the unpredictability of real-time multipath determination, especially in challenging conditions such as when there is no direct line of sight. We achieve this by fusing range measurements with inertial measurements obtained from a low cost Inertial Measurement Unit (IMU). For this purpose, we introduce a novel asymmetric noise model crafted specifically for non-Gaussian multipath disturbances. Additionally, we present a novel Levenberg-Marquardt (LM)-family trust-region adaptation of the iSAM2 fusion algorithm, which is optimized for robust performance for our ranging-IMU fusion problem. We evaluate our solution in a densely occupied real office environment. Our proposed solution can achieve temporally consistent localization with an average absolute accuracy of $\sim$0.3m in real-world settings. Furthermore, our results indicate that we can achieve comparable accuracy even with infrequent (1Hz) range measurements. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.04755 [pdf, other]

Towards Real-time Training of Physics-informed Neural Networks: Applications in Ultrafast Ultrasound Blood Flow Imaging

Authors: Haotian Guan, **** Dong, Wei-Ning Lee

Abstract: Physics-informed Neural Network (PINN) is one of the most preeminent solvers of Navier-Stokes equations, which are widely used as the governing equation of blood flow. However, current approaches, relying on full Navier-Stokes equations, are impractical for ultrafast Doppler ultrasound, the state-of-the-art technique for depiction of complex blood flow dynamics \emph{in vivo} through acquired thou… ▽ More Physics-informed Neural Network (PINN) is one of the most preeminent solvers of Navier-Stokes equations, which are widely used as the governing equation of blood flow. However, current approaches, relying on full Navier-Stokes equations, are impractical for ultrafast Doppler ultrasound, the state-of-the-art technique for depiction of complex blood flow dynamics \emph{in vivo} through acquired thousands of frames (or, timestamps) per second. In this article, we first propose a novel training framework of PINN for solving Navier-Stokes equations by discretizing Navier-Stokes equations into steady state and sequentially solving steady-state Navier-Stokes equations with transfer learning. The novel training framework is coined as SeqPINN. Upon the success of SeqPINN, we adopt the idea of averaged constant stochastic gradient descent (SGD) as initialization and propose a parallel training scheme for all timestamps. To ensure an initialization that generalizes well, we borrow the concept of Stochastic Weight Averaging Gaussian to perform uncertainty estimation as an indicator of generalizability of the initialization. This algorithm, named SP-PINN, further expedites training of PINN while achieving comparable accuracy with SeqPINN. Finite-element simulations and \emph{in vitro} phantoms of single-branch and trifurcate blood vessels are used to evaluate the performance of SeqPINN and SP-PINN. Results show that both SeqPINN and SP-PINN are manyfold faster than the original design of PINN, while respectively achieving Root Mean Square Errors (RMSEs) of 1.01 cm/s and 1.26 cm/s on the straight vessel and 1.91 cm/s and 2.56 cm/s on the trifurcate blood vessel when recovering blood flow velocities. △ Less

Submitted 9 September, 2023; originally announced September 2023.

arXiv:2309.02733 [pdf, ps, other]

Existence and Completeness of Bounded Disturbance Observers: A Set-Membership Viewpoint

Authors: Yudong Li, Yirui Cong, Jiuxiang Dong

Abstract: This paper investigates the boundedness of the Disturbance Observer (DO) for linear discrete-time systems. In contrast to previous studies that focus on analyzing and/or designing observer gains, our analysis and synthesis approach is based on a set-membership viewpoint. From this viewpoint, a necessary and sufficient existence condition of bounded DOs is first established, which can be easily ver… ▽ More This paper investigates the boundedness of the Disturbance Observer (DO) for linear discrete-time systems. In contrast to previous studies that focus on analyzing and/or designing observer gains, our analysis and synthesis approach is based on a set-membership viewpoint. From this viewpoint, a necessary and sufficient existence condition of bounded DOs is first established, which can be easily verified. Furthermore, a set-membership filter-based DO is developed, and its completeness is proved; thus, our proposed DO is bounded if and only if bounded DOs exist. We also prove that the proposed DO has the capability to achieve the worst-case optimality, which can provide a benchmark for the design of DOs. Finally, numerical simulations are performed to corroborate the effectiveness of the theoretical results. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2307.16173 [pdf]

doi 10.1109/TIE.2023.3265027

Data-Driven Modeling with Experimental Augmentation for the Modulation Strategy of the Dual-Active-Bridge Converter

Authors: Xinze Li, Josep Pou, Jiaxin Dong, Fanfan Lin, Changyun Wen, Suvajit Mukherjee, Xin Zhang

Abstract: For the performance modeling of power converters, the mainstream approaches are essentially knowledge-based, suffering from heavy manpower burden and low modeling accuracy. Recent emerging data-driven techniques greatly relieve human reliance by automatic modeling from simulation data. However, model discrepancy may occur due to unmodeled parasitics, deficient thermal and magnetic models, unpredic… ▽ More For the performance modeling of power converters, the mainstream approaches are essentially knowledge-based, suffering from heavy manpower burden and low modeling accuracy. Recent emerging data-driven techniques greatly relieve human reliance by automatic modeling from simulation data. However, model discrepancy may occur due to unmodeled parasitics, deficient thermal and magnetic models, unpredictable ambient conditions, etc. These inaccurate data-driven models based on pure simulation cannot represent the practical performance in physical world, hindering their applications in power converter modeling. To alleviate model discrepancy and improve accuracy in practice, this paper proposes a novel data-driven modeling with experimental augmentation (D2EA), leveraging both simulation data and experimental data. In D2EA, simulation data aims to establish basic functional landscape, and experimental data focuses on matching actual performance in real world. The D2EA approach is instantiated for the efficiency optimization of a hybrid modulation for neutral-point-clamped dual-active-bridge (NPC-DAB) converter. The proposed D2EA approach realizes 99.92% efficiency modeling accuracy, and its feasibility is comprehensively validated in 2-kW hardware experiments, where the peak efficiency of 98.45% is attained. Overall, D2EA is data-light and can achieve highly accurate and highly practical data-driven models in one shot, and it is scalable to other applications, effortlessly. △ Less

Submitted 2 August, 2023; v1 submitted 30 July, 2023; originally announced July 2023.

Comments: 11 pages

Journal ref: IEEE.Trans.Ind.Electron. Early Access (2023) 1-11

arXiv:2306.11021 [pdf, other]

CloudBrain-MRS: An Intelligent Cloud Computing Platform for in vivo Magnetic Resonance Spectroscopy Preprocessing, Quantification, and Analysis

Authors: Xiaodie Chen, Jiayu Li, Dicheng Chen, Yirong Zhou, Zhangren Tu, Mei** Lin, Taishan Kang, Jianzhong Lin, Tao Gong, Liuhong Zhu, Jianjun Zhou, Lin Ou-yang, Jiefeng Guo, Jiyang Dong, Di Guo, Xiaobo Qu

Abstract: Magnetic resonance spectroscopy (MRS) is an important clinical imaging method for diagnosis of diseases. MRS spectrum is used to observe the signal intensity of metabolites or further infer their concentrations. Although the magnetic resonance vendors commonly provide basic functions of spectra plots and metabolite quantification, the widespread clinical research of MRS is still limited due to the… ▽ More Magnetic resonance spectroscopy (MRS) is an important clinical imaging method for diagnosis of diseases. MRS spectrum is used to observe the signal intensity of metabolites or further infer their concentrations. Although the magnetic resonance vendors commonly provide basic functions of spectra plots and metabolite quantification, the widespread clinical research of MRS is still limited due to the lack of easy-to-use processing software or platform. To address this issue, we have developed CloudBrain-MRS, a cloud-based online platform that provides powerful hardware and advanced algorithms. The platform can be accessed simply through a web browser, without the need of any program installation on the user side. CloudBrain-MRS also integrates the classic LCModel and advanced artificial intelligence algorithms and supports batch preprocessing, quantification, and analysis of MRS data from different vendors. Additionally, the platform offers useful functions: 1) Automatically statistical analysis to find biomarkers for diseases; 2) Consistency verification between the classic and artificial intelligence quantification algorithms; 3) Colorful three-dimensional visualization for easy observation of individual metabolite spectrum. Last, both healthy and mild cognitive impairment patient data are used to demonstrate the functions of the platform. To the best of our knowledge, this is the first cloud computing platform for in vivo MRS with artificial intelligence processing. We have shared our cloud platform at MRSHub, providing free access and service for two years. Please visit https://mrshub.org/software_all/#CloudBrain-MRS or https://csrc.xmu.edu.cn/CloudBrain.html. △ Less

Submitted 6 September, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: 11 pages, 12 figures

arXiv:2306.01952 [pdf, other]

Learning in Domain Randomization via Continuous Time Non-Stochastic Control

Authors: **gwei Li, **g Dong, Can Chang, Baoxiang Wang, **gzhao Zhang

Abstract: Domain randomization is a popular method for robustly training agents to adapt to diverse environments and real-world tasks. In this paper, we examine how to train an agent in domain randomization environments from a nonstochastic control perspective. We first theoretically study online control of continuous-time linear systems under nonstochastic noises. We present a novel two-level online algori… ▽ More Domain randomization is a popular method for robustly training agents to adapt to diverse environments and real-world tasks. In this paper, we examine how to train an agent in domain randomization environments from a nonstochastic control perspective. We first theoretically study online control of continuous-time linear systems under nonstochastic noises. We present a novel two-level online algorithm, by integrating a higher-level learning strategy and a lower-level feedback control strategy. This method offers a practical solution, and for the first time achieves sublinear regret in continuous-time nonstochastic systems. Compared to standard online learning algorithms, our algorithm features a stack and skip procedure. By applying stack and skip to the SAC (Soft Actor-Critic) algorithm, we achieved improved results in multiple reinforcement learning tasks within domain randomization environments. Our work provides new insights into nonasymptotic analyses of controlling continuous-time systems. Further, our work justifies the importance of stacked and skipped in controller learning under nonstochastic environments. △ Less

Submitted 14 December, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:2304.12445 [pdf, other]

Real-Time Ground Fault Detection for Inverter-Based Microgrid Systems

Authors: **gwei Dong, Yucheng Liao, Haiwei Xie, Jochen Cremer, Peyman Mohajerin Esfahani

Abstract: Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increase… ▽ More Ground fault detection in inverter-based microgrid (IBM) systems is challenging, particularly in a real-time setting, as the fault current deviates slightly from the nominal value. This difficulty is reinforced when there are partially decoupled disturbances and modeling uncertainties. The conventional solution of installing more relays to obtain additional measurements is costly and also increases the complexity of the system. In this paper, we propose a data-assisted diagnosis scheme based on an optimization-based fault detection filter with the output current as the only measurement. Modeling the microgrid dynamics and the diagnosis filter, we formulate the filter design as a quadratic programming (QP) problem that accounts for decoupling partial disturbances, robustness to non-decoupled disturbances and modeling uncertainties by training with data, and ensuring fault sensitivity simultaneously. To ease the computational effort, we also provide an approximate but analytical solution to this QP. Additionally, we use classical statistical results to provide a thresholding mechanism that enjoys probabilistic false-alarm guarantees. Finally, we implement the IBM system with Simulink and Real Time Digital Simulator (RTDS) to verify the effectiveness of the proposed method through simulations. △ Less

Submitted 3 April, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

Comments: 18 pages, 7 figures

arXiv:2304.11320 [pdf, other]

doi 10.1109/LGRS.2023.3270183

SAWU-Net: Spatial Attention Weighted Unmixing Network for Hyperspectral Images

Authors: Lin Qi, Xuewen Qin, Feng Gao, Junyu Dong, Xinbo Gao

Abstract: Hyperspectral unmixing is a critical yet challenging task in hyperspectral image interpretation. Recently, great efforts have been made to solve the hyperspectral unmixing task via deep autoencoders. However, existing networks mainly focus on extracting spectral features from mixed pixels, and the employment of spatial feature prior knowledge is still insufficient. To this end, we put forward a sp… ▽ More Hyperspectral unmixing is a critical yet challenging task in hyperspectral image interpretation. Recently, great efforts have been made to solve the hyperspectral unmixing task via deep autoencoders. However, existing networks mainly focus on extracting spectral features from mixed pixels, and the employment of spatial feature prior knowledge is still insufficient. To this end, we put forward a spatial attention weighted unmixing network, dubbed as SAWU-Net, which learns a spatial attention network and a weighted unmixing network in an end-to-end manner for better spatial feature exploitation. In particular, we design a spatial attention module, which consists of a pixel attention block and a window attention block to efficiently model pixel-based spectral information and patch-based spatial information, respectively. While in the weighted unmixing framework, the central pixel abundance is dynamically weighted by the coarse-grained abundances of surrounding pixels. In addition, SAWU-Net generates dynamically adaptive spatial weights through the spatial attention mechanism, so as to dynamically integrate surrounding pixels more effectively. Experimental results on real and synthetic datasets demonstrate the better accuracy and superiority of SAWU-Net, which reflects the effectiveness of the proposed spatial attention mechanism. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: IEEE GRSL 2023

arXiv:2304.09376 [pdf, other]

doi 10.1109/TGRS.2023.3257039

Physical Knowledge Enhanced Deep Neural Network for Sea Surface Temperature Prediction

Authors: Yuxin Meng, Feng Gao, Eric Rigall, Ran Dong, Junyu Dong, Qian Du

Abstract: Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advanc… ▽ More Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advances in earth observation technologies have yielded a monumental growth of data. Consequently, it is imperative to explore ways in which to improve and supplement numerical models utilizing the ever-increasing amounts of historical observational data. To this end, we introduce a method for SST prediction that transfers physical knowledge from historical observations to numerical models. Specifically, we use a combination of an encoder and a generative adversarial network (GAN) to capture physical knowledge from the observed data. The numerical model data is then fed into the pre-trained model to generate physics-enhanced data, which can then be used for SST prediction. Experimental results demonstrate that the proposed method considerably enhances SST prediction performance when compared to several state-of-the-art baselines. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: IEEE TGRS 2023

arXiv:2304.09373 [pdf, other]

Multi-scale Adaptive Fusion Network for Hyperspectral Image Denoising

Authors: Haodong Pan, Feng Gao, Junyu Dong, Qian Du

Abstract: Removing the noise and improving the visual quality of hyperspectral images (HSIs) is challenging in academia and industry. Great efforts have been made to leverage local, global or spectral context information for HSI denoising. However, existing methods still have limitations in feature interaction exploitation among multiple scales and rich spectral structure preservation. In view of this, we p… ▽ More Removing the noise and improving the visual quality of hyperspectral images (HSIs) is challenging in academia and industry. Great efforts have been made to leverage local, global or spectral context information for HSI denoising. However, existing methods still have limitations in feature interaction exploitation among multiple scales and rich spectral structure preservation. In view of this, we propose a novel solution to investigate the HSI denoising using a Multi-scale Adaptive Fusion Network (MAFNet), which can learn the complex nonlinear map** between clean and noisy HSI. Two key components contribute to improving the hyperspectral image denoising: A progressively multiscale information aggregation network and a co-attention fusion module. Specifically, we first generate a set of multiscale images and feed them into a coarse-fusion network to exploit the contextual texture correlation. Thereafter, a fine fusion network is followed to exchange the information across the parallel multiscale subnetworks. Furthermore, we design a co-attention fusion module to adaptively emphasize informative features from different scales, and thereby enhance the discriminative learning capability for denoising. Extensive experiments on synthetic and real HSI datasets demonstrate that the proposed MAFNet has achieved better denoising performance than other state-of-the-art techniques. Our codes are available at \verb'https://github.com/summitgao/MAFNet'. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: IEEE JSTASRS 2023, code at: https://github.com/summitgao/MAFNet

arXiv:2303.14133 [pdf, other]

Adversarial Attack and Defense for Medical Image Analysis: Methods and Applications

Authors: Junhao Dong, Junxi Chen, Xiaohua Xie, Jianhuang Lai, Hao Chen

Abstract: Deep learning techniques have achieved superior performance in computer-aided medical image analysis, yet they are still vulnerable to imperceptible adversarial attacks, resulting in potential misdiagnosis in clinical practice. Oppositely, recent years have also witnessed remarkable progress in defense against these tailored adversarial examples in deep medical diagnosis systems. In this expositio… ▽ More Deep learning techniques have achieved superior performance in computer-aided medical image analysis, yet they are still vulnerable to imperceptible adversarial attacks, resulting in potential misdiagnosis in clinical practice. Oppositely, recent years have also witnessed remarkable progress in defense against these tailored adversarial examples in deep medical diagnosis systems. In this exposition, we present a comprehensive survey on recent advances in adversarial attack and defense for medical image analysis with a novel taxonomy in terms of the application scenario. We also provide a unified theoretical framework for different types of adversarial attack and defense methods for medical image analysis. For a fair comparison, we establish a new benchmark for adversarially robust medical diagnosis models obtained by adversarial training under various scenarios. To the best of our knowledge, this is the first survey paper that provides a thorough evaluation of adversarially robust medical diagnosis models. By analyzing qualitative and quantitative results, we conclude this survey with a detailed discussion of current challenges for adversarial attack and defense in medical image analysis systems to shed light on future research directions. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2303.06019 [pdf, other]

Scatter-based common spatial patterns -- a unified spatial filtering framework

Authors: **long Dong, Milana Komosar, Johannes Vorwerk, Daniel Baumgarten, Jens Haueisen

Abstract: The common spatial pattern (CSP) approach is known as one of the most popular spatial filtering techniques for EEG classification in motor imagery (MI) based brain-computer interfaces (BCIs). However, it still suffers some drawbacks such as sensitivity to noise, non-stationarity, and limitation to binary classification.Therefore, we propose a novel spatial filtering framework called scaCSP based o… ▽ More The common spatial pattern (CSP) approach is known as one of the most popular spatial filtering techniques for EEG classification in motor imagery (MI) based brain-computer interfaces (BCIs). However, it still suffers some drawbacks such as sensitivity to noise, non-stationarity, and limitation to binary classification.Therefore, we propose a novel spatial filtering framework called scaCSP based on the scatter matrices of spatial covariances of EEG signals, which works generally in both binary and multi-class problems whereas CSP can be cast into our framework as a special case when only the range space of the between-class scatter matrix is used in binary cases.We further propose subspace enhanced scaCSP algorithms which easily permit incorporating more discriminative information contained in other range spaces and null spaces of the between-class and within-class scatter matrices in two scenarios: a nullspace components reduction scenario and an additional spatial filter learning scenario.The proposed algorithms are evaluated on two data sets including 4 MI tasks. The classification performance is compared against state-of-the-art competing algorithms: CSP, Tikhonov regularized CSP (TRCSP), stationary CSP (sCSP) and stationary TRCSP (sTRCSP) in the binary problems whilst multi-class extensions of CSP based on pair-wise and one-versus-rest techniques in the multi-class problems. The results show that the proposed framework outperforms all the competing algorithms in terms of average classification accuracy and computational efficiency in both binary and multi-class problems.The proposed scsCSP works as a unified framework for general multi-class problems and is promising for improving the performance of MI-BCIs. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2303.03279 [pdf, other]

Online functional connectivity analysis of large all-to-all networks

Authors: Lorenz Esch, **long Dong, Matti Hämäläinen, Daniel Baumgarten, Jens Haueisen, Johannes Vorwerk

Abstract: The analysis of EEG/MEG functional connectivity has become an important tool in neural research. Especially the high time resolution of EEG/MEG enables important insight into the functioning of the human brain. To date, functional connectivity is commonly estimated offline, i.e., after the conclusion of the experiment. However, online computation of functional connectivity has the potential to ena… ▽ More The analysis of EEG/MEG functional connectivity has become an important tool in neural research. Especially the high time resolution of EEG/MEG enables important insight into the functioning of the human brain. To date, functional connectivity is commonly estimated offline, i.e., after the conclusion of the experiment. However, online computation of functional connectivity has the potential to enable unique experimental paradigms. For example, changes of functional connectivity due to learning processes could be tracked in real time and the experiment be adjusted based on these observations. Furthermore, the connectivity estimates can be used for neurofeedback applications or the instantaneous inspection of measurement results. In this study, we present the implementation and evaluation of online sensor and source space functional connectivity estimation in the open-source software MNE Scan. Online capable implementations of several functional connectivity metrics were established in the Connectivity library within MNE-CPP and made available as a plugin in MNE Scan. Online capability was achieved by enforcing multithreading and high efficiency for all computations, so that repeated computations were avoided wherever possible, which allows for a major speed-up in the case of overlap** intervalls. We present comprehensive performance evaluations of these implementations proving the online capability for the computation of large all-to-all functional connectivity networks. As a proof of principle, we demonstrate the feasibility of online functional connectivity estimation in the evaluation of somatosensory evoked brain activity. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2303.01291 [pdf, other]

Robust, High-Precision GNSS Carrier-Phase Positioning with Visual-Inertial Fusion

Authors: Erqun Dong, Sheroze Sheriffdeen, Shichao Yang, **g Dong, Renzo De Nardi, Carl Ren, Xiao-Wen Chang, Xue Liu, Zijian Wang

Abstract: Robust, high-precision global localization is fundamental to a wide range of outdoor robotics applications. Conventional fusion methods use low-accuracy pseudorange based GNSS measurements ($>>5m$ errors) and can only yield a coarse registration to the global earth-centered-earth-fixed (ECEF) frame. In this paper, we leverage high-precision GNSS carrier-phase positioning and aid it with local visu… ▽ More Robust, high-precision global localization is fundamental to a wide range of outdoor robotics applications. Conventional fusion methods use low-accuracy pseudorange based GNSS measurements ($>>5m$ errors) and can only yield a coarse registration to the global earth-centered-earth-fixed (ECEF) frame. In this paper, we leverage high-precision GNSS carrier-phase positioning and aid it with local visual-inertial odometry (VIO) tracking using an extended Kalman filter (EKF) framework that better resolves the integer ambiguity concerned with GNSS carrier-phase. %to achieve centimeter-level accuracy in the ECEF frame. We also propose an algorithm for accurate GNSS-antenna-to-IMU extrinsics calibration to accurately align VIO to the ECEF frame. Together, our system achieves robust global positioning demonstrated by real-world hardware experiments in severely occluded urban canyons, and outperforms the state-of-the-art RTKLIB by a significant margin in terms of integer ambiguity solution fix rate and positioning RMSE accuracy. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2301.03335 [pdf, other]

doi 10.1109/TGRS.2023.3236154

Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR Data Classification

Authors: Meng Wang, Feng Gao, Junyu Dong, Heng-Chao Li, Qian Du

Abstract: The joint hyperspectral image (HSI) and LiDAR data classification aims to interpret ground objects at more detailed and precise level. Although deep learning methods have shown remarkable success in the multisource data classification task, self-supervised learning has rarely been explored. It is commonly nontrivial to build a robust self-supervised learning model for multisource data classificati… ▽ More The joint hyperspectral image (HSI) and LiDAR data classification aims to interpret ground objects at more detailed and precise level. Although deep learning methods have shown remarkable success in the multisource data classification task, self-supervised learning has rarely been explored. It is commonly nontrivial to build a robust self-supervised learning model for multisource data classification, due to the fact that the semantic similarities of neighborhood regions are not exploited in existing contrastive learning framework. Furthermore, the heterogeneous gap induced by the inconsistent distribution of multisource data impedes the classification performance. To overcome these disadvantages, we propose a Nearest Neighbor-based Contrastive Learning Network (NNCNet), which takes full advantage of large amounts of unlabeled data to learn discriminative feature representations. Specifically, we propose a nearest neighbor-based data augmentation scheme to use enhanced semantic relationships among nearby regions. The intermodal semantic alignments can be captured more accurately. In addition, we design a bilinear attention module to exploit the second-order and even high-order feature interactions between the HSI and LiDAR data. Extensive experiments on four public datasets demonstrate the superiority of our NNCNet over state-of-the-art methods. The source codes are available at \url{https://github.com/summitgao/NNCNet}. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: IEEE TGRS 2023

arXiv:2212.02007 [pdf, other]

Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin

Authors: Jianghong Dong, Qing Xu, Jiawei Wang, Chunying Yang, Mengchi Cai, Chaoyi Chen, Jianqiang Wang, Keqiang Li

Abstract: Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and phys… ▽ More Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and physical spaces into a mixed one, where physical entities coexist and interact with virtual entities via their digital counterparts. Under the framework of mixedDT, MCCT contains three major experimental platforms in the physical, virtual and mixed spaces respectively, and provides a unified access for various human-machine interfaces and external devices such as driving simulators. A cloud unit, where the mixed experimental platform is deployed, is responsible for fusing multi-platform information and assigning control instructions, contributing to synchronous operation and real-time cross-platform interaction. Particularly, MCCT allows for multi-vehicle coordination composed of different multi-source vehicles (\eg, physical vehicles, virtual vehicles and human-driven vehicles). Validations on vehicle platooning demonstrate the flexibility and scalability of MCCT. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: 13 pages, 13 figures

arXiv:2212.01878 [pdf]

CloudBrain-ReconAI: An Online Platform for MRI Reconstruction and Image Quality Evaluation

Authors: Yirong Zhou, Chen Qian, Jiayu Li, Zi Wang, Yu Hu, Biao Qu, Liuhong Zhu, Jianjun Zhou, Taishan Kang, Jianzhong Lin, Qing Hong, Jiyang Dong, Di Guo, Xiaobo Qu

Abstract: Efficient collaboration between engineers and radiologists is important for image reconstruction algorithm development and image quality evaluation in magnetic resonance imaging (MRI). Here, we develop CloudBrain-ReconAI, an online cloud computing platform, for algorithm deployment, fast and blind reader study. This platform supports online image reconstruction using state-of-the-art artificial in… ▽ More Efficient collaboration between engineers and radiologists is important for image reconstruction algorithm development and image quality evaluation in magnetic resonance imaging (MRI). Here, we develop CloudBrain-ReconAI, an online cloud computing platform, for algorithm deployment, fast and blind reader study. This platform supports online image reconstruction using state-of-the-art artificial intelligence and compressed sensing algorithms with applications to fast imaging and high-resolution diffusion imaging. Through visiting the website, radiologists can easily score and mark the images. Then, automatic statistical analysis will be provided. CloudBrain-ReconAI is now open accessed at https://csrc.xmu.edu.cn/CloudBrain.html and will be continually improved to serve the MRI research community. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: 8 pages, 11 figures

arXiv:2208.04481 [pdf, other]

doi 10.1109/LGRS.2022.3198088

Synthetic Aperture Radar Image Change Detection via Layer Attention-Based Noise-Tolerant Network

Authors: Desen Meng, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Recently, change detection methods for synthetic aperture radar (SAR) images based on convolutional neural networks (CNN) have gained increasing research attention. However, existing CNN-based methods neglect the interactions among multilayer convolutions, and errors involved in the preclassification restrict the network optimization. To this end, we proposed a layer attention-based noise-tolerant… ▽ More Recently, change detection methods for synthetic aperture radar (SAR) images based on convolutional neural networks (CNN) have gained increasing research attention. However, existing CNN-based methods neglect the interactions among multilayer convolutions, and errors involved in the preclassification restrict the network optimization. To this end, we proposed a layer attention-based noise-tolerant network, termed LANTNet. In particular, we design a layer attention module that adaptively weights the feature of different convolution layers. In addition, we design a noise-tolerant loss function that effectively suppresses the impact of noisy labels. Therefore, the model is insensitive to noisy labels in the preclassification results. The experimental results on three SAR datasets show that the proposed LANTNet performs better compared to several state-of-the-art methods. The source codes are available at https://github.com/summitgao/LANTNet △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: Accepted by IEEE Geoscience and Remote Sensing Letters (GRSL) 2022, code is available at https://github.com/summitgao/LANTNet

arXiv:2206.02507 [pdf, other]

Learning to Control under Time-Varying Environment

Authors: Yuzhen Han, Ruben Solozabal, **g Dong, Xingyu Zhou, Martin Takac, Bin Gu

Abstract: This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from non… ▽ More This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from nonparametric rate of regret. In this paper, we propose the first computationally tractable online algorithm with regret guarantees that avoids offline planning over the state linear feedback policies. Our algorithm is based on the optimism in the face of uncertainty (OFU) principle in which we optimistically select the best model in a high confidence region. Our algorithm is then more explorative when compared to previous approaches. To overcome non-stationarity, we propose either a restarting strategy (R-OFU) or a sliding window (SW-OFU) strategy. With proper configuration, our algorithm is attains sublinear regret $O(T^{2/3})$. These algorithms utilize data from the current phase for tracking variations on the system dynamics. We corroborate our theoretical findings with numerical experiments, which highlight the effectiveness of our methods. To the best of our knowledge, our study establishes the first model-based online algorithm with regret guarantees under LTV dynamical systems. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2206.00487 [pdf, other]

Physics-based neural network for non-invasive control of coherent light in scattering media

Authors: Alexandra d'Arco, Fei Xia, Antoine Boniface, Jonathan Dong, Sylvain Gigan

Abstract: Optical imaging through complex media, such as biological tissues or fog, is challenging due to light scattering. In the multiple scattering regime, wavefront sha** provides an effective method to retrieve information; it relies on measuring how the propagation of different optical wavefronts are impacted by scattering. Based on this principle, several wavefront sha** techniques were successfu… ▽ More Optical imaging through complex media, such as biological tissues or fog, is challenging due to light scattering. In the multiple scattering regime, wavefront sha** provides an effective method to retrieve information; it relies on measuring how the propagation of different optical wavefronts are impacted by scattering. Based on this principle, several wavefront sha** techniques were successfully developed, but most of them are highly invasive and limited to proof-of-principle experiments. Here, we propose to use a neural network approach to non-invasively characterize and control light scattering inside the medium and also to retrieve information of hidden objects buried within it. Unlike most of the recently-proposed approaches, the architecture of our neural network with its layers, connected nodes and activation functions has a true physical meaning as it mimics the propagation of light in our optical system. It is trained with an experimentally-measured input/output dataset built from a series of incident light patterns and corresponding camera snapshots. We apply our physics-based neural network to a fluorescence microscope in epi-configuration and demonstrate its performance through numerical simulations and experiments. This flexible method can include physical priors and we show that it can be applied to other systems as, for example, non-linear or coherent contrast mechanisms. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: 15 pages, 11 figures

arXiv:2205.13489 [pdf, other]

Measuring Perceptual Color Differences of Smartphone Photographs

Authors: Zhihua Wang, Keshuo Xu, Yang Yang, Jianlei Dong, Shuhang Gu, Lihao Xu, Yuming Fang, Kede Ma

Abstract: Measuring perceptual color differences (CDs) is of great importance in modern smartphone photography. Despite the long history, most CD measures have been constrained by psychophysical data of homogeneous color patches or a limited number of simplistic natural photographic images. It is thus questionable whether existing CD measures generalize in the age of smartphone photography characterized by… ▽ More Measuring perceptual color differences (CDs) is of great importance in modern smartphone photography. Despite the long history, most CD measures have been constrained by psychophysical data of homogeneous color patches or a limited number of simplistic natural photographic images. It is thus questionable whether existing CD measures generalize in the age of smartphone photography characterized by greater content complexities and learning-based image signal processors. In this paper, we put together so far the largest image dataset for perceptual CD assessment, in which the photographic images are 1) captured by six flagship smartphones, 2) altered by Photoshop, 3) post-processed by built-in filters of the smartphones, and 4) reproduced with incorrect color profiles. We then conduct a large-scale psychophysical experiment to gather perceptual CDs of 30,000 image pairs in a carefully controlled laboratory environment. Based on the newly established dataset, we make one of the first attempts to construct an end-to-end learnable CD formula based on a lightweight neural network, as a generalization of several previous metrics. Extensive experiments demonstrate that the optimized formula outperforms 33 existing CD measures by a large margin, offers reasonable local CD maps without the use of dense supervision, generalizes well to homogeneous color patch data, and empirically behaves as a proper metric in the mathematical sense. Our dataset and code are publicly available at https://github.com/hellooks/CDNet. △ Less

Submitted 31 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

Comments: 10 figures, 8 tables, 14 pages

arXiv:2204.03747 [pdf, other]

Implementation and Experimental Validation of Data-Driven Predictive Control for Dissipating Stop-and-Go Waves in Mixed Traffic

Authors: Jiawei Wang, Yang Zheng, Jianghong Dong, Chaoyi Chen, Mengchi Cai, Keqiang Li, Qing Xu

Abstract: In this paper, we present the first experimental results of data-driven predictive control for connected and autonomous vehicles (CAVs) in dissipating traffic waves. In particular, we consider a recent strategy of Data-EnablEd Predicted Leading Cruise Control (DeeP-LCC), which bypasses the need of identifying the driving behaviors of surrounding vehicles and directly relies on measurable traffic d… ▽ More In this paper, we present the first experimental results of data-driven predictive control for connected and autonomous vehicles (CAVs) in dissipating traffic waves. In particular, we consider a recent strategy of Data-EnablEd Predicted Leading Cruise Control (DeeP-LCC), which bypasses the need of identifying the driving behaviors of surrounding vehicles and directly relies on measurable traffic data to achieve safe and optimal CAV control in mixed traffic. We present the implementation details of DeeP-LCC, including data collection, equilibrium estimation, and control execution. Based on a miniature experiment platform, we reproduce the phenomenon of stop-and-go waves in two typical traffic scenarios: 1) open straight-road scenario under external disturbances and 2) closed ring-road scenario with no bottlenecks. Our experiments clearly demonstrate that DeeP-LCC enables one or a few CAVs to dissipate the traffic waves in both traffic scenarios. These experimental findings validate the great potential of DeeP-LCC in smoothing practical traffic flow in the presence of noisy data, uncertain low-level vehicle dynamics, and communication and computation delays. The code and videos of our experimental results are available at https://github.com/soc-ucsd/DeeP-LCC. △ Less

Submitted 23 November, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

Comments: 14 pages, 11 figures

arXiv:2203.10078 [pdf, other]

Bayesian Inversion for Nonlinear Imaging Models using Deep Generative Priors

Authors: Pakshal Bohra, Thanh-an Pham, Jonathan Dong, Michael Unser

Abstract: Most modern imaging systems incorporate a computational pipeline to infer the image of interest from acquired measurements. The Bayesian approach to solve such ill-posed inverse problems involves the characterization of the posterior distribution of the image. It depends on the model of the imaging system and on prior knowledge on the image of interest. In this work, we present a Bayesian reconstr… ▽ More Most modern imaging systems incorporate a computational pipeline to infer the image of interest from acquired measurements. The Bayesian approach to solve such ill-posed inverse problems involves the characterization of the posterior distribution of the image. It depends on the model of the imaging system and on prior knowledge on the image of interest. In this work, we present a Bayesian reconstruction framework for nonlinear imaging models where we specify the prior knowledge on the image through a deep generative model. We develop a tractable posterior-sampling scheme based on the Metropolis-adjusted Langevin algorithm for the class of nonlinear inverse problems where the forward model has a neural-network-like structure. This class includes most practical imaging modalities. We introduce the notion of augmented deep generative priors in order to suitably handle the recovery of quantitative images.We illustrate the advantages of our framework by applying it to two nonlinear imaging modalities-phase retrieval and optical diffraction tomography. △ Less

Submitted 25 May, 2023; v1 submitted 18 March, 2022; originally announced March 2022.

arXiv:2203.06543 [pdf, other]

Change Detection from Synthetic Aperture Radar Images via Dual Path Denoising Network

Authors: Junjie Wang, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Benefited from the rapid and sustainable development of synthetic aperture radar (SAR) sensors, change detection from SAR images has received increasing attentions over the past few years. Existing unsupervised deep learning-based methods have made great efforts to exploit robust feature representations, but they consume much time to optimize parameters. Besides, these methods use clustering to ob… ▽ More Benefited from the rapid and sustainable development of synthetic aperture radar (SAR) sensors, change detection from SAR images has received increasing attentions over the past few years. Existing unsupervised deep learning-based methods have made great efforts to exploit robust feature representations, but they consume much time to optimize parameters. Besides, these methods use clustering to obtain pseudo-labels for training, and the pseudo-labeled samples often involve errors, which can be considered as "label noise". To address these issues, we propose a Dual Path Denoising Network (DPDNet) for SAR image change detection. In particular, we introduce the random label propagation to clean the label noise involved in preclassification. We also propose the distinctive patch convolution for feature representation learning to reduce the time consumption. Specifically, the attention mechanism is used to select distinctive pixels in the feature maps, and patches around these pixels are selected as convolution kernels. Consequently, the DPDNet does not require a great number of training samples for parameter optimization, and its computational efficiency is greatly enhanced. Extensive experiments have been conducted on five SAR datasets to verify the proposed DPDNet. The experimental results demonstrate that our method outperforms several state-of-the-art methods in change detection results. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE JSTARS

arXiv:2203.06375 [pdf, ps, other]

doi 10.1109/TGRS.2022.3150970

SSCU-Net: Spatial-Spectral Collaborative Unmixing Network for Hyperspectral Images

Authors: Lin Qi, Feng Gao, Junyu Dong, Xinbo Gao, Qian Du

Abstract: Linear spectral unmixing is an essential technique in hyperspectral image processing and interpretation. In recent years, deep learning-based approaches have shown great promise in hyperspectral unmixing, in particular, unsupervised unmixing methods based on autoencoder networks are a recent trend. The autoencoder model, which automatically learns low-dimensional representations (abundances) and r… ▽ More Linear spectral unmixing is an essential technique in hyperspectral image processing and interpretation. In recent years, deep learning-based approaches have shown great promise in hyperspectral unmixing, in particular, unsupervised unmixing methods based on autoencoder networks are a recent trend. The autoencoder model, which automatically learns low-dimensional representations (abundances) and reconstructs data with their corresponding bases (endmembers), has achieved superior performance in hyperspectral unmixing. In this article, we explore the effective utilization of spatial and spectral information in autoencoder-based unmixing networks. Important findings on the use of spatial and spectral information in the autoencoder framework are discussed. Inspired by these findings, we propose a spatial-spectral collaborative unmixing network, called SSCU-Net, which learns a spatial autoencoder network and a spectral autoencoder network in an end-to-end manner to more effectively improve the unmixing performance. SSCU-Net is a two-stream deep network and shares an alternating architecture, where the two autoencoder networks are efficiently trained in a collaborative way for estimation of endmembers and abundances. Meanwhile, we propose a new spatial autoencoder network by introducing a superpixel segmentation method based on abundance information, which greatly facilitates the employment of spatial information and improves the accuracy of unmixing network. Moreover, extensive ablation studies are carried out to investigate the performance gain of SSCU-Net. Experimental results on both synthetic and real hyperspectral data sets illustrate the effectiveness and competitiveness of the proposed SSCU-Net compared with several state-of-the-art hyperspectral unmixing methods. △ Less

Submitted 8 August, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

Comments: IEEE TGRS 2022

arXiv:2202.12940 [pdf]

Fully-integrated multipurpose microwave frequency identification system on a single chip

Authors: Yuhan Yao, Yuhe Zhao, Yanxian Wei, Feng Zhou, Daigao Chen, Yuguang Zhang, Xi Xiao, Ming Li, Jianji Dong, Shaohua Yu, Xinliang Zhang

Abstract: We demonstrate a fully-integrated multipurpose microwave frequency identification system on silicon-on-insulator platform. Thanks to its multipurpose features, the chip is able to identify different types of microwave signals, including single-frequency, multiple-frequency, chirped and frequency-hop** microwave signals, as well as discriminate instantaneous frequency variation among the frequenc… ▽ More We demonstrate a fully-integrated multipurpose microwave frequency identification system on silicon-on-insulator platform. Thanks to its multipurpose features, the chip is able to identify different types of microwave signals, including single-frequency, multiple-frequency, chirped and frequency-hop** microwave signals, as well as discriminate instantaneous frequency variation among the frequency-modulated signals. This demonstration exhibits fully integrated solution and fully functional microwave frequency identification, which can meet the requirements in reduction of size, weight and power for future advanced microwave photonic processor. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: 23 pages,6 figures

arXiv:2201.08954 [pdf, other]

Change Detection from Synthetic Aperture Radar Images via Graph-Based Knowledge Supplement Network

Authors: Junjie Wang, Feng Gao, Junyu Dong, Shan Zhang, Qian Du

Abstract: Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis. Most previous works adopt a self-supervised method which uses pseudo-labeled samples to guide subsequent training and testing. However, deep networks commonly require many high-quality samples for parameter optimization. The noise in pseudo-labels inevitably affects… ▽ More Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis. Most previous works adopt a self-supervised method which uses pseudo-labeled samples to guide subsequent training and testing. However, deep networks commonly require many high-quality samples for parameter optimization. The noise in pseudo-labels inevitably affects the final change detection performance. To solve the problem, we propose a Graph-based Knowledge Supplement Network (GKSNet). To be more specific, we extract discriminative information from the existing labeled dataset as additional knowledge, to suppress the adverse effects of noisy samples to some extent. Afterwards, we design a graph transfer module to distill contextual information attentively from the labeled dataset to the target dataset, which bridges feature correlation between datasets. To validate the proposed method, we conducted extensive experiments on four SAR datasets, which demonstrated the superiority of the proposed GKSNet as compared to several state-of-the-art baselines. Our codes are available at https://github.com/summitgao/SAR_CD_GKSNet. △ Less

Submitted 9 February, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

Comments: Accepted by IEEE JSTARS

arXiv:2201.08938 [pdf, other]

doi 10.1109/TGRS.2020.3015843

Adaptive DropBlock Enhanced Generative Adversarial Networks for Hyperspectral Image Classification

Authors: Junjie Wang, Feng Gao, Junyu Dong, Qian Du

Abstract: In recent years, hyperspectral image (HSI) classification based on generative adversarial networks (GAN) has achieved great progress. GAN-based classification methods can mitigate the limited training sample dilemma to some extent. However, several studies have pointed out that existing GAN-based HSI classification methods are heavily affected by the imbalanced training data problem. The discrimin… ▽ More In recent years, hyperspectral image (HSI) classification based on generative adversarial networks (GAN) has achieved great progress. GAN-based classification methods can mitigate the limited training sample dilemma to some extent. However, several studies have pointed out that existing GAN-based HSI classification methods are heavily affected by the imbalanced training data problem. The discriminator in GAN always contradicts itself and tries to associate fake labels to the minority-class samples, and thus impair the classification performance. Another critical issue is the mode collapse in GAN-based methods. The generator is only capable of producing samples within a narrow scope of the data space, which severely hinders the advancement of GAN-based HSI classification methods. In this paper, we proposed an Adaptive DropBlock-enhanced Generative Adversarial Networks (ADGAN) for HSI classification. First, to solve the imbalanced training data problem, we adjust the discriminator to be a single classifier, and it will not contradict itself. Second, an adaptive DropBlock (AdapDrop) is proposed as a regularization method employed in the generator and discriminator to alleviate the mode collapse issue. The AdapDrop generated drop masks with adaptive shapes instead of a fixed size region, and it alleviates the limitations of DropBlock in dealing with ground objects with various shapes. Experimental results on three HSI datasets demonstrated that the proposed ADGAN achieved superior performance over state-of-the-art GAN-based methods. Our codes are available at https://github.com/summitgao/HC_ADGAN △ Less

Submitted 21 January, 2022; originally announced January 2022.

Journal ref: in IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 6, pp. 5040-5053, June 2021

arXiv:2201.08935 [pdf, other]

doi 10.1109/LGRS.2020.2977838

SAR Image Change Detection Based on Multiscale Capsule Network

Authors: Yunhao Gao, Feng Gao, Junyu Dong, Heng-Chao Li

Abstract: Traditional synthetic aperture radar image change detection methods based on convolutional neural networks (CNNs) face the challenges of speckle noise and deformation sensitivity. To mitigate these issues, we proposed a Multiscale Capsule Network (Ms-CapsNet) to extract the discriminative information between the changed and unchanged pixels. On the one hand, the multiscale capsule module is employ… ▽ More Traditional synthetic aperture radar image change detection methods based on convolutional neural networks (CNNs) face the challenges of speckle noise and deformation sensitivity. To mitigate these issues, we proposed a Multiscale Capsule Network (Ms-CapsNet) to extract the discriminative information between the changed and unchanged pixels. On the one hand, the multiscale capsule module is employed to exploit the spatial relationship of features. Therefore, equivariant properties can be achieved by aggregating the features from different positions. On the other hand, an adaptive fusion convolution (AFC) module is designed for the proposed Ms-CapsNet. Higher semantic features can be captured for the primary capsules. Feature extracted by the AFC module significantly improves the robustness to speckle noise. The effectiveness of the proposed Ms-CapsNet is verified on three real SAR datasets. The comparison experiments with four state-of-the-art methods demonstrate the efficiency of the proposed method. Our codes are available at https://github.com/summitgao/SAR_CD_MS_CapsNet . △ Less

Submitted 21 January, 2022; originally announced January 2022.

Journal ref: in IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 3, pp. 484-488, March 2021

arXiv:2112.12602 [pdf, other]

doi 10.1364/OL.457144

Artifacts in optical projection tomography due to refractive index mismatch: model and correction

Authors: Yan Liu, Jonathan Dong, Cédric Schmidt, Aleix Boquet-Pujadas, Jérôme Extermann, Michael Unser

Abstract: Optical Projection Tomography (OPT) is a powerful tool for 3D imaging of mesoscopic samples, thus of great importance to image whole organs for the study of various disease models in life sciences. OPT is able to achieve resolution at a few tens of microns over a large sample volume of several cubic centimeters. However, the reconstructed OPT images often suffer from artifacts caused by different… ▽ More Optical Projection Tomography (OPT) is a powerful tool for 3D imaging of mesoscopic samples, thus of great importance to image whole organs for the study of various disease models in life sciences. OPT is able to achieve resolution at a few tens of microns over a large sample volume of several cubic centimeters. However, the reconstructed OPT images often suffer from artifacts caused by different kinds of physical miscalibration. This work focuses on the refractive index (RI) mismatch between the rotating object and the surrounding medium. We derive a 3D cone beam forward model to approximate the effect of RI mismatch and implement a fast and efficient reconstruction method to correct the induced seagull-shaped artifacts on experimental images of fluorescent beads. △ Less

Submitted 19 September, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

Journal ref: Optics Letters, 47(11), 2618-2621 (2022)

arXiv:2112.00312 [pdf, other]

Experimental Validation of Multi-lane Formation Control for Connected and Automated Vehicles in Multiple Scenarios

Authors: Mengchi Cai, Qing Xu, Chunying Yang, Jianghong Dong, Chaoyi Chen, Jiawei Wang, Jianqiang Wang, Keqiang Li

Abstract: Formation control methods of connected and automated vehicles have been proposed to smoothly switch the structure of vehicular formations in different scenarios. In the previous research, simulations are often conducted to verify the performance of formation control methods. This paper presents the experimental results of multi-lane formation control for connected and automated vehicles. The coord… ▽ More Formation control methods of connected and automated vehicles have been proposed to smoothly switch the structure of vehicular formations in different scenarios. In the previous research, simulations are often conducted to verify the performance of formation control methods. This paper presents the experimental results of multi-lane formation control for connected and automated vehicles. The coordinated formation control framework and specific methods utilized for different scenarios are introduced. The details of experimental platform and vehicle control strategy is provided. Simulations and experiments are conducted in different scenarios, and the results indicate that the formation control method is applicable to multiple traffic scenarios and able to improve formation-structure-switching efficiency compared with benchmark methods. △ Less

Submitted 1 December, 2021; originally announced December 2021.

arXiv:2111.03064 [pdf, other]

Physics-Guided Generative Adversarial Networks for Sea Subsurface Temperature Prediction

Authors: Yuxin Meng, Eric Rigall, Xueen Chen, Feng Gao, Junyu Dong, Sheng Chen

Abstract: Sea subsurface temperature, an essential component of aquatic wildlife, underwater dynamics and heat transfer with the sea surface, is affected by global warming in climate change. Existing research is commonly based on either physics-based numerical models or data based models. Physical modeling and machine learning are traditionally considered as two unrelated fields for the sea subsurface tempe… ▽ More Sea subsurface temperature, an essential component of aquatic wildlife, underwater dynamics and heat transfer with the sea surface, is affected by global warming in climate change. Existing research is commonly based on either physics-based numerical models or data based models. Physical modeling and machine learning are traditionally considered as two unrelated fields for the sea subsurface temperature prediction task, with very different scientific paradigms (physics-driven and data-driven). However, we believe both methods are complementary to each other. Physical modeling methods can offer the potential for extrapolation beyond observational conditions, while data-driven methods are flexible in adapting to data and are capable of detecting unexpected patterns. The combination of both approaches is very attractive and offers potential performance improvement. In this paper, we propose a novel framework based on generative adversarial network (GAN) combined with numerical model to predict sea subsurface temperature. First, a GAN-based model is used to learn the simplified physics between the surface temperature and the target subsurface temperature in numerical model. Then, observation data are used to calibrate the GAN-based model parameters to obtain better prediction. We evaluate the proposed framework by predicting daily sea subsurface temperature in the South China sea. Extensive experiments demonstrate the effectiveness of the proposed framework compared to existing state-of-the-art methods. △ Less

Submitted 4 November, 2021; originally announced November 2021.

Comments: This work has been accepted by IEEE TNNLS for publication. Our codes and datasets are available at https://github.com/mengyuxin520/PGGAN

arXiv:2110.12859 [pdf, other]

Multi-vehicle experiment platform: A Digital Twin Realization Method

Authors: Chunying Yang, Jianghong Dong, Qing Xu, Mengchi Cai, Hongmao Qin, Jianqiang Wang, Keqiang Li

Abstract: With the development of V2X technology, multiple vehicles cooperative control has been widely studied. However, filed testing is rarely conducted due to financial and safety considerations. To solve this problem, this study proposes a digital twin method to carry out multi-vehicle experiments, which uses combination of physical and virtual vehicles to perform coordination tasks. To confirm effecti… ▽ More With the development of V2X technology, multiple vehicles cooperative control has been widely studied. However, filed testing is rarely conducted due to financial and safety considerations. To solve this problem, this study proposes a digital twin method to carry out multi-vehicle experiments, which uses combination of physical and virtual vehicles to perform coordination tasks. To confirm effectiveness of this method, a prototype system is developed, which consists of sand table testbed, its twin system and cloud. Several aspects are quantified to describe system performance, including time delay and localization accuracy. Finally, a vehicle level experiment in platoon scenario is carried out and experiment results confirm effectiveness of this method. △ Less

Submitted 25 October, 2021; originally announced October 2021.

arXiv:2110.11253 [pdf, ps, other]

Multimode Diagnosis for Switched Affine Systems with Noisy Measurement

Authors: **gwei Dong, Arman Sharifi Kolarijani, Peyman Mohajerin Esfahani

Abstract: We study a diagnosis scheme to reliably detect the active mode of discrete-time, switched affine systems in the presence of measurement noise and asynchronous switching. The proposed scheme consists of two parts: (i) the construction of a bank of filters, and (ii) the introduction of a residual/threshold-based diagnosis rule. We develop an exact finite optimization-based framework to numerically s… ▽ More We study a diagnosis scheme to reliably detect the active mode of discrete-time, switched affine systems in the presence of measurement noise and asynchronous switching. The proposed scheme consists of two parts: (i) the construction of a bank of filters, and (ii) the introduction of a residual/threshold-based diagnosis rule. We develop an exact finite optimization-based framework to numerically solve an optimal bank of filters in which the contribution of measurement noise to the residual is minimized. The design problem is safely approximated through linear matrix inequalities and thus becomes tractable. We further propose a thresholding policy along with probabilistic false-alarm guarantees to estimate the active system mode in real-time. In comparison with the existing results, the guarantees improve from a polynomial dependency in the probability of false alarm to a logarithmic form. This improvement is achieved under the additional assumption of sub-Gaussianity, which is expected in many applications. The performance of the proposed approach is validated through a numerical example and an application of the building radiant system. △ Less

Submitted 30 December, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: 25 pages, 15 figures

arXiv:2110.09049 [pdf, other]

Synthetic Aperture Radar Image Change Detection via Siamese Adaptive Fusion Network

Authors: Yunhao Gao, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Synthetic aperture radar (SAR) image change detection is a critical yet challenging task in the field of remote sensing image analysis. The task is non-trivial due to the following challenges: Firstly, intrinsic speckle noise of SAR images inevitably degrades the neural network because of error gradient accumulation. Furthermore, the correlation among various levels or scales of feature maps is di… ▽ More Synthetic aperture radar (SAR) image change detection is a critical yet challenging task in the field of remote sensing image analysis. The task is non-trivial due to the following challenges: Firstly, intrinsic speckle noise of SAR images inevitably degrades the neural network because of error gradient accumulation. Furthermore, the correlation among various levels or scales of feature maps is difficult to be achieved through summation or concatenation. Toward this end, we proposed a siamese adaptive fusion network for SAR image change detection. To be more specific, two-branch CNN is utilized to extract high-level semantic features of multitemporal SAR images. Besides, an adaptive fusion module is designed to adaptively combine multiscale responses in convolutional layers. Therefore, the complementary information is exploited, and feature learning in change detection is further improved. Moreover, a correlation layer is designed to further explore the correlation between multitemporal images. Thereafter, robust feature representation is utilized for classification through a fully-connected layer with softmax. Experimental results on four real SAR datasets demonstrate that the proposed method exhibits superior performance against several state-of-the-art methods. Our codes are available at https://github.com/summitgao/SAR_CD_SAFNet. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: This work has been accepted by IEEE JSTARS for publication. Our codes are available at https://github.com/summitgao/SAR_CD_SAFNet

arXiv:2110.05556 [pdf]

Addressing crash-imminent situations caused by human driven vehicle errors in a mixed traffic stream: a model-based reinforcement learning approach for CAV

Authors: Jiqian Dong, Sikai Chen, Samuel Labi

Abstract: It is anticipated that the era of fully autonomous vehicle operations will be preceded by a lengthy "Transition Period" where the traffic stream will be mixed, that is, consisting of connected autonomous vehicles (CAVs), human-driven vehicles (HDVs) and connected human-driven vehicles (CHDVs). In recognition of the fact that public acceptance of CAVs will hinge on safety performance of automated d… ▽ More It is anticipated that the era of fully autonomous vehicle operations will be preceded by a lengthy "Transition Period" where the traffic stream will be mixed, that is, consisting of connected autonomous vehicles (CAVs), human-driven vehicles (HDVs) and connected human-driven vehicles (CHDVs). In recognition of the fact that public acceptance of CAVs will hinge on safety performance of automated driving systems, and that there will likely be safety challenges in the early part of the transition period, significant research efforts have been expended in the development of safety-conscious automated driving systems. Yet still, there appears to be a lacuna in the literature regarding the handling of the crash-imminent situations that are caused by errant human driven vehicles (HDVs) in the vicinity of the CAV during operations on the roadway. In this paper, we develop a simple model-based Reinforcement Learning (RL) based system that can be deployed in the CAV to generate trajectories that anticipate and avoid potential collisions caused by drivers of the HDVs. The model involves an end-to-end data-driven approach that contains a motion prediction model based on deep learning, and a fast trajectory planning algorithm based on model predictive control (MPC). The proposed system requires no prior knowledge or assumption about the physical environment including the vehicle dynamics, and therefore represents a general approach that can be deployed on any type of vehicle (e.g., truck, buse, motorcycle, etc.). The framework is trained and tested in the CARLA simulator with multiple collision imminent scenarios, and the results indicate the proposed model can avoid the collision at high successful rate (>85%) even in highly compact and dangerous situations. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: Under review for presentation at TRB 2022 Annual Meeting

Showing 1–50 of 91 results for author: Dong, J