Search | arXiv e-print repository

Object-Attribute-Relation Representation based Video Semantic Communication

Authors: Qiyuan Du, Yi** Duan, Qianqian Yang, Xiaoming Tao, Mérouane Debbah

Abstract: With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding… ▽ More With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.02190 [pdf, ps, other]

Age of Trust (AoT): A Continuous Verification Framework for Wireless Networks

Authors: Yuquan Xiao, Qinghe Du, Wenchi Cheng, Panagiotis D. Diamantoulakis, George K. Karagiannidis

Abstract: Zero Trust is a new security vision for 6G networks that emphasises the philosophy of never trust and always verify. However, there is a fundamental trade-off between the wireless transmission efficiency and the trust level, which is reflected by the verification interval and its adaptation strategy. More importantly, the mathematical framework to characterise the trust level of the adaptive verif… ▽ More Zero Trust is a new security vision for 6G networks that emphasises the philosophy of never trust and always verify. However, there is a fundamental trade-off between the wireless transmission efficiency and the trust level, which is reflected by the verification interval and its adaptation strategy. More importantly, the mathematical framework to characterise the trust level of the adaptive verification strategy is still missing. Inspired by this vision, we propose a concept called age of trust (AoT) to capture the characteristics of the trust level degrading over time, with the definition of the time elapsed since the last verification of the target user's trust plus the initial age, which depends on the trust level evaluated at that verification. The higher the trust level, the lower the initial age. To evaluate the trust level in the long term, the average AoT is used. We then investigate how to find a compromise between average AoT and wireless transmission efficiency with limited resources. In particular, we address the bi-objective optimization (BOO) problem between average AoT and throughput over a single link with arbitrary service process, where the identity of the receiver is constantly verified, and we devise a periodic verification scheme and a Q-learning-based scheme for constant process and random process, respectively. We also tackle the BOO problem in a multiple random access scenario, where a trust-enhanced frame-slotted ALOHA is designed. Finally, the numerical results show that our proposals can achieve a fair compromise between trust level and wireless transmission efficiency, and thus have a wide application prospect in various zero-trust architectures. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02139 [pdf, other]

Statistical Age of Information: A Risk-Aware Metric and Its Applications in Status Updates

Authors: Yuquan Xiao, Qinghe Du, George K. Karagiannidis

Abstract: Age of information (AoI) is an effective measure to quantify the information freshness in wireless status update systems. It has been further validated that the peak AoI has the potential to capture the core characteristics of the aging process, and thus the average peak AoI is widely used to evaluate the long-term performance of information freshness. However, the average peak AoI is a risk-insen… ▽ More Age of information (AoI) is an effective measure to quantify the information freshness in wireless status update systems. It has been further validated that the peak AoI has the potential to capture the core characteristics of the aging process, and thus the average peak AoI is widely used to evaluate the long-term performance of information freshness. However, the average peak AoI is a risk-insensitive metric and therefore may not be well suited for evaluating critical status update services. Motivated by this concern, and following the spirit of entropic value-at-risk (EVaR) in the field of risk analysis, in this paper we present a concept, termed Statistical AoI, for providing a unified framework to guarantee various requirements of risk-sensitive status-update services with the demand on the violation probability of the peak age. In particular, as the constraint on the violation probability of the peak age varies from loose to strict, the statistical AoI evolves from the average peak AoI to the maximum peak AoI. We then investigate the statistical AoI minimization problem for status updates over wireless fading channels. It is interesting to note that the corresponding optimal sampling scheme varies from step to constant functions of the channel power gain with the peak age violation probability from one to zero. We also address the maximum statistical AoI minimization problem for multi-status updates with time division multiple access (TDMA), where longer transmission time can improve reliability but may also cause the larger age. By solving this problem, we derive the optimal transmission time allocation scheme. Numerical results show that our proposals can better satisfy the diverse requirements of various risk-sensitive status update services, and demonstrate the great potential of improving information freshness compared to baseline approaches. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2403.10067 [pdf, other]

Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

Authors: Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

Abstract: Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhan… ▽ More Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhance the modeling of both global and local features, we have devised a convolution and attention fusion module aimed at capturing long-range dependencies and neighborhood spectral correlations. Furthermore, to improve multi-scale information aggregation, we design a multi-scale feed-forward network to enhance denoising performance by extracting features at different scales. Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet. The proposed model is effective in removing various types of complex noise. Our codes are available at \url{https://github.com/summitgao/HCANet}. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: IEEE GRSL 2024

arXiv:2311.04442 [pdf, other]

SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification

Authors: Junyan Lin, Feng Gao, Xiaocheng Shi, Junyu Dong, Qian Du

Abstract: Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectra… ▽ More Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectral masked auto-encoder (SS-MAE) for HSI and LiDAR/SAR data joint classification. Specifically, SS-MAE consists of a spatial-wise branch and a spectral-wise branch. The spatial-wise branch masks random patches and reconstructs missing pixels, while the spectral-wise branch masks random spectral channels and reconstructs missing channels. Our SS-MAE fully exploits the spatial and spectral representations of the input data. Furthermore, to complement local features in the training stage, we add two lightweight CNNs for feature extraction. Both global and local features are taken into account for feature modeling. To demonstrate the effectiveness of the proposed SS-MAE, we conduct extensive experiments on three publicly available datasets. Extensive experiments on three multi-source datasets verify the superiority of our SS-MAE compared with several state-of-the-art baselines. The source codes are available at \url{https://github.com/summitgao/SS-MAE}. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: IEEE TGRS 2023

arXiv:2309.12010 [pdf, other]

Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection

Authors: Haopeng Zhang, Zi**g Lin, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorpo… ▽ More Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorporate global attention. To this end, we propose a convolution and attention mixer (CAMixer). First, to compensate the inductive bias for Transformer, we combine self-attention with shift convolution in a parallel way. The parallel design effectively captures the global semantic information via the self-attention and performs local feature extraction through shift convolution simultaneously. Second, we adopt a gating mechanism in the feed-forward network to enhance the non-linear feature transformation. The gating mechanism is formulated as the element-wise multiplication of two parallel linear layers. Important features can be highlighted, leading to high-quality representations against speckle noise. Extensive experiments conducted on three SAR datasets verify the superior performance of the proposed CAMixer. The source codes will be publicly available at https://github.com/summitgao/CAMixer . △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by IEEE GRSL

arXiv:2308.13906 [pdf, other]

A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Authors: Zixiao Zhao, Qinghe Du, Xiang Yao, Lei Lu, Shijiao Zhang

Abstract: As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection c… ▽ More As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection can recognize the type and flight mode of drones based on the sampled drone signals. In this paper, we first utilize Short-Time Fourier. Transform (STFT) to extract two-dimensional features from the raw signals, which contain both time-domain and frequency-domain information. Then, we employ a Convolutional Neural Network (CNN) built with ResNet structure to achieve multi-class classifications. Our experimental results show that the proposed ResNet-STFT can achieve higher accuracy and faster convergence on the extended dataset. Additionally, it exhibits balanced performance compared to other baselines on the raw dataset. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2304.09376 [pdf, other]

doi 10.1109/TGRS.2023.3257039

Physical Knowledge Enhanced Deep Neural Network for Sea Surface Temperature Prediction

Authors: Yuxin Meng, Feng Gao, Eric Rigall, Ran Dong, Junyu Dong, Qian Du

Abstract: Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advanc… ▽ More Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advances in earth observation technologies have yielded a monumental growth of data. Consequently, it is imperative to explore ways in which to improve and supplement numerical models utilizing the ever-increasing amounts of historical observational data. To this end, we introduce a method for SST prediction that transfers physical knowledge from historical observations to numerical models. Specifically, we use a combination of an encoder and a generative adversarial network (GAN) to capture physical knowledge from the observed data. The numerical model data is then fed into the pre-trained model to generate physics-enhanced data, which can then be used for SST prediction. Experimental results demonstrate that the proposed method considerably enhances SST prediction performance when compared to several state-of-the-art baselines. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: IEEE TGRS 2023

arXiv:2304.09373 [pdf, other]

Multi-scale Adaptive Fusion Network for Hyperspectral Image Denoising

Authors: Haodong Pan, Feng Gao, Junyu Dong, Qian Du

Abstract: Removing the noise and improving the visual quality of hyperspectral images (HSIs) is challenging in academia and industry. Great efforts have been made to leverage local, global or spectral context information for HSI denoising. However, existing methods still have limitations in feature interaction exploitation among multiple scales and rich spectral structure preservation. In view of this, we p… ▽ More Removing the noise and improving the visual quality of hyperspectral images (HSIs) is challenging in academia and industry. Great efforts have been made to leverage local, global or spectral context information for HSI denoising. However, existing methods still have limitations in feature interaction exploitation among multiple scales and rich spectral structure preservation. In view of this, we propose a novel solution to investigate the HSI denoising using a Multi-scale Adaptive Fusion Network (MAFNet), which can learn the complex nonlinear map** between clean and noisy HSI. Two key components contribute to improving the hyperspectral image denoising: A progressively multiscale information aggregation network and a co-attention fusion module. Specifically, we first generate a set of multiscale images and feed them into a coarse-fusion network to exploit the contextual texture correlation. Thereafter, a fine fusion network is followed to exchange the information across the parallel multiscale subnetworks. Furthermore, we design a co-attention fusion module to adaptively emphasize informative features from different scales, and thereby enhance the discriminative learning capability for denoising. Extensive experiments on synthetic and real HSI datasets demonstrate that the proposed MAFNet has achieved better denoising performance than other state-of-the-art techniques. Our codes are available at \verb'https://github.com/summitgao/MAFNet'. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: IEEE JSTASRS 2023, code at: https://github.com/summitgao/MAFNet

arXiv:2303.11153 [pdf, ps, other]

doi 10.1109/TVT.2023.3336728

Statistical Age-of-Information Optimization for Status Update over Multi-State Fading Channels

Authors: Yuquan Xiao, Qinghe Du

Abstract: Age of information (AoI) is a powerful metric to evaluate the freshness of information, where minimization of average statistics, such as the average AoI and average peak AoI, currently prevails in guiding freshness optimization for related applications. Although minimizing the statistics does improve the received information's freshness for status update systems in the sense of average, the time-… ▽ More Age of information (AoI) is a powerful metric to evaluate the freshness of information, where minimization of average statistics, such as the average AoI and average peak AoI, currently prevails in guiding freshness optimization for related applications. Although minimizing the statistics does improve the received information's freshness for status update systems in the sense of average, the time-varying fading characteristics of wireless channels often cause uncertain yet frequent age violations. The recently-proposed statistical AoI metric can better characterize more features of AoI dynamics, which evaluates the achievable minimum peak AoI under the certain constraint on age violation probability. In this paper, we study the statistical AoI minimization problem for status update systems over multi-state fading channels, which can effectively upper-bound the AoI violation probability but introduce the prohibitively-high computing complexity. To resolve this issue, we tackle the problem with a two-fold approach. For a small AoI exponent, the problem is approximated via a fractional programming problem. For a large AoI exponent, the problem is converted to a convex problem. Solving the two problems respectively, we derive the near-optimal sampling interval for diverse status update systems. Insightful observations are obtained on how sampling interval shall be tuned as a decreasing function of channel state information (CSI). Surprisingly, for the extremely stringent AoI requirement, the sampling interval converges to a constant regardless of CSI's variation. Numerical results verify effectiveness as well as superiority of our proposed scheme. △ Less

Submitted 27 November, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: This paper has been accepted by IEEE Transactions on Vehicular Technology

arXiv:2302.14764 [pdf, other]

Robust Secrecy via Aerial Reflection and Jamming: Joint Optimization of Deployment and Transmission

Authors: Xiao Tang, Hongliang He, Limeng Dong, Lixin Li, Qinghe Du, Zhu Han

Abstract: Reconfigurable intelligent surfaces (RISs) are recognized with great potential to strengthen wireless security, yet the performance gain largely depends on the deployment location of RISs in the network topology. In this paper, we consider the anti-eavesdrop** communication established through a RIS at a fixed location, as well as an aerial platform mounting another RIS and a friendly jammer to… ▽ More Reconfigurable intelligent surfaces (RISs) are recognized with great potential to strengthen wireless security, yet the performance gain largely depends on the deployment location of RISs in the network topology. In this paper, we consider the anti-eavesdrop** communication established through a RIS at a fixed location, as well as an aerial platform mounting another RIS and a friendly jammer to further improve the secrecy. The aerial RIS helps enhance the legitimate signal and the aerial cooperative jamming is strengthened through the fixed RIS. The security gain with aerial reflection and jamming is further improved with the optimized deployment of the aerial platform. We particularly consider the imperfect channel state information issue and address the worst-case secrecy for robust performance. The formulated robust secrecy rate maximization problem is decomposed into two layers, where the inner layer solves for reflection and jamming with robust optimization, and the outer layer tackles the aerial deployment through deep reinforcement learning. Simulation results show the deployment under different network topologies and demonstrate the performance superiority of our proposal in terms of the worst-case security provisioning as compared with the baselines. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: 14 pages, 10 figures, accepted at IEEE IoTJ

arXiv:2301.03335 [pdf, other]

doi 10.1109/TGRS.2023.3236154

Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR Data Classification

Authors: Meng Wang, Feng Gao, Junyu Dong, Heng-Chao Li, Qian Du

Abstract: The joint hyperspectral image (HSI) and LiDAR data classification aims to interpret ground objects at more detailed and precise level. Although deep learning methods have shown remarkable success in the multisource data classification task, self-supervised learning has rarely been explored. It is commonly nontrivial to build a robust self-supervised learning model for multisource data classificati… ▽ More The joint hyperspectral image (HSI) and LiDAR data classification aims to interpret ground objects at more detailed and precise level. Although deep learning methods have shown remarkable success in the multisource data classification task, self-supervised learning has rarely been explored. It is commonly nontrivial to build a robust self-supervised learning model for multisource data classification, due to the fact that the semantic similarities of neighborhood regions are not exploited in existing contrastive learning framework. Furthermore, the heterogeneous gap induced by the inconsistent distribution of multisource data impedes the classification performance. To overcome these disadvantages, we propose a Nearest Neighbor-based Contrastive Learning Network (NNCNet), which takes full advantage of large amounts of unlabeled data to learn discriminative feature representations. Specifically, we propose a nearest neighbor-based data augmentation scheme to use enhanced semantic relationships among nearby regions. The intermodal semantic alignments can be captured more accurately. In addition, we design a bilinear attention module to exploit the second-order and even high-order feature interactions between the HSI and LiDAR data. Extensive experiments on four public datasets demonstrate the superiority of our NNCNet over state-of-the-art methods. The source codes are available at \url{https://github.com/summitgao/NNCNet}. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: IEEE TGRS 2023

arXiv:2208.04481 [pdf, other]

doi 10.1109/LGRS.2022.3198088

Synthetic Aperture Radar Image Change Detection via Layer Attention-Based Noise-Tolerant Network

Authors: Desen Meng, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Recently, change detection methods for synthetic aperture radar (SAR) images based on convolutional neural networks (CNN) have gained increasing research attention. However, existing CNN-based methods neglect the interactions among multilayer convolutions, and errors involved in the preclassification restrict the network optimization. To this end, we proposed a layer attention-based noise-tolerant… ▽ More Recently, change detection methods for synthetic aperture radar (SAR) images based on convolutional neural networks (CNN) have gained increasing research attention. However, existing CNN-based methods neglect the interactions among multilayer convolutions, and errors involved in the preclassification restrict the network optimization. To this end, we proposed a layer attention-based noise-tolerant network, termed LANTNet. In particular, we design a layer attention module that adaptively weights the feature of different convolution layers. In addition, we design a noise-tolerant loss function that effectively suppresses the impact of noisy labels. Therefore, the model is insensitive to noisy labels in the preclassification results. The experimental results on three SAR datasets show that the proposed LANTNet performs better compared to several state-of-the-art methods. The source codes are available at https://github.com/summitgao/LANTNet △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: Accepted by IEEE Geoscience and Remote Sensing Letters (GRSL) 2022, code is available at https://github.com/summitgao/LANTNet

arXiv:2206.05641 [pdf, ps, other]

An Unsupervised Deep-Learning Method for Bone Age Assessment

Authors: Hao Zhu, Wan-**g Nie, Yue-Jie Hou, Qi-Meng Du, Si-**g Li, Chi-Chun Zhou

Abstract: The bone age, reflecting the degree of development of the bones, can be used to predict the adult height and detect endocrine diseases of children. Both examinations of radiologists and variability of operators have a significant impact on bone age assessment. To decrease human intervention , machine learning algorithms are used to assess the bone age automatically. However, conventional supervise… ▽ More The bone age, reflecting the degree of development of the bones, can be used to predict the adult height and detect endocrine diseases of children. Both examinations of radiologists and variability of operators have a significant impact on bone age assessment. To decrease human intervention , machine learning algorithms are used to assess the bone age automatically. However, conventional supervised deep-learning methods need pre-labeled data. In this paper, based on the convolutional auto-encoder with constraints (CCAE), an unsupervised deep-learning model proposed in the classification of the fingerprint, we propose this model for the classification of the bone age and baptize it BA-CCAE. In the proposed BA-CCAE model, the key regions of the raw X-ray images of the bone age are encoded, yielding the latent vectors. The K-means clustering algorithm is used to obtain the final classifications by grou** the latent vectors of the bone images. A set of experiments on the Radiological Society of North America pediatric bone age dataset (RSNA) show that the accuracy of classifications at 48-month intervals is 76.15%. Although the accuracy now is lower than most of the existing supervised models, the proposed BA-CCAE model can establish the classification of bone age without any pre-labeled data, and to the best of our knowledge, the proposed BA-CCAE is one of the few trails using the unsupervised deep-learning method for the bone age assessment. △ Less

Submitted 11 June, 2022; originally announced June 2022.

arXiv:2205.09933 [pdf, other]

Hyperspectral Unmixing Based on Nonnegative Matrix Factorization: A Comprehensive Review

Authors: Xin-Ru Feng, Heng-Chao Li, Rui Wang, Qian Du, ** Jia, Antonio Plaza

Abstract: Hyperspectral unmixing has been an important technique that estimates a set of endmembers and their corresponding abundances from a hyperspectral image (HSI). Nonnegative matrix factorization (NMF) plays an increasingly significant role in solving this problem. In this article, we present a comprehensive survey of the NMF-based methods proposed for hyperspectral unmixing. Taking the NMF model as a… ▽ More Hyperspectral unmixing has been an important technique that estimates a set of endmembers and their corresponding abundances from a hyperspectral image (HSI). Nonnegative matrix factorization (NMF) plays an increasingly significant role in solving this problem. In this article, we present a comprehensive survey of the NMF-based methods proposed for hyperspectral unmixing. Taking the NMF model as a baseline, we show how to improve NMF by utilizing the main properties of HSIs (e.g., spectral, spatial, and structural information). We categorize three important development directions including constrained NMF, structured NMF, and generalized NMF. Furthermore, several experiments are conducted to illustrate the effectiveness of associated algorithms. Finally, we conclude the article with possible future directions with the purposes of providing guidelines and inspiration to promote the development of hyperspectral unmixing. △ Less

Submitted 19 May, 2022; originally announced May 2022.

arXiv:2205.08839 [pdf, other]

A Survey on Hyperspectral Image Restoration: From the View of Low-Rank Tensor Approximation

Authors: Na Liu, Wei Li, Yinjian Wang, Rao Tao, Qian Du, Jocelyn Chanussot

Abstract: The ability of capturing fine spectral discriminative information enables hyperspectral images (HSIs) to observe, detect and identify objects with subtle spectral discrepancy. However, the captured HSIs may not represent true distribution of ground objects and the received reflectance at imaging instruments may be degraded, owing to environmental disturbances, atmospheric effects and sensors' hard… ▽ More The ability of capturing fine spectral discriminative information enables hyperspectral images (HSIs) to observe, detect and identify objects with subtle spectral discrepancy. However, the captured HSIs may not represent true distribution of ground objects and the received reflectance at imaging instruments may be degraded, owing to environmental disturbances, atmospheric effects and sensors' hardware limitations. These degradations include but are not limited to: complex noise (i.e., Gaussian noise, impulse noise, sparse stripes, and their mixtures), heavy stripes, deadlines, cloud and shadow occlusion, blurring and spatial-resolution degradation and spectral absorption, etc. These degradations dramatically reduce the quality and usefulness of HSIs. Low-rank tensor approximation (LRTA) is such an emerging technique, having gained much attention in HSI restoration community, with ever-growing theoretical foundation and pivotal technological innovation. Compared to low-rank matrix approximation (LRMA), LRTA is capable of characterizing more complex intrinsic structure of high-order data and owns more efficient learning abilities, being established to address convex and non-convex inverse optimization problems induced by HSI restoration. This survey mainly attempts to present a sophisticated, cutting-edge, and comprehensive technical survey of LRTA toward HSI restoration, specifically focusing on the following six topics: Denoising, Destri**, Inpainting, Deblurring, Super--resolution and Fusion. The theoretical development and variants of LRTA techniques are also elaborated. For each topic, the state-of-the-art restoration methods are compared by assessing their performance both quantitatively and visually. Open issues and challenges are also presented, including model formulation, algorithm design, prior exploration and application concerning the interpretation requirements. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: Under review in Science China Information Siences

arXiv:2204.04462 [pdf, ps, other]

doi 10.1109/TNNLS.2020.3028945

A3CLNN: Spatial, Spectral and Multiscale Attention ConvLSTM Neural Network for Multisource Remote Sensing Data Classification

Authors: Heng-Chao Li, Wen-Shuai Hu, Wei Li, Jun Li, Qian Du, Antonio Plaza

Abstract: The problem of effectively exploiting the information multiple data sources has become a relevant but challenging research topic in remote sensing. In this paper, we propose a new approach to exploit the complementarity of two data sources: hyperspectral images (HSIs) and light detection and ranging (LiDAR) data. Specifically, we develop a new dual-channel spatial, spectral and multiscale attentio… ▽ More The problem of effectively exploiting the information multiple data sources has become a relevant but challenging research topic in remote sensing. In this paper, we propose a new approach to exploit the complementarity of two data sources: hyperspectral images (HSIs) and light detection and ranging (LiDAR) data. Specifically, we develop a new dual-channel spatial, spectral and multiscale attention convolutional long short-term memory neural network (called dual-channel A3CLNN) for feature extraction and classification of multisource remote sensing data. Spatial, spectral and multiscale attention mechanisms are first designed for HSI and LiDAR data in order to learn spectral- and spatial-enhanced feature representations, and to represent multiscale information for different classes. In the designed fusion network, a novel composite attention learning mechanism (combined with a three-level fusion strategy) is used to fully integrate the features in these two data sources. Finally, inspired by the idea of transfer learning, a novel stepwise training strategy is designed to yield a final classification result. Our experimental results, conducted on several multisource remote sensing data sets, demonstrate that the newly proposed dual-channel A3CLNN exhibits better feature representation ability (leading to more competitive classification performance) than other state-of-the-art methods. △ Less

Submitted 9 April, 2022; originally announced April 2022.

Comments: 16 pages, 10 figures

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 747-761, Feb. 2022

arXiv:2204.00260 [pdf, other]

doi 10.1109/TGRS.2022.3193109

MS-HLMO: Multi-scale Histogram of Local Main Orientation for Remote Sensing Image Registration

Authors: Chenzhong Gao, Wei Li, Ran Tao, Qian Du

Abstract: Multi-source image registration is challenging due to intensity, rotation, and scale differences among the images. Considering the characteristics and differences of multi-source remote sensing images, a feature-based registration algorithm named Multi-scale Histogram of Local Main Orientation (MS-HLMO) is proposed. Harris corner detection is first adopted to generate feature points. The HLMO feat… ▽ More Multi-source image registration is challenging due to intensity, rotation, and scale differences among the images. Considering the characteristics and differences of multi-source remote sensing images, a feature-based registration algorithm named Multi-scale Histogram of Local Main Orientation (MS-HLMO) is proposed. Harris corner detection is first adopted to generate feature points. The HLMO feature of each Harris feature point is extracted on a Partial Main Orientation Map (PMOM) with a Generalized Gradient Location and Orientation Histogram-like (GGLOH) feature descriptor, which provides high intensity, rotation, and scale invariance. The feature points are matched through a multi-scale matching strategy. Comprehensive experiments on 17 multi-source remote sensing scenes demonstrate that the proposed MS-HLMO and its simplified version MS-HLMO$^+$ outperform other competitive registration algorithms in terms of effectiveness and generalization. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2203.06543 [pdf, other]

Change Detection from Synthetic Aperture Radar Images via Dual Path Denoising Network

Authors: Junjie Wang, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Benefited from the rapid and sustainable development of synthetic aperture radar (SAR) sensors, change detection from SAR images has received increasing attentions over the past few years. Existing unsupervised deep learning-based methods have made great efforts to exploit robust feature representations, but they consume much time to optimize parameters. Besides, these methods use clustering to ob… ▽ More Benefited from the rapid and sustainable development of synthetic aperture radar (SAR) sensors, change detection from SAR images has received increasing attentions over the past few years. Existing unsupervised deep learning-based methods have made great efforts to exploit robust feature representations, but they consume much time to optimize parameters. Besides, these methods use clustering to obtain pseudo-labels for training, and the pseudo-labeled samples often involve errors, which can be considered as "label noise". To address these issues, we propose a Dual Path Denoising Network (DPDNet) for SAR image change detection. In particular, we introduce the random label propagation to clean the label noise involved in preclassification. We also propose the distinctive patch convolution for feature representation learning to reduce the time consumption. Specifically, the attention mechanism is used to select distinctive pixels in the feature maps, and patches around these pixels are selected as convolution kernels. Consequently, the DPDNet does not require a great number of training samples for parameter optimization, and its computational efficiency is greatly enhanced. Extensive experiments have been conducted on five SAR datasets to verify the proposed DPDNet. The experimental results demonstrate that our method outperforms several state-of-the-art methods in change detection results. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE JSTARS

arXiv:2203.06375 [pdf, ps, other]

doi 10.1109/TGRS.2022.3150970

SSCU-Net: Spatial-Spectral Collaborative Unmixing Network for Hyperspectral Images

Authors: Lin Qi, Feng Gao, Junyu Dong, Xinbo Gao, Qian Du

Abstract: Linear spectral unmixing is an essential technique in hyperspectral image processing and interpretation. In recent years, deep learning-based approaches have shown great promise in hyperspectral unmixing, in particular, unsupervised unmixing methods based on autoencoder networks are a recent trend. The autoencoder model, which automatically learns low-dimensional representations (abundances) and r… ▽ More Linear spectral unmixing is an essential technique in hyperspectral image processing and interpretation. In recent years, deep learning-based approaches have shown great promise in hyperspectral unmixing, in particular, unsupervised unmixing methods based on autoencoder networks are a recent trend. The autoencoder model, which automatically learns low-dimensional representations (abundances) and reconstructs data with their corresponding bases (endmembers), has achieved superior performance in hyperspectral unmixing. In this article, we explore the effective utilization of spatial and spectral information in autoencoder-based unmixing networks. Important findings on the use of spatial and spectral information in the autoencoder framework are discussed. Inspired by these findings, we propose a spatial-spectral collaborative unmixing network, called SSCU-Net, which learns a spatial autoencoder network and a spectral autoencoder network in an end-to-end manner to more effectively improve the unmixing performance. SSCU-Net is a two-stream deep network and shares an alternating architecture, where the two autoencoder networks are efficiently trained in a collaborative way for estimation of endmembers and abundances. Meanwhile, we propose a new spatial autoencoder network by introducing a superpixel segmentation method based on abundance information, which greatly facilitates the employment of spatial information and improves the accuracy of unmixing network. Moreover, extensive ablation studies are carried out to investigate the performance gain of SSCU-Net. Experimental results on both synthetic and real hyperspectral data sets illustrate the effectiveness and competitiveness of the proposed SSCU-Net compared with several state-of-the-art hyperspectral unmixing methods. △ Less

Submitted 8 August, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

Comments: IEEE TGRS 2022

arXiv:2202.10635 [pdf, other]

Online Learning of Trellis Diagram Using Neural Network for Robust Detection and Decoding

Authors: Jie Yang, Qinghe Du, Yi Jiang

Abstract: This paper studies machine learning-assisted maximum likelihood (ML) and maximum a posteriori (MAP) receivers for a communication system with memory, which can be modelled by a trellis diagram. The prerequisite of the ML/MAP receiver is to obtain the likelihood of the received samples under different state transitions of the trellis diagram, which relies on the channel state information (CSI) and… ▽ More This paper studies machine learning-assisted maximum likelihood (ML) and maximum a posteriori (MAP) receivers for a communication system with memory, which can be modelled by a trellis diagram. The prerequisite of the ML/MAP receiver is to obtain the likelihood of the received samples under different state transitions of the trellis diagram, which relies on the channel state information (CSI) and the distribution of the channel noise. We propose to learn the trellis diagram real-time using an artificial neural network (ANN) trained by a pilot sequence. This approach, termed as the online learning of trellis diagram (OLTD), requires neither the CSI nor statistics of the noise, and can be incorporated into the classic Viterbi and the BCJR algorithm. %Compared with the state-of-the-art ViterbiNet and BCJRNet algorithms in the literature, it It is shown to significantly outperform the model-based methods in non-Gaussian channels. It requires much less training overhead than the state-of-the-art methods, and hence is more feasible for real implementations. As an illustrative example, the OLTD-based BCJR is applied to a Bluetooth low energy (BLE) receiver trained only by a 256-sample pilot sequence. Moreover, the OLTD-based BCJR can accommodate for turbo equalization, while the state-of-the-art BCJRNet/ViterbiNet cannot. As an interesting by-product, we propose an enhancement to the BLE standard by introducing a bit interleaver to its physical layer; the resultant improvement of the receiver sensitivity can make it a better fit for some Internet of Things (IoT) communications. △ Less

Submitted 21 February, 2022; originally announced February 2022.

Comments: 10 pages

arXiv:2201.08954 [pdf, other]

Change Detection from Synthetic Aperture Radar Images via Graph-Based Knowledge Supplement Network

Authors: Junjie Wang, Feng Gao, Junyu Dong, Shan Zhang, Qian Du

Abstract: Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis. Most previous works adopt a self-supervised method which uses pseudo-labeled samples to guide subsequent training and testing. However, deep networks commonly require many high-quality samples for parameter optimization. The noise in pseudo-labels inevitably affects… ▽ More Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis. Most previous works adopt a self-supervised method which uses pseudo-labeled samples to guide subsequent training and testing. However, deep networks commonly require many high-quality samples for parameter optimization. The noise in pseudo-labels inevitably affects the final change detection performance. To solve the problem, we propose a Graph-based Knowledge Supplement Network (GKSNet). To be more specific, we extract discriminative information from the existing labeled dataset as additional knowledge, to suppress the adverse effects of noisy samples to some extent. Afterwards, we design a graph transfer module to distill contextual information attentively from the labeled dataset to the target dataset, which bridges feature correlation between datasets. To validate the proposed method, we conducted extensive experiments on four SAR datasets, which demonstrated the superiority of the proposed GKSNet as compared to several state-of-the-art baselines. Our codes are available at https://github.com/summitgao/SAR_CD_GKSNet. △ Less

Submitted 9 February, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

Comments: Accepted by IEEE JSTARS

arXiv:2201.08938 [pdf, other]

doi 10.1109/TGRS.2020.3015843

Adaptive DropBlock Enhanced Generative Adversarial Networks for Hyperspectral Image Classification

Authors: Junjie Wang, Feng Gao, Junyu Dong, Qian Du

Abstract: In recent years, hyperspectral image (HSI) classification based on generative adversarial networks (GAN) has achieved great progress. GAN-based classification methods can mitigate the limited training sample dilemma to some extent. However, several studies have pointed out that existing GAN-based HSI classification methods are heavily affected by the imbalanced training data problem. The discrimin… ▽ More In recent years, hyperspectral image (HSI) classification based on generative adversarial networks (GAN) has achieved great progress. GAN-based classification methods can mitigate the limited training sample dilemma to some extent. However, several studies have pointed out that existing GAN-based HSI classification methods are heavily affected by the imbalanced training data problem. The discriminator in GAN always contradicts itself and tries to associate fake labels to the minority-class samples, and thus impair the classification performance. Another critical issue is the mode collapse in GAN-based methods. The generator is only capable of producing samples within a narrow scope of the data space, which severely hinders the advancement of GAN-based HSI classification methods. In this paper, we proposed an Adaptive DropBlock-enhanced Generative Adversarial Networks (ADGAN) for HSI classification. First, to solve the imbalanced training data problem, we adjust the discriminator to be a single classifier, and it will not contradict itself. Second, an adaptive DropBlock (AdapDrop) is proposed as a regularization method employed in the generator and discriminator to alleviate the mode collapse issue. The AdapDrop generated drop masks with adaptive shapes instead of a fixed size region, and it alleviates the limitations of DropBlock in dealing with ground objects with various shapes. Experimental results on three HSI datasets demonstrated that the proposed ADGAN achieved superior performance over state-of-the-art GAN-based methods. Our codes are available at https://github.com/summitgao/HC_ADGAN △ Less

Submitted 21 January, 2022; originally announced January 2022.

Journal ref: in IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 6, pp. 5040-5053, June 2021

arXiv:2112.14608 [pdf, other]

HPRN: Holistic Prior-embedded Relation Network for Spectral Super-Resolution

Authors: Chaoxiong Wu, Jiaojiao Li, Rui Song, Yunsong Li, Qian Du

Abstract: Spectral super-resolution (SSR) refers to the hyperspectral image (HSI) recovery from an RGB counterpart. Due to the one-to-many nature of the SSR problem, a single RGB image can be reprojected to many HSIs. The key to tackle this ill-posed problem is to plug into multi-source prior information such as the natural spatial context-prior of RGB images, deep feature-prior or inherent statistical-prio… ▽ More Spectral super-resolution (SSR) refers to the hyperspectral image (HSI) recovery from an RGB counterpart. Due to the one-to-many nature of the SSR problem, a single RGB image can be reprojected to many HSIs. The key to tackle this ill-posed problem is to plug into multi-source prior information such as the natural spatial context-prior of RGB images, deep feature-prior or inherent statistical-prior of HSIs, etc., so as to effectively alleviate the degree of ill-posedness. However, most current approaches only consider the general and limited priors in their customized convolutional neural networks (CNNs), which leads to the inability to guarantee the confidence and fidelity of reconstructed spectra. In this paper, we propose a novel holistic prior-embedded relation network (HPRN) to integrate comprehensive priors to regularize and optimize the solution space of SSR. Basically, the core framework is delicately assembled by several multi-residual relation blocks (MRBs) that fully facilitate the transmission and utilization of the low-frequency content prior of RGBs. Innovatively, the semantic prior of RGB inputs is introduced to mark category attributes, and a semantic-driven spatial relation module (SSRM) is invented to perform the feature aggregation of clustered similar range for refining recovered characteristics. Additionally, we develop a transformer-based channel relation module (TCRM), which breaks the habit of employing scalars as the descriptors of channel-wise relations in the previous deep feature-prior, and replaces them with certain vectors to make the map** function more robust and smoother. In order to maintain the mathematical correlation and spectral consistency between hyperspectral bands, the second-order prior constraints (SOPC) are incorporated into the loss function to guide the HSI reconstruction. △ Less

Submitted 8 February, 2022; v1 submitted 29 December, 2021; originally announced December 2021.

arXiv:2112.10755 [pdf, other]

Discovering State Variables Hidden in Experimental Data

Authors: Boyuan Chen, Kuang Huang, Sunand Raghupathi, Ishaan Chandratreya, Qiang Du, Hod Lipson

Abstract: All physical laws are described as relationships between state variables that give a complete and non-redundant description of the relevant system dynamics. However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation. Most data-driven methods for modeling physical phenomena still assume that observed data strea… ▽ More All physical laws are described as relationships between state variables that give a complete and non-redundant description of the relevant system dynamics. However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation. Most data-driven methods for modeling physical phenomena still assume that observed data streams already correspond to relevant state variables. A key challenge is to identify the possible sets of state variables from scratch, given only high-dimensional observational data. Here we propose a new principle for determining how many state variables an observed system is likely to have, and what these variables might be, directly from video streams. We demonstrate the effectiveness of this approach using video recordings of a variety of physical dynamical systems, ranging from elastic double pendulums to fire flames. Without any prior knowledge of the underlying physics, our algorithm discovers the intrinsic dimension of the observed dynamics and identifies candidate sets of state variables. We suggest that this approach could help catalyze the understanding, prediction and control of increasingly complex systems. Project website is at: https://www.cs.columbia.edu/~bchen/neural-state-variables △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: Project website with code, data, and overview video is at: https://www.cs.columbia.edu/~bchen/neural-state-variables

arXiv:2110.09744 [pdf, ps, other]

doi 10.1109/TGRS.2022.3169228

Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images

Authors: Ge Zhang, Shaohui Mei, Mingyang Ma, Yan Feng, Qian Du

Abstract: Spectral unmixing (SU) expresses the mixed pixels existed in hyperspectral images as the product of endmember and abundance, which has been widely used in hyperspectral imagery analysis. However, the influence of light, acquisition conditions and the inherent properties of materials, results in that the identified endmembers can vary spectrally within a given image (construed as spectral variabili… ▽ More Spectral unmixing (SU) expresses the mixed pixels existed in hyperspectral images as the product of endmember and abundance, which has been widely used in hyperspectral imagery analysis. However, the influence of light, acquisition conditions and the inherent properties of materials, results in that the identified endmembers can vary spectrally within a given image (construed as spectral variability). To address this issue, recent methods usually use a priori obtained spectral library to represent multiple characteristic spectra of the same object, but few of them extracted the spectral variability explicitly. In this paper, a spectral variability augmented sparse unmixing model (SVASU) is proposed, in which the spectral variability is extracted for the first time. The variable spectra are divided into two parts of intrinsic spectrum and spectral variability for spectral reconstruction, and modeled synchronously in the SU model adding the regular terms restricting the sparsity of abundance and the generalization of the variability coefficient. It is noted that the spectral variability library and the intrinsic spectral library are all constructed from the In-situ observed image. Experimental results over both synthetic and real-world data sets demonstrate that the augmented decomposition by spectral variability significantly improves the unmixing performance than the decomposition only by spectral library, as well as compared to state-of-the-art algorithms. △ Less

Submitted 21 October, 2021; v1 submitted 19 October, 2021; originally announced October 2021.

arXiv:2110.09049 [pdf, other]

Synthetic Aperture Radar Image Change Detection via Siamese Adaptive Fusion Network

Authors: Yunhao Gao, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Synthetic aperture radar (SAR) image change detection is a critical yet challenging task in the field of remote sensing image analysis. The task is non-trivial due to the following challenges: Firstly, intrinsic speckle noise of SAR images inevitably degrades the neural network because of error gradient accumulation. Furthermore, the correlation among various levels or scales of feature maps is di… ▽ More Synthetic aperture radar (SAR) image change detection is a critical yet challenging task in the field of remote sensing image analysis. The task is non-trivial due to the following challenges: Firstly, intrinsic speckle noise of SAR images inevitably degrades the neural network because of error gradient accumulation. Furthermore, the correlation among various levels or scales of feature maps is difficult to be achieved through summation or concatenation. Toward this end, we proposed a siamese adaptive fusion network for SAR image change detection. To be more specific, two-branch CNN is utilized to extract high-level semantic features of multitemporal SAR images. Besides, an adaptive fusion module is designed to adaptively combine multiscale responses in convolutional layers. Therefore, the complementary information is exploited, and feature learning in change detection is further improved. Moreover, a correlation layer is designed to further explore the correlation between multitemporal images. Thereafter, robust feature representation is utilized for classification through a fully-connected layer with softmax. Experimental results on four real SAR datasets demonstrate that the proposed method exhibits superior performance against several state-of-the-art methods. Our codes are available at https://github.com/summitgao/SAR_CD_SAFNet. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: This work has been accepted by IEEE JSTARS for publication. Our codes are available at https://github.com/summitgao/SAR_CD_SAFNet

arXiv:2105.07364 [pdf, other]

doi 10.1109/TGRS.2021.3080580

BDANet: Multiscale Convolutional Neural Network with Cross-directional Attention for Building Damage Assessment from Satellite Images

Authors: Yu Shen, Sijie Zhu, Taojiannan Yang, Chen Chen, Delu Pan, Jianyu Chen, Liang Xiao, Qian Du

Abstract: Fast and effective responses are required when a natural disaster (e.g., earthquake, hurricane, etc.) strikes. Building damage assessment from satellite imagery is critical before relief effort is deployed. With a pair of pre- and post-disaster satellite images, building damage assessment aims at predicting the extent of damage to buildings. With the powerful ability of feature representation, dee… ▽ More Fast and effective responses are required when a natural disaster (e.g., earthquake, hurricane, etc.) strikes. Building damage assessment from satellite imagery is critical before relief effort is deployed. With a pair of pre- and post-disaster satellite images, building damage assessment aims at predicting the extent of damage to buildings. With the powerful ability of feature representation, deep neural networks have been successfully applied to building damage assessment. Most existing works simply concatenate pre- and post-disaster images as input of a deep neural network without considering their correlations. In this paper, we propose a novel two-stage convolutional neural network for Building Damage Assessment, called BDANet. In the first stage, a U-Net is used to extract the locations of buildings. Then the network weights from the first stage are shared in the second stage for building damage assessment. In the second stage, a two-branch multi-scale U-Net is employed as backbone, where pre- and post-disaster images are fed into the network separately. A cross-directional attention module is proposed to explore the correlations between pre- and post-disaster images. Moreover, CutMix data augmentation is exploited to tackle the challenge of difficult classes. The proposed method achieves state-of-the-art performance on a large-scale dataset -- xBD. The code is available at https://github.com/ShaneShen/BDANet-Building-Damage-Assessment. △ Less

Submitted 16 May, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:2010.14014

arXiv:2104.06699 [pdf, other]

doi 10.1109/LGRS.2021.3073900

Change Detection in Synthetic Aperture Radar Images Using a Dual-Domain Network

Authors: Xiaofan Qu, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Change detection from synthetic aperture radar (SAR) imagery is a critical yet challenging task. Existing methods mainly focus on feature extraction in spatial domain, and little attention has been paid to frequency domain. Furthermore, in patch-wise feature analysis, some noisy features in the marginal region may be introduced. To tackle the above two challenges, we propose a Dual-Domain Network.… ▽ More Change detection from synthetic aperture radar (SAR) imagery is a critical yet challenging task. Existing methods mainly focus on feature extraction in spatial domain, and little attention has been paid to frequency domain. Furthermore, in patch-wise feature analysis, some noisy features in the marginal region may be introduced. To tackle the above two challenges, we propose a Dual-Domain Network. Specifically, we take features from the discrete cosine transform domain into consideration and the reshaped DCT coefficients are integrated into the proposed model as the frequency domain branch. Feature representations from both frequency and spatial domain are exploited to alleviate the speckle noise. In addition, we further propose a multi-region convolution module, which emphasizes the central region of each patch. The contextual information and central region features are modeled adaptively. The experimental results on three SAR datasets demonstrate the effectiveness of the proposed model. Our codes are available at https://github.com/summitgao/SAR_CD_DDNet. △ Less

Submitted 14 April, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: Accepted by IEEE Geoscience and Remote Sensing Letters, Code: https://github.com/summitgao/SAR_CD_DDNet

arXiv:2103.15502 [pdf, other]

doi 10.1109/LGRS.2021.3068558

Remote Sensing Image Translation via Style-Based Recalibration Module and Improved Style Discriminator

Authors: Tiange Zhang, Feng Gao, Junyu Dong, Qian Du

Abstract: Existing remote sensing change detection methods are heavily affected by seasonal variation. Since vegetation colors are different between winter and summer, such variations are inclined to be falsely detected as changes. In this letter, we proposed an image translation method to solve the problem. A style-based recalibration module is introduced to capture seasonal features effectively. Then, a n… ▽ More Existing remote sensing change detection methods are heavily affected by seasonal variation. Since vegetation colors are different between winter and summer, such variations are inclined to be falsely detected as changes. In this letter, we proposed an image translation method to solve the problem. A style-based recalibration module is introduced to capture seasonal features effectively. Then, a new style discriminator is designed to improve the translation performance. The discriminator can not only produce a decision for the fake or real sample, but also return a style vector according to the channel-wise correlations. Extensive experiments are conducted on season-varying dataset. The experimental results show that the proposed method can effectively perform image translation, thereby consistently improving the season-varying image change detection performance. Our codes and data are available at https://github.com/summitgao/RSIT_SRM_ISD. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: Accepted by IEEE Geoscience and Remote Sensing Letters, Code: https://github.com/summitgao/RSIT_SRM_ISD

arXiv:2012.08388 [pdf, other]

doi 10.1016/j.trc.2021.103189

Dynamic driving and routing games for autonomous vehicles on networks: A mean field game approach

Authors: Kuang Huang, Xu Chen, Xuan Di, Qiang Du

Abstract: This paper aims to answer the research question as to optimal design of decision-making processes for autonomous vehicles (AVs), including dynamical selection of driving velocity and route choices on a transportation network. Dynamic traffic assignment (DTA) has been widely used to model travelers's route choice or/and departure-time choice and predict dynamic traffic flow evolution in the short t… ▽ More This paper aims to answer the research question as to optimal design of decision-making processes for autonomous vehicles (AVs), including dynamical selection of driving velocity and route choices on a transportation network. Dynamic traffic assignment (DTA) has been widely used to model travelers's route choice or/and departure-time choice and predict dynamic traffic flow evolution in the short term. However, the existing DTA models do not explicitly describe one's selection of driving velocity on a road link. Driving velocity choice may not be crucial for modeling the movement of human drivers but it is a must-have control to maneuver AVs. In this paper, we aim to develop a game-theoretic model to solve for AVs's optimal driving strategies of velocity control in the interior of a road link and route choice at a junction node. To this end, we will first reinterpret the DTA problem as an N-car differential game and show that this game can be tackled with a general mean field game-theoretic framework. The developed mean field game is challenging to solve because of the forward and backward structure for velocity control and the complementarity conditions for route choice. An efficient algorithm is developed to address these challenges. The model and the algorithm are illustrated on the Braess network and the OW network with a single destination. On the Braess network, we first compare the LWR based DTA model with the proposed game and find that the driving and routing control navigates AVs with overall lower costs. We then compare the total travel cost without and with the middle link and find that the Braess paradox may still arise under certain conditions. We also test our proposed model and solution algorithm on the OW network. △ Less

Submitted 15 June, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

Comments: 32 pages, 13 figures

Journal ref: Transportation Research Part C: Emerging Technologies, 128, 103189 (2021)

arXiv:2008.05457 [pdf, other]

doi 10.1109/TGRS.2020.3016820

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

Authors: Danfeng Hong, Lianru Gao, Naoto Yokoya, **g Yao, Jocelyn Chanussot, Qian Du, Bing Zhang

Abstract: Classification and identification of the materials lying over or beneath the Earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS) and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet th… ▽ More Classification and identification of the materials lying over or beneath the Earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS) and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by develo** a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on "what", "where", and "how" to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS datasets. Furthermore, the codes and datasets will be available at https://github.com/danfenghong/IEEE_TGRS_MDL-RS, contributing to the RS community. △ Less

Submitted 12 August, 2020; originally announced August 2020.

Journal ref: IEEE Transactions on Geoscience and Remote Sensing, 2020

arXiv:2008.00542 [pdf, other]

doi 10.1109/TGRS.2020.3014286

Efficient Deep Learning of Non-local Features for Hyperspectral Image Classification

Authors: Yu Shen, Sijie Zhu, Chen Chen, Qian Du, Liang Xiao, Jianyu Chen, Delu Pan

Abstract: Deep learning based methods, such as Convolution Neural Network (CNN), have demonstrated their efficiency in hyperspectral image (HSI) classification. These methods can automatically learn spectral-spatial discriminative features within local patches. However, for each pixel in an HSI, it is not only related to its nearby pixels but also has connections to pixels far away from itself. Therefore, t… ▽ More Deep learning based methods, such as Convolution Neural Network (CNN), have demonstrated their efficiency in hyperspectral image (HSI) classification. These methods can automatically learn spectral-spatial discriminative features within local patches. However, for each pixel in an HSI, it is not only related to its nearby pixels but also has connections to pixels far away from itself. Therefore, to incorporate the long-range contextual information, a deep fully convolutional network (FCN) with an efficient non-local module, named ENL-FCN, is proposed for HSI classification. In the proposed framework, a deep FCN considers an entire HSI as input and extracts spectral-spatial information in a local receptive field. The efficient non-local module is embedded in the network as a learning unit to capture the long-range contextual information. Different from the traditional non-local neural networks, the long-range contextual information is extracted in a specially designed criss-cross path for computation efficiency. Furthermore, by using a recurrent operation, each pixel's response is aggregated from all pixels of HSI. The benefits of our proposed ENL-FCN are threefold: 1) the long-range contextual information is incorporated effectively, 2) the efficient module can be freely embedded in a deep neural network in a plug-and-play fashion, and 3) it has much fewer learning parameters and requires less computational resources. The experiments conducted on three popular HSI datasets demonstrate that the proposed method achieves state-of-the-art classification performance with lower computational cost in comparison with several leading deep neural networks for HSI. △ Less

Submitted 2 August, 2020; originally announced August 2020.

arXiv:2004.09392 [pdf, other]

doi 10.1016/j.cma.2020.113514

A non-cooperative meta-modeling game for automated third-party calibrating, validating, and falsifying constitutive laws with parallelized adversarial attacks

Authors: Kun Wang, WaiChing Sun, Qiang Du

Abstract: The evaluation of constitutive models, especially for high-risk and high-regret engineering applications, requires efficient and rigorous third-party calibration, validation and falsification. While there are numerous efforts to develop paradigms and standard procedures to validate models, difficulties may arise due to the sequential, manual and often biased nature of the commonly adopted calibrat… ▽ More The evaluation of constitutive models, especially for high-risk and high-regret engineering applications, requires efficient and rigorous third-party calibration, validation and falsification. While there are numerous efforts to develop paradigms and standard procedures to validate models, difficulties may arise due to the sequential, manual and often biased nature of the commonly adopted calibration and validation processes, thus slowing down data collections, hampering the progress towards discovering new physics, increasing expenses and possibly leading to misinterpretations of the credibility and application ranges of proposed models. This work attempts to introduce concepts from game theory and machine learning techniques to overcome many of these existing difficulties. We introduce an automated meta-modeling game where two competing AI agents systematically generate experimental data to calibrate a given constitutive model and to explore its weakness, in order to improve experiment design and model robustness through competition. The two agents automatically search for the Nash equilibrium of the meta-modeling game in an adversarial reinforcement learning framework without human intervention. By capturing all possible design options of the laboratory experiments into a single decision tree, we recast the design of experiments as a game of combinatorial moves that can be resolved through deep reinforcement learning by the two competing players. Our adversarial framework emulates idealized scientific collaborations and competitions among researchers to achieve a better understanding of the application range of the learned material laws and prevent misinterpretations caused by conventional AI-based third-party validation. △ Less

Submitted 13 April, 2020; originally announced April 2020.

MSC Class: 35Q70; 74C05; 65G20; 68T42

arXiv:1906.01554 [pdf, other]

doi 10.1109/ITSC.2019.8917021

Stabilizing Traffic via Autonomous Vehicles: A Continuum Mean Field Game Approach

Authors: Kuang Huang, Xuan Di, Qiang Du, Xi Chen

Abstract: This paper presents scalable traffic stability analysis for both pure autonomous vehicle (AV) traffic and mixed traffic based on continuum traffic flow models. Human vehicles are modeled by a non-equilibrium traffic flow model, i.e., Aw-Rascle-Zhang (ARZ), which is unstable. AVs are modeled by the mean field game which assumes AVs are rational agents with anticipation capacities. It is shown from… ▽ More This paper presents scalable traffic stability analysis for both pure autonomous vehicle (AV) traffic and mixed traffic based on continuum traffic flow models. Human vehicles are modeled by a non-equilibrium traffic flow model, i.e., Aw-Rascle-Zhang (ARZ), which is unstable. AVs are modeled by the mean field game which assumes AVs are rational agents with anticipation capacities. It is shown from linear stability analysis and numerical experiments that AVs help stabilize the traffic. Further, we quantify the impact of AV's penetration rate and controller design on the traffic stability. The results may provide insights for AV manufacturers and city planners. △ Less

Submitted 22 January, 2020; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: 6 pages

Journal ref: 2019 22nd Intelligent Transportation Systems Conference (ITSC)

arXiv:1905.01662 [pdf, other]

doi 10.1109/TGRS.2018.2849692

GETNET: A General End-to-end Two-dimensional CNN Framework for Hyperspectral Image Change Detection

Authors: Qi Wang, Zhenghang Yuan, Qian Du, Xuelong Li

Abstract: Change detection (CD) is an important application of remote sensing, which provides timely change information about large-scale Earth surface. With the emergence of hyperspectral imagery, CD technology has been greatly promoted, as hyperspectral data with the highspectral resolution are capable of detecting finer changes than using the traditional multispectral imagery. Nevertheless, the high dime… ▽ More Change detection (CD) is an important application of remote sensing, which provides timely change information about large-scale Earth surface. With the emergence of hyperspectral imagery, CD technology has been greatly promoted, as hyperspectral data with the highspectral resolution are capable of detecting finer changes than using the traditional multispectral imagery. Nevertheless, the high dimension of hyperspectral data makes it difficult to implement traditional CD algorithms. Besides, endmember abundance information at subpixel level is often not fully utilized. In order to better handle high dimension problem and explore abundance information, this paper presents a General End-to-end Two-dimensional CNN (GETNET) framework for hyperspectral image change detection (HSI-CD). The main contributions of this work are threefold: 1) Mixed-affinity matrix that integrates subpixel representation is introduced to mine more cross-channel gradient features and fuse multi-source information; 2) 2-D CNN is designed to learn the discriminative features effectively from multi-source data at a higher level and enhance the generalization ability of the proposed CD algorithm; 3) A new HSI-CD data set is designed for the objective comparison of different methods. Experimental results on real hyperspectral data sets demonstrate the proposed method outperforms most of the state-of-the-arts. △ Less

Submitted 5 May, 2019; originally announced May 2019.

arXiv:1903.06053 [pdf, other]

doi 10.3934/dcdsb.2020131

A Game-Theoretic Framework for Autonomous Vehicles Velocity Control: Bridging Microscopic Differential Games and Macroscopic Mean Field Games

Authors: Kuang Huang, Xuan Di, Qiang Du, Xi Chen

Abstract: This paper proposes an efficient computational framework for longitudinal velocity control of a large number of autonomous vehicles (AVs) and develops a traffic flow theory for AVs. Instead of hypothesizing explicitly how AVs drive, our goal is to design future AVs as rational, utility-optimizing agents that continuously select optimal velocity over a period of planning horizon. With a large numbe… ▽ More This paper proposes an efficient computational framework for longitudinal velocity control of a large number of autonomous vehicles (AVs) and develops a traffic flow theory for AVs. Instead of hypothesizing explicitly how AVs drive, our goal is to design future AVs as rational, utility-optimizing agents that continuously select optimal velocity over a period of planning horizon. With a large number of interacting AVs, this design problem can become computationally intractable. This paper aims to tackle such a challenge by employing mean field approximation and deriving a mean field game (MFG) as the limiting differential game with an infinite number of agents. The proposed micro-macro model allows one to define individuals on a microscopic level as utility-optimizing agents while translating rich microscopic behaviors to macroscopic models. Different from existing studies on the application of MFG to traffic flow models, the present study offers a systematic framework to apply MFG to autonomous vehicle velocity control. The MFG-based AV controller is shown to mitigate traffic jam faster than the LWR-based controller. MFG also embodies classical traffic flow models with behavioral interpretation, thereby providing a new traffic flow theory for AVs. △ Less

Submitted 10 December, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

Comments: 31 pages, 11 figures

MSC Class: Primary: 49N90; 90B20; Secondary: 35Q91

Journal ref: Discrete & Continuous Dynamical Systems - B,22,11,0,0,2020-4-26

Showing 1–37 of 37 results for author: Du, Q