Search | arXiv e-print repository

Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

Authors: Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei Zhang, Ruoxiu Xiao

Abstract: X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume d… ▽ More X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume data. Existing methods are mainly realized by modelling the whole X-ray imaging procedure. In this study, we propose a learning-based approach termed CT2X-GAN to synthesize the X-ray images in an end-to-end manner using the content and style disentanglement from three different image domains. Our method decouples the anatomical structure information from CT scans and style information from unpaired real X-ray images/ digital reconstructed radiography (DRR) images via a series of decoupling encoders. Additionally, we introduce a novel consistency regularization term to improve the stylistic resemblance between synthesized X-ray images and real X-ray images. Meanwhile, we also impose a supervised process by computing the similarity of computed real DRR and synthesized DRR images. We further develop a pose attention module to fully strengthen the comprehensive information in the decoupled content code from CT scans, facilitating high-quality multi-view image synthesis in the lower 2D space. Extensive experiments were conducted on the publicly available CTSpine1K dataset and achieved 97.8350, 0.0842 and 3.0938 in terms of FID, KID and defined user-scored X-ray similarity, respectively. In comparison with 3D-aware methods ($π$-GAN, EG3D), CT2X-GAN is superior in improving the synthesis quality and realistic to the real X-ray images. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 13 pages, 10 figures

arXiv:2402.03245 [pdf, other]

On the Popov-Belevitch-Hautus tests for functional observability and output controllability

Authors: Arthur N. Montanari, Chao Duan, Adilson E. Motter

Abstract: Functional observability and output controllability are properties that establish the conditions respectively for the partial estimation and partial control of the system state. In the special case of full-state observability and controllability, the Popov-Belevitch-Hautus (PBH) tests provide conditions for the properties to hold based on the system eigenspace. Generalizations of the Popov-Belevit… ▽ More Functional observability and output controllability are properties that establish the conditions respectively for the partial estimation and partial control of the system state. In the special case of full-state observability and controllability, the Popov-Belevitch-Hautus (PBH) tests provide conditions for the properties to hold based on the system eigenspace. Generalizations of the Popov-Belevitch-Hautus (PBH) test have been recently proposed for functional observability and output controllability but were proved to be valid only for diagonalizable systems thus far. Here, we rigorously establish a more general class of systems based on their Jordan decomposition under which a generalized PBH test for functional observability is valid. Likewise, we determine the class of systems under which the generalized PBH test is sufficient and necessary for output controllability. These results have immediate implications for observer and controller design, pole assignment, and optimal placement of sensors and drivers. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.01115 [pdf, other]

Interpretation of Intracardiac Electrograms Through Textual Representations

Authors: William Jongwon Han, Diana Gomez, Avi Alok, Chao**g Duan, Michael A. Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao

Abstract: Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artif… ▽ More Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artificial intelligence (AI) has allowed some works to utilize deep learning frameworks to interpret EGMs during AFib. Additionally, language models (LMs) have shown exceptional performance in being able to generalize to unseen domains, especially in healthcare. In this study, we are the first to leverage pretrained LMs for finetuning of EGM interpolation and AFib classification via masked language modeling. We formulate the EGM as a textual sequence and present competitive performances on AFib classification compared against other representations. Lastly, we provide a comprehensive interpretability study to provide a multi-perspective intuition of the model's behavior, which could greatly benefit the clinical use. △ Less

Submitted 11 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: 18 pages, 9 figures; Accepted to CHIL 2024

ACM Class: I.2.7; J.3

arXiv:2401.16372 [pdf, other]

Duality between controllability and observability for target control and estimation in networks

Authors: Arthur N. Montanari, Chao Duan, Adilson E. Motter

Abstract: Controllability and observability are properties that establish the existence of full-state controllers and observers, respectively. The notions of output controllability and functional observability are generalizations that enable respectively the control and estimation of part of the state vector. These generalizations are of utmost importance in applications to high-dimensional systems, such as… ▽ More Controllability and observability are properties that establish the existence of full-state controllers and observers, respectively. The notions of output controllability and functional observability are generalizations that enable respectively the control and estimation of part of the state vector. These generalizations are of utmost importance in applications to high-dimensional systems, such as large-scale networks, in which only a target subset of variables (nodes) are sought to be controlled or estimated. Although the duality between controllability and observability is well established, the characterization of the duality between their generalized counterparts remains an outstanding problem. Here, we establish both the weak and the strong duality between output controllability and functional observability. Specifically, we show that functional observability of a system implies output controllability of a dual system (weak duality), and that under a certain condition the converse also holds (strong duality). As an application of the strong duality principle, we derive a necessary and sufficient condition for target control via static feedback. This allow us to establish a separation principle between the design of a feedback target controller and the design of a functional observer in closed-loop systems. These results generalize the well-known duality and separation principles in modern control theory. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2309.14263 [pdf, other]

doi 10.1109/LCSYS.2023.3289827

Target Controllability and Target Observability of Structured Network Systems

Authors: Arthur N. Montanari, Chao Duan, Adilson E. Motter

Abstract: The duality between controllability and observability enables methods developed for full-state control to be applied to full-state estimation, and vice versa. In applications in which control or estimation of all state variables is unfeasible, the generalized notions of output controllability and functional observability establish the minimal conditions for the control and estimation of a target s… ▽ More The duality between controllability and observability enables methods developed for full-state control to be applied to full-state estimation, and vice versa. In applications in which control or estimation of all state variables is unfeasible, the generalized notions of output controllability and functional observability establish the minimal conditions for the control and estimation of a target subset of state variables, respectively. Given the seemly unrelated nature of these properties, thus far methods for target control and target estimation have been developed independently in the literature. Here, we characterize the graph-theoretic conditions for target controllability and target observability (which are, respectively, special cases of output controllability and functional observability for structured systems). This allow us to rigorously establish a weak and strong duality between these generalized properties. When both properties are equivalent (strongly dual), we show that efficient algorithms developed for target controllability can be used for target observability, and vice versa, for the optimal placement of sensors and drivers. These results are applicable to large-scale networks, in which control and monitoring are often sought for small subsets of nodes. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Codes are available in GitHub (https://github.com/montanariarthur/TargetCtrb)

Journal ref: IEEE Control Systems Letters, vol. 7, pp. 3060-3065 (2023)

arXiv:2306.11977 [pdf]

Encoding Enhanced Complex CNN for Accurate and Highly Accelerated MRI

Authors: Zimeng Li, Sa Xiao, Cheng Wang, Haidong Li, Xiuchao Zhao, Caohui Duan, Qian Zhou, Qiuchen Rao, Yuan Fang, Junshuai Xie, Lei Shi, Fumin Guo, Chaohui Ye, Xin Zhou

Abstract: Magnetic resonance imaging (MRI) using hyperpolarized noble gases provides a way to visualize the structure and function of human lung, but the long imaging time limits its broad research and clinical applications. Deep learning has demonstrated great potential for accelerating MRI by reconstructing images from undersampled data. However, most existing deep conventional neural networks (CNN) direc… ▽ More Magnetic resonance imaging (MRI) using hyperpolarized noble gases provides a way to visualize the structure and function of human lung, but the long imaging time limits its broad research and clinical applications. Deep learning has demonstrated great potential for accelerating MRI by reconstructing images from undersampled data. However, most existing deep conventional neural networks (CNN) directly apply square convolution to k-space data without considering the inherent properties of k-space sampling, limiting k-space learning efficiency and image reconstruction quality. In this work, we propose an encoding enhanced (EN2) complex CNN for highly undersampled pulmonary MRI reconstruction. EN2 employs convolution along either the frequency or phase-encoding direction, resembling the mechanisms of k-space sampling, to maximize the utilization of the encoding correlation and integrity within a row or column of k-space. We also employ complex convolution to learn rich representations from the complex k-space data. In addition, we develop a feature-strengthened modularized unit to further boost the reconstruction performance. Experiments demonstrate that our approach can accurately reconstruct hyperpolarized 129Xe and 1H lung MRI from 6-fold undersampled k-space data and provide lung function measurements with minimal biases compared with fully-sampled image. These results demonstrate the effectiveness of the proposed algorithmic components and indicate that the proposed approach could be used for accelerated pulmonary MRI in research and clinical lung disease patient care. △ Less

Submitted 13 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

arXiv:2304.06286 [pdf, other]

Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report

Authors: Jielin Qiu, Jiacheng Zhu, Shiqi Liu, William Han, **gqi Zhang, Chao**g Duan, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao

Abstract: Automated interpretation of electrocardiograms (ECG) has garnered significant attention with the advancements in machine learning methodologies. Despite the growing interest, most current studies focus solely on classification or regression tasks, which overlook a crucial aspect of clinical cardio-disease diagnosis: the diagnostic report generated by experienced human clinicians. In this paper, we… ▽ More Automated interpretation of electrocardiograms (ECG) has garnered significant attention with the advancements in machine learning methodologies. Despite the growing interest, most current studies focus solely on classification or regression tasks, which overlook a crucial aspect of clinical cardio-disease diagnosis: the diagnostic report generated by experienced human clinicians. In this paper, we introduce a novel approach to ECG interpretation, leveraging recent breakthroughs in Large Language Models (LLMs) and Vision-Transformer (ViT) models. Rather than treating ECG diagnosis as a classification or regression task, we propose an alternative method of automatically identifying the most similar clinical cases based on the input ECG data. Also, since interpreting ECG as images is more affordable and accessible, we process ECG as encoded images and adopt a vision-language learning paradigm to jointly learn vision-language alignment between encoded ECG images and ECG diagnosis reports. Encoding ECG into images can result in an efficient ECG retrieval system, which will be highly practical and useful in clinical applications. More importantly, our findings could serve as a crucial resource for providing diagnostic services in underdeveloped regions. △ Less

Submitted 6 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: Accepted to the ML4H 2023 Proceedings track

arXiv:2208.05980 [pdf, other]

doi 10.1073/pnas.2122566119

Prevalence and scalable control of localized networks

Authors: Chao Duan, Takashi Nishikawa, Adilson E. Motter

Abstract: The ability to control network dynamics is essential for ensuring desirable functionality of many technological, biological, and social systems. Such systems often consist of a large number of network elements, and controlling large-scale networks remains challenging because the computation and communication requirements increase prohibitively fast with network size. Here, we introduce a notion of… ▽ More The ability to control network dynamics is essential for ensuring desirable functionality of many technological, biological, and social systems. Such systems often consist of a large number of network elements, and controlling large-scale networks remains challenging because the computation and communication requirements increase prohibitively fast with network size. Here, we introduce a notion of network locality that can be exploited to make the control of networks scalable even when the dynamics are nonlinear. We show that network locality is captured by an information metric and is almost universally observed across real and model networks. In localized networks, the optimal control actions and system responses are both shown to be necessarily concentrated in small neighborhoods induced by the information metric. This allows us to develop localized algorithms for determining network controllability and optimizing the placement of driver nodes. This also allows us to develop a localized algorithm for designing local feedback controllers that approach the performance of the corresponding best global controllers while incurring a computational cost orders-of-magnitude lower. We validate the locality, performance, and efficiency of the algorithms in Kuramoto oscillator networks as well as three large empirical networks: synchronization dynamics in the Eastern U.S. power grid, epidemic spreading mediated by the global air transportation network, and Alzheimer's disease dynamics in a human brain network. Taken together, our results establish that large networks can be controlled with computation and communication costs comparable to those for small networks. △ Less

Submitted 11 August, 2022; originally announced August 2022.

Comments: Codes are available on GitHub (https://github.com/cduan2020/LocalizedControl/)

Journal ref: Proc. Natl. Acad. Sci. U.S.A. 119, e2122566119 (2022)

arXiv:2207.07680 [pdf, other]

doi 10.1126/sciadv.abm8310

Network structural origin of instabilities in large complex systems

Authors: Chao Duan, Takashi Nishikawa, Deniz Eroglu, Adilson E. Motter

Abstract: A central issue in the study of large complex network systems, such as power grids, financial networks, and ecological systems, is to understand their response to dynamical perturbations. Recent studies recognize that many real networks show nonnormality and that nonnormality can give rise to reactivity--the capacity of a linearly stable system to amplify its response to perturbations, oftentimes… ▽ More A central issue in the study of large complex network systems, such as power grids, financial networks, and ecological systems, is to understand their response to dynamical perturbations. Recent studies recognize that many real networks show nonnormality and that nonnormality can give rise to reactivity--the capacity of a linearly stable system to amplify its response to perturbations, oftentimes exciting nonlinear instabilities. Here, we identify network structural properties underlying the pervasiveness of nonnormality and reactivity in real directed networks, which we establish using the most extensive data set of such networks studied in this context to date. The identified properties are imbalances between incoming and outgoing network links and paths at each node. Based on this characterization, we develop a theory that quantitatively predicts nonnormality and reactivity and explains the observed pervasiveness. We suggest that these results can be used to design, upgrade, control, and manage networks to avoid or promote network instabilities. △ Less

Submitted 19 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

Comments: Includes Supplementary Materials

Journal ref: Science Advances 8, eabm8310 (2022)

arXiv:2201.07256 [pdf, other]

doi 10.1073/pnas.2113750119

Functional observability and target state estimation in large-scale networks

Authors: Arthur N. Montanari, Chao Duan, Luis A. Aguirre, Adilson E. Motter

Abstract: The quantitative understanding and precise control of complex dynamical systems can only be achieved by observing their internal states via measurement and/or estimation. In large-scale dynamical networks, it is often difficult or physically impossible to have enough sensor nodes to make the system fully observable. Even if the system is in principle observable, high-dimensionality poses fundament… ▽ More The quantitative understanding and precise control of complex dynamical systems can only be achieved by observing their internal states via measurement and/or estimation. In large-scale dynamical networks, it is often difficult or physically impossible to have enough sensor nodes to make the system fully observable. Even if the system is in principle observable, high-dimensionality poses fundamental limits on the computational tractability and performance of a full-state observer. To overcome the curse of dimensionality, we instead require the system to be functionally observable, meaning that a targeted subset of state variables can be reconstructed from the available measurements. Here, we develop a graph-based theory of functional observability, which leads to highly scalable algorithms to i) determine the minimal set of required sensors and ii) design the corresponding state observer of minimum order. Compared to the full-state observer, the proposed functional observer achieves the same estimation quality with substantially less sensing and computational resources, making it suitable for large-scale networks. We apply the proposed methods to the detection of cyber-attacks in power grids from limited phase measurement data and the inference of the prevalence rate of infection during an epidemic under limited testing conditions. The applications demonstrate that the functional observer can significantly scale up our ability to explore otherwise inaccessible dynamical processes on complex networks. △ Less

Submitted 18 January, 2022; originally announced January 2022.

Comments: Codes are available in GitHub (https://github.com/montanariarthur/FunctionalObservability)

Journal ref: Proc. Natl. Acad. Sci. U.S.A. 119, e2113750119 (2022)

arXiv:2108.05898 [pdf, other]

doi 10.1109/TCNS.2021.3070665

Hierarchical Power Flow Control in Smart Grids: Enhancing Rotor Angle and Frequency Stability with Demand-Side Flexibility

Authors: Chao Duan, Pratyush Chakraborty, Takashi Nishikawa, Adilson E. Motter

Abstract: Large-scale integration of renewables in power systems gives rise to new challenges for kee** synchronization and frequency stability in volatile and uncertain power flow states. To ensure the safety of operation, the system must maintain adequate disturbance rejection capability at the time scales of both rotor angle and system frequency dynamics. This calls for flexibility to be exploited on b… ▽ More Large-scale integration of renewables in power systems gives rise to new challenges for kee** synchronization and frequency stability in volatile and uncertain power flow states. To ensure the safety of operation, the system must maintain adequate disturbance rejection capability at the time scales of both rotor angle and system frequency dynamics. This calls for flexibility to be exploited on both the generation and demand sides, compensating volatility and ensuring stability at the two separate time scales. This article proposes a hierarchical power flow control architecture that involves both transmission and distribution networks as well as individual buildings to enhance both small-signal rotor angle stability and frequency stability of the transmission network. The proposed architecture consists of a transmission-level optimizer enhancing system dam** ratios, a distribution-level controller following transmission commands and providing frequency support, and a building-level scheduler accounting for quality of service and following the distribution-level targets. We validate the feasibility and performance of the whole control architecture through real-time hardware-in-loop tests involving real-world transmission and distribution network models along with real devices at the Stone Edge Farm Microgrid. △ Less

Submitted 12 August, 2021; originally announced August 2021.

Comments: To appear in IEEE Transactions on Control of Network Systems

Journal ref: IEEE Transactions on Control of Network Systems 8, 1046 (2021)

arXiv:2108.04836 [pdf, other]

doi 10.1109/TSG.2021.3084470

Practical Challenges in Real-time Demand Response

Authors: Chao Duan, Guna Bharati, Pratyush Chakraborty, Bo Chen, Takashi Nishikawa, Adilson E. Motter

Abstract: We report on a real-time demand response experiment with 100 controllable devices. The experiment reveals several key challenges in the deployment of a real-time demand response program, including time delays, uncertainties, characterization errors, multiple timescales, and nonlinearity, which have been largely ignored in previous studies. To resolve these practical issues, we develop and implemen… ▽ More We report on a real-time demand response experiment with 100 controllable devices. The experiment reveals several key challenges in the deployment of a real-time demand response program, including time delays, uncertainties, characterization errors, multiple timescales, and nonlinearity, which have been largely ignored in previous studies. To resolve these practical issues, we develop and implement a two-level multi-loop control structure integrating feed-forward proportional-integral controllers and optimization solvers in closed loops, which eliminates steady-state errors and improves the dynamical performance of the overall building response. The proposed methods are validated by Hardware-in-the-Loop (HiL) tests. △ Less

Submitted 10 August, 2021; originally announced August 2021.

Comments: To appear in IEEE Transactions on Smart Grid

Journal ref: IEEE Transactions on Smart Grid 12, 4573 (2021)

arXiv:2103.07935 [pdf]

Scale-aware Neural Network for Semantic Segmentation of Multi-resolution Remote Sensing Images

Authors: Libo Wang, Ce Zhang, Rui Li, Chenxi Duan, Xiaoliang Meng, Peter M. Atkinson

Abstract: Assigning geospatial objects with specific categories at the pixel level is a fundamental task in remote sensing image analysis. Along with rapid development in sensor technologies, remotely sensed images can be captured at multiple spatial resolutions (MSR) with information content manifested at different scales. Extracting information from these MSR images represents huge opportunities for enhan… ▽ More Assigning geospatial objects with specific categories at the pixel level is a fundamental task in remote sensing image analysis. Along with rapid development in sensor technologies, remotely sensed images can be captured at multiple spatial resolutions (MSR) with information content manifested at different scales. Extracting information from these MSR images represents huge opportunities for enhanced feature representation and characterisation. However, MSR images suffer from two critical issues: 1) increased scale variation of geo-objects and 2) loss of detailed information at coarse spatial resolutions. To bridge these gaps, in this paper, we propose a novel scale-aware neural network (SaNet) for semantic segmentation of MSR remotely sensed imagery. SaNet deploys a densely connected feature network (DCFFM) module to capture high-quality multi-scale context, such that the scale variation is handled properly and the quality of segmentation is increased for both large and small objects. A spatial feature recalibration (SFRM) module is further incorporated into the network to learn intact semantic content with enhanced spatial relationships, where the negative effects of information loss are removed. The combination of DCFFM and SFRM allows SaNet to learn scale-aware feature representation, which outperforms the existing multi-scale feature representation. Extensive experiments on three semantic segmentation datasets demonstrated the effectiveness of the proposed SaNet in cross-resolution segmentation. △ Less

Submitted 4 November, 2021; v1 submitted 14 March, 2021; originally announced March 2021.

arXiv:2012.10898 [pdf]

Multi-Head Linear Attention Generative Adversarial Network for Thin Cloud Removal

Authors: Chenxi Duan, Rui Li

Abstract: In remote sensing images, the existence of the thin cloud is an inevitable and ubiquitous phenomenon that crucially reduces the quality of imageries and limits the scenarios of application. Therefore, thin cloud removal is an indispensable procedure to enhance the utilization of remote sensing images. Generally, even though contaminated by thin clouds, the pixels still retain more or less surface… ▽ More In remote sensing images, the existence of the thin cloud is an inevitable and ubiquitous phenomenon that crucially reduces the quality of imageries and limits the scenarios of application. Therefore, thin cloud removal is an indispensable procedure to enhance the utilization of remote sensing images. Generally, even though contaminated by thin clouds, the pixels still retain more or less surface information. Hence, different from thick cloud removal, thin cloud removal algorithms normally concentrate on inhibiting the cloud influence rather than substituting the cloud-contaminated pixels. Meanwhile, considering the surface features obscured by the cloud are usually similar to adjacent areas, the dependency between each pixel of the input is useful to reconstruct contaminated areas. In this paper, to make full use of the dependencies between pixels of the image, we propose a Multi-Head Linear Attention Generative Adversarial Network (MLAGAN) for Thin Cloud Removal. The MLA-GAN is based on the encoding-decoding framework consisting of multiple attention-based layers and deconvolutional layers. Compared with six deep learning-based thin cloud removal benchmarks, the experimental results on the RICE1 and RICE2 datasets demonstrate that the proposed framework MLA-GAN has dominant advantages in thin cloud removal. △ Less

Submitted 20 December, 2020; originally announced December 2020.

arXiv:2009.02130 [pdf]

doi 10.1109/TGRS.2021.3093977

Multi-Attention-Network for Semantic Segmentation of Fine Resolution Remote Sensing Images

Authors: Rui Li, Shunyi Zheng, Chenxi Duan, Ce Zhang, Jianlin Su, P. M. Atkinson

Abstract: Semantic segmentation of remote sensing images plays an important role in a wide range of applications including land resource management, biosphere monitoring and urban planning. Although the accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks, several limitations exist in standard models. First, for encoder-decoder arc… ▽ More Semantic segmentation of remote sensing images plays an important role in a wide range of applications including land resource management, biosphere monitoring and urban planning. Although the accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks, several limitations exist in standard models. First, for encoder-decoder architectures such as U-Net, the utilization of multi-scale features causes the underuse of information, where low-level features and high-level features are concatenated directly without any refinement. Second, long-range dependencies of feature maps are insufficiently explored, resulting in sub-optimal feature representations associated with each semantic class. Third, even though the dot-product attention mechanism has been introduced and utilized in semantic segmentation to model long-range dependencies, the large time and space demands of attention impede the actual usage of attention in application scenarios with large-scale input. This paper proposed a Multi-Attention-Network (MANet) to address these issues by extracting contextual dependencies through multiple efficient attention modules. A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention. Based on kernel attention and channel attention, we integrate local feature maps extracted by ResNeXt-101 with their corresponding global dependencies and reweight interdependent channel maps adaptively. Numerical experiments on three large-scale fine resolution remote sensing images captured by different satellite sensors demonstrate the superior performance of the proposed MANet, outperforming the DeepLab V3+, PSPNet, FastFCN, DANet, OCRNet, and other benchmark approaches. △ Less

Submitted 23 November, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2007.14902

arXiv:2008.04529 [pdf]

doi 10.3390/rs12203446

Thick Cloud Removal of Remote Sensing Images Using Temporal Smoothness and Sparsity-Regularized Tensor Optimization

Authors: Chenxi Duan, Jun Pan, Rui Li

Abstract: In remote sensing images, the presence of thick cloud accompanying cloud shadow is a high probability event, which can affect the quality of subsequent processing and limit the scenarios of application. Hence, removing the thick cloud and cloud shadow as well as recovering the cloud-contaminated pixels is indispensable to make good use of remote sensing images. In this paper, a novel thick cloud r… ▽ More In remote sensing images, the presence of thick cloud accompanying cloud shadow is a high probability event, which can affect the quality of subsequent processing and limit the scenarios of application. Hence, removing the thick cloud and cloud shadow as well as recovering the cloud-contaminated pixels is indispensable to make good use of remote sensing images. In this paper, a novel thick cloud removal method for remote sensing images based on temporal smoothness and sparsity-regularized tensor optimization (TSSTO) is proposed. The basic idea of TSSTO is that the thick cloud and cloud shadow are not only sparse but also smooth along the horizontal and vertical direction in images while the clean images are smooth along the temporal direction between images. Therefore, the sparsity norm is used to boost the sparsity of the cloud and cloud shadow, and unidirectional total variation (UTV) regularizers are applied to ensure the unidirectional smoothness. This paper utilizes alternation direction method of multipliers to solve the presented model and generate the cloud and cloud shadow element as well as the clean element. The cloud and cloud shadow element is purified to get the cloud area and cloud shadow area. Then, the clean area of the original cloud-contaminated images is replaced to the corresponding area of the clean element. Finally, the reference image is selected to reconstruct details of the cloud area and cloud shadow area using the information cloning method. A series of experiments are conducted both on simulated and real cloud-contaminated images from different sensors and with different resolutions, and the results demonstrate the potential of the proposed TSSTO method for removing cloud and cloud shadow from both qualitative and quantitative viewpoints. △ Less

Submitted 1 September, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

arXiv:2008.00168 [pdf]

doi 10.1080/10095020.2021.2017237

Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Authors: Rui Li, Shunyi Zheng, Chenxi Duan, Ce Zhang

Abstract: In this paper, a Multi-Scale Fully Convolutional Network (MSFCN) with multi-scale convolutional kernel is proposed to exploit discriminative representations from two-dimensional (2D) satellite images. In this paper, a Multi-Scale Fully Convolutional Network (MSFCN) with multi-scale convolutional kernel is proposed to exploit discriminative representations from two-dimensional (2D) satellite images. △ Less

Submitted 21 October, 2020; v1 submitted 1 August, 2020; originally announced August 2020.

arXiv:2007.13083 [pdf]

doi 10.1109/LGRS.2021.3052886

MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Authors: Rui Li, Chenxi Duan, Shunyi Zheng, Ce Zhang, Peter M. Atkinson

Abstract: Semantic segmentation of remotely sensed images plays an important role in land resource management, yield estimation, and economic assessment. U-Net, a deep encoder-decoder architecture, has been used frequently for image segmentation with high accuracy. In this Letter, we incorporate multi-scale features generated by different layers of U-Net and design a multi-scale skip connected and asymmetri… ▽ More Semantic segmentation of remotely sensed images plays an important role in land resource management, yield estimation, and economic assessment. U-Net, a deep encoder-decoder architecture, has been used frequently for image segmentation with high accuracy. In this Letter, we incorporate multi-scale features generated by different layers of U-Net and design a multi-scale skip connected and asymmetric-convolution-based U-Net (MACU-Net), for segmentation using fine-resolution remotely sensed images. Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained in both low-level and high-level feature maps; (2) the asymmetric convolution block strengthens the feature representation and feature extraction capability of a standard convolution layer. Experiments conducted on two remotely sensed datasets captured by different satellite sensors demonstrate that the proposed MACU-Net transcends the U-Net, U-NetPPL, U-Net 3+, amongst other benchmark approaches. Code is available at https://github.com/lironui/MACU-Net. △ Less

Submitted 4 May, 2022; v1 submitted 26 July, 2020; originally announced July 2020.

arXiv:2004.08112

LiteDenseNet: A Lightweight Network for Hyperspectral Image Classification

Authors: Rui Li, Chenxi Duan

Abstract: Hyperspectral Image (HSI) classification based on deep learning has been an attractive area in recent years. However, as a kind of data-driven algorithm, deep learning method usually requires numerous computational resources and high-quality labelled dataset, while the cost of high-performance computing and data annotation is expensive. In this paper, to reduce dependence on massive calculation an… ▽ More Hyperspectral Image (HSI) classification based on deep learning has been an attractive area in recent years. However, as a kind of data-driven algorithm, deep learning method usually requires numerous computational resources and high-quality labelled dataset, while the cost of high-performance computing and data annotation is expensive. In this paper, to reduce dependence on massive calculation and labelled samples, we propose a lightweight network architecture (LiteDenseNet) based on DenseNet for Hyperspectral Image Classification. Inspired by GoogLeNet and PeleeNet, we design a 3D two-way dense layer to capture the local and global features of the input. As convolution is a computationally intensive operation, we introduce group convolution to decrease calculation cost and parameter size further. Thus, the number of parameters and the consumptions of calculation are observably less than contrapositive deep learning methods, which means LiteDenseNet owns simpler architecture and higher efficiency. A series of quantitative experiences on 6 widely used hyperspectral datasets show that the proposed LiteDenseNet obtains the state-of-the-art performance, even though when the absence of labelled samples is severe. △ Less

Submitted 26 April, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

Comments: The random split among training, test, and validation is not acceptable in cube-based methods, which may lead to test data leakage

arXiv:1904.04427 [pdf, other]

3D Point Cloud Denoising via Deep Neural Network based Local Surface Estimation

Authors: Chao**g Duan, Siheng Chen, Jelena Kovacevic

Abstract: We present a neural-network-based architecture for 3D point cloud denoising called neural projection denoising (NPD). In our previous work, we proposed a two-stage denoising algorithm, which first estimates reference planes and follows by projecting noisy points to estimated reference planes. Since the estimated reference planes are inevitably noisy, multi-projection is applied to stabilize the de… ▽ More We present a neural-network-based architecture for 3D point cloud denoising called neural projection denoising (NPD). In our previous work, we proposed a two-stage denoising algorithm, which first estimates reference planes and follows by projecting noisy points to estimated reference planes. Since the estimated reference planes are inevitably noisy, multi-projection is applied to stabilize the denoising performance. NPD algorithm uses a neural network to estimate reference planes for points in noisy point clouds. With more accurate estimations of reference planes, we are able to achieve better denoising performances with only one-time projection. To the best of our knowledge, NPD is the first work to denoise 3D point clouds with deep learning techniques. To conduct the experiments, we sample 40000 point clouds from the 3D data in ShapeNet to train a network and sample 350 point clouds from the 3D data in ModelNet10 to test. Experimental results show that our algorithm can estimate normal vectors of points in noisy point clouds. Comparing to five competitive methods, the proposed algorithm achieves better denoising performance and produces much smaller variances. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Showing 1–20 of 20 results for author: Duan, C