Search | arXiv e-print repository

arXiv:2406.14069 [pdf, other]

Towards Multi-modality Fusion and Prototype-based Feature Refinement for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound

Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zou, Jianhua Zhou, Yi Wang

Abstract: Prostate cancer is a highly prevalent cancer and ranks as the second leading cause of cancer-related deaths in men globally. Recently, the utilization of multi-modality transrectal ultrasound (TRUS) has gained significant traction as a valuable technique for guiding prostate biopsies. In this study, we propose a novel learning framework for clinically significant prostate cancer (csPCa) classifica… ▽ More Prostate cancer is a highly prevalent cancer and ranks as the second leading cause of cancer-related deaths in men globally. Recently, the utilization of multi-modality transrectal ultrasound (TRUS) has gained significant traction as a valuable technique for guiding prostate biopsies. In this study, we propose a novel learning framework for clinically significant prostate cancer (csPCa) classification using multi-modality TRUS. The proposed framework employs two separate 3D ResNet-50 to extract distinctive features from B-mode and shear wave elastography (SWE). Additionally, an attention module is incorporated to effectively refine B-mode features and aggregate the extracted features from both modalities. Furthermore, we utilize few shot segmentation task to enhance the capacity of classification encoder. Due to the limited availability of csPCa masks, a prototype correction module is employed to extract representative prototypes of csPCa. The performance of the framework is assessed on a large-scale dataset consisting of 512 TRUS videos with biopsy-proved prostate cancer. The results demonstrate the strong capability in accurately identifying csPCa, achieving an area under the curve (AUC) of 0.86. Moreover, the framework generates visual class activation map** (CAM), which can serve as valuable assistance for localizing csPCa. These CAM images may offer valuable guidance during TRUS-guided targeted biopsies, enhancing the efficacy of the biopsy procedure.The code is available at https://github.com/2313595986/SmileCode. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.05982 [pdf]

Artificial Intelligence for Neuro MRI Acquisition: A Review

Authors: Hongjia Yang, Guanhua Wang, Ziyu Li, Haoxiang Li, Jialan Zheng, Yuxin Hu, Xiaozhi Cao, Congyu Liao, Huihui Ye, Qiyuan Tian

Abstract: Magnetic resonance imaging (MRI) has significantly benefited from the resurgence of artificial intelligence (AI). By leveraging AI's capabilities in large-scale optimization and pattern recognition, innovative methods are transforming the MRI acquisition workflow, including planning, sequence design, and correction of acquisition artifacts. These emerging algorithms demonstrate substantial potenti… ▽ More Magnetic resonance imaging (MRI) has significantly benefited from the resurgence of artificial intelligence (AI). By leveraging AI's capabilities in large-scale optimization and pattern recognition, innovative methods are transforming the MRI acquisition workflow, including planning, sequence design, and correction of acquisition artifacts. These emerging algorithms demonstrate substantial potential in enhancing the efficiency and throughput of acquisition steps. This review discusses several pivotal AI-based methods in neuro MRI acquisition, focusing on their technological advances, impact on clinical practice, and potential risks. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Submitted to MAGMA for review

arXiv:2404.16412 [pdf, ps, other]

Distributed Matrix Pencil Formulations for Prescribed-Time Leader-Following Consensus of MASs with Unknown Sensor Sensitivity

Authors: Hefu Ye, Changyun Wen, Yongduan Song

Abstract: In this paper, we address the problem of prescribed-time leader-following consensus of heterogeneous multi-agent systems (MASs) in the presence of unknown sensor sensitivity. Under a connected undirected topology, we propose a time-varying dual observer/controller design framework that makes use of regular local and inaccurate feedback to achieve consensus tracking within a prescribed time. In par… ▽ More In this paper, we address the problem of prescribed-time leader-following consensus of heterogeneous multi-agent systems (MASs) in the presence of unknown sensor sensitivity. Under a connected undirected topology, we propose a time-varying dual observer/controller design framework that makes use of regular local and inaccurate feedback to achieve consensus tracking within a prescribed time. In particular, the developed analysis framework is applicable to MASs equipped with sensors of different sensitivities. One of the design innovations involves constructing a distributed matrix pencil formulation based on worst-case sensors, yielding control parameters with sufficient robustness yet relatively low conservatism. Another novelty is the construction of the control gains, which consists of the product of a proportional coefficient obtained from the matrix pencil formulation and a classic time-varying function that grows to infinity or a novel bounded time-varying function. Furthermore, it is possible to extend the prescribed-time distributed protocol to infinite time domain by introducing the bounded time-varying gain technique without sacrificing the ultimate control accuracy, and the corresponding technical proof is comprehensive. The effectiveness of the method is demonstrated through a group of 5 single-link robot manipulators. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 10 pages, 1 figure

arXiv:2402.08987 [pdf, other]

Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer

Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang

Abstract: Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The frame… ▽ More Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The framework utilizes two 3D ResNet-50 models to extract features from B-mode images and shear wave elastography images, respectively. An adaptive spatial fusion module is introduced to aggregate two modalities' features. An orthogonal regularized loss is further used to mitigate feature redundancy. The proposed framework is evaluated on an in-house dataset containing 512 TRUS videos, and achieves favorable performance in identifying csPCa with an area under curve (AUC) of 0.84. Furthermore, the visualized class activation map** (CAM) images generated from the proposed framework may provide valuable guidance for the localization of csPCa, thus facilitating the TRUS-guided targeted biopsy. Our code is publicly available at https://github.com/2313595986/ProstateTRUS. △ Less

Submitted 17 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2401.07515 [pdf, other]

On Purely Data-Driven Massive MIMO Detectors

Authors: Hao Ye, Le Liang

Abstract: To enhance the performance of massive multi-input multi-output (MIMO) detection using deep learning, prior research primarily adopts a model-driven methodology, integrating deep neural networks (DNNs) with traditional iterative detectors. Despite these efforts, achieving a purely data-driven detector has remained elusive, primarily due to the inherent complexities arising from the problem's high d… ▽ More To enhance the performance of massive multi-input multi-output (MIMO) detection using deep learning, prior research primarily adopts a model-driven methodology, integrating deep neural networks (DNNs) with traditional iterative detectors. Despite these efforts, achieving a purely data-driven detector has remained elusive, primarily due to the inherent complexities arising from the problem's high dimensionality. This paper introduces ChannelNet, a simple yet effective purely data-driven massive MIMO detector. ChannelNet embeds the channel matrix into the network as linear layers rather than viewing it as input, enabling scalability to massive MIMO scenarios. ChannelNet is computationally efficient and has a computational complexity of $\mathcal{O}(N_t N_r)$, where $N_t$ and $N_r$ represent the numbers of transmit and receive antennas, respectively. Despite the low computation complexity, ChannelNet demonstrates robust empirical performance, matching or surpassing state-of-the-art detectors in various scenarios. In addition, theoretical insights establish ChannelNet as a universal approximator in probability for any continuous permutation-equivariant functions. ChannelNet demonstrates that designing deep learning based massive MIMO detectors can be purely data-driven and free from the constraints posed by the conventional iterative frameworks as well as the channel and noise distribution models. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2311.06498 [pdf, other]

Semantic Communication for Cooperative Perception based on Importance Map

Authors: Yucheng Sheng, Hao Ye, Le Liang, Shi **, Geoffrey Ye Li

Abstract: Cooperative perception, which has a broader perception field than single-vehicle perception, has played an increasingly important role in autonomous driving to conduct 3D object detection. Through vehicle-to-vehicle (V2V) communication technology, various connected automated vehicles (CAVs) can share their sensory information (LiDAR point clouds) for cooperative perception. We employ an importance… ▽ More Cooperative perception, which has a broader perception field than single-vehicle perception, has played an increasingly important role in autonomous driving to conduct 3D object detection. Through vehicle-to-vehicle (V2V) communication technology, various connected automated vehicles (CAVs) can share their sensory information (LiDAR point clouds) for cooperative perception. We employ an importance map to extract significant semantic information and propose a novel cooperative perception semantic communication scheme with intermediate fusion. Meanwhile, our proposed architecture can be extended to the challenging time-varying multipath fading channel. To alleviate the distortion caused by the time-varying multipath fading, we adopt explicit orthogonal frequency-division multiplexing (OFDM) blocks combined with channel estimation and channel equalization. Simulation results demonstrate that our proposed model outperforms the traditional separate source-channel coding over various channel models. Moreover, a robustness study indicates that only part of semantic information is key to cooperative perception. Although our proposed model has only been trained over one specific channel, it has the ability to learn robust coded representations of semantic information that remain resilient to various channel models, demonstrating its generality and robustness. △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: 13 pages,22 figures;journal;submitted for possible publication

arXiv:2309.16372 [pdf, other]

Aperture Diffraction for Compact Snapshot Spectral Imaging

Authors: Tao Lv, Hao Ye, Quan Yuan, Zhan Shi, Yibo Wang, Shuming Wang, Xun Cao

Abstract: We demonstrate a compact, cost-effective snapshot spectral imaging system named Aperture Diffraction Imaging Spectrometer (ADIS), which consists only of an imaging lens with an ultra-thin orthogonal aperture mask and a mosaic filter sensor, requiring no additional physical footprint compared to common RGB cameras. Then we introduce a new optical design that each point in the object space is multip… ▽ More We demonstrate a compact, cost-effective snapshot spectral imaging system named Aperture Diffraction Imaging Spectrometer (ADIS), which consists only of an imaging lens with an ultra-thin orthogonal aperture mask and a mosaic filter sensor, requiring no additional physical footprint compared to common RGB cameras. Then we introduce a new optical design that each point in the object space is multiplexed to discrete encoding locations on the mosaic filter sensor by diffraction-based spatial-spectral projection engineering generated from the orthogonal mask. The orthogonal projection is uniformly accepted to obtain a weakly calibration-dependent data form to enhance modulation robustness. Meanwhile, the Cascade Shift-Shuffle Spectral Transformer (CSST) with strong perception of the diffraction degeneration is designed to solve a sparsity-constrained inverse problem, realizing the volume reconstruction from 2D measurements with Large amount of aliasing. Our system is evaluated by elaborating the imaging optical theory and reconstruction algorithm with demonstrating the experimental imaging under a single exposure. Ultimately, we achieve the sub-super-pixel spatial resolution and high spectral resolution imaging. The code will be available at: https://github.com/Krito-ex/CSST. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: accepted by International Conference on Computer Vision (ICCV) 2023

arXiv:2308.10547 [pdf, other]

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

Authors: Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu

Abstract: The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios.… ▽ More The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and the communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold. △ Less

Submitted 12 March, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

Journal ref: International Conference on Learning Representations, 2024

arXiv:2305.14781 [pdf, other]

Accelerated Nonconvex ADMM with Self-Adaptive Penalty for Rank-Constrained Model Identification

Authors: Qingyuan Liu, Zhengchao Huang, Hao Ye, Dexian Huang, Chao Shang

Abstract: The alternating direction method of multipliers (ADMM) has been widely adopted in low-rank approximation and low-order model identification tasks; however, the performance of nonconvex ADMM is highly reliant on the choice of penalty parameter. To accelerate ADMM for solving rank-constrained identification problems, this paper proposes a new self-adaptive strategy for automatic penalty update. Guid… ▽ More The alternating direction method of multipliers (ADMM) has been widely adopted in low-rank approximation and low-order model identification tasks; however, the performance of nonconvex ADMM is highly reliant on the choice of penalty parameter. To accelerate ADMM for solving rank-constrained identification problems, this paper proposes a new self-adaptive strategy for automatic penalty update. Guided by first-order analysis of the increment of the augmented Lagrangian, the self-adaptive penalty updating enables effective and balanced minimization of both primal and dual residuals and thus ensures a stable convergence. Moreover, improved efficiency can be obtained within the Anderson acceleration scheme. Numerical examples show that the proposed strategy significantly accelerates the convergence of nonconvex ADMM while alleviating the critical reliance on tedious tuning of penalty parameters. △ Less

Submitted 8 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 7 pages, 5 figures. Accepted by 62nd IEEE Conference on Decision and Control (CDC 2023)

arXiv:2305.10651 [pdf, other]

Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

Authors: Hengfa Lu, Huihui Ye, Lawrence L. Wald, Bo Zhao

Abstract: Magnetic Resonance (MR) Fingerprinting is an emerging multi-parametric quantitative MR imaging technique, for which image reconstruction methods utilizing low-rank and subspace constraints have achieved state-of-the-art performance. However, this class of methods often suffers from an ill-conditioned model-fitting issue, which degrades the performance as the data acquisition lengths become short a… ▽ More Magnetic Resonance (MR) Fingerprinting is an emerging multi-parametric quantitative MR imaging technique, for which image reconstruction methods utilizing low-rank and subspace constraints have achieved state-of-the-art performance. However, this class of methods often suffers from an ill-conditioned model-fitting issue, which degrades the performance as the data acquisition lengths become short and/or the signal-to-noise ratio becomes low. To address this problem, we present a new image reconstruction method for MR Fingerprinting, integrating low-rank and subspace modeling with a deep generative prior. Specifically, the proposed method captures the strong spatiotemporal correlation of contrast-weighted time-series images in MR Fingerprinting via a low-rank factorization. Further, it utilizes an untrained convolutional generative neural network to represent the spatial subspace of the low-rank model, while estimating the temporal subspace of the model from simulated magnetization evolutions generated based on spin physics. Here the architecture of the generative neural network serves as an effective regularizer for the ill-conditioned inverse problem without additional spatial training data that are often expensive to acquire. The proposed formulation results in a non-convex optimization problem, for which we develop an algorithm based on variable splitting and alternating direction method of multipliers.We evaluate the performance of the proposed method with numerical simulations and in vivo experiments and demonstrate that the proposed method outperforms the state-of-the-art low-rank and subspace reconstruction. △ Less

Submitted 24 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2303.09780 [pdf, other]

doi 10.1016/j.isci.2024.109766

Mpox-AISM: AI-Mediated Super Monitoring for Mpox and Like-Mpox

Authors: Yubiao Yue, Minghua Jiang, Xinyue Zhang, Jialong Xu, Huacong Ye, Fan Zhang, Zhenzhang Li, Yang Li

Abstract: Swift and accurate diagnosis for earlier-stage monkeypox (mpox) patients is crucial to avoiding its spread. However, the similarities between common skin disorders and mpox and the need for professional diagnosis unavoidably impaired the diagnosis of earlier-stage mpox patients and contributed to mpox outbreak. To address the challenge, we proposed "Super Monitoring", a real-time visualization tec… ▽ More Swift and accurate diagnosis for earlier-stage monkeypox (mpox) patients is crucial to avoiding its spread. However, the similarities between common skin disorders and mpox and the need for professional diagnosis unavoidably impaired the diagnosis of earlier-stage mpox patients and contributed to mpox outbreak. To address the challenge, we proposed "Super Monitoring", a real-time visualization technique employing artificial intelligence (AI) and Internet technology to diagnose earlier-stage mpox cheaply, conveniently, and quickly. Concretely, AI-mediated "Super Monitoring" (mpox-AISM) integrates deep learning models, data augmentation, self-supervised learning, and cloud services. According to publicly accessible datasets, mpox-AISM's Precision, Recall, Specificity, and F1-score in diagnosing mpox reach 99.3%, 94.1%, 99.9%, and 96.6%, respectively, and it achieves 94.51% accuracy in diagnosing mpox, six like-mpox skin disorders, and normal skin. With the Internet and communication terminal, mpox-AISM has the potential to perform real-time and accurate diagnosis for earlier-stage mpox in real-world scenarios, thereby preventing mpox outbreak. △ Less

Submitted 15 June, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

Journal ref: iScience, 27(5) 2024

arXiv:2303.02559 [pdf, other]

Securing Biomedical Images from Unauthorized Training with Anti-Learning Perturbation

Authors: Yixin Liu, Haohui Ye, Kai Zhang, Lichao Sun

Abstract: The volume of open-source biomedical data has been essential to the development of various spheres of the healthcare community since more `free' data can provide individual researchers more chances to contribute. However, institutions often hesitate to share their data with the public due to the risk of data exploitation by unauthorized third parties for another commercial usage (e.g., training AI… ▽ More The volume of open-source biomedical data has been essential to the development of various spheres of the healthcare community since more `free' data can provide individual researchers more chances to contribute. However, institutions often hesitate to share their data with the public due to the risk of data exploitation by unauthorized third parties for another commercial usage (e.g., training AI models). This phenomenon might hinder the development of the whole healthcare research community. To address this concern, we propose a novel approach termed `unlearnable biomedical image' for protecting biomedical data by injecting imperceptible but delusive noises into the data, making them unexploitable for AI models. We formulate the problem as a bi-level optimization and propose three kinds of anti-learning perturbation generation approaches to solve the problem. Our method is an important step toward encouraging more institutions to contribute their data for the long-term development of the research community. △ Less

Submitted 4 March, 2023; originally announced March 2023.

Comments: This paper is accepted as a poster for NDSS 2023

arXiv:2210.13415 [pdf]

Deep Learning Approach for Dynamic Sampling for Multichannel Mass Spectrometry Imaging

Authors: David Helminiak, Hang Hu, Julia Laskin, Dong Hye Ye

Abstract: Mass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochasti… ▽ More Mass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochastic models determine which locations probabilistically contain information critical to the generation of low-error reconstructions. Decreasing the number of required physical measurements thereby minimizes overall acquisition times. A Deep Learning Approach for Dynamic Sampling (DLADS), utilizing a Convolutional Neural Network (CNN) and encapsulating molecular mass intensity distributions within a third dimension, demonstrates a simulated 70% throughput improvement for Nanospray Desorption Electrospray Ionization (nano-DESI) MSI tissues. Evaluations are conducted between DLADS and a Supervised Learning Approach for Dynamic Sampling, with Least-Squares regression (SLADS-LS) and a Multi-Layer Perceptron (MLP) network (SLADS-Net). When compared with SLADS-LS, limited to a single m/z channel, as well as multichannel SLADS-LS and SLADS-Net, DLADS respectively improves regression performance by 36.7%, 7.0%, and 6.2%, resulting in gains to reconstruction quality of 6.0%, 2.1%, and 3.4% for acquisition of targeted m/z. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2210.12715 [pdf, ps, other]

Adaptive Control with Global Exponential Stability for Parameter-Varying Nonlinear Systems under Unknown Control Gains

Authors: Hefu Ye, Haijia Wu, Kai Zhao, Yongduan Song

Abstract: It is nontrivial to achieve exponential stability even for time-invariant nonlinear systems with matched uncertainties and persistent excitation (PE) condition. In this paper, without the need for PE condition, we address the problem of global exponential stabilization of strict-feedback systems with mismatched uncertainties and unknown yet time-varying control gains. The resultant control, embedd… ▽ More It is nontrivial to achieve exponential stability even for time-invariant nonlinear systems with matched uncertainties and persistent excitation (PE) condition. In this paper, without the need for PE condition, we address the problem of global exponential stabilization of strict-feedback systems with mismatched uncertainties and unknown yet time-varying control gains. The resultant control, embedded with time-varying feedback gains, is capable of ensuring global exponential stability of parametric-strict-feedback systems in the absence of persistence of excitation. By using the enhanced Nussbaum function, the previous results are extended to more general nonlinear systems where the sign and magnitude of the time-varying control gain are unknown. In particular, the argument of the Nussbaum function is guaranteed to be always positive with the aid of nonlinear dam** design, which is critical to perform a straightforward technical analysis of the boundedness of the Nussbaum function. Finally, the global exponential stability of parameter-varying strict-feedback systems, the boundedness of the control input and the update rate, and the asymptotic constancy of the parameter estimate are established. Numerical simulations are carried out to verify the effectiveness and benefits of the proposed methods. △ Less

Submitted 23 October, 2022; originally announced October 2022.

arXiv:2210.12712 [pdf, ps, other]

Prescribed-Time Control and Its Latest Developments

Authors: Hefu Ye, Yongduan Song, Frank L. Lewis

Abstract: Prescribed-time (PT) control, originated from \textit{Song et al.}, has gained increasing attention among control community. The salient feature of PT control lies in its ability to achieve system stability within a finite settling time user-assignable in advance irrespective of initial conditions. It is such a unique feature that has enticed many follow-up studies on this technically important ar… ▽ More Prescribed-time (PT) control, originated from \textit{Song et al.}, has gained increasing attention among control community. The salient feature of PT control lies in its ability to achieve system stability within a finite settling time user-assignable in advance irrespective of initial conditions. It is such a unique feature that has enticed many follow-up studies on this technically important area, motivating numerous research advancements. In this article, we provide a comprehensive survey on the recent developments in PT control. Through a concise introduction to the concept of PT control, and a unique taxonomy covering: 1) from robust PT control to adaptive PT control; 2) from PT control for single-input-single-output (SISO) systems to multi-input-multi-output (MIMO) systems; and 3) from PT control for single systems to multi-agent systems, we present an accessible review of this interesting topic. We highlight key techniques, fundamental assumptions adopted in various developments as well as some new design ideas. We also discuss several possibles future research directions towards PT control. △ Less

Submitted 23 October, 2022; originally announced October 2022.

arXiv:2210.12706 [pdf, ps, other]

Robust Adaptive Prescribed-Time Control for Parameter-Varying Nonlinear Systems

Authors: Hefu Ye, Yongduan Song

Abstract: It is an interesting open problem to achieve adaptive prescribed-time control for strict-feedback systems with unknown and fast or even abrupt time-varying parameters. In this paper we present a solution with the aid of several design and analysis innovations. First, by using a spatiotemporal transformation, we convert the original system operational over finite time interval into one operational… ▽ More It is an interesting open problem to achieve adaptive prescribed-time control for strict-feedback systems with unknown and fast or even abrupt time-varying parameters. In this paper we present a solution with the aid of several design and analysis innovations. First, by using a spatiotemporal transformation, we convert the original system operational over finite time interval into one operational over infinite time interval, allowing for Lyapunov asymptotic design and recasting prescribed-time stabilization on finite time domain into asymptotic stabilization on infinite time domain. Second, to deal with time-varying parameters with unknown variation boundaries, we use congelation of variables method and establish three separate adaptive laws for parameter estimation (two for the unknown parameters in the feedback path and one for the unknown parameter in the input path), in doing so we utilize two tuning functions to eliminate over-parametrization. Third, to achieve asymptotic convergence for the transformed system, we make use of nonlinear dam** design and non-regressor-based design to cope with time-varying perturbations, and finally, we derive the prescribed-time control scheme from the asymptotic controller via inverse temporal-scale transformation. The boundedness of all closed-loop signals and control input is proved rigorously through Lyapunov analysis, squeeze theorem, and two novel lemmas built upon the method of variation of constants. Numerical simulation verifies the effectiveness of the proposed method. △ Less

Submitted 23 October, 2022; originally announced October 2022.

arXiv:2208.04017 [pdf, other]

Stain-Adaptive Self-Supervised Learning for Histopathology Image Analysis

Authors: Hai-Li Ye, Da-Han Wang

Abstract: It is commonly recognized that color variations caused by differences in stains is a critical issue for histopathology image analysis. Existing methods adopt color matching, stain separation, stain transfer or the combination of them to alleviate the stain variation problem. In this paper, we propose a novel Stain-Adaptive Self-Supervised Learning(SASSL) method for histopathology image analysis. O… ▽ More It is commonly recognized that color variations caused by differences in stains is a critical issue for histopathology image analysis. Existing methods adopt color matching, stain separation, stain transfer or the combination of them to alleviate the stain variation problem. In this paper, we propose a novel Stain-Adaptive Self-Supervised Learning(SASSL) method for histopathology image analysis. Our SASSL integrates a domain-adversarial training module into the SSL framework to learn distinctive features that are robust to both various transformations and stain variations. The proposed SASSL is regarded as a general method for domain-invariant feature extraction which can be flexibly combined with arbitrary downstream histopathology image analysis modules (e.g. nuclei/tissue segmentation) by fine-tuning the features for specific downstream tasks. We conducted experiments on publicly available pathological image analysis datasets including the PANDA, BreastPathQ, and CAMELYON16 datasets, achieving the state-of-the-art performance. Experimental results demonstrate that the proposed method can robustly improve the feature extraction ability of the model, and achieve stable performance improvement in downstream tasks. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 16 pages, 8 figures, 7 table, 10 equality

arXiv:2203.06824 [pdf, other]

Low-dose CT reconstruction by self-supervised learning in the projection domain

Authors: Long Zhou, Xiaozhuang Wang, Min Hou, ** Li, Chunlong Fu, Yanjun Ren, Tingting Shao, Xi Hu, Jihong Sun, Hongwei Ye

Abstract: In the intention of minimizing excessive X-ray radiation administration to patients, low-dose computed tomography (LDCT) has become a distinct trend in radiology. However, while lowering the radiation dose reduces the risk to the patient, it also increases noise and artifacts, compromising image quality and clinical diagnosis. In most supervised learning methods, paired CT images are required, but… ▽ More In the intention of minimizing excessive X-ray radiation administration to patients, low-dose computed tomography (LDCT) has become a distinct trend in radiology. However, while lowering the radiation dose reduces the risk to the patient, it also increases noise and artifacts, compromising image quality and clinical diagnosis. In most supervised learning methods, paired CT images are required, but such images are unlikely to be available in the clinic. We present a self-supervised learning model (Noise2Projection) that fully exploits the raw projection images to reduce noise and improve the quality of reconstructed LDCT images. Unlike existing self-supervised algorithms, the proposed method only requires noisy CT projection images and reduces noise by exploiting the correlation between nearby projection images. We trained and tested the model using clinical data and the quantitative and qualitative results suggest that our model can effectively reduce LDCT image noise while also drastically removing artifacts in LDCT images. △ Less

Submitted 13 March, 2022; originally announced March 2022.

arXiv:2202.06320 [pdf, ps, other]

Adaptive Control with Guaranteed Transient Behavior and Zero Steady-State Error for Systems with Time-Varying Parameters

Authors: Hefu Ye, Yongduan Song

Abstract: It is nontrivial to achieve global zero-error regulation for uncertain nonlinear systems. The underlying problem becomes even more challenging if mismatched uncertainties and unknown time-varying control gain are involved, yet certain performance specifications are also pursued. In this work, we present an adaptive control method, which, without the persistent excitation (PE) condition, is able to… ▽ More It is nontrivial to achieve global zero-error regulation for uncertain nonlinear systems. The underlying problem becomes even more challenging if mismatched uncertainties and unknown time-varying control gain are involved, yet certain performance specifications are also pursued. In this work, we present an adaptive control method, which, without the persistent excitation (PE) condition, is able to ensure global zero-error regulation with guaranteed output performance for parametric strict-feedback systems involving fast time-varying parameters in the feedback path and input path. The development of our control scheme benefits from generalized t-dependent and x-dependent functions, a novel coordinate transformation and "congelation of variables" method. Both theoretical analysis and numerical simulation verify the effectiveness and benefits of the proposed method. △ Less

Submitted 13 February, 2022; originally announced February 2022.

Comments: 9 pages, 6 figures

arXiv:2201.02940 [pdf, ps, other]

Backstep** Design Embedded With Time-Varying Command Filters

Authors: Hefu Ye, Yongduan Song

Abstract: If embedded with command filter properly, the implementation of backstep** design could be dramatically simplified. In this paper, we introduce a command filter with time-varying gain and integrate it with backstep** design, resulting in a new set of backstep** control algorithms with low complexity even for high-order strict-feedback systems. Furthermore, with the aid of "softening" sign fu… ▽ More If embedded with command filter properly, the implementation of backstep** design could be dramatically simplified. In this paper, we introduce a command filter with time-varying gain and integrate it with backstep** design, resulting in a new set of backstep** control algorithms with low complexity even for high-order strict-feedback systems. Furthermore, with the aid of "softening" sign function based compensator, zero-error output tracking is ensured while at the same time maintaining prescribed transient performance. Numerical simulation is carried out to verify the effectiveness and benefits of the proposed method. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2201.02939 [pdf, ps, other]

Prescribed-time Control for Linear Systems in Canonical Form Via Nonlinear Feedback

Authors: Hefu Ye, Yongduan Song

Abstract: For systems in canonical form with nonvanishing uncertainties/disturbances, this work presents an approach to full state regulation within prescribed time irrespective of initial conditions. By introducing the smooth hyperbolic-tangent-like function, a nonlinear and time-varying state feedback control scheme is constructed, which is further extended to address output feedback based prescribed-time… ▽ More For systems in canonical form with nonvanishing uncertainties/disturbances, this work presents an approach to full state regulation within prescribed time irrespective of initial conditions. By introducing the smooth hyperbolic-tangent-like function, a nonlinear and time-varying state feedback control scheme is constructed, which is further extended to address output feedback based prescribed-time regulation by invoking the prescribed-time observer, all are applicable over the entire operational time zone. As an alternative to full state regulation within user-assignable time interval, the proposed method analytically bridges the divide between linear and nonlinear feedback based prescribed-time control, and is able to achieve asymptotic stability, exponential stability and prescribed-time stability with a unified control structure. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2112.03815 [pdf]

Accurate parameter estimation using scan-specific unsupervised deep learning for relaxometry and MR fingerprinting

Authors: Mengze Gao, Huihui Ye, Tae Hyung Kim, Zi**g Zhang, Seohee So, Berkin Bilgic

Abstract: We propose an unsupervised convolutional neural network (CNN) for relaxation parameter estimation. This network incorporates signal relaxation and Bloch simulations while taking advantage of residual learning and spatial relations across neighboring voxels. Quantification accuracy and robustness to noise is shown to be significantly improved compared to standard parameter estimation methods in num… ▽ More We propose an unsupervised convolutional neural network (CNN) for relaxation parameter estimation. This network incorporates signal relaxation and Bloch simulations while taking advantage of residual learning and spatial relations across neighboring voxels. Quantification accuracy and robustness to noise is shown to be significantly improved compared to standard parameter estimation methods in numerical simulations and in vivo data for multi-echo T2 and T2* map**. The combination of the proposed network with subspace modeling and MR fingerprinting (MRF) from highly undersampled data permits high quality T1 and T2 map**. △ Less

Submitted 12 December, 2021; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: 7 pages, 5 figures, submitted to International Society for Magnetic Resonance in Medicine 2022

arXiv:2110.15568 [pdf, other]

Unsupervised PET Reconstruction from a Bayesian Perspective

Authors: Chenyu Shen, Wenjun Xia, Hongwei Ye, Mingzheng Hou, Hu Chen, Yan Liu, Jiliu Zhou, Yi Zhang

Abstract: Positron emission tomography (PET) reconstruction has become an ill-posed inverse problem due to low-count projection data, and a robust algorithm is urgently required to improve imaging quality. Recently, the deep image prior (DIP) has drawn much attention and has been successfully applied in several image restoration tasks, such as denoising and inpainting, since it does not need any labels (ref… ▽ More Positron emission tomography (PET) reconstruction has become an ill-posed inverse problem due to low-count projection data, and a robust algorithm is urgently required to improve imaging quality. Recently, the deep image prior (DIP) has drawn much attention and has been successfully applied in several image restoration tasks, such as denoising and inpainting, since it does not need any labels (reference image). However, overfitting is a vital defect of this framework. Hence, many methods have been proposed to mitigate this problem, and DeepRED is a typical representation that combines DIP and regularization by denoising (RED). In this article, we leverage DeepRED from a Bayesian perspective to reconstruct PET images from a single corrupted sinogram without any supervised or auxiliary information. In contrast to the conventional denoisers customarily used in RED, a DnCNN-like denoiser, which can add an adaptive constraint to DIP and facilitate the computation of derivation, is employed. Moreover, to further enhance the regularization, Gaussian noise is injected into the gradient updates, deriving a Markov chain Monte Carlo (MCMC) sampler. Experimental studies on brain and whole-body datasets demonstrate that our proposed method can achieve better performance in terms of qualitative and quantitative results compared to several classic and state-of-the-art methods. △ Less

Submitted 29 October, 2021; originally announced October 2021.

arXiv:2110.10965 [pdf, other]

2020 CATARACTS Semantic Segmentation Challenge

Authors: Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heon** Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio , et al. (15 additional authors not shown)

Abstract: Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presenc… ▽ More Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presence information. In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set. The 2020 CATARACTS Semantic Segmentation Challenge, which was a sub-challenge of the 2020 MICCAI Endoscopic Vision (EndoVis) Challenge, presented three sub-tasks to assess participating solutions on anatomical structure and instrument segmentation. Their performance was assessed on a hidden test set of 531 images from 10 videos of the CATARACTS test set. △ Less

Submitted 24 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

arXiv:2108.12587 [pdf]

BUDA-SAGE with self-supervised denoising enables fast, distortion-free, high-resolution T2, T2*, para- and dia-magnetic susceptibility map**

Authors: Zi**g Zhang, Long Wang, Jae** Cho, Congyu Liao, Hyeong-Geol Shin, Xiaozhi Cao, Jongho Lee, **min Xu, Tao Zhang, Huihui Ye, Kawin Setsompop, Huafeng Liu, Berkin Bilgic

Abstract: To rapidly obtain high resolution T2, T2* and quantitative susceptibility map** (QSM) source separation maps with whole-brain coverage and high geometric fidelity. We propose Blip Up-Down Acquisition for Spin And Gradient Echo imaging (BUDA-SAGE), an efficient echo-planar imaging (EPI) sequence for quantitative map**. The acquisition includes multiple T2*-, T2'- and T2-weighted contrasts. We a… ▽ More To rapidly obtain high resolution T2, T2* and quantitative susceptibility map** (QSM) source separation maps with whole-brain coverage and high geometric fidelity. We propose Blip Up-Down Acquisition for Spin And Gradient Echo imaging (BUDA-SAGE), an efficient echo-planar imaging (EPI) sequence for quantitative map**. The acquisition includes multiple T2*-, T2'- and T2-weighted contrasts. We alternate the phase-encoding polarities across the interleaved shots in this multi-shot navigator-free acquisition. A field map estimated from interim reconstructions was incorporated into the joint multi-shot EPI reconstruction with a structured low rank constraint to eliminate geometric distortion. A self-supervised MR-Self2Self (MR-S2S) neural network (NN) was utilized to perform denoising after BUDA reconstruction to boost SNR. Employing Slider encoding allowed us to reach 1 mm isotropic resolution by performing super-resolution reconstruction on BUDA-SAGE volumes acquired with 2 mm slice thickness. Quantitative T2 and T2* maps were obtained using Bloch dictionary matching on the reconstructed echoes. QSM was estimated using nonlinear dipole inversion (NDI) on the gradient echoes. Starting from the estimated R2 and R2* maps, R2' information was derived and used in source separation QSM reconstruction, which provided additional para- and dia-magnetic susceptibility maps. In vivo results demonstrate the ability of BUDA-SAGE to provide whole-brain, distortion-free, high-resolution multi-contrast images and quantitative T2 and T2* maps, as well as yielding para- and dia-magnetic susceptibility maps. Derived quantitative maps showed comparable values to conventional map** methods in phantom and in vivo measurements. BUDA-SAGE acquisition with self-supervised denoising and Slider encoding enabled rapid, distortion-free, whole-brain T2, T2* map** at 1 mm3 isotropic resolution in 90 seconds. △ Less

Submitted 9 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

arXiv:2107.11650 [pdf, other]

Accelerated MRI Reconstruction with Separable and Enhanced Low-Rank Hankel Regularization

Authors: Xinlin Zhang, Hengfa Lu, Di Guo, Zongying Lai, Huihui Ye, Xi Peng, Bo Zhao, Xiaobo Qu

Abstract: The combination of the sparse sampling and the low-rank structured matrix reconstruction has shown promising performance, enabling a significant reduction of the magnetic resonance imaging data acquisition time. However, the low-rank structured approaches demand considerable memory consumption and are time-consuming due to a noticeable number of matrix operations performed on the huge-size block H… ▽ More The combination of the sparse sampling and the low-rank structured matrix reconstruction has shown promising performance, enabling a significant reduction of the magnetic resonance imaging data acquisition time. However, the low-rank structured approaches demand considerable memory consumption and are time-consuming due to a noticeable number of matrix operations performed on the huge-size block Hankel-like matrix. In this work, we proposed a novel framework to utilize the low-rank property but meanwhile to achieve faster reconstructions and promising results. The framework allows us to enforce the low-rankness of Hankel matrices constructing from 1D vectors instead of 2D matrices from 1D vectors and thus avoid the construction of huge block Hankel matrix for 2D k-space matrices. Moreover, under this framework, we can easily incorporate other information, such as the smooth phase of the image and the low-rankness in the parameter dimension, to further improve the image quality. We built and validated two models for parallel and parameter magnetic resonance imaging experiments, respectively. Our retrospective in-vivo results indicate that the proposed approaches enable faster reconstructions than the state-of-the-art approaches, e.g., about 8x faster than STDLRSPIRiT, and faithful removal of undersampling artifacts. △ Less

Submitted 24 July, 2021; originally announced July 2021.

Comments: 17 pages, 17 figures

arXiv:2104.09798 [pdf, other]

CoDR: Computation and Data Reuse Aware CNN Accelerator

Authors: Alireza Khadem, Haojie Ye, Trevor Mudge

Abstract: Computation and Data Reuse is critical for the resource-limited Convolutional Neural Network (CNN) accelerators. This paper presents Universal Computation Reuse to exploit weight sparsity, repetition, and similarity simultaneously in a convolutional layer. Moreover, CoDR decreases the cost of weight memory access by proposing a customized Run-Length Encoding scheme and the number of memory accesse… ▽ More Computation and Data Reuse is critical for the resource-limited Convolutional Neural Network (CNN) accelerators. This paper presents Universal Computation Reuse to exploit weight sparsity, repetition, and similarity simultaneously in a convolutional layer. Moreover, CoDR decreases the cost of weight memory access by proposing a customized Run-Length Encoding scheme and the number of memory accesses to the intermediate results by introducing an input and output stationary dataflow. Compared to two recent compressed CNN accelerators with the same area of 2.85 mm^2, CoDR decreases SRAM access by 5.08x and 7.99x, and consumes 3.76x and 6.84x less energy. △ Less

Submitted 20 April, 2021; originally announced April 2021.

arXiv:2011.04994 [pdf, other]

AIM 2020 Challenge on Learned Image Signal Processing Pipeline

Authors: Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, Wangmeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, Linhui Dai, Xiaohong Liu, Chengqi Li, Jun Chen, Yuichi Ito, Bhavya Vasudeva, Puneesh Deora, Umapada Pal, Zhenyu Guo, Yu Zhu, Tian Liang, Chenghua Li, Cong Leng, Zhihong Pan, Baopu Li , et al. (14 additional authors not shown)

Abstract: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB map** problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of com… ▽ More This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB map** problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of complex computer vision subtasks, such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions' perceptual results measured in a user study. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical image signal processing pipeline modeling. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: Published in ECCV 2020 Workshops (Advances in Image Manipulation), https://data.vision.ee.ethz.ch/cvl/aim20/

arXiv:2011.02679 [pdf, ps, other]

A Multi-resolution Model for Histopathology Image Classification and Localization with Multiple Instance Learning

Authors: Jiayun Li, Wenyuan Li, Anthony Sisk, Huihui Ye, W. Dean Wallace, William Speier, Corey W. Arnold

Abstract: Histopathological images provide rich information for disease diagnosis. Large numbers of histopathological images have been digitized into high resolution whole slide images, opening opportunities in develo** computational image analysis tools to reduce pathologists' workload and potentially improve inter- and intra- observer agreement. Most previous work on whole slide image analysis has focus… ▽ More Histopathological images provide rich information for disease diagnosis. Large numbers of histopathological images have been digitized into high resolution whole slide images, opening opportunities in develo** computational image analysis tools to reduce pathologists' workload and potentially improve inter- and intra- observer agreement. Most previous work on whole slide image analysis has focused on classification or segmentation of small pre-selected regions-of-interest, which requires fine-grained annotation and is non-trivial to extend for large-scale whole slide analysis. In this paper, we proposed a multi-resolution multiple instance learning model that leverages saliency maps to detect suspicious regions for fine-grained grade prediction. Instead of relying on expensive region- or pixel-level annotations, our model can be trained end-to-end with only slide-level labels. The model is developed on a large-scale prostate biopsy dataset containing 20,229 slides from 830 patients. The model achieved 92.7% accuracy, 81.8% Cohen's Kappa for benign, low grade (i.e. Grade group 1) and high grade (i.e. Grade group >= 2) prediction, an area under the receiver operating characteristic curve (AUROC) of 98.2% and an average precision (AP) of 97.4% for differentiating malignant and benign slides. The model obtained an AUROC of 99.4% and an AP of 99.8% for cancer detection on an external dataset. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: 9 pages, 6 figures

arXiv:2008.02993 [pdf, ps, other]

Joint Uplink-and-Downlink Optimization of 3D UAV Swarm Deployment for Wireless-Powered NB-IoT Networks

Authors: Han-Ting Ye, Xin Kang, **gon Joung, Ying-Chang Liang

Abstract: This paper investigates a full-duplex orthogonal-frequency-division multiple access (OFDMA) based multiple unmanned aerial vehicles (UAVs)-enabled wireless-powered Internet-of-Things (IoT) networks. In this paper, a swarm of UAVs is first deployed in three dimensions (3D) to simultaneously charge all devices, i.e., a downlink (DL) charging period, and then flies to new locations within this area t… ▽ More This paper investigates a full-duplex orthogonal-frequency-division multiple access (OFDMA) based multiple unmanned aerial vehicles (UAVs)-enabled wireless-powered Internet-of-Things (IoT) networks. In this paper, a swarm of UAVs is first deployed in three dimensions (3D) to simultaneously charge all devices, i.e., a downlink (DL) charging period, and then flies to new locations within this area to collect information from scheduled devices in several epochs via OFDMA due to potential limited number of channels available in Narrow Band IoT, i.e., an uplink (UL) communication period. To maximize the UL throughput of IoT devices, we jointly optimizes the UL-and-DL 3D deployment of the UAV swarm, including the device-UAV association, the scheduling order, and the UL-DL time allocation. In particular, the DL energy harvesting (EH) threshold of devices and the UL signal decoding threshold of UAVs are taken into consideration when studying the problem. Besides, both line-of-sight (LoS) and non-line-of-sight (NLoS) channel models are studied depending on the position of sensors and UAVs. The influence of the potential limited channels issue in NB-IoT is also considered by studying the IoT scheduling policy. Two scheduling policies, a near-first (NF) policy and a far-first (FF) policy, are studied. It is shown that the NF scheme outperforms FF scheme in terms of sum throughput maximization; whereas FF scheme outperforms NF scheme in terms of system fairness. △ Less

Submitted 7 August, 2020; originally announced August 2020.

arXiv:2005.05265 [pdf, ps, other]

Federated Learning and Wireless Communications

Authors: Zhi** Qin, Geoffrey Ye Li, Hao Ye

Abstract: Federated learning becomes increasingly attractive in the areas of wireless communications and machine learning due to its powerful functions and potential applications. In contrast to other machine learning tools that require no communication resources, federated learning exploits communications between the central server and the distributed local clients to train and optimize a machine learning… ▽ More Federated learning becomes increasingly attractive in the areas of wireless communications and machine learning due to its powerful functions and potential applications. In contrast to other machine learning tools that require no communication resources, federated learning exploits communications between the central server and the distributed local clients to train and optimize a machine learning model. Therefore, how to efficiently assign limited communication resources to train a federated learning model becomes critical to performance optimization. On the other hand, federated learning, as a brand new tool, can potentially enhance the intelligence of wireless networks. In this article, we provide a comprehensive overview on the relationship between federated learning and wireless communications, including basic principle of federated learning, efficient communications for training a federated learning model, and federated learning for intelligent wireless applications. We also identify some future research challenges and directions at the end of this article. △ Less

Submitted 12 May, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

arXiv:2004.07576 [pdf, other]

Deep Learning based Denoise Network for CSI Feedback in FDD Massive MIMO Systems

Authors: Hongyuan Ye, Feifei Gao, **g Qian, Hao Wang, Geoffrey Ye Li

Abstract: Channel state information (CSI) feedback is critical for frequency division duplex (FDD) massive multi-input multi-output (MIMO) systems. Most conventional algorithms are based on compressive sensing (CS) and are highly dependent on the level of channel sparsity. To address the issue, a recent approach adopts deep learning (DL) to compress CSI into a codeword with low dimensionality, which has sho… ▽ More Channel state information (CSI) feedback is critical for frequency division duplex (FDD) massive multi-input multi-output (MIMO) systems. Most conventional algorithms are based on compressive sensing (CS) and are highly dependent on the level of channel sparsity. To address the issue, a recent approach adopts deep learning (DL) to compress CSI into a codeword with low dimensionality, which has shown much better performance than the CS algorithms when feedback link is perfect. In practical scenario, however, there exists various interference and non-linear effect. In this article, we design a DL-based denoise network, called DNNet, to improve the performance of channel feedback. Numerical results show that the DL-based feedback algorithm with the proposed DNNet has superior performance over the existing algorithms, especially at low signal-to-noise ratio (SNR). △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:2004.06689 [pdf]

Weakly Supervised Deep Learning for COVID-19 Infection Detection and Classification from CT Images

Authors: Shao** Hu, Yuan Gao, Zhangming Niu, Yinghui Jiang, Lao Li, Xianglu Xiao, Minhao Wang, Evandro Fei Fang, Wade Menpes-Smith, Jun Xia, Hui Ye, Guang Yang

Abstract: An outbreak of a novel coronavirus disease (i.e., COVID-19) has been recorded in Wuhan, China since late December 2019, which subsequently became pandemic around the world. Although COVID-19 is an acutely treated disease, it can also be fatal with a risk of fatality of 4.03% in China and the highest of 13.04% in Algeria and 12.67% Italy (as of 8th April 2020). The onset of serious illness may resu… ▽ More An outbreak of a novel coronavirus disease (i.e., COVID-19) has been recorded in Wuhan, China since late December 2019, which subsequently became pandemic around the world. Although COVID-19 is an acutely treated disease, it can also be fatal with a risk of fatality of 4.03% in China and the highest of 13.04% in Algeria and 12.67% Italy (as of 8th April 2020). The onset of serious illness may result in death as a consequence of substantial alveolar damage and progressive respiratory failure. Although laboratory testing, e.g., using reverse transcription polymerase chain reaction (RT-PCR), is the golden standard for clinical diagnosis, the tests may produce false negatives. Moreover, under the pandemic situation, shortage of RT-PCR testing resources may also delay the following clinical decision and treatment. Under such circumstances, chest CT imaging has become a valuable tool for both diagnosis and prognosis of COVID-19 patients. In this study, we propose a weakly supervised deep learning strategy for detecting and classifying COVID-19 infection from CT images. The proposed method can minimise the requirements of manual labelling of CT images but still be able to obtain accurate infection detection and distinguish COVID-19 from non-COVID-19 cases. Based on the promising results obtained qualitatively and quantitatively, we can envisage a wide deployment of our developed technique in large-scale clinical studies. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: 21 pages, 7 figures

arXiv:2002.00011 [pdf, other]

Age-Conditioned Synthesis of Pediatric Computed Tomography with Auxiliary Classifier Generative Adversarial Networks

Authors: Chi Nok Enoch Kan, Najibakram Maheenaboobacker, Dong Hye Ye

Abstract: Deep learning is a popular and powerful tool in computed tomography (CT) image processing such as organ segmentation, but its requirement of large training datasets remains a challenge. Even though there is a large anatomical variability for children during their growth, the training datasets for pediatric CT scans are especially hard to obtain due to risks of radiation to children. In this paper,… ▽ More Deep learning is a popular and powerful tool in computed tomography (CT) image processing such as organ segmentation, but its requirement of large training datasets remains a challenge. Even though there is a large anatomical variability for children during their growth, the training datasets for pediatric CT scans are especially hard to obtain due to risks of radiation to children. In this paper, we propose a method to conditionally synthesize realistic pediatric CT images using a new auxiliary classifier generative adversarial network (ACGAN) architecture by taking age information into account. The proposed network generated age-conditioned high-resolution CT images to enrich pediatric training datasets. △ Less

Submitted 31 January, 2020; originally announced February 2020.

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2020

Journal ref: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI 2020)

arXiv:1910.09748 [pdf]

doi 10.1167/tvst.9.2.29

Assessment of Generative Adversarial Networks Model for Synthetic Optical Coherence Tomography Images of Retinal Disorders

Authors: Ce Zheng, Xiaolin Xie, Kang Zhou, Bang Chen, Jili Chen, Haiyun Ye, Wen Li, Tong Qiao, Shenghua Gao, Jianlong Yang, Jiang Liu

Abstract: Purpose: To assess whether a generative adversarial network (GAN) could synthesize realistic optical coherence tomography (OCT) images that satisfactorily serve as the educational images for retinal specialists and the training datasets for the classification of various retinal disorders using deep learning (DL). Methods: The GANs architecture was adopted to synthesis high-resolution OCT images tr… ▽ More Purpose: To assess whether a generative adversarial network (GAN) could synthesize realistic optical coherence tomography (OCT) images that satisfactorily serve as the educational images for retinal specialists and the training datasets for the classification of various retinal disorders using deep learning (DL). Methods: The GANs architecture was adopted to synthesis high-resolution OCT images training on a publicly available OCT dataset including urgent referrals (choroidal neovascularization and diabetic macular edema) and non-urgent referrals (normal and drusen). 400 real and synthetic OCT images were evaluated by 2 retinal specialists to assess image quality. We further trained 2 DL models on either real or synthetic datasets and compared the performance of urgent vs nonurgent referrals diagnosis tested on a local (1000 images from the public dataset) and clinical validation dataset (278 images from Shanghai Shibei Hospital). Results: The image quality of real vs synthetic OCT images was similar as assessed by 2 retinal specialists. The accuracy of discrimination as real vs synthetic OCT images was 59.50% for retinal specialist 1 and 53.67% for retinal specialist 2. For the local dataset, the DL model trained on real (DL_Model_R) and synthetic OCT images (DL_Model_S) had an area under the curve (AUC) of 0.99, and 0.98 respectively. For the clinical dataset, the AUC was 0.94 for DL_Model_R, 0.90 for DL_Model_S. Conclusions: The GAN-synthetic OCT images can be used by clinicians for educational purposes and develo** DL algorithms. Translational Relevance: The medical image synthesis based on GANs is promising in human and machine to fulfill clinical tasks. △ Less

Submitted 21 October, 2019; originally announced October 2019.

Comments: submitted

arXiv:1908.04685 [pdf, ps, other]

Learn to Compress CSI and Allocate Resources in Vehicular Networks

Authors: Liang Wang, Hao Ye, Le Liang, Geoffrey Ye Li

Abstract: Resource allocation has a direct and profound impact on the performance of vehicle-to-everything (V2X) networks. In this paper, we develop a hybrid architecture consisting of centralized decision making and distributed resource sharing (the C-Decision scheme) to maximize the long-term sum rate of all vehicles. To reduce the network signaling overhead, each vehicle uses a deep neural network to com… ▽ More Resource allocation has a direct and profound impact on the performance of vehicle-to-everything (V2X) networks. In this paper, we develop a hybrid architecture consisting of centralized decision making and distributed resource sharing (the C-Decision scheme) to maximize the long-term sum rate of all vehicles. To reduce the network signaling overhead, each vehicle uses a deep neural network to compress its observed information that is thereafter fed back to the centralized decision making unit. The centralized decision unit employs a deep Q-network to allocate resources and then sends the decision results to all vehicles. We further adopt a quantization layer for each vehicle that learns to quantize the continuous feedback. In addition, we devise a mechanism to balance the transmission of vehicle-to-vehicle (V2V) links and vehicle-to-infrastructure (V2I) links. To further facilitate distributed spectrum sharing, we also propose a distributed decision making and spectrum sharing architecture (the D-Decision scheme) for each V2V link. Through extensive simulation results, we demonstrate that the proposed C-Decision and D-Decision schemes can both achieve near-optimal performance and are robust to feedback interval variations, input noise, and feedback noise. △ Less

Submitted 11 August, 2019; originally announced August 2019.

Comments: arXiv admin note: text overlap with arXiv:1908.03447

arXiv:1908.03447 [pdf, ps, other]

Learn to Allocate Resources in Vehicular Networks

Authors: Liang Wang, Hao Ye, Le Liang, Geoffrey Ye Li

Abstract: Resource allocation has a direct and profound impact on the performance of vehicle-to-everything (V2X) networks. Considering the dynamic nature of vehicular environments, it is appealing to devise a decentralized strategy to perform effective resource sharing. In this paper, we exploit deep learning to promote coordination among multiple vehicles and propose a hybrid architecture consisting of cen… ▽ More Resource allocation has a direct and profound impact on the performance of vehicle-to-everything (V2X) networks. Considering the dynamic nature of vehicular environments, it is appealing to devise a decentralized strategy to perform effective resource sharing. In this paper, we exploit deep learning to promote coordination among multiple vehicles and propose a hybrid architecture consisting of centralized decision making and distributed resource sharing to maximize the long-term sum rate of all vehicles. To reduce the network signaling overhead, each vehicle uses a deep neural network to compress its own observed information that is thereafter fed back to the centralized decision-making unit, which employs a deep Q-network to allocate resources and then sends the decision results to all vehicles. We further adopt a quantization layer for each vehicle that learns to quantize the continuous feedback. Extensive simulation results demonstrate that the proposed hybrid architecture can achieve near-optimal performance. Meanwhile, there exists an optimal number of continuous feedback and binary feedback, respectively. Besides, this architecture is robust to different feedback intervals, input noise, and feedback noise. △ Less

Submitted 30 July, 2019; originally announced August 2019.

arXiv:1906.02939 [pdf, other]

Key Ingredients of Self-Driving Cars

Authors: Rui Fan, Jianhao Jiao, Haoyang Ye, Yang Yu, Ioannis Pitas, Ming Liu

Abstract: Over the past decade, many research articles have been published in the area of autonomous driving. However, most of them focus only on a specific technological area, such as visual environment perception, vehicle control, etc. Furthermore, due to fast advances in the self-driving car technology, such articles become obsolete very fast. In this paper, we give a brief but comprehensive overview on… ▽ More Over the past decade, many research articles have been published in the area of autonomous driving. However, most of them focus only on a specific technological area, such as visual environment perception, vehicle control, etc. Furthermore, due to fast advances in the self-driving car technology, such articles become obsolete very fast. In this paper, we give a brief but comprehensive overview on key ingredients of autonomous cars (ACs), including driving automation levels, AC sensors, AC software, open source datasets, industry leaders, AC applications and existing challenges. △ Less

Submitted 10 August, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: 5 pages, 2 figures, EUSIPCO 2019 Satellite Workshop: Signal Processing, Computer Vision and Deep Learning for Autonomous Systems

arXiv:1812.08367 [pdf, other]

2.5D Deep Learning for CT Image Reconstruction using a Multi-GPU implementation

Authors: Amirkoushyar Ziabari, Dong Hye Ye, Somesh Srivastava, Ken D. Sauer, Jean-Baptiste Thibault, Charles A. Bouman

Abstract: While Model Based Iterative Reconstruction (MBIR) of CT scans has been shown to have better image quality than Filtered Back Projection (FBP), its use has been limited by its high computational cost. More recently, deep convolutional neural networks (CNN) have shown great promise in both denoising and reconstruction applications. In this research, we propose a fast reconstruction algorithm, which… ▽ More While Model Based Iterative Reconstruction (MBIR) of CT scans has been shown to have better image quality than Filtered Back Projection (FBP), its use has been limited by its high computational cost. More recently, deep convolutional neural networks (CNN) have shown great promise in both denoising and reconstruction applications. In this research, we propose a fast reconstruction algorithm, which we call Deep Learning MBIR (DL-MBIR), for approximating MBIR using a deep residual neural network. The DL-MBIR method is trained to produce reconstructions that approximate true MBIR images using a 16 layer residual convolutional neural network implemented on multiple GPUs using Google Tensorflow. In addition, we propose 2D, 2.5D and 3D variations on the DL-MBIR method and show that the 2.5D method achieves similar quality to the fully 3D method, but with reduced computational cost. △ Less

Submitted 20 December, 2018; originally announced December 2018.

Comments: IEEE Asilomar conference on signals systems and computers, 2018

arXiv:1812.08364 [pdf, other]

Model Based Iterative Reconstruction With Spatially Adaptive Sinogram Weights for Wide-Cone Cardiac CT

Authors: Amirkoushyar Ziabari, Dong Hye Ye, Lin Fu, Somesh Srivastava, Ken D. Sauer, Jean-Baptist Thibault, Charles A. Bouman

Abstract: With the recent introduction of CT scanners with large cone angles, wide coverage detectors now provide a desirable scanning platform for cardiac CT that allows whole heart imaging in a single rotation. On these scanners, while half-scan data is strictly sufficient to produce images with the best temporal resolution, acquiring a full 360 degree rotation worth of data is beneficial for wide-cone im… ▽ More With the recent introduction of CT scanners with large cone angles, wide coverage detectors now provide a desirable scanning platform for cardiac CT that allows whole heart imaging in a single rotation. On these scanners, while half-scan data is strictly sufficient to produce images with the best temporal resolution, acquiring a full 360 degree rotation worth of data is beneficial for wide-cone image reconstruction at negligible additional radiation dose. Applying Model-Based Iterative Reconstruction (MBIR) algorithm to the heart has shown to yield significant enhancement in image quality for cardiac CT. But imaging the heart in large cone angle geometry leads to apparently conflicting data usage considerations. On the one hand, in addition to using the fastest available scanner rotation speed, a minimal complete data set of 180 degrees plus the fan angle is typically used to minimize both cardiac and respiratory motion. On the other hand, a full 360 degree acquisition helps better handle the challenges of missing frequencies and incomplete projections associated with wide-cone half-scan data acquisition. In this paper, we develop a Spatially Adaptive sinogram Weights MBIR algorithm (SAW-MBIR) that is designed to achieve the benefits of both half and full-scan reconstructions in order to maximize temporal resolution over the heart region while providing stable results over the whole volume covered with the wide-area detector. Spatially-adaptive sinogram weights applied to each projection measurement in SAW-MBIR are designed to selectively perform backprojection from the full and half-scan portion of the sinogram based on both projection angle and reconstructed voxel location. We demonstrate with experimental results of SAW-MBIR applied to whole-heart cardiac CT clinical data that overall temporal resolution matches half-scan while full volume image quality is on par with full-scan MBIR. △ Less

Submitted 20 December, 2018; originally announced December 2018.

Comments: The 5th international Conference on image formation in X-ray Computed Tomography (Proceedings of CT Meeting). Compared to original publication, we slightly modified figure 4 for better clarity

arXiv:1812.08067 [pdf]

doi 10.1002/mrm.27812

Ultrashort Echo Time Magnetic Resonance Fingerprinting (UTE-MRF) for Simultaneous Quantification of Long and Ultrashort T2 Tissues

Authors: Qing Li, Xiaozhi Cao, Huihui Ye, Congyu Liao, Hongjian He, Jianhui Zhong

Abstract: Purpose: To demonstrate an ultrashort echo time magnetic resonance fingerprinting (UTE-MRF) method that can simultaneously quantify tissue relaxometries for muscle and bone in musculoskeletal systems and tissue components in brain and therefore can synthesize pseudo-CT images. Methods: A FISP-MRF sequence with half pulse excitation and half spoke radial acquisition was designed to sample fast T2… ▽ More Purpose: To demonstrate an ultrashort echo time magnetic resonance fingerprinting (UTE-MRF) method that can simultaneously quantify tissue relaxometries for muscle and bone in musculoskeletal systems and tissue components in brain and therefore can synthesize pseudo-CT images. Methods: A FISP-MRF sequence with half pulse excitation and half spoke radial acquisition was designed to sample fast T2 decay signals. Sinusoidal echo time (TE) pattern was applied to enhance MRF sensitivity for tissues with short and ultrashort T2 values. The performance of UTE-MRF was evaluated via simulations, phantoms, and in vivo experiments. Results: A minimal TE of 0.05 ms was achieved in UTE-MRF. Simulations indicated that extension of TE sampling increased T2 quantification accuracy in cortical bone and tendon, and had little impact on long T2 muscle quantifications. For a rubber phantom, an average T1/T2 of 162/1.07 ms from UTE-MRF were compared well with gold standard T2 of 190 ms from IR-UTE and T2* of 1.03 ms from UTE sequence. For a long T2 agarose phantom, the linear regression slope between UTE-MRF and gold standard was 1.07 (R2=0.991) for T1 and 1.04 (R2=0.994) for T2. In vivo experiments showed the detection of cortical bone and Achilles tendon, where the averaged T2 was respectively 1.0 ms and 15 ms. Scalp images were in good agreement with CT. Conclusion: UTE-MRF with sinusoidal TE variations shows its capability to produce pseudo-CT images and simultaneously output T1, T2, proton density, and B0 maps for tissues with long T2 and short/ultrashort T2 in the brain and musculoskeletal system. △ Less

Submitted 27 March, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

Comments: 32 pages, 12 figures, 1 table

Journal ref: Magnetic Resonance in Medicine (2019)

arXiv:1811.09508 [pdf, ps, other]

doi 10.1109/TAP.2019.2899850

Monopulse beam synthesis using a sparse single-layer of weights

Authors: Semin Kwak, Joohwan Chun, Sung Hyuck Ye

Abstract: A conventional monopulse radar system uses three beams; sum beam, elevation difference beam and azimuth difference beam, which require different layers of weights to synthesize each beam independently. Since the multi-layer structure increases hardware complexity, many simplified structures based on a single layer of weights have been suggested. In this work, we introduce a new technique for findi… ▽ More A conventional monopulse radar system uses three beams; sum beam, elevation difference beam and azimuth difference beam, which require different layers of weights to synthesize each beam independently. Since the multi-layer structure increases hardware complexity, many simplified structures based on a single layer of weights have been suggested. In this work, we introduce a new technique for finding disjoint and fully covering sets of weight vectors, each of which constitutes a sparse subarray, forming a single beam. Our algorithm decomposes the original non-convex optimization problem for finding disjoint weight vectors into a sequence of convex problems. We demonstrate the convergence of the algorithm and show that the interleaved array structure is able to meet difficult beam constraints. △ Less

Submitted 23 November, 2018; originally announced November 2018.

arXiv:1811.04761 [pdf, other]

Self-Refining Deep Symmetry Enhanced Network for Rain Removal

Authors: Hong Liu, Hanrong Ye, Xia Li, Wei Shi, Mengyuan Liu, Qianru Sun

Abstract: Rain removal aims to remove the rain streaks on rain images. The state-of-the-art methods are mostly based on Convolutional Neural Network~(CNN). However, as CNN is not equivariant to object rotation, these methods are unsuitable for dealing with the tilted rain streaks. To tackle this problem, we propose Deep Symmetry Enhanced Network~(DSEN) that is able to explicitly extract the rotation equivar… ▽ More Rain removal aims to remove the rain streaks on rain images. The state-of-the-art methods are mostly based on Convolutional Neural Network~(CNN). However, as CNN is not equivariant to object rotation, these methods are unsuitable for dealing with the tilted rain streaks. To tackle this problem, we propose Deep Symmetry Enhanced Network~(DSEN) that is able to explicitly extract the rotation equivariant features from rain images. In addition, we design a self-refining mechanism to remove the accumulated rain streaks in a coarse-to-fine manner. This mechanism reuses DSEN with a novel information link which passes the gradient flow to the higher stages. Extensive experiments on both synthetic and real-world rain images show that our self-refining DSEN yields the top performance. △ Less

Submitted 6 September, 2020; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: Accepted by ICIP 19. Corresponding author: Hanrong Ye

arXiv:1808.03735 [pdf, other]

Video Logo Retrieval based on local Features

Authors: Bochen Guan, Hanrong Ye, Hong Liu, William A. Sethares

Abstract: Estimation of the frequency and duration of logos in videos is important and challenging in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the… ▽ More Estimation of the frequency and duration of logos in videos is important and challenging in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the spatial distribution of local image descriptors that measure the distance between the query image (the logo) and a collection of video images. VLR uses local features to overcome the weakness of global feature-based models such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and does not require training after setting some hyper-parameters. The performance of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and Standford I2V), and compared with other state-of-the-art logo retrieval or detection algorithms. Overall, VLR shows significantly higher accuracy compared with the existing methods. △ Less

Submitted 18 May, 2020; v1 submitted 10 August, 2018; originally announced August 2018.

Comments: Accepted by ICIP 20. Contact author: Bochen Guan ([email protected])

arXiv:1807.02370 [pdf, other]

Deep Back Projection for Sparse-View CT Reconstruction

Authors: Dong Hye Ye, Gregery T. Buzzard, Max Ruby, Charles A. Bouman

Abstract: Filtered back projection (FBP) is a classical method for image reconstruction from sinogram CT data. FBP is computationally efficient but produces lower quality reconstructions than more sophisticated iterative methods, particularly when the number of views is lower than the number required by the Nyquist rate. In this paper, we use a deep convolutional neural network (CNN) to produce high-quality… ▽ More Filtered back projection (FBP) is a classical method for image reconstruction from sinogram CT data. FBP is computationally efficient but produces lower quality reconstructions than more sophisticated iterative methods, particularly when the number of views is lower than the number required by the Nyquist rate. In this paper, we use a deep convolutional neural network (CNN) to produce high-quality reconstructions directly from sinogram data. A primary novelty of our approach is that we first back project each view separately to form a stack of back projections and then feed this stack as input into the convolutional neural network. These single-view back projections convert the encoding of sinogram data into the appropriate spatial location, which can then be leveraged by the spatial invariance of the CNN to learn the reconstruction effectively. We demonstrate the benefit of our CNN based back projection on simulated sparse-view CT data over classical FBP. △ Less

Submitted 6 July, 2018; originally announced July 2018.

Comments: GlobalSIP 2018

arXiv:1802.00631 [pdf, other]

Satellite Image Scene Classification via ConvNet with Context Aggregation

Authors: Zhao Zhou, Yingbin Zheng, Hao Ye, Jian Pu, Gufei Sun

Abstract: Scene classification is a fundamental problem to understand the high-resolution remote sensing imagery. Recently, convolutional neural network (ConvNet) has achieved remarkable performance in different tasks, and significant efforts have been made to develop various representations for satellite image scene classification. In this paper, we present a novel representation based on a ConvNet with co… ▽ More Scene classification is a fundamental problem to understand the high-resolution remote sensing imagery. Recently, convolutional neural network (ConvNet) has achieved remarkable performance in different tasks, and significant efforts have been made to develop various representations for satellite image scene classification. In this paper, we present a novel representation based on a ConvNet with context aggregation. The proposed two-pathway ResNet (ResNet-TP) architecture adopts the ResNet as backbone, and the two pathways allow the network to model both local details and regional context. The ResNet-TP based representation is generated by global average pooling on the last convolutional layers from both pathways. Experiments on two scene classification datasets, UCM Land Use and NWPU-RESISC45, show that the proposed mechanism achieves promising improvements over state-of-the-art methods. △ Less

Submitted 12 July, 2018; v1 submitted 2 February, 2018; originally announced February 2018.

Journal ref: Pacific-Rim Conference on Multimedia (PCM), 2018

arXiv:1704.08034 [pdf]

Optimal Decentralized Economical-sharing Scheme in Islanded AC Microgrids with Cascaded Inverters

Authors: Lang Li, Huawen Ye, Yao Sun, Zhangjie Liu, Hua Han, Mei Su, Josep M. Guerrero

Abstract: To address the economical dispatch problem without communications in islanded AC microgrids consisting of cascaded inverters, this paper proposes an optimal decentralized economical-sharing scheme. In proposed scheme, optimal sharing function of the current is applied to generate the reference voltages. And the frequency is used to drive all distributed generators (DGs) synchronize operation in mi… ▽ More To address the economical dispatch problem without communications in islanded AC microgrids consisting of cascaded inverters, this paper proposes an optimal decentralized economical-sharing scheme. In proposed scheme, optimal sharing function of the current is applied to generate the reference voltages. And the frequency is used to drive all distributed generators (DGs) synchronize operation in microgrids. When the microgrid is in steady state, DGs share a single common frequency and current in terms of the proposed scheme. Thus the potential advantages of simplicity and decentralized manner are retained. The AC microgrid model has been developed through simulations and experiments to verify the effectiveness and performance of the proposed scheme. △ Less

Submitted 26 April, 2017; originally announced April 2017.

Comments: 16 pages, 14 figures

arXiv:1512.06050 [pdf, other]

Pricing the Ram** Reserve and Capacity Reserve in Real Time Markets

Authors: Hongxing Ye, Zuyi Li

Abstract: The increasing penetration of renewable energy in recent years has led to more uncertainties in power systems. In order to maintain system reliability and security, electricity market operators need to keep certain reserves in the Security-Constrained Economic Dispatch (SCED) problems. A new concept, deliverable generation ram** reserve, is proposed in this paper. The prices of generation rampin… ▽ More The increasing penetration of renewable energy in recent years has led to more uncertainties in power systems. In order to maintain system reliability and security, electricity market operators need to keep certain reserves in the Security-Constrained Economic Dispatch (SCED) problems. A new concept, deliverable generation ram** reserve, is proposed in this paper. The prices of generation ram** reserves and generation capacity reserves are derived in the Affine Adjustable Robust Optimization framework. With the help of these prices, the valuable reserves can be identified among the available reserves. These prices provide crucial information on the values of reserve resources, which are critical for the long-term flexibility investment. The market equilibrium based on these prices is analyzed. Simulations on a 3-bus system and the IEEE 118-bus system are performed to illustrate the concept of ram** reserve price and capacity reserve price. The impacts of the reserve credit on market participants are discussed. △ Less

Submitted 18 December, 2015; originally announced December 2015.

Comments: We presented related content in 2014 IEEE Power and Energy Society General Meeting

arXiv:1507.01540 [pdf, other]

doi 10.1109/TPWRS.2016.2595621

Uncertainty Marginal Price, Transmission Reserve, and Day-ahead Market Clearing with Robust Unit Commitment

Authors: Hongxing Ye, Yinyin Ge, Mohammad Shahidehpour, Zuyi Li

Abstract: The increasing penetration of renewable energy in recent years has led to more uncertainties in power systems. These uncertainties have to be accommodated by flexible re- sources (i.e. upward and downward generation reserves). In this paper, a novel concept, Uncertainty Marginal Price (UMP), is proposed to price both the uncertainty and reserve. At the same time, the energy is priced at Locational… ▽ More The increasing penetration of renewable energy in recent years has led to more uncertainties in power systems. These uncertainties have to be accommodated by flexible re- sources (i.e. upward and downward generation reserves). In this paper, a novel concept, Uncertainty Marginal Price (UMP), is proposed to price both the uncertainty and reserve. At the same time, the energy is priced at Locational Marginal Price (LMP). A novel market clearing mechanism is proposed to credit the gener- ation and reserve and to charge the load and uncertainty within the Robust Unit Commitment (RUC) in the Day-ahead market. We derive the UMPs and LMPs in the robust optimization framework. UMP helps allocate the cost of generation reserves to uncertainty sources. We prove that the proposed market clearing mechanism leads to partial market equilibrium. We find that transmission reserves must be kept explicitly in addition to generation reserves for uncertainty accommodation. We prove that transmission reserves for ram** delivery may lead to Financial Transmission Right (FTR) underfunding in existing markets. The FTR underfunding can be covered by congestion fund collected from uncertainty payment in the proposed market clearing mechanism. Simulations on a six-bus system and the IEEE 118-bus system are performed to illustrate the new concepts and the market clearing mechanism. △ Less

Submitted 1 August, 2016; v1 submitted 6 July, 2015; originally announced July 2015.

Comments: to appear in IEEE Transactions on Power Systems, IEEE link: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7524711. arXiv admin note: text overlap with arXiv:1507.01167

arXiv:1507.01167 [pdf, other]

Market Clearing for Uncertainty, Generation Reserve, and Transmission Reserve--Part II:Case Study

Authors: Hongxing Ye, Yinyin Ge, Mohammad Shahidehpour, Zuyi Li

Abstract: In Part II of this two-part paper, we analyze the marginal prices derived in Part I of this two-part paper within a robust optimization framework. The load and generation are priced at Locational Marginal Price (LMP) while the uncertainty and generation reserve are priced at Uncertainty Marginal Price(UMP). The Financial Transmission Right (FTR) underfunding is demonstrated when there is transmiss… ▽ More In Part II of this two-part paper, we analyze the marginal prices derived in Part I of this two-part paper within a robust optimization framework. The load and generation are priced at Locational Marginal Price (LMP) while the uncertainty and generation reserve are priced at Uncertainty Marginal Price(UMP). The Financial Transmission Right (FTR) underfunding is demonstrated when there is transmission reserve. A comparison between traditional reserve price and UMP is presented. We also discuss the incentives for market participants within the new market scheme. △ Less

Submitted 5 July, 2015; originally announced July 2015.

Showing 1–50 of 50 results for author: Ye, H