Search | arXiv e-print repository

arXiv:2406.19043 [pdf]

CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 19 pages, 3 figures, 2 tables

arXiv:2406.12646 [pdf, other]

An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Authors: Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Ya**g Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

Abstract: The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potenti… ▽ More The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potential for performance biases that could mirror those found in task-specific deep learning models like nnU-Net. In this paper, we explored the fairness dilemma concerning large segmentation foundation models. We prospectively curate a benchmark dataset of 3D MRI and CT scans of the organs including liver, kidney, spleen, lung and aorta from a total of 1056 healthy subjects with expert segmentations. Crucially, we document demographic details such as gender, age, and body mass index (BMI) for each subject to facilitate a nuanced fairness analysis. We test state-of-the-art foundation models for medical image segmentation, including the original SAM, medical SAM and SAT models, to evaluate segmentation efficacy across different demographic groups and identify disparities. Our comprehensive analysis, which accounts for various confounding factors, reveals significant fairness concerns within these foundational models. Moreover, our findings highlight not only disparities in overall segmentation metrics, such as the Dice Similarity Coefficient but also significant variations in the spatial distribution of segmentation errors, offering empirical evidence of the nuanced challenges in ensuring fairness in medical image segmentation. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Accepted to MICCAI-2024

arXiv:2405.02660 [pdf, other]

AFDM Channel Estimation in Multi-Scale Multi-Lag Channels

Authors: Rongyou Cao, Yuheng Zhong, Jiangbin Lyu, Deqing Wang, Liqun Fu

Abstract: Affine Frequency Division Multiplexing (AFDM) is a brand new chirp-based multi-carrier (MC) waveform for high mobility communications, with promising advantages over Orthogonal Frequency Division Multiplexing (OFDM) and other MC waveforms. Existing AFDM research focuses on wireless communication at high carrier frequency (CF), which typically considers only Doppler frequency shift (DFS) as a resul… ▽ More Affine Frequency Division Multiplexing (AFDM) is a brand new chirp-based multi-carrier (MC) waveform for high mobility communications, with promising advantages over Orthogonal Frequency Division Multiplexing (OFDM) and other MC waveforms. Existing AFDM research focuses on wireless communication at high carrier frequency (CF), which typically considers only Doppler frequency shift (DFS) as a result of mobility, while ignoring the accompanied Doppler time scaling (DTS) on waveform. However, for underwater acoustic (UWA) communication at much lower CF and propagating at speed of sound, the DTS effect could not be ignored and poses significant challenges for channel estimation. This paper analyzes the channel frequency response (CFR) of AFDM under multi-scale multi-lag (MSML) channels, where each propagating path could have different delay and DFS/DTS. Based on the newly derived input-output formula and its characteristics, two new channel estimation methods are proposed, i.e., AFDM with iterative multi-index (AFDM-IMI) estimation under low to moderate DTS, and AFDM with orthogonal matching pursuit (AFDM-OMP) estimation under high DTS. Numerical results confirm the effectiveness of the proposed methods against the original AFDM channel estimation method. Moreover, the resulted AFDM system outperforms OFDM as well as Orthogonal Chirp Division Multiplexing (OCDM) in terms of channel estimation accuracy and bit error rate (BER), which is consistent with our theoretical analysis based on CFR overlap probability (COP), mutual incoherent property (MIP) and channel diversity gain under MSML channels. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: 6 pages, 6 figures. Investigate AFDM under underwater multi-scale multi-lag channels. Derive the new input-output formula with the impact of Doppler time scaling. Propose two new channel estimation methods to tackle different level of Doppler factors. Perform diversity analyis based on CFR overlap probability (COP) and mutual incoherent property (MIP)

arXiv:2404.01082 [pdf, other]

The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation platform hinder the development of data-driven reconstruction algorithms. To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on MICCAI. CMRxRecon presented an extensive k-space dataset comprising cine and map** raw data, accompanied by detailed annotations of cardiac anatomical structures. With overwhelming participation, the challenge attracted more than 285 teams and over 600 participants. Among them, 22 teams successfully submitted Docker containers for the testing phase, with 7 teams submitted for both cine and map** tasks. All teams use deep learning based approaches, indicating that deep learning has predominately become a promising solution for the problem. The first-place winner of both tasks utilizes the E2E-VarNet architecture as backbones. In contrast, U-Net is still the most popular backbone for both multi-coil and single-coil reconstructions. This paper provides a comprehensive overview of the challenge design, presents a summary of the submitted results, reviews the employed methods, and offers an in-depth discussion that aims to inspire future advancements in cardiac MRI reconstruction models. The summary emphasizes the effective strategies observed in Cardiac MRI reconstruction, including backbone architecture, loss function, pre-processing techniques, physical modeling, and model complexity, thereby providing valuable insights for further developments in this field. △ Less

Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: 25 pages, 17 figures

arXiv:2402.17780 [pdf, other]

Constraint Latent Space Matters: An Anti-anomalous Waveform Transformation Solution from Photoplethysmography to Arterial Blood Pressure

Authors: Cheng Bian, Xiaoyu Li, Qi Bi, Guangpu Zhu, Jiegeng Lyu, Weile Zhang, Yelei Li, Zi**g Zeng

Abstract: Arterial blood pressure (ABP) holds substantial promise for proactive cardiovascular health management. Notwithstanding its potential, the invasive nature of ABP measurements confines their utility primarily to clinical environments, limiting their applicability for continuous monitoring beyond medical facilities. The conversion of photoplethysmography (PPG) signals into ABP equivalents has garner… ▽ More Arterial blood pressure (ABP) holds substantial promise for proactive cardiovascular health management. Notwithstanding its potential, the invasive nature of ABP measurements confines their utility primarily to clinical environments, limiting their applicability for continuous monitoring beyond medical facilities. The conversion of photoplethysmography (PPG) signals into ABP equivalents has garnered significant attention due to its potential in revolutionizing cardiovascular disease management. Recent strides in PPG-to-ABP prediction encompass the integration of generative and discriminative models. Despite these advances, the efficacy of these models is curtailed by the latent space shift predicament, stemming from alterations in PPG data distribution across disparate hardware and individuals, potentially leading to distorted ABP waveforms. To tackle this problem, we present an innovative solution named the Latent Space Constraint Transformer (LSCT), leveraging a quantized codebook to yield robust latent spaces by employing multiple discretizing bases. To facilitate improved reconstruction, the Correlation-boosted Attention Module (CAM) is introduced to systematically query pertinent bases on a global scale. Furthermore, to enhance expressive capacity, we propose the Multi-Spectrum Enhancement Knowledge (MSEK), which fosters local information flow within the channels of latent code and provides additional embedding for reconstruction. Through comprehensive experimentation on both publicly available datasets and a private downstream task dataset, the proposed approach demonstrates noteworthy performance enhancements compared to existing methods. Extensive ablation studies further substantiate the effectiveness of each introduced module. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: Accepted by AAAI-2024, main track

arXiv:2312.10475 [pdf, ps, other]

IRS-Aided Sectorized Base Station Design and 3D Coverage Performance Analysis

Authors: Xintong Chen, Jiangbin Lyu, Liqun Fu

Abstract: Intelligent reflecting surface (IRS) is regarded as a revolutionary paradigm that can reconfigure the wireless propagation environment for enhancing the desired signal and/or weakening the interference, and thus improving the quality of service (QoS) for communication systems. In this paper, we propose an IRS-aided sectorized BS design where the IRS is mounted in front of a transmitter (TX) and re… ▽ More Intelligent reflecting surface (IRS) is regarded as a revolutionary paradigm that can reconfigure the wireless propagation environment for enhancing the desired signal and/or weakening the interference, and thus improving the quality of service (QoS) for communication systems. In this paper, we propose an IRS-aided sectorized BS design where the IRS is mounted in front of a transmitter (TX) and reflects/reconfigures signal towards the desired user equipment (UE). Unlike prior works that address link-level analysis/optimization of IRS-aided systems, we focus on the system-level three-dimensional (3D) coverage performance in both single-/multiple-cell scenarios. To this end, a distance/angle-dependent 3D channel model is considered for UEs in the 3D space, as well as the non-isotropic TX beam pattern and IRS element radiation pattern (ERP), both of which affect the average channel power as well as the multi-path fading statistics. Based on the above, a general formula of received signal power in our design is obtained, along with derived power scaling laws and upper/lower bounds on the mean signal/interference power under IRS passive beamforming or random scattering. Numerical results validate our analysis and demonstrate that our proposed design outperforms the benchmark schemes with fixed BS antenna patterns or active 3D beamforming. In particular, for aerial UEs that suffer from strong inter-cell interference, the IRS-aided BS design provides much better QoS in terms of the ergodic throughput performance compared with benchmarks, thanks to the IRS-inherent double pathloss effect that helps weaken the interference. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: Manuscript submitted to IEEE IWQoS 2023 on 12 Feb. 2023; accepted 13 April 2023; published 27 July 2023. An associated Chinese patent was applied on 9 Aug. 2022 and granted on 1 Sep. 2023, under No. ZL202210948626.X

arXiv:2312.07226 [pdf, other]

Super-Resolution on Rotationally Scanned Photoacoustic Microscopy Images Incorporating Scanning Prior

Authors: Kai Pan, Linyang Li, Li Lin, Pu** Cheng, Junyan Lyu, Lei Xi, Xiaoyin Tang

Abstract: Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. However, there exists a trade-off between scanning speed and image resolution. Compared with traditional raster scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. Recently, there is a t… ▽ More Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. However, there exists a trade-off between scanning speed and image resolution. Compared with traditional raster scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. Recently, there is a trend to incorporate deep learning into the scanning process to further increase the scanning speed.Yet, most such attempts are performed for raster scanning while those for rotational scanning are relatively rare. In this study, we propose a novel and well-performing super-resolution framework for rotational scanning-based PAM imaging. To eliminate adjacent rows' displacements due to subject motion or high-frequency scanning distortion,we introduce a registration module across odd and even rows in the preprocessing and incorporate displacement degradation in the training. Besides, gradient-based patch selection is proposed to increase the probability of blood vessel patches being selected for training. A Transformer-based network with a global receptive field is applied for better performance. Experimental results on both synthetic and real datasets demonstrate the effectiveness and generalizability of our proposed framework for rotationally scanned PAM images'super-resolution, both quantitatively and qualitatively. Code is available at https://github.com/11710615/PAMSR.git. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2307.04296 [pdf, other]

K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment

Authors: Guoyang Xie, **bao Wang, Yawen Huang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu **

Abstract: The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROS… ▽ More The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location, together with a tumor encoder for representing features, such as texture details and brightness intensities. To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities. As a consequence, the features learned from lesion regions, k-space, and anatomical structures are all captured, which serve as our quality evaluators. We evaluate the performance by constructing a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments. Extensive experiments demonstrate that the proposed method outperforms other metrics, especially in comparison with the radiologists on NIRPS. △ Less

Submitted 9 February, 2024; v1 submitted 9 July, 2023; originally announced July 2023.

arXiv:2307.02334 [pdf, ps, other]

Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI

Authors: Jiamiao Zhang, Yichen Chi, Jun Lyu, Wenming Yang, Yapeng Tian

Abstract: Limited by imaging systems, the reconstruction of Magnetic Resonance Imaging (MRI) images from partial measurement is essential to medical imaging research. Benefiting from the diverse and complementary information of multi-contrast MR images in different imaging modalities, multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality. In the medical scen… ▽ More Limited by imaging systems, the reconstruction of Magnetic Resonance Imaging (MRI) images from partial measurement is essential to medical imaging research. Benefiting from the diverse and complementary information of multi-contrast MR images in different imaging modalities, multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality. In the medical scenario, to fully visualize the lesion, radiologists are accustomed to zooming the MR images at arbitrary scales rather than using a fixed scale, as used by most MRI SR methods. In addition, existing multi-contrast MRI SR methods often require a fixed resolution for the reference image, which makes acquiring reference images difficult and imposes limitations on arbitrary scale SR tasks. To address these issues, we proposed an implicit neural representations based dual-arbitrary multi-contrast MRI super-resolution method, called Dual-ArbNet. First, we decouple the resolution of the target and reference images by a feature encoder, enabling the network to input target and reference images at arbitrary scales. Then, an implicit fusion decoder fuses the multi-contrast features and uses an Implicit Decoding Function~(IDF) to obtain the final MRI SR results. Furthermore, we introduce a curriculum learning strategy to train our network, which improves the generalization and performance of our Dual-ArbNet. Extensive experiments in two public MRI datasets demonstrate that our method outperforms state-of-the-art approaches under different scale factors and has great potential in clinical practice. △ Less

Submitted 10 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: Accepted by MICCAI2023

arXiv:2208.04360 [pdf, other]

SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022

Authors: **gbo Zhou, Xinjiang Lu, Yixiong Xiao, Jiantao Su, Junfu Lyu, Yanjun Ma, De**g Dou

Abstract: The variability of wind power supply can present substantial challenges to incorporating wind power into a grid system. Thus, Wind Power Forecasting (WPF) has been widely recognized as one of the most critical issues in wind power integration and operation. There has been an explosion of studies on wind power forecasting problems in the past decades. Nevertheless, how to well handle the WPF proble… ▽ More The variability of wind power supply can present substantial challenges to incorporating wind power into a grid system. Thus, Wind Power Forecasting (WPF) has been widely recognized as one of the most critical issues in wind power integration and operation. There has been an explosion of studies on wind power forecasting problems in the past decades. Nevertheless, how to well handle the WPF problem is still challenging, since high prediction accuracy is always demanded to ensure grid stability and security of supply. We present a unique Spatial Dynamic Wind Power Forecasting dataset: SDWPF, which includes the spatial distribution of wind turbines, as well as the dynamic context factors. Whereas, most of the existing datasets have only a small number of wind turbines without knowing the locations and context information of wind turbines at a fine-grained time scale. By contrast, SDWPF provides the wind power data of 134 wind turbines from a wind farm over half a year with their relative positions and internal statuses. We use this dataset to launch the Baidu KDD Cup 2022 to examine the limit of current WPF solutions. The dataset is released at https://aistudio.baidu.com/aistudio/competition/detail/152/0/datasets. △ Less

Submitted 8 August, 2022; originally announced August 2022.

arXiv:2207.13249 [pdf, other]

doi 10.1109/TMI.2022.3193146

AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

Authors: Junyan Lyu, Yiqi Zhang, Yi** Huang, Li Lin, Pu** Cheng, Xiaoying Tang

Abstract: Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Do… ▽ More Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Domain Generalization (AADG). Our AADG framework can effectively sample data augmentation policies that generate novel domains and diversify the training set from an appropriate search space. Specifically, we introduce a novel proxy task maximizing the diversity among multiple augmented novel domains as measured by the Sinkhorn distance in a unit sphere space, making automated augmentation tractable. Adversarial training and deep reinforcement learning are employed to efficiently search the objectives. Quantitative and qualitative experiments on 11 publicly-accessible fundus image datasets (four for retinal vessel segmentation, four for optic disc and cup (OD/OC) segmentation and three for retinal lesion segmentation) are comprehensively performed. Two OCTA datasets for retinal vasculature segmentation are further involved to validate cross-modality generalization. Our proposed AADG exhibits state-of-the-art generalization performance and outperforms existing approaches by considerable margins on retinal vessel, OD/OC and lesion segmentation tasks. The learned policies are empirically validated to be model-agnostic and can transfer well to other models. The source code is available at https://github.com/CRazorback/AADG. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE Transactions on Medical Imaging (TMI)

arXiv:2202.06997 [pdf, other]

Cross-Modality Neuroimage Synthesis: A Survey

Authors: Guoyang Xie, Yawen Huang, **bao Wang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu **

Abstract: Multi-modality imaging improves disease diagnosis and reveals distinct deviations in tissues with anatomical properties. The existence of completely aligned and paired multi-modality neuroimaging data has proved its effectiveness in brain research. However, collecting fully aligned and paired data is expensive or even impractical, since it faces many difficulties, including high cost, long acquisi… ▽ More Multi-modality imaging improves disease diagnosis and reveals distinct deviations in tissues with anatomical properties. The existence of completely aligned and paired multi-modality neuroimaging data has proved its effectiveness in brain research. However, collecting fully aligned and paired data is expensive or even impractical, since it faces many difficulties, including high cost, long acquisition time, image corruption, and privacy issues. An alternative solution is to explore unsupervised or weakly supervised learning methods to synthesize the absent neuroimaging data. In this paper, we provide a comprehensive review of cross-modality synthesis for neuroimages, from the perspectives of weakly supervised and unsupervised settings, loss functions, evaluation metrics, imaging modalities, datasets, and downstream applications based on synthesis. We begin by highlighting several opening challenges for cross-modality neuroimage synthesis. Then, we discuss representative architectures of cross-modality synthesis methods under different supervisions. This is followed by a stepwise in-depth analysis to evaluate how cross-modality neuroimage synthesis improves the performance of its downstream tasks. Finally, we summarize the existing research findings and point out future research directions. All resources are available at https://github.com/M-3LAB/awesome-multimodal-brain-image-systhesis △ Less

Submitted 21 September, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

arXiv:2112.13396 [pdf, ps, other]

Energy-Efficient Trajectory Design for UAV-Aided Maritime Data Collection in Wind

Authors: Yifan Zhang, Jiangbin Lyu, Liqun Fu

Abstract: Unmanned aerial vehicles (UAVs), especially fixed-wing ones that withstand strong winds, have great potential for oceanic exploration and research. This paper studies a UAV-aided maritime data collection system with a fixed-wing UAV dispatched to collect data from marine buoys. We aim to minimize the UAV's energy consumption in completing the task by jointly optimizing the communication time sched… ▽ More Unmanned aerial vehicles (UAVs), especially fixed-wing ones that withstand strong winds, have great potential for oceanic exploration and research. This paper studies a UAV-aided maritime data collection system with a fixed-wing UAV dispatched to collect data from marine buoys. We aim to minimize the UAV's energy consumption in completing the task by jointly optimizing the communication time scheduling among the buoys and the UAV's flight trajectory subject to wind effect. The conventional successive convex approximation (SCA) method can provide efficient sub-optimal solutions for collecting small/moderate data volume, whereas the solution heavily relies on trajectory initialization and has not explicitly considered wind effect, while the computational/trajectory complexity both become prohibitive for the task with large data volume. To this end, we propose a new cyclical trajectory design framework with tailored initialization algorithm that can handle arbitrary data volume efficiently, as well as a hybrid offline-online (HO2) design that leverages convex stochastic programming (CSP) offline based on wind statistics, and refines the solution by adapting online to real-time wind velocity. Numerical results show that our optimized trajectory can better adapt to various setups with different target data volume and buoys' topology as well as various wind speed/direction/variance compared with benchmark schemes. △ Less

Submitted 26 December, 2021; originally announced December 2021.

Comments: 30 pages, 14 figures. Propose a new cyclical trajectory design framework with tailored initialization algorithm that can handle arbitrary data volume efficiently, as well as a hybrid offline-online (HO2) design that leverages convex stochastic programming (CSP) offline based on wind statistics, and refines the solution by adapting online to real-time wind velocity. arXiv admin note: substantial text overlap with arXiv:2006.01371

arXiv:2110.14160 [pdf, other]

Identifying the key components in ResNet-50 for diabetic retinopathy grading from fundus images: a systematic investigation

Authors: Yi** Huang, Li Lin, Pu** Cheng, Junyan Lyu, Roger Tam, Xiaoying Tang

Abstract: Although deep learning based diabetic retinopathy (DR) classification methods typically benefit from well-designed architectures of convolutional neural networks, the training setting also has a non-negligible impact on the prediction performance. The training setting includes various interdependent components, such as objective function, data sampling strategy and data augmentation approach. To i… ▽ More Although deep learning based diabetic retinopathy (DR) classification methods typically benefit from well-designed architectures of convolutional neural networks, the training setting also has a non-negligible impact on the prediction performance. The training setting includes various interdependent components, such as objective function, data sampling strategy and data augmentation approach. To identify the key components in a standard deep learning framework (ResNet-50) for DR grading, we systematically analyze the impact of several major components. Extensive experiments are conducted on a publicly-available dataset EyePACS. We demonstrate that (1) the DR grading framework is sensitive to input resolution, objective function, and composition of data augmentation, (2) using mean square error as the loss function can effectively improve the performance with respect to a task-specific evaluation metric, namely the quadratically-weighted Kappa, (3) utilizing eye pairs boosts the performance of DR grading and (4) using data resampling to address the problem of imbalanced data distribution in EyePACS hurts the performance. Based on these observations and an optimal combination of the investigated components, our framework, without any specialized network design, achieves the state-of-the-art result (0.8631 for Kappa) on the EyePACS test set (a total of 42670 fundus images) with only image-level labels. We also examine the proposed training practices on other fundus datasets and other network architectures to evaluate their generalizability. Our codes and pre-trained model are available at https://github.com/Yi**Huang/pytorch-classification. △ Less

Submitted 17 October, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

arXiv:2110.12685 [pdf]

Transient Synchronization Stability Analysis of Wind Farms with MMC-HVDC Integration Under Offshore AC Grid Fault

Authors: Yu Zhang, Chen Zhang, Renxin Yang, **g Lyu, Li Liu, Xu Cai

Abstract: The MMC-HVDC connected offshore wind farms (OWFs) could suffer short circuit fault (SCF), whereas their transient stability is not well analysed. In this paper, the mechanism of the loss of synchronization (LOS) of this system is analysed considering the whole system state from the fault-on to the post-fault, and the discussion on fault type and fault clearance is addressed as well. A stability in… ▽ More The MMC-HVDC connected offshore wind farms (OWFs) could suffer short circuit fault (SCF), whereas their transient stability is not well analysed. In this paper, the mechanism of the loss of synchronization (LOS) of this system is analysed considering the whole system state from the fault-on to the post-fault, and the discussion on fault type and fault clearance is addressed as well. A stability index is proposed to quantify the transient synchronization stability (TSS) of the system, which is capable to not only estimate whether the wind turbine generators (WTGs) be able to get resynchronized with the offshore MMC after the fault is cleared, but also to evaluate the performance of stability improving methods as well. Finally, a scenario of six cases is tested on the PSCAD/EMTDC simulation platform, where the performances of four existing stability improving methods are thoroughly compared via both numerical simulation and the proposed stability index. △ Less

Submitted 25 October, 2021; originally announced October 2021.

arXiv:2107.08274 [pdf, other]

Lesion-based Contrastive Learning for Diabetic Retinopathy Grading from Fundus Images

Authors: Yi** Huang, Li Lin, Pu** Cheng, Junyan Lyu, Xiaoying Tang

Abstract: Manually annotating medical images is extremely expensive, especially for large-scale datasets. Self-supervised contrastive learning has been explored to learn feature representations from unlabeled images. However, unlike natural images, the application of contrastive learning to medical images is relatively limited. In this work, we propose a self-supervised framework, namely lesion-based contra… ▽ More Manually annotating medical images is extremely expensive, especially for large-scale datasets. Self-supervised contrastive learning has been explored to learn feature representations from unlabeled images. However, unlike natural images, the application of contrastive learning to medical images is relatively limited. In this work, we propose a self-supervised framework, namely lesion-based contrastive learning for automated diabetic retinopathy (DR) grading. Instead of taking entire images as the input in the common contrastive learning scheme, lesion patches are employed to encourage the feature extractor to learn representations that are highly discriminative for DR grading. We also investigate different data augmentation operations in defining our contrastive prediction task. Extensive experiments are conducted on the publicly-accessible dataset EyePACS, demonstrating that our proposed framework performs outstandingly on DR grading in terms of both linear evaluation and transfer capacity evaluation. △ Less

Submitted 17 July, 2021; originally announced July 2021.

Comments: 10 pages, 2 figures, MICCAI2021 early accepted

arXiv:2107.04823 [pdf, other]

BSDA-Net: A Boundary Shape and Distance Aware Joint Learning Framework for Segmenting and Classifying OCTA Images

Authors: Li Lin, Zhonghua Wang, Jiewei Wu, Yi** Huang, Junyan Lyu, Pu** Cheng, Jiong Wu, Xiaoying Tang

Abstract: Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, the… ▽ More Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, there is no existing research reporting that FAZ features can improve the performance of deep diagnostic classification networks. In this paper, we propose a novel multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images. Two auxiliary branches, namely boundary heatmap regression and signed distance map reconstruction branches, are constructed in addition to the segmentation branch to improve the segmentation performance, resulting in more accurate FAZ contours and fewer outliers. Moreover, both low-level and high-level features from the aforementioned three branches, including shape, size, boundary, and signed directional distance map of FAZ, are fused hierarchically with features from the diagnostic classifier. Through extensive experiments, the proposed BSDA-Net is found to yield state-of-the-art segmentation and classification results on the OCTA-500, OCTAGON, and FAZID datasets. △ Less

Submitted 13 July, 2021; v1 submitted 10 July, 2021; originally announced July 2021.

Comments: 12 pages, 4 figures, MICCAI2021 [Student Travel Award]

arXiv:2105.08163 [pdf, other]

Accelerating 3D MULTIPLEX MRI Reconstruction with Deep Learning

Authors: Eric Z. Chen, Yongquan Ye, Xiao Chen, **gyuan Lyu, Zhongqi Zhang, Yichen Hu, Terrence Chen, Jian Xu, Shanhui Sun

Abstract: Multi-contrast MRI images provide complementary contrast information about the characteristics of anatomical structures and are commonly used in clinical practice. Recently, a multi-flip-angle (FA) and multi-echo GRE method (MULTIPLEX MRI) has been developed to simultaneously acquire multiple parametric images with just one single scan. However, it poses two challenges for MULTIPLEX to be used in… ▽ More Multi-contrast MRI images provide complementary contrast information about the characteristics of anatomical structures and are commonly used in clinical practice. Recently, a multi-flip-angle (FA) and multi-echo GRE method (MULTIPLEX MRI) has been developed to simultaneously acquire multiple parametric images with just one single scan. However, it poses two challenges for MULTIPLEX to be used in the 3D high-resolution setting: a relatively long scan time and the huge amount of 3D multi-contrast data for reconstruction. Currently, no DL based method has been proposed for 3D MULTIPLEX data reconstruction. We propose a deep learning framework for undersampled 3D MRI data reconstruction and apply it to MULTIPLEX MRI. The proposed deep learning method shows good performance in image quality and reconstruction time. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: Presented at ISMRM 2021 as the digital poster

arXiv:2105.08157 [pdf, other]

Cardiac Functional Analysis with Cine MRI via Deep Learning Reconstruction

Authors: Eric Z. Chen, Xiao Chen, **gyuan Lyu, Qi Liu, Zhongqi Zhang, Yu Ding, Shuheng Zhang, Terrence Chen, Jian Xu, Shanhui Sun

Abstract: Retrospectively gated cine (retro-cine) MRI is the clinical standard for cardiac functional analysis. Deep learning (DL) based methods have been proposed for the reconstruction of highly undersampled MRI data and show superior image quality and magnitude faster reconstruction time than CS-based methods. Nevertheless, it remains unclear whether DL reconstruction is suitable for cardiac function ana… ▽ More Retrospectively gated cine (retro-cine) MRI is the clinical standard for cardiac functional analysis. Deep learning (DL) based methods have been proposed for the reconstruction of highly undersampled MRI data and show superior image quality and magnitude faster reconstruction time than CS-based methods. Nevertheless, it remains unclear whether DL reconstruction is suitable for cardiac function analysis. To address this question, in this study we evaluate and compare the cardiac functional values (EDV, ESV and EF for LV and RV, respectively) obtained from highly accelerated MRI acquisition using DL based reconstruction algorithm (DL-cine) with values from CS-cine and conventional retro-cine. To the best of our knowledge, this is the first work to evaluate the cine MRI with deep learning reconstruction for cardiac function analysis and compare it with other conventional methods. The cardiac functional values obtained from cine MRI with deep learning reconstruction are consistent with values from clinical standard retro-cine MRI. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: Presented at ISMRM 2021 as the digital poster

arXiv:2010.03740 [pdf, other]

Bone Feature Segmentation in Ultrasound Spine Image with Robustness to Speckle and Regular Occlusion Noise

Authors: Zixun Huang, Li-Wen Wang, Frank H. F. Leung, Sunetra Banerjee, De Yang, Timothy Lee, Juan Lyu, Sai Ho Ling, Yong-** Zheng

Abstract: 3D ultrasound imaging shows great promise for scoliosis diagnosis thanks to its low-costing, radiation-free and real-time characteristics. The key to accessing scoliosis by ultrasound imaging is to accurately segment the bone area and measure the scoliosis degree based on the symmetry of the bone features. The ultrasound images tend to contain many speckles and regular occlusion noise which is dif… ▽ More 3D ultrasound imaging shows great promise for scoliosis diagnosis thanks to its low-costing, radiation-free and real-time characteristics. The key to accessing scoliosis by ultrasound imaging is to accurately segment the bone area and measure the scoliosis degree based on the symmetry of the bone features. The ultrasound images tend to contain many speckles and regular occlusion noise which is difficult, tedious and time-consuming for experts to find out the bony feature. In this paper, we propose a robust bone feature segmentation method based on the U-net structure for ultrasound spine Volume Projection Imaging (VPI) images. The proposed segmentation method introduces a total variance loss to reduce the sensitivity of the model to small-scale and regular occlusion noise. The proposed approach improves 2.3% of Dice score and 1% of AUC score as compared with the u-net model and shows high robustness to speckle and regular occlusion noise. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: SMC2020

arXiv:2008.07438 [pdf, ps, other]

Analysis and Optimization for Large-Scale LoRa Networks: Throughput Fairness and Scalability

Authors: Jiangbin Lyu, Dan Yu, Liqun Fu

Abstract: LoRa networks are pivotally enabling Long Range connectivity to low-cost and power-constrained user equipments (UEs) in a wide area, whereas a critical issue is to effectively allocate wireless resources to support potentially massive UEs while resolving the prominent near-far fairness issue, which is challenging due to the lack of tractable analytical model and the practical requirement for low-c… ▽ More LoRa networks are pivotally enabling Long Range connectivity to low-cost and power-constrained user equipments (UEs) in a wide area, whereas a critical issue is to effectively allocate wireless resources to support potentially massive UEs while resolving the prominent near-far fairness issue, which is challenging due to the lack of tractable analytical model and the practical requirement for low-complexity and low-overhead design. Leveraging on stochastic geometry, especially the Poisson rain model, we derive (semi-) closed-form formulas for the aggregate interference distribution, packet success probability and hence system throughput in both single-cell and multi-cell setups with frequency reuse, by accounting for channel fading, random UE distribution, partial packet overlap**, and/or multi-gateway packet reception. The analytical formulas require only average channel statistics and spatial UE distribution, which enable tractable network performance evaluation and incubate our proposed Iterative Balancing (IB) method that quickly yields high-level policies of joint spreading factor (SF) allocation, power control, and duty cycle adjustment for gauging the average max-min UE throughput or supported UE density with rate requirements. Numerical results validate the analytical formulas and the effectiveness of our proposed optimization scheme, which greatly alleviates the near-far fairness issue and reduces the spatial power consumption, while significantly improving the cell-edge throughput as well as the spatial (sum) throughput for the majority of UEs, by adapting to the UE/gateway densities. △ Less

Submitted 5 November, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

Comments: To appear in IEEE IOT Journal. Stochastic geometry-based framework to model/analyze large-scale LoRa networks with channel fading/aggregate interference/packet overlap**/multi-GW reception. Jointly optimize SF/Tx-power/duty-cycle based on channel statistics and UE distribution. Achieve both fairness/power savings and improve cell-edge throughput and spatial (sum) throughput for majority of UEs. arXiv admin note: text overlap with arXiv:1904.12300

arXiv:2008.05044 [pdf, other]

Real-Time Cardiac Cine MRI with Residual Convolutional Recurrent Neural Network

Authors: Eric Z. Chen, Xiao Chen, **gyuan Lyu, Yuan Zheng, Terrence Chen, Jian Xu, Shanhui Sun

Abstract: Real-time cardiac cine MRI does not require ECG gating in the data acquisition and is more useful for patients who can not hold their breaths or have abnormal heart rhythms. However, to achieve fast image acquisition, real-time cine commonly acquires highly undersampled data, which imposes a significant challenge for MRI image reconstruction. We propose a residual convolutional RNN for real-time c… ▽ More Real-time cardiac cine MRI does not require ECG gating in the data acquisition and is more useful for patients who can not hold their breaths or have abnormal heart rhythms. However, to achieve fast image acquisition, real-time cine commonly acquires highly undersampled data, which imposes a significant challenge for MRI image reconstruction. We propose a residual convolutional RNN for real-time cardiac cine reconstruction. To the best of our knowledge, this is the first work applying deep learning approach to Cartesian real-time cardiac cine reconstruction. Based on the evaluation from radiologists, our deep learning model shows superior performance than compressed sensing. △ Less

Submitted 20 August, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

Comments: Presented at ISMRM 2020 as the digital poster

arXiv:2006.01371 [pdf, ps, other]

doi 10.1109/GLOBECOM42002.2020.9322588

Energy-Efficient Cyclical Trajectory Design for UAV-Aided Maritime Data Collection in Wind

Authors: Yifan Zhang, Jiangbin Lyu, Liqun Fu

Abstract: Unmanned aerial vehicles (UAVs), especially fixed-wing ones that withstand strong winds, have great potential for oceanic exploration and research. This paper studies a UAV-aided maritime data collection system with a fixed-wing UAV dispatched to collect data from marine buoys. We aim to minimize the UAV's energy consumption in completing the task by jointly optimizing the communication time sched… ▽ More Unmanned aerial vehicles (UAVs), especially fixed-wing ones that withstand strong winds, have great potential for oceanic exploration and research. This paper studies a UAV-aided maritime data collection system with a fixed-wing UAV dispatched to collect data from marine buoys. We aim to minimize the UAV's energy consumption in completing the task by jointly optimizing the communication time scheduling among the buoys and the UAV's flight trajectory subject to wind effect, which is a non-convex problem and difficult to solve optimally. Existing techniques such as the successive convex approximation (SCA) method provide efficient sub-optimal solutions for collecting small/moderate data volume, whereas the solution heavily relies on the trajectory initialization and has not explicitly considered the wind effect, while the computational complexity and resulted trajectory complexity both become prohibitive for the task with large data volume. To this end, we propose a new cyclical trajectory design framework that can handle arbitrary data volume efficiently subject to wind effect. Specifically, the proposed UAV trajectory comprises multiple cyclical laps, each responsible for collecting only a subset of data and thereby significantly reducing the computational/trajectory complexity, which allows searching for better trajectory initialization that fits the buoys' topology and the wind. Numerical results show that the proposed cyclical scheme outperforms the benchmark one-flight-only scheme in general. Moreover, the optimized cyclical 8-shape trajectory can proactively exploit the wind and achieve lower energy consumption compared with the case without wind. △ Less

Submitted 4 November, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: Published in GLOBECOM2020. Investigated UAV-aided maritime data collection in wind, with joint trajectory and communications optimization for energy efficiency. Proposed new cyclical trajectory design that can handle arbitrary data volume with significantly reduced computational/trajectory complexity. Unveiled that the wind can be proactively utilized by our optimized trajectory

Journal ref: GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1-6

arXiv:1904.01509 [pdf, other]

doi 10.1109/ICMEW.2019.0-104

FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation

Authors: Yanfu Yan, Ke Lu, Jian Xue, Pengcheng Gao, Jiayi Lyu

Abstract: Facial expression analysis based on machine learning requires large number of well-annotated data to reflect different changes in facial motion. Publicly available datasets truly help to accelerate research in this area by providing a benchmark resource, but all of these datasets, to the best of our knowledge, are limited to rough annotations for action units, including only their absence, presenc… ▽ More Facial expression analysis based on machine learning requires large number of well-annotated data to reflect different changes in facial motion. Publicly available datasets truly help to accelerate research in this area by providing a benchmark resource, but all of these datasets, to the best of our knowledge, are limited to rough annotations for action units, including only their absence, presence, or a five-level intensity according to the Facial Action Coding System. To meet the need for videos labeled in great detail, we present a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D Facial Animation. One hundred and twenty-two participants, including children, young adults and elderly people, were recorded in real-world conditions. In addition, 99,356 frames were manually labeled using Expression Quantitative Tool developed by us to quantify 9 symmetrical FACS action units, 10 asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action descriptors and 2 asymmetrical FACS action descriptors, and each action unit or action descriptor is well-annotated with a floating point number between 0 and 1. To provide a baseline for use in future research, a benchmark for the regression of action unit values based on Convolutional Neural Networks are presented. We also demonstrate the potential of our FEAFA dataset for 3D facial animation. Almost all state-of-the-art algorithms for facial animation are achieved based on 3D face reconstruction. We hence propose a novel method that drives virtual characters only based on action unit value regression of the 2D video frames of source actors. △ Less

Submitted 2 April, 2019; originally announced April 2019.

Comments: 9 pages, 7 figures

Journal ref: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)

arXiv:1901.07887 [pdf, other]

Network-Connected UAV: 3D System Modeling and Coverage Performance Analysis

Authors: Jiangbin Lyu, Rui Zhang

Abstract: With growing popularity, unmanned aerial vehicles (UAVs) are pivotally extending conventional terrestrial Internet of Things (IoT) into the sky. To enable high-performance two-way communications of UAVs with their ground pilots/users, cellular network-connected UAV has drawn significant interests recently. Among others, an important issue is whether the existing cellular network, designed mainly f… ▽ More With growing popularity, unmanned aerial vehicles (UAVs) are pivotally extending conventional terrestrial Internet of Things (IoT) into the sky. To enable high-performance two-way communications of UAVs with their ground pilots/users, cellular network-connected UAV has drawn significant interests recently. Among others, an important issue is whether the existing cellular network, designed mainly for terrestrial users, is also able to effectively cover the new UAV users in the three-dimensional (3D) space for both uplink and downlink communications. Such 3D coverage analysis is challenging due to the unique air-ground channel characteristics, the resulted interference issue with terrestrial communication, and the non-uniform 3D antenna gain pattern of ground base station (GBS) in practice. Particularly, high-altitude UAV often possesses a high probability of line-of-sight (LoS) channels with a large number of GBSs, while their random binary (LoS/Non-LoS) channel states and (on/off) activities give rise to exponentially large number of discrete UAV-GBS association/interference states, rendering coverage analysis more difficult. This paper presents a new 3D system model to incorporate UAV users and proposes an analytical framework to characterize their uplink/downlink 3D coverage performance. To tackle the above exponential complexity, we introduce a generalized Poisson multinomial (GPM) distribution to model the discrete interference states, and a novel lattice approximation (LA) technique to approximate the non-lattice GPM variable and obtain the interference distribution efficiently with high accuracy. The 3D coverage analysis is validated by extensive numerical results, which also show effects of key system parameters such as cell loading factor, GBS antenna downtilt, UAV altitude and antenna beamwidth. △ Less

Submitted 25 April, 2019; v1 submitted 19 January, 2019; originally announced January 2019.

Comments: Double-column, 13 pages, 10 figures, accepted for publication in IEEE Internet of Things Journal

arXiv:1901.00276 [pdf, other]

Multi-level CNN for lung nodule classification with Gaussian Process assisted hyperparameter optimization

Authors: Miao Zhang, Huiqi Li, Juan Lyu, Sai Ho Ling, Steven Su

Abstract: This paper investigates lung nodule classification by using deep neural networks (DNNs). Hyperparameter optimization in DNNs is a computationally expensive problem, where evaluating a hyperparameter configuration may take several hours or even days. Bayesian optimization has been recently introduced for the automatically searching of optimal hyperparameter configurations of DNNs. It applies probab… ▽ More This paper investigates lung nodule classification by using deep neural networks (DNNs). Hyperparameter optimization in DNNs is a computationally expensive problem, where evaluating a hyperparameter configuration may take several hours or even days. Bayesian optimization has been recently introduced for the automatically searching of optimal hyperparameter configurations of DNNs. It applies probabilistic surrogate models to approximate the validation error function of hyperparameter configurations, such as Gaussian processes, and reduce the computational complexity to a large extent. However, most existing surrogate models adopt stationary covariance functions to measure the difference between hyperparameter points based on spatial distance without considering its spatial locations. This distance-based assumption together with the condition of constant smoothness throughout the whole hyperparameter search space clearly violates the property that the points far away from optimal points usually get similarly poor performance even though each two of them have huge spatial distance between them. In this paper, a non-stationary kernel is proposed which allows the surrogate model to adapt to functions whose smoothness varies with the spatial location of inputs, and a multi-level convolutional neural network (ML-CNN) is built for lung nodule classification whose hyperparameter configuration is optimized by using the proposed non-stationary kernel based Gaussian surrogate model. Our algorithm searches the surrogate for optimal setting via hyperparameter importance based evolutionary strategy, and the experiments demonstrate our algorithm outperforms manual tuning and well-established hyperparameter optimization methods such as Random search, Gaussian processes with stationary kernels, and recently proposed Hyperparameter Optimization via RBF and Dynamic coordinate search. △ Less

Submitted 2 January, 2019; originally announced January 2019.

arXiv:1811.02784 [pdf, other]

Median Binary-Connect Method and a Binary Convolutional Neural Nework for Word Recognition

Authors: Spencer Sheen, Jiancheng Lyu

Abstract: We propose and study a new projection formula for training binary weight convolutional neural networks. The projection formula measures the error in approximating a full precision (32 bit) vector by a 1-bit vector in the l_1 norm instead of the standard l_2 norm. The l_1 projector is in closed analytical form and involves a median computation instead of an arithmatic average in the l_2 projector.… ▽ More We propose and study a new projection formula for training binary weight convolutional neural networks. The projection formula measures the error in approximating a full precision (32 bit) vector by a 1-bit vector in the l_1 norm instead of the standard l_2 norm. The l_1 projector is in closed analytical form and involves a median computation instead of an arithmatic average in the l_2 projector. Experiments on 10 keywords classification show that the l_1 (median) BinaryConnect (BC) method outperforms the regular BC, regardless of cold or warm start. The binary network trained by median BC and a recent blending technique reaches test accuracy 92.4%, which is 1.1% lower than the full-precision network accuracy 93.5%. On Android phone app, the trained binary network doubles the speed of full-precision network in spoken keywords recognition. △ Less

Submitted 7 November, 2018; originally announced November 2018.

arXiv:1706.09925 [pdf]

Harmonic State Space Modeling of a Three-Phase Modular Multilevel Converter

Authors: **g Lyu, Marta Molinas, Xu Cai

Abstract: This paper presents the harmonic state space (HSS) modeling of a three-phase modular multilevel converter (MMC). MMC is a converter system with a typical multi-frequency response due to its significant harmonics in the arm currents, capacitor voltages, and control signals. These internal harmonic dynamics can have a great influence on the operation characteristics of MMC. However, the conventional… ▽ More This paper presents the harmonic state space (HSS) modeling of a three-phase modular multilevel converter (MMC). MMC is a converter system with a typical multi-frequency response due to its significant harmonics in the arm currents, capacitor voltages, and control signals. These internal harmonic dynamics can have a great influence on the operation characteristics of MMC. However, the conventional modeling methods commonly used in two-level voltage-source converters (VSCs), where only the fundamental-frequency dynamic is considered, will lead to an inaccurate model that cannot accurately reflect the real dynamic characteristics of MMC. Therefore, the HSS modeling method, in which harmonics of state variables, inputs, and outputs are posed separately in a state-space form, is introduced in this paper to model the MMC in order to capture all the harmonics and the frequency couplings. The steady-state and small-signal dynamic HSS models of a three-phase MMC are developed, respectively. The validity of the developed HSS model of a three-phase MMC has been verified by the results from both the nonlinear time domain simulation model in MATLAB/Simulink and the laboratory prototype with 12 submodules per arm. △ Less

Submitted 29 June, 2017; originally announced June 2017.

arXiv:1705.01030 [pdf]

Impedance Analysis of Modular Multilevel Converter Based on Harmonic State-Space Modeling Method

Authors: **g Lyu, Qiang Chen, Xu Cai, Marta Molinas

Abstract: The small-signal impedance modeling of modular multilevel converter (MMC) is the key for analyzing resonance and stability of MMC-based ac power electronics systems. MMC is a converter system with a typical multi-frequency response due to its significant steady-state harmonic components in the arm currents, capacitor voltages, and control signals. Therefore, traditional small-signal modeling metho… ▽ More The small-signal impedance modeling of modular multilevel converter (MMC) is the key for analyzing resonance and stability of MMC-based ac power electronics systems. MMC is a converter system with a typical multi-frequency response due to its significant steady-state harmonic components in the arm currents, capacitor voltages, and control signals. Therefore, traditional small-signal modeling methods for 2-level voltage-source converters (VSCs) cannot be directly applied to the MMC. In this paper, the harmonic state-space (HSS) modeling approach is introduced to characterize the harmonic coupling behavior of the MMC. On this basis, the small-signal impedance models of the MMC are developed according to the harmonic linearization principle, which can include all the steady-state harmonic effects of the state variables, leading to the accurate impedance models. Furthermore, in order to reveal the impact of the internal dynamics and closed-loop control on the small-signal impedance of the MMC, three cases are considered in this paper, i.e., open-loop control, ac voltage closed-loop control, and circulating current closed-loop control. Finally, the analytical impedance models are verified by both simulation and experimental results. △ Less

Submitted 2 May, 2017; originally announced May 2017.

Showing 1–29 of 29 results for author: Lyu, J