Search | arXiv e-print repository

FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-supervised Medical Image Segmentation

Authors: Li Lin, Yixiang Liu, Jiewei Wu, Pu** Cheng, Zhiyuan Cai, Kenneth K. Y. Wong, Xiaoying Tang

Abstract: Federated learning (FL) effectively mitigates the data silo challenge brought about by policies and privacy concerns, implicitly harnessing more data for deep model training. However, traditional centralized FL models grapple with diverse multi-center data, especially in the face of significant data heterogeneity, notably in medical contexts. In the realm of medical image segmentation, the growing… ▽ More Federated learning (FL) effectively mitigates the data silo challenge brought about by policies and privacy concerns, implicitly harnessing more data for deep model training. However, traditional centralized FL models grapple with diverse multi-center data, especially in the face of significant data heterogeneity, notably in medical contexts. In the realm of medical image segmentation, the growing imperative to curtail annotation costs has amplified the importance of weakly-supervised techniques which utilize sparse annotations such as points, scribbles, etc. A pragmatic FL paradigm shall accommodate diverse annotation formats across different sites, which research topic remains under-investigated. In such context, we propose a novel personalized FL framework with learnable prompt and aggregation (FedLPPA) to uniformly leverage heterogeneous weak supervision for medical image segmentation. In FedLPPA, a learnable universal knowledge prompt is maintained, complemented by multiple learnable personalized data distribution prompts and prompts representing the supervision sparsity. Integrated with sample features through a dual-attention mechanism, those prompts empower each local task decoder to adeptly adjust to both the local distribution and the supervision form. Concurrently, a dual-decoder strategy, predicated on prompt similarity, is introduced for enhancing the generation of pseudo-labels in weakly-supervised learning, alleviating overfitting and noise accumulation inherent to local data, while an adaptable aggregation method is employed to customize the task decoder on a parameter-wise basis. Extensive experiments on four distinct medical image segmentation tasks involving different modalities underscore the superiority of FedLPPA, with its efficacy closely parallels that of fully supervised centralized training. Our code and data will be available. △ Less

Submitted 31 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: 12 pages, 10 figures

arXiv:2401.15704 [pdf, other]

Phoneme-Based Proactive Anti-Eavesdrop** with Controlled Recording Privilege

Authors: Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren

Abstract: The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancemen… ▽ More The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancement techniques. Besides, most of these solutions do not support authorized recording, which restricts their usage scenarios. In this paper, we design an efficient yet robust system that can jam microphones while preserving authorized recording. Specifically, we propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans and is resistant to denoising techniques. Besides, we optimize the noise transmission strategy for broader coverage and implement a hardware prototype of our system. Experimental results show that our system can reduce the recognition accuracy of recordings to below 50\% under all tested speech recognition systems, which is much better than existing solutions. △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: 14 pages, 28 figures; submitted to IEEE TDSC

arXiv:2312.07226 [pdf, other]

Super-Resolution on Rotationally Scanned Photoacoustic Microscopy Images Incorporating Scanning Prior

Authors: Kai Pan, Linyang Li, Li Lin, Pu** Cheng, Junyan Lyu, Lei Xi, Xiaoyin Tang

Abstract: Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. However, there exists a trade-off between scanning speed and image resolution. Compared with traditional raster scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. Recently, there is a t… ▽ More Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. However, there exists a trade-off between scanning speed and image resolution. Compared with traditional raster scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. Recently, there is a trend to incorporate deep learning into the scanning process to further increase the scanning speed.Yet, most such attempts are performed for raster scanning while those for rotational scanning are relatively rare. In this study, we propose a novel and well-performing super-resolution framework for rotational scanning-based PAM imaging. To eliminate adjacent rows' displacements due to subject motion or high-frequency scanning distortion,we introduce a registration module across odd and even rows in the preprocessing and incorporate displacement degradation in the training. Besides, gradient-based patch selection is proposed to increase the probability of blood vessel patches being selected for training. A Transformer-based network with a global receptive field is applied for better performance. Experimental results on both synthetic and real datasets demonstrate the effectiveness and generalizability of our proposed framework for rotationally scanned PAM images'super-resolution, both quantitatively and qualitatively. Code is available at https://github.com/11710615/PAMSR.git. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2308.15736 [pdf, ps, other]

Vulnerability of Machine Learning Approaches Applied in IoT-based Smart Grid: A Review

Authors: Zhenyong Zhang, Mengxiang Liu, Mingyang Sun, Ruilong Deng, Peng Cheng, Dusit Niyato, Mo-Yuen Chow, Jiming Chen

Abstract: Machine learning (ML) sees an increasing prevalence of being used in the internet-of-things (IoT)-based smart grid. However, the trustworthiness of ML is a severe issue that must be addressed to accommodate the trend of ML-based smart grid applications (MLsgAPPs). The adversarial distortion injected into the power signal will greatly affect the system's normal control and operation. Therefore, it… ▽ More Machine learning (ML) sees an increasing prevalence of being used in the internet-of-things (IoT)-based smart grid. However, the trustworthiness of ML is a severe issue that must be addressed to accommodate the trend of ML-based smart grid applications (MLsgAPPs). The adversarial distortion injected into the power signal will greatly affect the system's normal control and operation. Therefore, it is imperative to conduct vulnerability assessment for MLsgAPPs applied in the context of safety-critical power systems. In this paper, we provide a comprehensive review of the recent progress in designing attack and defense methods for MLsgAPPs. Unlike the traditional survey about ML security, this is the first review work about the security of MLsgAPPs that focuses on the characteristics of power systems. We first highlight the specifics for constructing the adversarial attacks on MLsgAPPs. Then, the vulnerability of MLsgAPP is analyzed from both the aspects of the power system and ML model. Afterward, a comprehensive survey is conducted to review and compare existing studies about the adversarial attacks on MLsgAPPs in scenarios of generation, transmission, distribution, and consumption, and the countermeasures are reviewed according to the attacks that they defend against. Finally, the future research directions are discussed on the attacker's and defender's side, respectively. We also analyze the potential vulnerability of large language model-based (e.g., ChatGPT) power system applications. Overall, we encourage more researchers to contribute to investigating the adversarial issues of MLsgAPPs. △ Less

Submitted 24 December, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2305.11504 [pdf, other]

JOINEDTrans: Prior Guided Multi-task Transformer for Joint Optic Disc/Cup Segmentation and Fovea Detection

Authors: Huaqing He, Li Lin, Zhiyuan Cai, Pu** Cheng, Xiaoying Tang

Abstract: Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each la… ▽ More Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each landmark as a single task, and take into account no prior information. To address these issues, we propose a prior guided multi-task transformer framework for joint OD/OC segmentation and fovea detection, named JOINEDTrans. JOINEDTrans effectively combines various spatial features of the fundus images, relieving the structural distortions induced by lesions and other imaging issues. It contains a segmentation branch and a detection branch. To be noted, we employ an encoder pretrained in a vessel segmentation task to effectively exploit the positional relationship among vessel, OD/OC, and fovea, successfully incorporating spatial prior into the proposed JOINEDTrans framework. There are a coarse stage and a fine stage in JOINEDTrans. In the coarse stage, OD/OC coarse segmentation and fovea heatmap localization are obtained through a joint segmentation and detection module. In the fine stage, we crop regions of interest for subsequent refinement and use predictions obtained in the coarse stage to provide additional information for better performance and faster convergence. Experimental results demonstrate that JOINEDTrans outperforms existing state-of-the-art methods on the publicly available GAMMA, REFUGE, and PALM fundus image datasets. We make our code available at https://github.com/HuaqingHe/JOINEDTrans △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: 11 pages, 6 figures

arXiv:2305.05338 [pdf, other]

doi 10.1109/TSG.2024.3373008

Enhancing Cyber-Resiliency of DER-based SmartGrid: A Survey

Authors: Mengxiang Liu, Fei Teng, Zhenyong Zhang, Pudong Ge, Ruilong Deng, Mingyang Sun, Peng Cheng, Jiming Chen

Abstract: The rapid development of information and communications technology has enabled the use of digital-controlled and software-driven distributed energy resources (DERs) to improve the flexibility and efficiency of power supply, and support grid operations. However, this evolution also exposes geographically-dispersed DERs to cyber threats, including hardware and software vulnerabilities, communication… ▽ More The rapid development of information and communications technology has enabled the use of digital-controlled and software-driven distributed energy resources (DERs) to improve the flexibility and efficiency of power supply, and support grid operations. However, this evolution also exposes geographically-dispersed DERs to cyber threats, including hardware and software vulnerabilities, communication issues, and personnel errors, etc. Therefore, enhancing the cyber-resiliency of DER-based smart grid - the ability to survive successful cyber intrusions - is becoming increasingly vital and has garnered significant attention from both industry and academia. In this survey, we aim to provide a systematical and comprehensive review regarding the cyber-resiliency enhancement (CRE) of DER-based smart grid. Firstly, an integrated threat modeling method is tailored for the hierarchical DER-based smart grid with special emphasis on vulnerability identification and impact analysis. Then, the defense-in-depth strategies encompassing prevention, detection, mitigation, and recovery are comprehensively surveyed, systematically classified, and rigorously compared. A CRE framework is subsequently proposed to incorporate the five key resiliency enablers. Finally, challenges and future directions are discussed in details. The overall aim of this survey is to demonstrate the development trend of CRE methods and motivate further efforts to improve the cyber-resiliency of DER-based smart grid. △ Less

Submitted 5 March, 2024; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: Accepted by IEEE Transactions on Smart Grid

arXiv:2303.04603 [pdf, other]

Learning Enhancement From Degradation: A Diffusion Model For Fundus Image Enhancement

Authors: Pui** Cheng, Li Lin, Yi** Huang, Huaqing He, Wenhan Luo, Xiaoying Tang

Abstract: The quality of a fundus image can be compromised by numerous factors, many of which are challenging to be appropriately and mathematically modeled. In this paper, we introduce a novel diffusion model based framework, named Learning Enhancement from Degradation (LED), for enhancing fundus images. Specifically, we first adopt a data-driven degradation framework to learn degradation map**s from unp… ▽ More The quality of a fundus image can be compromised by numerous factors, many of which are challenging to be appropriately and mathematically modeled. In this paper, we introduce a novel diffusion model based framework, named Learning Enhancement from Degradation (LED), for enhancing fundus images. Specifically, we first adopt a data-driven degradation framework to learn degradation map**s from unpaired high-quality to low-quality images. We then apply a conditional diffusion model to learn the inverse enhancement process in a paired manner. The proposed LED is able to output enhancement results that maintain clinically important features with better clarity. Moreover, in the inference phase, LED can be easily and effectively integrated with any existing fundus image enhancement framework. We evaluate the proposed LED on several downstream tasks with respect to various clinically-relevant metrics, successfully demonstrating its superiority over existing state-of-the-art methods both quantitatively and qualitatively. The source code is available at https://github.com/QtacierP/LED. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2212.10541 [pdf, other]

UNO-QA: An Unsupervised Anomaly-Aware Framework with Test-Time Clustering for OCTA Image Quality Assessment

Authors: Juntao Chen, Li Lin, Pu** Cheng, Yi** Huang, Xiaoying Tang

Abstract: Medical image quality assessment (MIQA) is a vital prerequisite in various medical image analysis applications. Most existing MIQA algorithms are fully supervised that request a large amount of annotated data. However, annotating medical images is time-consuming and labor-intensive. In this paper, we propose an unsupervised anomaly-aware framework with test-time clustering for optical coherence to… ▽ More Medical image quality assessment (MIQA) is a vital prerequisite in various medical image analysis applications. Most existing MIQA algorithms are fully supervised that request a large amount of annotated data. However, annotating medical images is time-consuming and labor-intensive. In this paper, we propose an unsupervised anomaly-aware framework with test-time clustering for optical coherence tomography angiography (OCTA) image quality assessment in a setting wherein only a set of high-quality samples are accessible in the training phase. Specifically, a feature-embedding-based low-quality representation module is proposed to quantify the quality of OCTA images and then to discriminate between outstanding quality and non-outstanding quality. Within the non-outstanding quality class, to further distinguish gradable images from ungradable ones, we perform dimension reduction and clustering of multi-scale image features extracted by the trained OCTA quality representation network. Extensive experiments are conducted on one publicly accessible dataset sOCTA-3*3-10k, with superiority of our proposed framework being successfully established. △ Less

Submitted 21 February, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

Comments: submitted to ISBI2023

arXiv:2212.05566 [pdf, other]

YoloCurvSeg: You Only Label One Noisy Skeleton for Vessel-style Curvilinear Structure Segmentation

Authors: Li Lin, Linkai Peng, Huaqing He, Pu** Cheng, Jiewei Wu, Kenneth K. Y. Wong, Xiaoying Tang

Abstract: Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small… ▽ More Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg. A very essential component of YoloCurvSeg is image synthesis. Specifically, a background generator delivers image backgrounds that closely match the real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14\%, 0.03\%, 1.40\%, and 0.65\% of the full annotation), YoloCurvSeg achieves more than 97\% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg. △ Less

Submitted 18 August, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

Comments: 20 pages, 15 figures, MEDIA accepted

arXiv:2207.13249 [pdf, other]

doi 10.1109/TMI.2022.3193146

AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

Authors: Junyan Lyu, Yiqi Zhang, Yi** Huang, Li Lin, Pu** Cheng, Xiaoying Tang

Abstract: Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Do… ▽ More Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Domain Generalization (AADG). Our AADG framework can effectively sample data augmentation policies that generate novel domains and diversify the training set from an appropriate search space. Specifically, we introduce a novel proxy task maximizing the diversity among multiple augmented novel domains as measured by the Sinkhorn distance in a unit sphere space, making automated augmentation tractable. Adversarial training and deep reinforcement learning are employed to efficiently search the objectives. Quantitative and qualitative experiments on 11 publicly-accessible fundus image datasets (four for retinal vessel segmentation, four for optic disc and cup (OD/OC) segmentation and three for retinal lesion segmentation) are comprehensively performed. Two OCTA datasets for retinal vasculature segmentation are further involved to validate cross-modality generalization. Our proposed AADG exhibits state-of-the-art generalization performance and outperforms existing approaches by considerable margins on retinal vessel, OD/OC and lesion segmentation tasks. The learned policies are empirically validated to be model-agnostic and can transfer well to other models. The source code is available at https://github.com/CRazorback/AADG. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE Transactions on Medical Imaging (TMI)

arXiv:2204.07360 [pdf, other]

Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network

Authors: Han Meng, Yuexing Peng, Wenbo Wang, Peng Cheng, Yonghui Li, Wei Xiang

Abstract: This paper proposes a knowledge-and-data-driven graph neural network-based collaboration learning model for reliable aircraft recognition in a heterogeneous radar network. The aircraft recognizability analysis shows that: (1) the semantic feature of an aircraft is motion patterns driven by the kinetic characteristics, and (2) the grammatical features contained in the radar cross-section (RCS) sign… ▽ More This paper proposes a knowledge-and-data-driven graph neural network-based collaboration learning model for reliable aircraft recognition in a heterogeneous radar network. The aircraft recognizability analysis shows that: (1) the semantic feature of an aircraft is motion patterns driven by the kinetic characteristics, and (2) the grammatical features contained in the radar cross-section (RCS) signals present spatial-temporal-frequency (STF) diversity decided by both the electromagnetic radiation shape and motion pattern of the aircraft. Then a STF graph attention convolutional network (STFGACN) is developed to distill semantic features from the RCS signals received by the heterogeneous radar network. Extensive experiment results verify that the STFGACN outperforms the baseline methods in terms of detection accuracy, and ablation experiments are carried out to further show that the expansion of the information dimension can gain considerable benefits to perform robustly in the low signal-to-noise ratio region. △ Less

Submitted 15 April, 2022; originally announced April 2022.

Comments: 11 pages, 17 figures

arXiv:2203.06920 [pdf, other]

DS3-Net: Difficulty-perceived Common-to-T1ce Semi-Supervised Multimodal MRI Synthesis Network

Authors: Ziqi Huang, Li Lin, Pu** Cheng, Kai Pan, Xiaoying Tang

Abstract: Contrast-enhanced T1 (T1ce) is one of the most essential magnetic resonance imaging (MRI) modalities for diagnosing and analyzing brain tumors, especially gliomas. In clinical practice, common MRI modalities such as T1, T2, and fluid attenuation inversion recovery are relatively easy to access while T1ce is more challenging considering the additional cost and potential risk of allergies to the con… ▽ More Contrast-enhanced T1 (T1ce) is one of the most essential magnetic resonance imaging (MRI) modalities for diagnosing and analyzing brain tumors, especially gliomas. In clinical practice, common MRI modalities such as T1, T2, and fluid attenuation inversion recovery are relatively easy to access while T1ce is more challenging considering the additional cost and potential risk of allergies to the contrast agent. Therefore, it is of great clinical necessity to develop a method to synthesize T1ce from other common modalities. Current paired image translation methods typically have the issue of requiring a large amount of paired data and do not focus on specific regions of interest, e.g., the tumor region, in the synthesization process. To address these issues, we propose a Difficulty-perceived common-to-T1ce Semi-Supervised multimodal MRI Synthesis network (DS3-Net), involving both paired and unpaired data together with dual-level knowledge distillation. DS3-Net predicts a difficulty map to progressively promote the synthesis task. Specifically, a pixelwise constraint and a patchwise contrastive constraint are guided by the predicted difficulty map. Through extensive experiments on the publiclyavailable BraTS2020 dataset, DS3-Net outperforms its supervised counterpart in each respect. Furthermore, with only 5% paired data, the proposed DS3-Net achieves competitive performance with state-of-theart image translation methods utilizing 100% paired data, delivering an average SSIM of 0.8947 and an average PSNR of 23.60. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 10 pages, 2 figures

arXiv:2203.04586 [pdf, other]

Multi-modal Brain Tumor Segmentation via Missing Modality Synthesis and Modality-level Attention Fusion

Authors: Ziqi Huang, Li Lin, Pu** Cheng, Linkai Peng, Xiaoying Tang

Abstract: Multi-modal magnetic resonance (MR) imaging provides great potential for diagnosing and analyzing brain gliomas. In clinical scenarios, common MR sequences such as T1, T2 and FLAIR can be obtained simultaneously in a single scanning process. However, acquiring contrast enhanced modalities such as T1ce requires additional time, cost, and injection of contrast agent. As such, it is clinically meanin… ▽ More Multi-modal magnetic resonance (MR) imaging provides great potential for diagnosing and analyzing brain gliomas. In clinical scenarios, common MR sequences such as T1, T2 and FLAIR can be obtained simultaneously in a single scanning process. However, acquiring contrast enhanced modalities such as T1ce requires additional time, cost, and injection of contrast agent. As such, it is clinically meaningful to develop a method to synthesize unavailable modalities which can also be used as additional inputs to downstream tasks (e.g., brain tumor segmentation) for performance enhancing. In this work, we propose an end-to-end framework named Modality-Level Attention Fusion Network (MAF-Net), wherein we innovatively conduct patchwise contrastive learning for extracting multi-modal latent features and dynamically assigning attention weights to fuse different modalities. Through extensive experiments on BraTS2020, our proposed MAF-Net is found to yield superior T1ce synthesis performance (SSIM of 0.8879 and PSNR of 22.78) and accurate brain tumor segmentation (mean Dice scores of 67.9%, 41.8% and 88.0% on segmenting the tumor core, enhancing tumor and whole tumor). △ Less

Submitted 9 March, 2022; originally announced March 2022.

Comments: 6 pages, 5 figures, submitted to ICPR 2022

arXiv:2203.03631 [pdf, other]

Student Becomes Decathlon Master in Retinal Vessel Segmentation via Dual-teacher Multi-target Domain Adaptation

Authors: Linkai Peng, Li Lin, Pu** Cheng, Huaqing He, Xiaoying Tang

Abstract: Unsupervised domain adaptation has been proposed recently to tackle the so-called domain shift between training data and test data with different distributions. However, most of them only focus on single-target domain adaptation and cannot be applied to the scenario with multiple target domains. In this paper, we propose RVms, a novel unsupervised multi-target domain adaptation approach to segment… ▽ More Unsupervised domain adaptation has been proposed recently to tackle the so-called domain shift between training data and test data with different distributions. However, most of them only focus on single-target domain adaptation and cannot be applied to the scenario with multiple target domains. In this paper, we propose RVms, a novel unsupervised multi-target domain adaptation approach to segment retinal vessels (RVs) from multimodal and multicenter retinal images. RVms mainly consists of a style augmentation and transfer (SAT) module and a dual-teacher knowledge distillation (DTKD) module. SAT augments and clusters images into source-similar domains and source-dissimilar domains via Bezier and Fourier transformations. DTKD utilizes the augmented and transformed data to train two teachers, one for source-similar domains and the other for source-dissimilar domains. Afterwards, knowledge distillation is performed to iteratively distill different domain knowledge from teachers to a generic student. The local relative intensity transformation is employed to characterize RVs in a domain invariant manner and promote the generalizability of teachers and student models. Moreover, we construct a new multimodal and multicenter vascular segmentation dataset from existing publicly-available datasets, which can be used to benchmark various domain adaptation and domain generalization methods. Through extensive experiments, RVms is found to be very close to the target-trained Oracle in terms of segmenting the RVs, largely outperforming other state-of-the-art methods. △ Less

Submitted 11 October, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

Comments: To be published in MICCAI-MLMI 2022

arXiv:2203.00951 [pdf, other]

Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis

Authors: Pengyu Cheng, Zhenhua Ling

Abstract: In this paper, we propose a method of speaker adaption with intuitive prosodic features for statistical parametric speech synthesis. The intuitive prosodic features employed in this method include pitch, pitch range, speech rate and energy considering that they are directly related with the overall prosodic characteristics of different speakers. The intuitive prosodic features are extracted at utt… ▽ More In this paper, we propose a method of speaker adaption with intuitive prosodic features for statistical parametric speech synthesis. The intuitive prosodic features employed in this method include pitch, pitch range, speech rate and energy considering that they are directly related with the overall prosodic characteristics of different speakers. The intuitive prosodic features are extracted at utterance-level or speaker-level, and are further integrated into the existing speaker-encoding-based and speaker-embedding-based adaptation frameworks respectively. The acoustic models are sequence-to-sequence ones based on Tacotron2. Intuitive prosodic features are concatenated with text encoder outputs and speaker vectors for decoding acoustic features.Experimental results have demonstrated that our proposed methods can achieve better objective and subjective performance than the baseline methods without intuitive prosodic features. Besides, the proposed speaker adaption method with utterance-level prosodic features has achieved the best similarity of synthetic speech among all compared methods. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted by ICDSP2022

arXiv:2201.04812 [pdf, other]

Unsupervised Domain Adaptation for Cross-Modality Retinal Vessel Segmentation via Disentangling Representation Style Transfer and Collaborative Consistency Learning

Authors: Linkai Peng, Li Lin, Pu** Cheng, Ziqi Huang, Xiaoying Tang

Abstract: Various deep learning models have been developed to segment anatomical structures from medical images, but they typically have poor performance when tested on another target domain with different data distribution. Recently, unsupervised domain adaptation methods have been proposed to alleviate this so-called domain shift issue, but most of them are designed for scenarios with relatively small dom… ▽ More Various deep learning models have been developed to segment anatomical structures from medical images, but they typically have poor performance when tested on another target domain with different data distribution. Recently, unsupervised domain adaptation methods have been proposed to alleviate this so-called domain shift issue, but most of them are designed for scenarios with relatively small domain shifts and are likely to fail when encountering a large domain gap. In this paper, we propose DCDA, a novel cross-modality unsupervised domain adaptation framework for tasks with large domain shifts, e.g., segmenting retinal vessels from OCTA and OCT images. DCDA mainly consists of a disentangling representation style transfer (DRST) module and a collaborative consistency learning (CCL) module. DRST decomposes images into content components and style codes and performs style transfer and image reconstruction. CCL contains two segmentation models, one for source domain and the other for target domain. The two models use labeled data (together with the corresponding transferred images) for supervised learning and perform collaborative consistency learning on unlabeled data. Each model focuses on the corresponding single domain and aims to yield an expertized domain-specific segmentation model. Through extensive experiments on retinal vessel segmentation, our framework achieves Dice scores close to target-trained oracle both from OCTA to OCT and from OCT to OCTA, significantly outperforming other state-of-the-art methods. △ Less

Submitted 20 January, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

Comments: To be published in ISBI 2022

arXiv:2110.14160 [pdf, other]

Identifying the key components in ResNet-50 for diabetic retinopathy grading from fundus images: a systematic investigation

Authors: Yi** Huang, Li Lin, Pu** Cheng, Junyan Lyu, Roger Tam, Xiaoying Tang

Abstract: Although deep learning based diabetic retinopathy (DR) classification methods typically benefit from well-designed architectures of convolutional neural networks, the training setting also has a non-negligible impact on the prediction performance. The training setting includes various interdependent components, such as objective function, data sampling strategy and data augmentation approach. To i… ▽ More Although deep learning based diabetic retinopathy (DR) classification methods typically benefit from well-designed architectures of convolutional neural networks, the training setting also has a non-negligible impact on the prediction performance. The training setting includes various interdependent components, such as objective function, data sampling strategy and data augmentation approach. To identify the key components in a standard deep learning framework (ResNet-50) for DR grading, we systematically analyze the impact of several major components. Extensive experiments are conducted on a publicly-available dataset EyePACS. We demonstrate that (1) the DR grading framework is sensitive to input resolution, objective function, and composition of data augmentation, (2) using mean square error as the loss function can effectively improve the performance with respect to a task-specific evaluation metric, namely the quadratically-weighted Kappa, (3) utilizing eye pairs boosts the performance of DR grading and (4) using data resampling to address the problem of imbalanced data distribution in EyePACS hurts the performance. Based on these observations and an optimal combination of the investigated components, our framework, without any specialized network design, achieves the state-of-the-art result (0.8631 for Kappa) on the EyePACS test set (a total of 42670 fundus images) with only image-level labels. We also examine the proposed training practices on other fundus datasets and other network architectures to evaluate their generalizability. Our codes and pre-trained model are available at https://github.com/Yi**Huang/pytorch-classification. △ Less

Submitted 17 October, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

arXiv:2107.08274 [pdf, other]

Lesion-based Contrastive Learning for Diabetic Retinopathy Grading from Fundus Images

Authors: Yi** Huang, Li Lin, Pu** Cheng, Junyan Lyu, Xiaoying Tang

Abstract: Manually annotating medical images is extremely expensive, especially for large-scale datasets. Self-supervised contrastive learning has been explored to learn feature representations from unlabeled images. However, unlike natural images, the application of contrastive learning to medical images is relatively limited. In this work, we propose a self-supervised framework, namely lesion-based contra… ▽ More Manually annotating medical images is extremely expensive, especially for large-scale datasets. Self-supervised contrastive learning has been explored to learn feature representations from unlabeled images. However, unlike natural images, the application of contrastive learning to medical images is relatively limited. In this work, we propose a self-supervised framework, namely lesion-based contrastive learning for automated diabetic retinopathy (DR) grading. Instead of taking entire images as the input in the common contrastive learning scheme, lesion patches are employed to encourage the feature extractor to learn representations that are highly discriminative for DR grading. We also investigate different data augmentation operations in defining our contrastive prediction task. Extensive experiments are conducted on the publicly-accessible dataset EyePACS, demonstrating that our proposed framework performs outstandingly on DR grading in terms of both linear evaluation and transfer capacity evaluation. △ Less

Submitted 17 July, 2021; originally announced July 2021.

Comments: 10 pages, 2 figures, MICCAI2021 early accepted

arXiv:2107.04823 [pdf, other]

BSDA-Net: A Boundary Shape and Distance Aware Joint Learning Framework for Segmenting and Classifying OCTA Images

Authors: Li Lin, Zhonghua Wang, Jiewei Wu, Yi** Huang, Junyan Lyu, Pu** Cheng, Jiong Wu, Xiaoying Tang

Abstract: Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, the… ▽ More Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, there is no existing research reporting that FAZ features can improve the performance of deep diagnostic classification networks. In this paper, we propose a novel multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images. Two auxiliary branches, namely boundary heatmap regression and signed distance map reconstruction branches, are constructed in addition to the segmentation branch to improve the segmentation performance, resulting in more accurate FAZ contours and fewer outliers. Moreover, both low-level and high-level features from the aforementioned three branches, including shape, size, boundary, and signed directional distance map of FAZ, are fused hierarchically with features from the diagnostic classifier. Through extensive experiments, the proposed BSDA-Net is found to yield state-of-the-art segmentation and classification results on the OCTA-500, OCTAGON, and FAZID datasets. △ Less

Submitted 13 July, 2021; v1 submitted 10 July, 2021; originally announced July 2021.

Comments: 12 pages, 4 figures, MICCAI2021 [Student Travel Award]

arXiv:2106.12511 [pdf]

doi 10.1001/jamacardio.2021.6059

High-Throughput Precision Phenoty** of Left Ventricular Hypertrophy with Cardiovascular Deep Learning

Authors: Grant Duffy, Paul P Cheng, Neal Yuan, Bryan He, Alan C. Kwan, Matthew J. Shun-Shin, Kevin M. Alexander, Joseph Ebinger, Matthew P. Lungren, Florian Rader, David H. Liang, Ingela Schnittger, Euan A. Ashley, James Y. Zou, Jignesh Patel, Ronald Witteles, Susan Cheng, David Ouyang

Abstract: Left ventricular hypertrophy (LVH) results from chronic remodeling caused by a broad range of systemic and cardiovascular disease including hypertension, aortic stenosis, hypertrophic cardiomyopathy, and cardiac amyloidosis. Early detection and characterization of LVH can significantly impact patient care but is limited by under-recognition of hypertrophy, measurement error and variability, and di… ▽ More Left ventricular hypertrophy (LVH) results from chronic remodeling caused by a broad range of systemic and cardiovascular disease including hypertension, aortic stenosis, hypertrophic cardiomyopathy, and cardiac amyloidosis. Early detection and characterization of LVH can significantly impact patient care but is limited by under-recognition of hypertrophy, measurement error and variability, and difficulty differentiating etiologies of LVH. To overcome this challenge, we present EchoNet-LVH - a deep learning workflow that automatically quantifies ventricular hypertrophy with precision equal to human experts and predicts etiology of LVH. Trained on 28,201 echocardiogram videos, our model accurately measures intraventricular wall thickness (mean absolute error [MAE] 1.4mm, 95% CI 1.2-1.5mm), left ventricular diameter (MAE 2.4mm, 95% CI 2.2-2.6mm), and posterior wall thickness (MAE 1.2mm, 95% CI 1.1-1.3mm) and classifies cardiac amyloidosis (area under the curve of 0.83) and hypertrophic cardiomyopathy (AUC 0.98) from other etiologies of LVH. In external datasets from independent domestic and international healthcare systems, EchoNet-LVH accurately quantified ventricular parameters (R2 of 0.96 and 0.90 respectively) and detected cardiac amyloidosis (AUC 0.79) and hypertrophic cardiomyopathy (AUC 0.89) on the domestic external validation site. Leveraging measurements across multiple heart beats, our model can more accurately identify subtle changes in LV geometry and its causal etiologies. Compared to human experts, EchoNet-LVH is fully automated, allowing for reproducible, precise measurements, and lays the foundation for precision diagnosis of cardiac hypertrophy. As a resource to promote further innovation, we also make publicly available a large dataset of 23,212 annotated echocardiogram videos. △ Less

Submitted 23 June, 2021; originally announced June 2021.

arXiv:2104.10737 [pdf, other]

Feedforward-Feedback wake redirection for wind farm control

Authors: Steffen Raach, Bart Doekemeijer, Sjoerd Boersma, Jan-Willem van Wingerden, Po Wen Cheng

Abstract: This work presents a combined feedforward-feedback wake redirection framework for wind farm control. The FLORIS wake model, a control-oriented steady-state wake model is used to calculate optimal yaw angles for a given wind farm layout and atmospheric condition. The optimal yaw angles, which maximize the total power output, are applied to the wind farm. Further, the lidar-based closed-loop wake re… ▽ More This work presents a combined feedforward-feedback wake redirection framework for wind farm control. The FLORIS wake model, a control-oriented steady-state wake model is used to calculate optimal yaw angles for a given wind farm layout and atmospheric condition. The optimal yaw angles, which maximize the total power output, are applied to the wind farm. Further, the lidar-based closed-loop wake redirection concept is used to realize a local feedback on turbine level. The wake center is estimated from lidar measurements \unit[3]{D} downwind of the wind turbines. The dynamical feedback controllers support the feedforward controller and reject disturbances and adapt to model uncertainties. Altogether, the total framework is presented and applied to a nine turbine wind farm test case. In a high fidelity simulation study the concept shows promising results and an increase in total energy production compared to the baseline case and the feedforward-only case. △ Less

Submitted 21 April, 2021; originally announced April 2021.

arXiv:2103.09420 [pdf, other]

Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Authors: Siyang Yuan, Pengyu Cheng, Ruiyi Zhang, Weituo Hao, Zhe Gan, Lawrence Carin

Abstract: Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a ch… ▽ More Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a challenging problem. We propose a novel zero-shot voice transfer method via disentangled representation learning. The proposed method first encodes speaker-related style and voice content of each input voice into separated low-dimensional embedding spaces, and then transfers to a new voice by combining the source content embedding and target style embedding through a decoder. With information-theoretic guidance, the style and content embedding spaces are representative and (ideally) independent of each other. On real-world VCTK datasets, our method outperforms other baselines and obtains state-of-the-art results in terms of transfer accuracy and voice naturalness for voice style transfer experiments under both many-to-many and zero-shot setups. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: To appear in ICLR 2021

arXiv:2007.13495 [pdf, other]

Deep Multi-Task Learning for Cooperative NOMA: System Design and Principles

Authors: Yuxin Lu, Peng Cheng, Zhuo Chen, Wai Ho Mow, Yonghui Li, Branka Vucetic

Abstract: Envisioned as a promising component of the future wireless Internet-of-Things (IoT) networks, the non-orthogonal multiple access (NOMA) technique can support massive connectivity with a significantly increased spectral efficiency. Cooperative NOMA is able to further improve the communication reliability of users under poor channel conditions. However, the conventional system design suffers from se… ▽ More Envisioned as a promising component of the future wireless Internet-of-Things (IoT) networks, the non-orthogonal multiple access (NOMA) technique can support massive connectivity with a significantly increased spectral efficiency. Cooperative NOMA is able to further improve the communication reliability of users under poor channel conditions. However, the conventional system design suffers from several inherent limitations and is not optimized from the bit error rate (BER) perspective. In this paper, we develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL). We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner. On this basis, we construct multiple loss functions to quantify the BER performance and propose a novel multi-task oriented two-stage training method to solve the end-to-end training problem in a self-supervised manner. The learning mechanism of each DNN module is then analyzed based on information theory, offering insights into the proposed DNN architecture and its corresponding training method. We also adapt the proposed scheme to handle the power allocation (PA) mismatch between training and inference and incorporate it with channel coding to combat signal deterioration. Simulation results verify its advantages over orthogonal multiple access (OMA) and the conventional cooperative NOMA scheme in various scenarios. △ Less

Submitted 27 July, 2020; originally announced July 2020.

arXiv:2003.00866 [pdf, other]

doi 10.1109/COMST.2020.3024783

Enabling AI in Future Wireless Networks: A Data Life Cycle Perspective

Authors: Dinh C. Nguyen, Peng Cheng, Ming Ding, David Lopez-Perez, Pubudu N. Pathirana, Jun Li, Aruna Seneviratne, Yonghui Li, H. Vincent Poor

Abstract: Recent years have seen rapid deployment of mobile computing and Internet of Things (IoT) networks, which can be mostly attributed to the increasing communication and sensing capabilities of wireless systems. Big data analysis, pervasive computing, and eventually artificial intelligence (AI) are envisaged to be deployed on top of the IoT and create a new world featured by data-driven AI. In this co… ▽ More Recent years have seen rapid deployment of mobile computing and Internet of Things (IoT) networks, which can be mostly attributed to the increasing communication and sensing capabilities of wireless systems. Big data analysis, pervasive computing, and eventually artificial intelligence (AI) are envisaged to be deployed on top of the IoT and create a new world featured by data-driven AI. In this context, a novel paradigm of merging AI and wireless communications, called Wireless AI that pushes AI frontiers to the network edge, is widely regarded as a key enabler for future intelligent network evolution. To this end, we present a comprehensive survey of the latest studies in wireless AI from the data-driven perspective. Specifically, we first propose a novel Wireless AI architecture that covers five key data-driven AI themes in wireless networks, including Sensing AI, Network Device AI, Access AI, User Device AI and Data-provenance AI. Then, for each data-driven AI theme, we present an overview on the use of AI approaches to solve the emerging data-related problems and show how AI can empower wireless network functionalities. Particularly, compared to the other related survey papers, we provide an in-depth discussion on the Wireless AI applications in various data-driven domains wherein AI proves extremely useful for wireless network design and optimization. Finally, research challenges and future visions are also discussed to spur further research in this promising area. △ Less

Submitted 27 April, 2021; v1 submitted 24 February, 2020; originally announced March 2020.

Comments: Accepted at the IEEE Communications Surveys & Tutorials, 42 pages

arXiv:2001.01984 [pdf, other]

False Data Injection Attacks and the Distributed Countermeasure in DC Microgrids

Authors: Mengxiang Liu, Peng Cheng, Chengcheng Zhao, Ruilong Deng, Wenhai Wang, Jiming Chen

Abstract: In this paper, we consider a hierarchical control based DC microgrid (DCmG) equipped with unknown input observer (UIO) based detectors, where the potential false data injection (FDI) attacks and the distributed countermeasure are investigated. First, we find that the vulnerability of the UIO-based detector originates from the lacked knowledge of true unknown inputs. Zero trace stealthy (ZTS) attac… ▽ More In this paper, we consider a hierarchical control based DC microgrid (DCmG) equipped with unknown input observer (UIO) based detectors, where the potential false data injection (FDI) attacks and the distributed countermeasure are investigated. First, we find that the vulnerability of the UIO-based detector originates from the lacked knowledge of true unknown inputs. Zero trace stealthy (ZTS) attacks can be launched by secretly faking the unknown inputs, under which the detection residual will not be altered, and the impact on the DCmG in terms of voltage balancing and current sharing is theoretically analyzed. Then, to mitigate the ZTS attack, we propose an automatic and timely countermeasure based on the average point of common coupling (PCC) voltage obtained from the dynamic average consensus (DAC) estimator. The integrity of the communicated data utilized in DAC estimators is guaranteed via UIO-based detectors, where the DAC parameters are perturbed in a fixed period to be concealed from attackers. Finally, the detection and mitigation performance of the proposed countermeasure is rigorously investigated, and extensive simulations are conducted in Simulink/PLECS to validate the theoretical results. △ Less

Submitted 28 July, 2021; v1 submitted 7 January, 2020; originally announced January 2020.

arXiv:2001.00207 [pdf, other]

Spectrum Intelligent Radio: Technology, Development, and Future Trends

Authors: Peng Cheng, Zhuo Chen, Ming Ding, Yonghui Li, Branka Vucetic, Dusit Niyato

Abstract: The advent of Industry 4.0 with massive connectivity places significant strains on the current spectrum resources, and challenges the industry and regulators to respond promptly with new disruptive spectrum management strategies. The current radio development, with certain elements of intelligence, is nowhere near showing an agile response to the complex radio environments. Following the line of i… ▽ More The advent of Industry 4.0 with massive connectivity places significant strains on the current spectrum resources, and challenges the industry and regulators to respond promptly with new disruptive spectrum management strategies. The current radio development, with certain elements of intelligence, is nowhere near showing an agile response to the complex radio environments. Following the line of intelligence, we propose to classify spectrum intelligent radio into three streams: classical signal processing, machine learning (ML), and contextual adaptation. We focus on the ML approach, and propose a new intelligent radio architecture with three hierarchical forms: perception, understanding, and reasoning. The proposed perception method achieves fully blind multi-level spectrum sensing. The understanding method accurately predicts the primary users' coverage across a large area, and the reasoning method performs a near-optimal idle channel selection. Opportunities, challenges, and future visions are also discussed for the realization of a fully intelligent radio. △ Less

Submitted 1 January, 2020; originally announced January 2020.

Comments: Accepted by IEEE Communications Magazine

Journal ref: IEEE Communications Magazine, 2020

arXiv:1912.11372 [pdf, ps, other]

Analysis of Moving Target Defense Against False Data Injection Attacks on Power Grid

Authors: Zhenyong Zhang, Ruilong Deng, David K. Y. Yau, Peng Cheng, Jiming Chen

Abstract: Recent studies have considered thwarting false data injection (FDI) attacks against state estimation in power grids by proactively perturbing branch susceptances. This approach is known as moving target defense (MTD). However, despite of the deployment of MTD, it is still possible for the attacker to launch stealthy FDI attacks generated with former branch susceptances. In this paper, we prove tha… ▽ More Recent studies have considered thwarting false data injection (FDI) attacks against state estimation in power grids by proactively perturbing branch susceptances. This approach is known as moving target defense (MTD). However, despite of the deployment of MTD, it is still possible for the attacker to launch stealthy FDI attacks generated with former branch susceptances. In this paper, we prove that, an MTD has the capability to thwart all FDI attacks constructed with former branch susceptances only if (i) the number of branches $l$ in the power system is not less than twice that of the system states $n$ (i.e., $l \geq 2n$, where $n + 1$ is the number of buses); (ii) the susceptances of more than $n$ branches, which cover all buses, are perturbed. Moreover, we prove that the state variable of a bus that is only connected by a single branch (no matter it is perturbed or not) can always be modified by the attacker. Nevertheless, in order to reduce the attack opportunities of potential attackers, we first exploit the impact of the susceptance perturbation magnitude on the dimension of the \emph{stealthy attack space}, in which the attack vector is constructed with former branch susceptances. Then, we propose that, by perturbing an appropriate set of branches, we can minimize the dimension of the \emph{stealthy attack space} and maximize the number of covered buses. Besides, we consider the increasing operation cost caused by the activation of MTD. Finally, we conduct extensive simulations to illustrate our findings with IEEE standard test power systems. △ Less

Submitted 24 December, 2019; originally announced December 2019.

arXiv:1912.09859 [pdf, ps, other]

Lightweight and Unobtrusive Data Obfuscation at IoT Edge for Remote Inference

Authors: Dixing Xu, Mengyao Zheng, Linshan Jiang, Chaojie Gu, Rui Tan, Peng Cheng

Abstract: Executing deep neural networks for inference on the server-class or cloud backend based on data generated at the edge of Internet of Things is desirable due primarily to the limited compute power of edge devices and the need to protect the confidentiality of the inference neural networks. However, such a remote inference scheme incurs concerns regarding the privacy of the inference data transmitte… ▽ More Executing deep neural networks for inference on the server-class or cloud backend based on data generated at the edge of Internet of Things is desirable due primarily to the limited compute power of edge devices and the need to protect the confidentiality of the inference neural networks. However, such a remote inference scheme incurs concerns regarding the privacy of the inference data transmitted by the edge devices to the curious backend. This paper presents a lightweight and unobtrusive approach to obfuscate the inference data at the edge devices. It is lightweight in that the edge device only needs to execute a small-scale neural network; it is unobtrusive in that the edge device does not need to indicate whether obfuscation is applied. Extensive evaluation by three case studies of free spoken digit recognition, handwritten digit recognition, and American sign language recognition shows that our approach effectively protects the confidentiality of the raw forms of the inference data while effectively preserving the backend's inference accuracy. △ Less

Submitted 25 March, 2020; v1 submitted 20 December, 2019; originally announced December 2019.

Comments: This paper has been accepted by IEEE Internet of Things Journal, Special Issue on Artificial Intelligence Powered Edge Computing for Internet of Things

arXiv:1908.01059 [pdf, other]

doi 10.1109/TSP.2020.3009007

Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation

Authors: Xin Wang, Hideaki Ishii, Linkang Du, Peng Cheng, Jiming Chen

Abstract: With the proliferation of training data, distributed machine learning (DML) is becoming more competent for large-scale learning tasks. However, privacy concerns have to be given priority in DML, since training data may contain sensitive information of users. In this paper, we propose a privacy-preserving ADMM-based DML framework with two novel features: First, we remove the assumption commonly mad… ▽ More With the proliferation of training data, distributed machine learning (DML) is becoming more competent for large-scale learning tasks. However, privacy concerns have to be given priority in DML, since training data may contain sensitive information of users. In this paper, we propose a privacy-preserving ADMM-based DML framework with two novel features: First, we remove the assumption commonly made in the literature that the users trust the server collecting their data. Second, the framework provides heterogeneous privacy for users depending on data's sensitive levels and servers' trust degrees. The challenging issue is to keep the accumulation of privacy losses over ADMM iterations minimal. In the proposed framework, a local randomization approach, which is differentially private, is adopted to provide users with self-controlled privacy guarantee for the most sensitive information. Further, the ADMM algorithm is perturbed through a combined noise-adding method, which simultaneously preserves privacy for users' less sensitive information and strengthens the privacy protection of the most sensitive information. We provide detailed analyses on the performance of the trained model according to its generalization error. Finally, we conduct extensive experiments using real-world datasets to validate the theoretical results and evaluate the classification performance of the proposed framework. △ Less

Submitted 9 September, 2019; v1 submitted 30 July, 2019; originally announced August 2019.

arXiv:1907.09949 [pdf, other]

doi 10.1109/TSP.2019.2932866

A Learning-Based Two-Stage Spectrum Sharing Strategy with Multiple Primary Transmit Power Levels

Authors: Rui Zhang, Peng Cheng, Zhuo Chen, Yonghui Li, Branka Vucetic

Abstract: Multi-parameter cognition in a cognitive radio network (CRN) provides a more thorough understanding of the radio environments, and could potentially lead to far more intelligent and efficient spectrum usage for a secondary user. In this paper, we investigate the multi-parameter cognition problem for a CRN where the primary transmitter (PT) radiates multiple transmit power levels, and propose a lea… ▽ More Multi-parameter cognition in a cognitive radio network (CRN) provides a more thorough understanding of the radio environments, and could potentially lead to far more intelligent and efficient spectrum usage for a secondary user. In this paper, we investigate the multi-parameter cognition problem for a CRN where the primary transmitter (PT) radiates multiple transmit power levels, and propose a learning-based two-stage spectrum sharing strategy. We first propose a data-driven/machine learning based multi-level spectrum sensing scheme, including the spectrum learning (Stage I) and prediction (the first part in Stage II). This fully blind sensing scheme does not require any prior knowledge of the PT power characteristics. Then, based on a novel normalized power level alignment metric, we propose two prediction-transmission structures, namely periodic and non-periodic, for spectrum access (the second part in Stage II), which enable the secondary transmitter (ST) to closely follow the PT power level variation. The periodic structure features a fixed prediction interval, while the non-periodic one dynamically determines the interval with a proposed reinforcement learning algorithm to further improve the alignment metric. Finally, we extend the prediction-transmission structure to an online scenario, where the number of PT power levels might change as a consequence of PT adapting to the environment fluctuation or quality of service variation. The simulation results demonstrate the effectiveness of the proposed strategy in various scenarios. △ Less

Submitted 21 July, 2019; originally announced July 2019.

Comments: 46 pages, 10 figures, accepted by IEEE Transactions on Signal Processing 2019

arXiv:1808.10852 [pdf, other]

doi 10.1109/TENCON.2018.8650546

Towards Asynchronous Motor Imagery-Based Brain-Computer Interfaces: a joint training scheme using deep learning

Authors: Patcharin Cheng, Phairot Autthasan, Boriwat Pijarana, Ekapol Chuangsuwanich, Theerawit Wilaiprasitporn

Abstract: In this paper, the deep learning (DL) approach is applied to a joint training scheme for asynchronous motor imagery-based Brain-Computer Interface (BCI). The proposed DL approach is a cascade of one-dimensional convolutional neural networks and fully-connected neural networks (CNN-FC). The focus is mainly on three types of brain responses: non-imagery EEG (\textit{background EEG}), (\textit{pure i… ▽ More In this paper, the deep learning (DL) approach is applied to a joint training scheme for asynchronous motor imagery-based Brain-Computer Interface (BCI). The proposed DL approach is a cascade of one-dimensional convolutional neural networks and fully-connected neural networks (CNN-FC). The focus is mainly on three types of brain responses: non-imagery EEG (\textit{background EEG}), (\textit{pure imagery}) EEG, and EEG during the transitional period between background EEG and pure imagery (\textit{transitional imagery}). The study of transitional imagery signals should provide greater insight into real-world scenarios. It may be inferred that pure imagery and transitional EEG are high and low power EEG imagery, respectively. Moreover, the results from the CNN-FC are compared to the conventional approach for motor imagery-BCI, namely the common spatial pattern (CSP) for feature extraction and support vector machine (SVM) for classification (CSP-SVM). Under a joint training scheme, pure and transitional imagery are treated as the same class, while background EEG is another class. Ten-fold cross-validation is used to evaluate whether the joint training scheme significantly improves the performance task of classifying pure and transitional imagery signals from background EEG. Using sparse of just a few electrode channels ($C_{z}$, $C_{3}$ and $C_{4}$), mean accuracy reaches 71.52 % and 70.27 % for CNN-FC and CSP-SVM, respectively. On the other hand, mean accuracy without the joint training scheme achieve only 62.68 % and 52.41 % for CNN-FC and CSP-SVM, respectively. △ Less

Submitted 31 August, 2018; originally announced August 2018.

Journal ref: TENCON 2018 - 2018 IEEE Region 10 Conference

arXiv:1806.09250 [pdf]

doi 10.1109/TNS.2019.2900480

Electronics of Time-of-flight Measurement for Back-n at CSNS

Authors: T. Yu, P. Cao, X. Y. Ji, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. **g, L. Kang , et al. (46 additional authors not shown)

Abstract: Back-n is a white neutron experimental facility at China Spallation Neutron Source (CSNS). The time structure of the primary proton beam make it fully applicable to use TOF (time-of-flight) method for neutron energy measuring. We implement the electronics of TOF measurement on the general-purpose readout electronics designed for all of the seven detectors in Back-n. The electronics is based on PXI… ▽ More Back-n is a white neutron experimental facility at China Spallation Neutron Source (CSNS). The time structure of the primary proton beam make it fully applicable to use TOF (time-of-flight) method for neutron energy measuring. We implement the electronics of TOF measurement on the general-purpose readout electronics designed for all of the seven detectors in Back-n. The electronics is based on PXIe (Peripheral Component Interconnect Express eXtensions for Instrumentation) platform, which is composed of FDM (Field Digitizer Modules), TCM (Trigger and Clock Module), and SCM (Signal Conditioning Module). T0 signal synchronous to the CSNS accelerator represents the neutron emission from the target. It is the start of time stamp. The trigger and clock module (TCM) receives, synchronizes and distributes the T0 signal to each FDM based on the PXIe backplane bus. Meantime, detector signals after being conditioned are fed into FDMs for waveform digitizing. First sample point of the signal is the stop of time stamp. According to the start, stop time stamp and the time of signal over threshold, the total TOF can be obtained. FPGA-based (Field Programmable Gate Array) TDC is implemented on TCM to accurately acquire the time interval between the asynchronous T0 signal and the global synchronous clock phase. There is also an FPGA-based TDC on FDM to accurately acquire the time interval between T0 arriving at FDM and the first sample point of the detector signal, the over threshold time of signal is obtained offline. This method for TOF measurement is efficient and not needed for additional modules. Test result shows the accuracy of TOF is sub-nanosecond and can meet the requirement for Back-n at CSNS. △ Less

Submitted 24 June, 2018; originally announced June 2018.

Comments: 4 pages, 13 figures, 21st IEEE Real Time Conference

arXiv:1806.09249 [pdf]

T0 Fan-out for Back-n White Neutron Facility at CSNS

Authors: X. Y. Ji, P. Cao, T. Yu, L. K. Xie, X. R. Huang, Q. An, H. Y. Bai, J. Bao, Y. H. Chen, P. J. Cheng, Z. Q. Cui, R. R. Fan, C. Q. Feng, M. H. Gu, Z. J. Han, G. Z. He, Y. C. He, Y. F. He, H. X. Huang, W. L. Huang, X. L. Ji, H. Y. Jiang, W. Jiang, H. Y. **g, L. Kang , et al. (46 additional authors not shown)

Abstract: the main physics goal for Back-n white neutron facility at China Spallation Neutron Source (CSNS) is to measure nuclear data. The energy of neutrons is one of the most important parameters for measuring nuclear data. Method of time of flight (TOF) is used to obtain the energy of neutrons. The time when proton bunches hit the thick tungsten target is considered as the start point of TOF. T0 signal,… ▽ More the main physics goal for Back-n white neutron facility at China Spallation Neutron Source (CSNS) is to measure nuclear data. The energy of neutrons is one of the most important parameters for measuring nuclear data. Method of time of flight (TOF) is used to obtain the energy of neutrons. The time when proton bunches hit the thick tungsten target is considered as the start point of TOF. T0 signal, generated from the CSNS accelerator, represents this start time. Besides, the T0 signal is also used as the gate control signal that triggers the readout electronics. Obviously, the timing precision of T0 directly affects the measurement precision of TOF and controls the running or readout electronics. In this paper, the T0 fan-out for Back-n white neutron facility at CSNS is proposed. The T0 signal travelling from the CSNS accelerator is fanned out to the two underground experiment stations respectively over long cables. To guarantee the timing precision, T0 signal is conditioned with good signal edge. Furthermore, techniques of signal pre-emphasizing and equalizing are used to improve signal quality after T0 being transmitted over long cables with about 100 m length. Experiments show that the T0 fan-out works well, the T0 signal transmitted over 100 m remains a good time resolution with a standard deviation of 25 ps. It absolutely meets the required accuracy of the measurement of TOF. △ Less

Submitted 24 June, 2018; originally announced June 2018.

Comments: 3 pages, 6 figures, the 21st IEEE Real Time Conference

arXiv:1804.05618 [pdf, ps, other]

Optimal Scheduling of Multiple Sensors over Lossy and Bandwidth Limited Channels

Authors: Shuang Wu, Kemi Ding, Peng Cheng, Ling Shi

Abstract: This work considers the sensor scheduling for multiple dynamic processes. We consider $n$ linear dynamic processes, the state of each process is measured by a sensor, which transmits their local state estimates over wireless channels to a remote estimator with certain communication costs. In each time step, only a portion of the sensors is allowed to transmit data to the remote estimator and the p… ▽ More This work considers the sensor scheduling for multiple dynamic processes. We consider $n$ linear dynamic processes, the state of each process is measured by a sensor, which transmits their local state estimates over wireless channels to a remote estimator with certain communication costs. In each time step, only a portion of the sensors is allowed to transmit data to the remote estimator and the packet might be lost due to unreliability of the wireless channels. Our goal is to find a scheduling policy which coordinates the sensors in a centralized manner to minimize the total expected estimation error of the remote estimator and the communication costs. We formulate the problem as a Markov decision process. We develop an algorithm to check whether there exists a deterministic stationary optimal policy. We show the optimality of monotone policies, which saves computational effort of finding an optimal policy and facilitates practical implementation. Nevertheless, obtaining an exact optimal policy still suffers from curse of dimensionality when the number of processes are large. We further provide an index-based heuristics to avoid brute force computation. Numerical examples are presented to illustrate our theoretical results. △ Less

Submitted 9 January, 2020; v1 submitted 16 April, 2018; originally announced April 2018.

Comments: Correct version

arXiv:1609.06381 [pdf, ps, other]

Consensus-based Privacy-preserving Data Aggregation

Authors: Jian** He, Lin Cai, Peng Cheng, Jian** Pan, Ling Shi

Abstract: Privacy-preserving data aggregation in ad hoc networks is a challenging problem, considering the distributed communication and control requirement, dynamic network topology, unreliable communication links, etc. Different from the widely used cryptographic approaches, in this paper, we address this challenging problem by exploiting the distributed consensus technique. We first propose a secure cons… ▽ More Privacy-preserving data aggregation in ad hoc networks is a challenging problem, considering the distributed communication and control requirement, dynamic network topology, unreliable communication links, etc. Different from the widely used cryptographic approaches, in this paper, we address this challenging problem by exploiting the distributed consensus technique. We first propose a secure consensus-based data aggregation (SCDA) algorithm that guarantees an accurate sum aggregation while preserving the privacy of sensitive data. Then, we prove that the proposed algorithm converges accurately and is $(ε, σ)$-data-privacy, and the mathematical relationship between $ε$ and $σ$ is provided. Extensive simulations have shown that the proposed algorithm has high accuracy and low complexity, and they are robust against network dynamics. △ Less

Submitted 6 February, 2018; v1 submitted 20 September, 2016; originally announced September 2016.

Comments: 8 pages

arXiv:1609.06368 [pdf, other]

Privacy-preserving Average Consensus: Privacy Analysis and Optimal Algorithm Design

Authors: Jian** He, Lin Cai, Chengcheng Zhao, Peng Cheng, ** Guan

Abstract: Privacy-preserving average consensus aims to guarantee the privacy of initial states and asymptotic consensus on the exact average of the initial value. In existing work, it is achieved by adding and subtracting variance decaying and zero-sum random noises to the consensus process. However, there is lack of theoretical analysis to quantify the degree of the privacy protection. In this paper, we in… ▽ More Privacy-preserving average consensus aims to guarantee the privacy of initial states and asymptotic consensus on the exact average of the initial value. In existing work, it is achieved by adding and subtracting variance decaying and zero-sum random noises to the consensus process. However, there is lack of theoretical analysis to quantify the degree of the privacy protection. In this paper, we introduce the maximum disclosure probability that the other nodes can infer one node's initial state within a given small interval to quantify the privacy. We develop a novel privacy definition, named $(ε, δ)$-data-privacy, to depict the relationship between maximum disclosure probability and estimation accuracy. Then, we prove that the general privacy-preserving average consensus (GPAC) provides $(ε, δ)$-data-privacy, and provide the closed-form expression of the relationship between $ε$ and $δ$. Meanwhile, it is shown that the added noise with uniform distribution is optimal in terms of achieving the highest $(ε, δ)$-data-privacy. We also prove that when all information used in the consensus process is available, the privacy will be compromised. Finally, an optimal privacy-preserving average consensus (OPAC) algorithm is proposed to achieve the highest $(ε, δ)$-data-privacy and avoid the privacy compromission. Simulations are conducted to verify the results. △ Less

Submitted 9 February, 2017; v1 submitted 20 September, 2016; originally announced September 2016.

Comments: 10 pages

Showing 1–36 of 36 results for author: Cheng, P