-
A Plug-and-Play Untrained Neural Network for Full Waveform Inversion in Reconstructing Sound Speed Images of Ultrasound Computed Tomography
Authors:
Weicheng Yan,
Qiude Zhang,
Yun Wu,
Zhaohui Liu,
Liang Zhou,
Mingyue Ding,
Ming Yuchi,
Wu Qiu
Abstract:
Ultrasound computed tomography (USCT), as an emerging technology, can provide multiple quantitative parametric images of human tissue, such as sound speed and attenuation images, distinguishing it from conventional B-mode (reflection) ultrasound imaging. Full waveform inversion (FWI) is acknowledged as a technique with the greatest potential for reconstructing high-resolution sound speed images in…
▽ More
Ultrasound computed tomography (USCT), as an emerging technology, can provide multiple quantitative parametric images of human tissue, such as sound speed and attenuation images, distinguishing it from conventional B-mode (reflection) ultrasound imaging. Full waveform inversion (FWI) is acknowledged as a technique with the greatest potential for reconstructing high-resolution sound speed images in USCT. However, traditional FWI for sound speed image reconstruction suffers from high sensitivity to the initial model caused by its strong non-convex nonlinearity, resulting in poor performance when ultrasound signals are at high frequencies. This limitation significantly restricts the application of FWI in the USCT imaging field. In this paper, we propose an untrained neural network (UNN) that can be integrated into the traditional iteration-based FWI framework as an implicit regularization prior. This integration allows for seamless deployment as a plug-and-play module within existing FWI algorithms or their variants. Notably, the proposed UNN method can be trained in an unsupervised fashion, a vital aspect in medical imaging where ground truth data is often unavailable. Evaluations of the numerical simulation and phantom experiment of the breast demonstrate that the proposed UNN improves the robustness of image reconstruction, reduces image artifacts, and achieves great image contrast. To the best of our knowledge, this study represents the first attempt to propose an implicit UNN for FWI in reconstructing sound speed images for USCT.
△ Less
Submitted 13 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Practical Explicit-time Stabilization of a Proportional Control System
Authors:
Wen Yan,
Tao Zhao
Abstract:
Proportional control can be realized directly through the amplification of analog signals, and it also has the advantage of easy tuning parameters in digital signal control. However, it is difficult for the proportional control to preset the upper bound of settling time. To address this problem, a novel practical explicit-time control method is proposed. In bounded initial condition, this method m…
▽ More
Proportional control can be realized directly through the amplification of analog signals, and it also has the advantage of easy tuning parameters in digital signal control. However, it is difficult for the proportional control to preset the upper bound of settling time. To address this problem, a novel practical explicit-time control method is proposed. In bounded initial condition, this method makes this system error converge to a predefined neighborhood of zero within an explicit time. More specifically, the initial condition set and conditionally stable set are solved by practical explicit-time stabilization theorem. Based on that, a proportional feedback control is founded to achieve practical conditional fixed-time stability.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Semi-weakly-supervised neural network training for medical image registration
Authors:
Yiwen Li,
Yunguan Fu,
Iani J. M. B. Gayo,
Qianye Yang,
Zhe Min,
Shaheer U. Saeed,
Wen Yan,
Yipei Wang,
J. Alison Noble,
Mark Emberton,
Matthew J. Clarkson,
Dean C. Barratt,
Victor A. Prisacariu,
Yipeng Hu
Abstract:
For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialis…
▽ More
For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialised effort. This paper describes a semi-weakly-supervised registration pipeline that improves the model performance, when only a small corresponding-ROI-labelled dataset is available, by exploiting unlabelled image pairs. We examine two types of augmentation methods by perturbation on network weights and image resampling, such that consistency-based unsupervised losses can be applied on unlabelled data. The novel WarpDDF and RegCut approaches are proposed to allow commutative perturbation between an image pair and the predicted spatial transformation (i.e. respective input and output of registration networks), distinct from existing perturbation methods for classification or segmentation. Experiments using 589 male pelvic MR images, labelled with eight anatomical ROIs, show the improvement in registration performance and the ablated contributions from the individual strategies. Furthermore, this study attempts to construct one of the first computational atlases for pelvic structures, enabled by registering inter-subject MRs, and quantifies the significant differences due to the proposed semi-weak supervision with a discussion on the potential clinical use of example atlas-derived statistics.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
On the Robustness of Deep Learning-aided Symbol Detectors to Varying Conditions and Imperfect Channel Knowledge
Authors:
Chin-Hung Chen,
Boris Karanov,
Wim van Houtum,
Wu Yan,
Alex Young,
Alex Alvarado
Abstract:
Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, it…
▽ More
Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, its generalization capabilities for practically-relevant time-varying channels, where the receiver can only access incorrect channel parameters, remain largely unexplored. The primary contribution of this paper is to expand upon the results from existing literature to encompass a variety of imperfect channel knowledge cases that appear in real-world transmissions. Our findings demonstrate that BCJRNet significantly outperforms the conventional BCJR algorithm for stationary transmission scenarios when learning from noisy channel data and with imperfect channel decay profiles. However, this advantage is shown to diminish when the operating channel is also rapidly time-varying. Our results also show the importance of memory assumptions for conventional BCJR and BCJRNet. An underestimation of the memory largely degrades the performance of both BCJR and BCJRNet, especially in a slow-decaying channel. To mimic a situation closer to a practical scenario, we also combined channel tap uncertainty with imperfect channel memory knowledge. Somewhat surprisingly, our results revealed improved performance when employing the conventional BCJR with an underestimated memory assumption. BCJRNet, on the other hand, showed a consistent performance improvement as the level of accurate memory knowledge increased.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Attentional Graph Neural Networks for Robust Massive Network Localization
Authors:
Wenzhong Yan,
Juntao Wang,
Feng Yin,
Yang Tian,
Abdelhak M. Zoubir
Abstract:
In recent years, Graph neural networks (GNNs) have emerged as a prominent tool for classification tasks in machine learning. However, their application in regression tasks remains underexplored. To tap the potential of GNNs in regression, this paper integrates GNNs with attention mechanism, a technique that revolutionized sequential learning tasks with its adaptability and robustness, to tackle a…
▽ More
In recent years, Graph neural networks (GNNs) have emerged as a prominent tool for classification tasks in machine learning. However, their application in regression tasks remains underexplored. To tap the potential of GNNs in regression, this paper integrates GNNs with attention mechanism, a technique that revolutionized sequential learning tasks with its adaptability and robustness, to tackle a challenging nonlinear regression problem: network localization. We first introduce a novel network localization method based on graph convolutional network (GCN), which exhibits exceptional precision even under severe non-line-of-sight (NLOS) conditions, thereby diminishing the need for laborious offline calibration or NLOS identification. We further propose an attentional graph neural network (AGNN) model, aimed at improving the limited flexibility and mitigating the high sensitivity to the hyperparameter of the GCN-based method. The AGNN comprises two crucial modules, each designed with distinct attention architectures to address specific issues associated with the GCN-based method, rendering it more practical in real-world scenarios. Experimental results substantiate the efficacy of our proposed GCN-based method and AGNN model, as well as the enhancements of AGNN model. Additionally, we delve into the performance improvements of AGNN model by analyzing it from the perspectives of dynamic attention and computational complexity.
△ Less
Submitted 14 February, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Unsupervised convolutional neural network fusion approach for change detection in remote sensing images
Authors:
Weidong Yan,
Pei Yan,
Li Cao
Abstract:
With the rapid development of deep learning, a variety of change detection methods based on deep learning have emerged in recent years. However, these methods usually require a large number of training samples to train the network model, so it is very expensive. In this paper, we introduce a completely unsupervised shallow convolutional neural network (USCNN) fusion approach for change detection.…
▽ More
With the rapid development of deep learning, a variety of change detection methods based on deep learning have emerged in recent years. However, these methods usually require a large number of training samples to train the network model, so it is very expensive. In this paper, we introduce a completely unsupervised shallow convolutional neural network (USCNN) fusion approach for change detection. Firstly, the bi-temporal images are transformed into different feature spaces by using convolution kernels of different sizes to extract multi-scale information of the images. Secondly, the output features of bi-temporal images at the same convolution kernels are subtracted to obtain the corresponding difference images, and the difference feature images at the same scale are fused into one feature image by using 1 * 1 convolution layer. Finally, the output features of different scales are concatenated and a 1 * 1 convolution layer is used to fuse the multi-scale information of the image. The model parameters are obtained by a redesigned sparse function. Our model has three features: the entire training process is conducted in an unsupervised manner, the network architecture is shallow, and the objective function is sparse. Thus, it can be seen as a kind of lightweight network model. Experimental results on four real remote sensing datasets indicate the feasibility and effectiveness of the proposed approach.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Wideband Beamforming for STAR-RIS-assisted THz Communications with Three-Side Beam Split
Authors:
Wencai Yan,
Wanming Hao,
Gangcan Sun,
Chongwen Huang,
Qingqing Wu
Abstract:
In this paper, we consider the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-assisted THz communications with three-side beam split. Except for the beam split at the base station (BS), we analyze the double-side beam split at the STAR-RIS for the first time. To relieve the double-side beam split effect, we propose a time delayer (TD)-based fully-connected…
▽ More
In this paper, we consider the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-assisted THz communications with three-side beam split. Except for the beam split at the base station (BS), we analyze the double-side beam split at the STAR-RIS for the first time. To relieve the double-side beam split effect, we propose a time delayer (TD)-based fully-connected structure at the STAR-RIS. As a further advance, a low-hardware complexity and low-power consumption sub-connected structure is developed, where multiple STAR-RIS elements share one TD. Meanwhile, considering the practical scenario, we investigate a multi-STAR-RIS and multi-user communication system, and a sum rate maximization problem is formulated by jointly optimizing the hybrid analog/digital beamforming, time delays at the BS as well as the double-layer phase-shift coefficients, time delays and amplitude coefficients at the STAR-RISs. Based on this, we first allocate users for each STAR-RIS, and then derive the analog beamforming, time delays at the BS, and the double-layer phase-shift coefficients, time delays at each STAR-RIS. Next, we develop an alternative optimization algorithm to calculate the digital beamforming at the BS and amplitude coefficients at the STAR-RISs. Finally, the numerical results verify the effectiveness of the proposed schemes.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Beamforming Design for the Distributed RISs-aided THz Communications with Double-Layer True Time Delays
Authors:
Gangcan Sun,
Wencai Yan,
Wanming Hao,
Chongwen Huang,
Chau Yuen
Abstract:
In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-r…
▽ More
In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-range delay devices, which involve additional insertion loss and amplification circuitry. Next, we analyze the system performance under the proposed double-layer TTD scheme. To relieve the beam split of the RIS, we consider multiple distributed RISs to replace an ultra-large size RIS. Based on this, we formulate an achievable rate maximization problem for the distributed RISs-aided THz communications via jointly optimizing the hybrid analog/digital beamforming, time delays of the double-layer TTD network and reflection coefficients of RISs. Considering the practical hardware limitation, the finite-resolution phase shift, time delay and reflection phase are constrained. To solve the formulated problem, we first design an analog beamforming scheme including optimizing phase shift and time delay based on the RISs' locations. Then, an alternatively optimization algorithm is proposed to obtain the digital beamforming and reflection coefficients based on the minimum mean square error and coordinate update techniques. Finally, simulation results show the effectiveness of the proposed scheme.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Twofold Structured Features-Based Siamese Network for Infrared Target Tracking
Authors:
Wei-Jie Yan,
Yun-Kai Xu,
Qian Chen,
Xiao-Fang Kong,
Guo-Hua Gu,
A-Jun Shao,
Min-Jie Wan
Abstract:
Nowadays, infrared target tracking has been a critical technology in the field of computer vision and has many applications, such as motion analysis, pedestrian surveillance, intelligent detection, and so forth. Unfortunately, due to the lack of color, texture and other detailed information, tracking drift often occurs when the tracker encounters infrared targets that vary in size or shape. To add…
▽ More
Nowadays, infrared target tracking has been a critical technology in the field of computer vision and has many applications, such as motion analysis, pedestrian surveillance, intelligent detection, and so forth. Unfortunately, due to the lack of color, texture and other detailed information, tracking drift often occurs when the tracker encounters infrared targets that vary in size or shape. To address this issue, we present a twofold structured features-based Siamese network for infrared target tracking. First of all, in order to improve the discriminative capacity for infrared targets, a novel feature fusion network is proposed to fuse both shallow spatial information and deep semantic information into the extracted features in a comprehensive manner. Then, a multi-template update module based on template update mechanism is designed to effectively deal with interferences from target appearance changes which are prone to cause early tracking failures. Finally, both qualitative and quantitative experiments are carried out on VOT-TIR 2016 dataset, which demonstrates that our method achieves the balance of promising tracking performance and real-time tracking speed against other out-of-the-art trackers.
△ Less
Submitted 26 June, 2024; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Combiner and HyperCombiner Networks: Rules to Combine Multimodality MR Images for Prostate Cancer Localisation
Authors:
Wen Yan,
Bernard Chiu,
Ziyi Shen,
Qianye Yang,
Tom Syer,
Zhe Min,
Shonit Punwani,
Mark Emberton,
David Atkinson,
Dean C. Barratt,
Yipeng Hu
Abstract:
One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant canc…
▽ More
One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant cancer. This work aims to demonstrate that it is feasible for low-dimensional parametric models to model such decision rules in the proposed Combiner networks, without compromising the accuracy of predicting radiologic labels: First, it is shown that either a linear mixture model or a nonlinear stacking model is sufficient to model PI-RADS decision rules for localising prostate cancer. Second, parameters of these (generalised) linear models are proposed as hyperparameters, to weigh multiple networks that independently represent individual image modalities in the Combiner network training, as opposed to end-to-end modality ensemble. A HyperCombiner network is developed to train a single image segmentation network that can be conditioned on these hyperparameters during inference, for much improved efficiency. Experimental results based on data from 850 patients, for the application of automating radiologist labelling multi-parametric MR, compare the proposed combiner networks with other commonly-adopted end-to-end networks. Using the added advantages of obtaining and interpreting the modality combining rules, in terms of the linear weights or odds-ratios on individual image modalities, three clinical applications are presented for prostate cancer segmentation, including modality availability assessment, importance quantification and rule discovery.
△ Less
Submitted 20 January, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
EVD Surgical Guidance with Retro-Reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device
Authors:
Haowei Li,
Wenqing Yan,
Du Liu,
Long Qian,
Yuxing Yang,
Yihao Liu,
Zhe Zhao,
Hui Ding,
Guangzhi Wang
Abstract:
Augmented Reality (AR) has been used to facilitate surgical guidance during External Ventricular Drain (EVD) surgery, reducing the risks of misplacement in manual operations. During this procedure, the key challenge is accurately estimating the spatial relationship between pre-operative images and actual patient anatomy in AR environment. This research proposes a novel framework utilizing Time of…
▽ More
Augmented Reality (AR) has been used to facilitate surgical guidance during External Ventricular Drain (EVD) surgery, reducing the risks of misplacement in manual operations. During this procedure, the key challenge is accurately estimating the spatial relationship between pre-operative images and actual patient anatomy in AR environment. This research proposes a novel framework utilizing Time of Flight (ToF) depth sensors integrated in commercially available AR Head Mounted Devices (HMD) for precise EVD surgical guidance. As previous studies have proven depth errors for ToF sensors, we first assessed their properties on AR-HMDs. Subsequently, a depth error model and patient-specific parameter identification method are introduced for accurate surface information. A tracking pipeline combining retro-reflective markers and point clouds is then proposed for accurate head tracking. The head surface is reconstructed using depth data for spatial registration, avoiding fixing tracking targets rigidly on the patient's skull. Firstly, $7.580\pm 1.488 mm$ depth value error was revealed on human skin, indicating the significance of depth correction. Our results showed that the error was reduced by over $85\%$ using proposed depth correction method on head phantoms in different materials. Meanwhile, the head surface reconstructed with corrected depth data achieved sub-millimetre accuracy. An experiment on sheep head revealed $0.79 mm$ reconstruction error. Furthermore, a user study was conducted for the performance in simulated EVD surgery, where five surgeons performed nine k-wire injections on a head phantom with virtual guidance. Results of this study revealed $2.09 \pm 0.16 mm$ translational accuracy and $2.97\pm 0.91$ degree orientational accuracy.
△ Less
Submitted 3 July, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Bi-parametric prostate MR image synthesis using pathology and sequence-conditioned stable diffusion
Authors:
Shaheer U. Saeed,
Tom Syer,
Wen Yan,
Qianye Yang,
Mark Emberton,
Shonit Punwani,
Matthew J. Clarkson,
Dean C. Barratt,
Yipeng Hu
Abstract:
We propose an image synthesis mechanism for multi-sequence prostate MR images conditioned on text, to control lesion presence and sequence, as well as to generate paired bi-parametric images conditioned on images e.g. for generating diffusion-weighted MR from T2-weighted MR for paired data, which are two challenging tasks in pathological image synthesis. Our proposed mechanism utilises and builds…
▽ More
We propose an image synthesis mechanism for multi-sequence prostate MR images conditioned on text, to control lesion presence and sequence, as well as to generate paired bi-parametric images conditioned on images e.g. for generating diffusion-weighted MR from T2-weighted MR for paired data, which are two challenging tasks in pathological image synthesis. Our proposed mechanism utilises and builds upon the recent stable diffusion model by proposing image-based conditioning for paired data generation. We validate our method using 2D image slices from real suspected prostate cancer patients. The realism of the synthesised images is validated by means of a blind expert evaluation for identifying real versus fake images, where a radiologist with 4 years experience reading urological MR only achieves 59.4% accuracy across all tested sequences (where chance is 50%). For the first time, we evaluate the realism of the generated pathology by blind expert identification of the presence of suspected lesions, where we find that the clinician performs similarly for both real and synthesised images, with a 2.9 percentage point difference in lesion identification accuracy between real and synthesised images, demonstrating the potentials in radiological training purposes. Furthermore, we also show that a machine learning model, trained for lesion identification, shows better performance (76.2% vs 70.4%, statistically significant improvement) when trained with real data augmented by synthesised data as opposed to training with only real images, demonstrating usefulness for model training.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition
Authors:
Jiacheng Zhang,
Wenyi Yan,
Ye Zhang
Abstract:
In this paper, a new speech feature fusion method is proposed for speaker recognition on the basis of the cross gate parallel convolutional neural network (CG-PCNN). The Mel filter bank features (MFBFs) of different frequency resolutions can be extracted from each speech frame of a speaker's speech by several Mel filter banks, where the numbers of the triangular filters in the Mel filter banks are…
▽ More
In this paper, a new speech feature fusion method is proposed for speaker recognition on the basis of the cross gate parallel convolutional neural network (CG-PCNN). The Mel filter bank features (MFBFs) of different frequency resolutions can be extracted from each speech frame of a speaker's speech by several Mel filter banks, where the numbers of the triangular filters in the Mel filter banks are different. Due to the frequency resolutions of these MFBFs are different, there are some complementaries for these MFBFs. The CG-PCNN is utilized to extract the deep features from these MFBFs, which applies a cross gate mechanism to capture the complementaries for improving the performance of the speaker recognition system. Then, the fusion feature can be obtained by concatenating these deep features for speaker recognition. The experimental results show that the speaker recognition system with the proposed speech feature fusion method is effective, and marginally outperforms the existing state-of-the-art systems.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Feedforward Control in the Presence of Input Nonlinearities: A Learning-based Approach
Authors:
Jilles van Hulst,
Maurice Poot,
Dragan Kostić,
Kai Wa Yan,
Jim Portegies,
Tom Oomen
Abstract:
Advanced feedforward control methods enable mechatronic systems to perform varying motion tasks with extreme accuracy and throughput. The aim of this paper is to develop a data-driven feedforward controller that addresses input nonlinearities, which are common in typical applications such as semiconductor back-end equipment. The developed method consists of parametric inverse-model feedforward tha…
▽ More
Advanced feedforward control methods enable mechatronic systems to perform varying motion tasks with extreme accuracy and throughput. The aim of this paper is to develop a data-driven feedforward controller that addresses input nonlinearities, which are common in typical applications such as semiconductor back-end equipment. The developed method consists of parametric inverse-model feedforward that is optimized for tracking error reduction by exploiting ideas from iterative learning control. Results on a simulated set-up indicate improved performance over existing identification methods for systems with nonlinearities at the input.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Prototypical few-shot segmentation for cross-institution male pelvic structures with spatial registration
Authors:
Yiwen Li,
Yunguan Fu,
Iani Gayo,
Qianye Yang,
Zhe Min,
Shaheer Saeed,
Wen Yan,
Yipei Wang,
J. Alison Noble,
Mark Emberton,
Matthew J. Clarkson,
Henkjan Huisman,
Dean Barratt,
Victor Adrian Prisacariu,
Yipeng Hu
Abstract:
The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of the support image data, which are labelled to classify or segment new classes, a task that otherwise requires substantially more training images and expert annotations. This work describes a fully 3D prototypical few-shot segmentation algorithm, such that the trained networks can be effectively ada…
▽ More
The prowess that makes few-shot learning desirable in medical image analysis is the efficient use of the support image data, which are labelled to classify or segment new classes, a task that otherwise requires substantially more training images and expert annotations. This work describes a fully 3D prototypical few-shot segmentation algorithm, such that the trained networks can be effectively adapted to clinically interesting structures that are absent in training, using only a few labelled images from a different institute. First, to compensate for the widely recognised spatial variability between institutions in episodic adaptation of novel classes, a novel spatial registration mechanism is integrated into prototypical learning, consisting of a segmentation head and an spatial alignment module. Second, to assist the training with observed imperfect alignment, support mask conditioning module is proposed to further utilise the annotation available from the support images. Extensive experiments are presented in an application of segmenting eight anatomical structures important for interventional planning, using a data set of 589 pelvic T2-weighted MR images, acquired at seven institutes. The results demonstrate the efficacy in each of the 3D formulation, the spatial registration, and the support mask conditioning, all of which made positive contributions independently or collectively. Compared with the previously proposed 2D alternatives, the few-shot segmentation performance was improved with statistical significance, regardless whether the support data come from the same or different institutes.
△ Less
Submitted 25 August, 2023; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Cross-Modality Image Registration using a Training-Time Privileged Third Modality
Authors:
Qianye Yang,
David Atkinson,
Yunguan Fu,
Tom Syer,
Wen Yan,
Shonit Punwani,
Matthew J. Clarkson,
Dean C. Barratt,
Tom Vercauteren,
Yipeng Hu
Abstract:
In this work, we consider the task of pairwise cross-modality image registration, which may benefit from exploiting additional images available only at training time from an additional modality that is different to those being registered. As an example, we focus on aligning intra-subject multiparametric Magnetic Resonance (mpMR) images, between T2-weighted (T2w) scans and diffusion-weighted scans…
▽ More
In this work, we consider the task of pairwise cross-modality image registration, which may benefit from exploiting additional images available only at training time from an additional modality that is different to those being registered. As an example, we focus on aligning intra-subject multiparametric Magnetic Resonance (mpMR) images, between T2-weighted (T2w) scans and diffusion-weighted scans with high b-value (DWI$_{high-b}$). For the application of localising tumours in mpMR images, diffusion scans with zero b-value (DWI$_{b=0}$) are considered easier to register to T2w due to the availability of corresponding features. We propose a learning from privileged modality algorithm, using a training-only imaging modality DWI$_{b=0}$, to support the challenging multi-modality registration problems. We present experimental results based on 369 sets of 3D multiparametric MRI images from 356 prostate cancer patients and report, with statistical significance, a lowered median target registration error of 4.34 mm, when registering the holdout DWI$_{high-b}$ and T2w image pairs, compared with that of 7.96 mm before registration. Results also show that the proposed learning-based registration networks enabled efficient registration with comparable or better accuracy, compared with a classical iterative algorithm and other tested learning-based methods with/without the additional modality. These compared algorithms also failed to produce any significantly improved alignment between DWI$_{high-b}$ and T2w in this challenging application.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Beamforming Analysis and Design for Wideband THz Reconfigurable Intelligent Surface Communications
Authors:
Wencai Yan,
Wanming Hao,
Chongwen Huang,
Gangcan Sun,
Osamu Muta,
Haris Gacanin,
Chau Yuen
Abstract:
Reconfigurable intelligent surface (RIS)-aided terahertz (THz) communications have been regarded as a promising candidate for future 6G networks because of its ultra-wide bandwidth and ultra-low power consumption. However, there exists the beam split problem, especially when the base station (BS) or RIS owns the large-scale antennas, which may lead to serious array gain loss. Therefore, in this pa…
▽ More
Reconfigurable intelligent surface (RIS)-aided terahertz (THz) communications have been regarded as a promising candidate for future 6G networks because of its ultra-wide bandwidth and ultra-low power consumption. However, there exists the beam split problem, especially when the base station (BS) or RIS owns the large-scale antennas, which may lead to serious array gain loss. Therefore, in this paper, we investigate the beam split and beamforming design problems in the THz RIS communications. Specifically, we first analyze the beam split effect caused by different RIS sizes, shapes and deployments. On this basis, we apply the fully connected time delayer phase shifter hybrid beamforming architecture at the BS and deploy distributed RISs to cooperatively mitigate the beam split effect. We aim to maximize the achievable sum rate by jointly optimizing the hybrid analog/digital beamforming, time delays at the BS and reflection coefficients at the RISs. To solve the formulated problem, we first design the analog beamforming and time delays based on different RISs physical directions, and then it is transformed into an optimization problem by jointly optimizing the digital beamforming and reflection coefficients. Next, we propose an alternatively iterative optimization algorithm to deal with it. Specifically, for given the reflection coefficients, we propose an iterative algorithm based on the minimum mean square error technique to obtain the digital beamforming. After, we apply LDR and MCQT methods to transform the original problem to a QCQP, which can be solved by ADMM technique to obtain the reflection coefficients. Finally, the digital beamforming and reflection coefficients are obtained via repeating the above processes until convergence. Simulation results verify that the proposed scheme can effectively alleviate the beam split effect and improve the system capacity.
△ Less
Submitted 23 June, 2023; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Optimization simulation of reflow welding based on prediction of regional center temperature field
Authors:
Yuan Sui,
Fan-yang Bu,
Zi-long Shao,
Wei Yan
Abstract:
Before reflow soldering of integrated electronic products, the numerical simulation of temperature control curve of reflow furnace is crucial for selecting proper parameters and improving the overall efficiency of reflow soldering process and product quality. According to the heat conduction law and the specific heat capacity formula, the first-order ordinary differential equation of the central t…
▽ More
Before reflow soldering of integrated electronic products, the numerical simulation of temperature control curve of reflow furnace is crucial for selecting proper parameters and improving the overall efficiency of reflow soldering process and product quality. According to the heat conduction law and the specific heat capacity formula, the first-order ordinary differential equation of the central temperature curve of the welding area with respect to the temperature distribution function in the furnace on the conveyor belt displacement is obtained. For the gap with small temperature difference, the sigmoid function is used to obtain a smooth interval temperature transition curve; For the gap with large temperature difference, the linear combination of exponential function and primary function is used to approach the actual concave function, so as to obtain the complete temperature distribution function in the furnace. The welding parameters are obtained by solving the ordinary differential equation, and a set of optimal process parameters consistent with the process boundary are obtained by calculating the mean square error between the predicted temperature field and the real temperature distribution. At the same time, a set of reflow optimization strategies are designed for speed interval prediction strategy, minimum parameter interval prediction strategy, and the most symmetrical parameter interval prediction of solder paste melting reflow area. The simulation results show that the temperature field prediction results obtained by this method are highly consistent with the actual sensor data, and have strong correlation. This method can help to select appropriate process parameters, optimize the production process, reduce equipment commissioning practice and optimize the solder joint quality of production products.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
The impact of using voxel-level segmentation metrics on evaluating multifocal prostate cancer localisation
Authors:
Wen Yan,
Qianye Yang,
Tom Syer,
Zhe Min,
Shonit Punwani,
Mark Emberton,
Dean C. Barratt,
Bernard Chiu,
Yipeng Hu
Abstract:
Dice similarity coefficient (DSC) and Hausdorff distance (HD) are widely used for evaluating medical image segmentation. They have also been criticised, when reported alone, for their unclear or even misleading clinical interpretation. DSCs may also differ substantially from HDs, due to boundary smoothness or multiple regions of interest (ROIs) within a subject. More importantly, either metric can…
▽ More
Dice similarity coefficient (DSC) and Hausdorff distance (HD) are widely used for evaluating medical image segmentation. They have also been criticised, when reported alone, for their unclear or even misleading clinical interpretation. DSCs may also differ substantially from HDs, due to boundary smoothness or multiple regions of interest (ROIs) within a subject. More importantly, either metric can also have a nonlinear, non-monotonic relationship with outcomes based on Type 1 and 2 errors, designed for specific clinical decisions that use the resulting segmentation. Whilst cases causing disagreement between these metrics are not difficult to postulate. This work first proposes a new asymmetric detection metric, adapting those used in object detection, for planning prostate cancer procedures. The lesion-level metrics is then compared with the voxel-level DSC and HD, whereas a 3D UNet is used for segmenting lesions from multiparametric MR (mpMR) images. Based on experimental results we report pairwise agreement and correlation 1) between DSC and HD, and 2) between voxel-level DSC and recall-controlled precision at lesion-level, with Cohen's [0.49, 0.61] and Pearson's [0.66, 0.76] (p-values}<0.001) at varying cut-offs. However, the differences in false-positives and false-negatives, between the actual errors and the perceived counterparts if DSC is used, can be as high as 152 and 154, respectively, out of the 357 test set lesions. We therefore carefully conclude that, despite of the significant correlations, voxel-level metrics such as DSC can misrepresent lesion-level detection accuracy for evaluating localisation of multifocal prostate cancer and should be interpreted with caution.
△ Less
Submitted 30 March, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Image quality assessment by overlap** task-specific and task-agnostic measures: application to prostate multiparametric MR images for cancer segmentation
Authors:
Shaheer U. Saeed,
Wen Yan,
Yunguan Fu,
Francesco Giganti,
Qianye Yang,
Zachary M. C. Baum,
Mirabela Rusu,
Richard E. Fan,
Geoffrey A. Sonn,
Mark Emberton,
Dean C. Barratt,
Yipeng Hu
Abstract:
Image quality assessment (IQA) in medical imaging can be used to ensure that downstream clinical tasks can be reliably performed. Quantifying the impact of an image on the specific target tasks, also named as task amenability, is needed. A task-specific IQA has recently been proposed to learn an image-amenability-predicting controller simultaneously with a target task predictor. This allows for th…
▽ More
Image quality assessment (IQA) in medical imaging can be used to ensure that downstream clinical tasks can be reliably performed. Quantifying the impact of an image on the specific target tasks, also named as task amenability, is needed. A task-specific IQA has recently been proposed to learn an image-amenability-predicting controller simultaneously with a target task predictor. This allows for the trained IQA controller to measure the impact an image has on the target task performance, when this task is performed using the predictor, e.g. segmentation and classification neural networks in modern clinical applications. In this work, we propose an extension to this task-specific IQA approach, by adding a task-agnostic IQA based on auto-encoding as the target task. Analysing the intersection between low-quality images, deemed by both the task-specific and task-agnostic IQA, may help to differentiate the underpinning factors that caused the poor target task performance. For example, common imaging artefacts may not adversely affect the target task, which would lead to a low task-agnostic quality and a high task-specific quality, whilst individual cases considered clinically challenging, which can not be improved by better imaging equipment or protocols, is likely to result in a high task-agnostic quality but a low task-specific quality. We first describe a flexible reward sha** strategy which allows for the adjustment of weighting between task-agnostic and task-specific quality scoring. Furthermore, we evaluate the proposed algorithm using a clinically challenging target task of prostate tumour segmentation on multiparametric magnetic resonance (mpMR) images, from 850 patients. The proposed reward sha** strategy, with appropriately weighted task-specific and task-agnostic qualities, successfully identified samples that need re-acquisition due to defected imaging process.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Few-shot image segmentation for cross-institution male pelvic organs using registration-assisted prototypical learning
Authors:
Yiwen Li,
Yunguan Fu,
Qianye Yang,
Zhe Min,
Wen Yan,
Henkjan Huisman,
Dean Barratt,
Victor Adrian Prisacariu,
Yipeng Hu
Abstract:
The ability to adapt medical image segmentation networks for a novel class such as an unseen anatomical or pathological structure, when only a few labelled examples of this class are available from local healthcare providers, is sought-after. This potentially addresses two widely recognised limitations in deploying modern deep learning models to clinical practice, expertise-and-labour-intensive la…
▽ More
The ability to adapt medical image segmentation networks for a novel class such as an unseen anatomical or pathological structure, when only a few labelled examples of this class are available from local healthcare providers, is sought-after. This potentially addresses two widely recognised limitations in deploying modern deep learning models to clinical practice, expertise-and-labour-intensive labelling and cross-institution generalisation. This work presents the first 3D few-shot interclass segmentation network for medical images, using a labelled multi-institution dataset from prostate cancer patients with eight regions of interest. We propose an image alignment module registering the predicted segmentation of both query and support data, in a standard prototypical learning algorithm, to a reference atlas space. The built-in registration mechanism can effectively utilise the prior knowledge of consistent anatomy between subjects, regardless whether they are from the same institution or not. Experimental results demonstrated that the proposed registration-assisted prototypical learning significantly improved segmentation accuracy (p-values<0.01) on query data from a holdout institution, with varying availability of support data from multiple institutions. We also report the additional benefits of the proposed 3D networks with 75% fewer parameters and an arguably simpler implementation, compared with existing 2D few-shot approaches that segment 2D slices of volumetric medical images.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Frequency Reflection Modulation for Reconfigurable Intelligent Surface Aided OFDM Systems
Authors:
Wen**g Yan,
Xiaojun Yuan,
Xuanyu Cao
Abstract:
Reconfigurable intelligent surface (RIS) based reflection modulation has been considered as a promising information delivery mechanism, and has the potential to realize passive information transfer of a RIS without consuming any additional radio frequency chain and time/frequency/energy resources. The existing on-off reflection modulation (ORM) schemes are based on manipulating the "on/off" states…
▽ More
Reconfigurable intelligent surface (RIS) based reflection modulation has been considered as a promising information delivery mechanism, and has the potential to realize passive information transfer of a RIS without consuming any additional radio frequency chain and time/frequency/energy resources. The existing on-off reflection modulation (ORM) schemes are based on manipulating the "on/off" states of RIS elements, which may lead to the degradation of RIS reflection efficiency. This paper proposes a frequency reflection modulation (FRM) method for RIS-aided OFDM systems. The FRM-OFDM scheme modulates the frequency of the incident electromagnetic waves, and the RIS information is embedded in the frequency-ho** states of RIS elements. Unlike the ORM-OFDM scheme, the FRM-OFDM scheme can achieve higher reflection efficiency, since the latter does not turn off any reflection element in reflection modulation. We propose a block coordinate descent (BCD) algorithm to maximize the user achievable rate for the FRM-OFDM system by jointly optimizing the phase shift of the RIS and the power allocation at the transmitter. Further, we design a bilinear message passing (BMP) algorithm for the bilinear recovery of both the user symbols and the RIS data. Numerical simulations have verified the efficiency of the designed BCD algorithm for system optimization and the BMP algorithm for signal detection, as well as the superiority of the proposed FRM-OFDM scheme over the ORM-OFDM scheme.
△ Less
Submitted 1 March, 2022; v1 submitted 19 April, 2021;
originally announced June 2021.
-
Controlling False Positive/Negative Rates for Deep-Learning-Based Prostate Cancer Detection on Multiparametric MR images
Authors:
Zhe Min,
Fernando J. Bianco,
Qianye Yang,
Rachael Rodell,
Wen Yan,
Dean Barratt,
Yipeng Hu
Abstract:
Prostate cancer (PCa) is one of the leading causes of death for men worldwide. Multi-parametric magnetic resonance (mpMR) imaging has emerged as a non-invasive diagnostic tool for detecting and localising prostate tumours by specialised radiologists. These radiological examinations, for example, for differentiating malignant lesions from benign prostatic hyperplasia in transition zones and for def…
▽ More
Prostate cancer (PCa) is one of the leading causes of death for men worldwide. Multi-parametric magnetic resonance (mpMR) imaging has emerged as a non-invasive diagnostic tool for detecting and localising prostate tumours by specialised radiologists. These radiological examinations, for example, for differentiating malignant lesions from benign prostatic hyperplasia in transition zones and for defining the boundaries of clinically significant cancer, remain challenging and highly skill-and-experience-dependent. We first investigate experimental results in develo** object detection neural networks that are trained to predict the radiological assessment, using these high-variance labels. We further argue that such a computer-assisted diagnosis (CAD) system needs to have the ability to control the false-positive rate (FPR) or false-negative rate (FNR), in order to be usefully deployed in a clinical workflow, informing clinical decisions without further human intervention. This work proposes a novel PCa detection network that incorporates a lesion-level cost-sensitive loss and an additional slice-level loss based on a lesion-to-slice map** function, to manage the lesion- and slice-level costs, respectively. Our experiments based on 290 clinical patients concludes that 1) The lesion-level FNR was effectively reduced from 0.19 to 0.10 and the lesion-level FPR was reduced from 1.03 to 0.66 by changing the lesion-level cost; 2) The slice-level FNR was reduced from 0.19 to 0.00 by taking into account the slice-level cost; (3) Both lesion-level and slice-level FNRs were reduced with lower FP/FPR by changing the lesion-level or slice-level costs, compared with post-training threshold adjustment using networks without the proposed cost-aware training.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
A Computational Design and Evaluation Tool for 3D Structures with Planar Surfaces
Authors:
Chang Liu,
Wenzhong Yan,
Pehuen Moure,
Cody Fan,
Ankur Mehta
Abstract:
Three dimensional (3D) structures composed of planar surfaces can be build out of accessible materials using easier fabrication technique with shorter fabrication time. To better design 3D structures with planar surfaces, realistic models are required to understand and evaluate mechanical behaviors. Existing design tools are either effort-consuming (e.g. finite element analysis) or bounded by assu…
▽ More
Three dimensional (3D) structures composed of planar surfaces can be build out of accessible materials using easier fabrication technique with shorter fabrication time. To better design 3D structures with planar surfaces, realistic models are required to understand and evaluate mechanical behaviors. Existing design tools are either effort-consuming (e.g. finite element analysis) or bounded by assumptions (e.g. numerical solutions). In this project, We have built a computational design tool that is (1) capable of rapidly and inexpensively evaluating planar surfaces in 3D structures, with sufficient computational efficiency and accuracy; (2) applicable to complex boundary conditions and loading conditions, both isotropic materials and orthotropic materials; and (3) suitable for rapid accommodation when design parameters need to be adjusted. We demonstrate the efficiency and necessity of this design tool by evaluating a glass table as well as a wood bookcase, and iteratively designing an origami gripper to satisfy performance requirements. This design tool gives non-expert users as well as engineers a simple and effective modus operandi in structural design.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Graph Neural Network for Large-Scale Network Localization
Authors:
Wenzhong Yan,
Di **,
Zhidi Lin,
Feng Yin
Abstract:
Graph neural networks (GNNs) are popular to use for classifying structured data in the context of machine learning. But surprisingly, they are rarely applied to regression problems. In this work, we adopt GNN for a classic but challenging nonlinear regression problem, namely the network localization. Our main findings are in order. First, GNN is potentially the best solution to large-scale network…
▽ More
Graph neural networks (GNNs) are popular to use for classifying structured data in the context of machine learning. But surprisingly, they are rarely applied to regression problems. In this work, we adopt GNN for a classic but challenging nonlinear regression problem, namely the network localization. Our main findings are in order. First, GNN is potentially the best solution to large-scale network localization in terms of accuracy, robustness and computational time. Second, proper thresholding of the communication range is essential to its superior performance. Simulation results corroborate that the proposed GNN based method outperforms all state-of-the-art benchmarks by far. Such inspiring results are theoretically justified in terms of data aggregation, non-line-of-sight (NLOS) noise removal and low-pass filtering effect, all affected by the threshold for neighbor selection. Code is available at https://github.com/Yanzongzi/GNN-For-localization.
△ Less
Submitted 15 February, 2021; v1 submitted 22 October, 2020;
originally announced October 2020.
-
Low-complexity Point Cloud Filtering for LiDAR by PCA-based Dimension Reduction
Authors:
Yao Duan,
Chuanchuan Yang,
Hao Chen,
Weizhen Yan,
Hongbin Li
Abstract:
Signals emitted by LiDAR sensors would often be negatively influenced during transmission by rain, fog, dust, atmospheric particles, scattering of light and other influencing factors, causing noises in point cloud images. To address this problem, this paper develops a new noise reduction method to filter LiDAR point clouds, i.e. an adaptive clustering method based on principal component analysis (…
▽ More
Signals emitted by LiDAR sensors would often be negatively influenced during transmission by rain, fog, dust, atmospheric particles, scattering of light and other influencing factors, causing noises in point cloud images. To address this problem, this paper develops a new noise reduction method to filter LiDAR point clouds, i.e. an adaptive clustering method based on principal component analysis (PCA). Different from the traditional filtering methods that directly process three-dimension (3D) point cloud data, the proposed method uses dimension reduction to generate two-dimension (2D) data by extracting the first principal component and the second principal component of the original data with little information attrition. In the 2D space spanned by two principal components, the generated 2D data are clustered for noise reduction before being restored into 3D. Through dimension reduction and the clustering of the generated 2D data, this method derives low computational complexity, effectively removing noises while retaining details of environmental features. Compared with traditional filtering algorithms, the proposed method has higher precision and recall. Experimental results show a F-score as high as 0.92 with complexity reduced by 50% compared with traditional density-based clustering method.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Improving Workflow Integration with xPath: Design and Evaluation of a Human-AI Diagnosis System in Pathology
Authors:
Hongyan Gu,
Yuan Liang,
Yifan Xu,
Christopher Kazu Williams,
Shino Magaki,
Negar Khanlou,
Harry Vinters,
Zesheng Chen,
Shuo Ni,
Chunxu Yang,
Wenzhong Yan,
Xinhai Robert Zhang,
Yang Li,
Mohammad Haeri,
Xiang 'Anthony' Chen
Abstract:
Recent developments in AI have provided assisting tools to support pathologists' diagnoses. However, it remains challenging to incorporate such tools into pathologists' practice; one main concern is AI's insufficient workflow integration with medical decisions. We observed pathologists' examination and discovered that the main hindering factor to integrate AI is its incompatibility with pathologis…
▽ More
Recent developments in AI have provided assisting tools to support pathologists' diagnoses. However, it remains challenging to incorporate such tools into pathologists' practice; one main concern is AI's insufficient workflow integration with medical decisions. We observed pathologists' examination and discovered that the main hindering factor to integrate AI is its incompatibility with pathologists' workflow. To bridge the gap between pathologists and AI, we developed a human-AI collaborative diagnosis tool -- xPath -- that shares a similar examination process to that of pathologists, which can improve AI's integration into their routine examination. The viability of xPath is confirmed by a technical evaluation and work sessions with twelve medical professionals in pathology. This work identifies and addresses the challenge of incorporating AI models into pathology, which can offer first-hand knowledge about how HCI researchers can work with medical professionals side-by-side to bring technological advances to medical tasks towards practical applications.
△ Less
Submitted 7 December, 2022; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Reconfigurable-Intelligent-Surface Empowered Wireless Communications: Challenges and Opportunities
Authors:
Xiaojun Yuan,
Ying-Jun Angela Zhang,
Yuanming Shi,
Wen**g Yan,
Hang Liu
Abstract:
Reconfigurable intelligent surfaces (RISs) are regarded as a promising emerging hardware technology to improve the spectrum and energy efficiency of wireless networks by artificially reconfiguring the propagation environment of electromagnetic waves. Due to the unique advantages in enhancing wireless channel capacity, RISs have recently become a hot research topic. In this article, we focus on thr…
▽ More
Reconfigurable intelligent surfaces (RISs) are regarded as a promising emerging hardware technology to improve the spectrum and energy efficiency of wireless networks by artificially reconfiguring the propagation environment of electromagnetic waves. Due to the unique advantages in enhancing wireless channel capacity, RISs have recently become a hot research topic. In this article, we focus on three fundamental physical-layer challenges for the incorporation of RISs into wireless networks, namely, channel state information acquisition, passive information transfer, and low-complexity robust system design. We summarize the state-of-the-art solutions and explore potential research directions. Furthermore, we discuss other promising research directions of RISs, including edge intelligence and physical-layer security.
△ Less
Submitted 16 August, 2020; v1 submitted 2 January, 2020;
originally announced January 2020.
-
Audio-based automatic mating success prediction of giant pandas
Authors:
WeiRan Yan,
MaoLin Tang,
Qijun Zhao,
Peng Chen,
Dunwu Qi,
Rong Hou,
Zhihe Zhang
Abstract:
Giant pandas, stereotyped as silent animals, make significantly more vocal sounds during breeding season, suggesting that sounds are essential for coordinating their reproduction and expression of mating preference. Previous biological studies have also proven that giant panda sounds are correlated with mating results and reproduction. This paper makes the first attempt to devise an automatic meth…
▽ More
Giant pandas, stereotyped as silent animals, make significantly more vocal sounds during breeding season, suggesting that sounds are essential for coordinating their reproduction and expression of mating preference. Previous biological studies have also proven that giant panda sounds are correlated with mating results and reproduction. This paper makes the first attempt to devise an automatic method for predicting mating success of giant pandas based on their vocal sounds. Given an audio sequence of mating giant pandas recorded during breeding encounters, we first crop out the segments with vocal sound of giant pandas, and normalize its magnitude, and length. We then extract acoustic features from the audio segment and feed the features into a deep neural network, which classifies the mating into success or failure. The proposed deep neural network employs convolution layers followed by bidirection gated recurrent units to extract vocal features, and applies attention mechanism to force the network to focus on most relevant features. Evaluation experiments on a data set collected during the past nine years obtain promising results, proving the potential of audio-based automatic mating success prediction methods in assisting giant panda reproduction.
△ Less
Submitted 3 June, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Passive Beamforming and Information Transfer Design for Reconfigurable Intelligent Surfaces Aided Multiuser MIMO Systems
Authors:
Wen**g Yan,
Xiaojun Yuan,
Zhen-Qing He,
Xiaoyan Kuai
Abstract:
This paper investigates the passive beamforming and information transfer (PBIT) technique for the multiuser multiple-input multiple-output (MIMO) systems with the aid of a reconfigurable intelligent surface (RIS), where the RIS enhances the primary communication via passive beamforming and at the same time delivers additional information by the spatial modulation (which adjusts the on-off states o…
▽ More
This paper investigates the passive beamforming and information transfer (PBIT) technique for the multiuser multiple-input multiple-output (MIMO) systems with the aid of a reconfigurable intelligent surface (RIS), where the RIS enhances the primary communication via passive beamforming and at the same time delivers additional information by the spatial modulation (which adjusts the on-off states of the reflecting elements). For the passive beamforming design, we propose to maximize the sum channel capacity of the RIS-aided multiuser MIMO channel and formulate the problem as a two-step stochastic program. A sample average approximation (SAA) based iterative algorithm is developed for the efficient passive beamforming design of the considered scheme. To strike a balance between complexity and performance, we then propose a simplified beamforming algorithm by approximating the stochastic program as a deterministic alternating optimization problem. For the receiver design, the signal detection at the receiver is a bilinear estimation problem since the RIS information is multiplicatively modulated onto the reflected signals of the reflecting elements. To solve this bilinear estimation problem, we develop a turbo message passing (TMP) algorithm in which the factor graph associated with the problem is divided into two modules: one for the estimation of the user signals and the other for the estimation of the RIS's on-off states. The two modules are executed iteratively to yield a near-optimal low-complexity solution. Furthermore, we extend the design of the multiuser MIMO PBIT scheme from single-RIS to multi-RIS, by leveraging the similarity between the single-RIS and multi-RIS system models. Extensive simulation results are provided to demonstrate the advantages of our passive beamforming and receiver designs.
△ Less
Submitted 28 December, 2019; v1 submitted 21 December, 2019;
originally announced December 2019.
-
The Domain Shift Problem of Medical Image Segmentation and Vendor-Adaptation by Unet-GAN
Authors:
Wenjun Yan,
Yuanyuan Wang,
Shengjia Gu,
Lu Huang,
Fuhua Yan,
Liming Xia,
Qian Tao
Abstract:
Convolutional neural network (CNN), in particular the Unet, is a powerful method for medical image segmentation. To date Unet has demonstrated state-of-art performance in many complex medical image segmentation tasks, especially under the condition when the training and testing data share the same distribution (i.e. come from the same source domain). However, in clinical practice, medical images a…
▽ More
Convolutional neural network (CNN), in particular the Unet, is a powerful method for medical image segmentation. To date Unet has demonstrated state-of-art performance in many complex medical image segmentation tasks, especially under the condition when the training and testing data share the same distribution (i.e. come from the same source domain). However, in clinical practice, medical images are acquired from different vendors and centers. The performance of a U-Net trained from a particular source domain, when transferred to a different target domain (e.g. different vendor, acquisition parameter), can drop unexpectedly. Collecting a large amount of annotation from each new domain to retrain the U-Net is expensive, tedious, and practically impossible. In this work, we proposed a generic framework to address this problem, consisting of (1) an unpaired generative adversarial network (GAN) for vendor-adaptation, and (2) a Unet for object segmentation. In the proposed Unet-GAN architecture, GAN learns from Unet at the feature level that is segmentation-specific. We used cardiac cine MRI as the example, with three major vendors (Philips, Siemens, and GE) as three domains, while the methodology can be extended to medical images segmentation in general. The proposed method showed significant improvement of the segmentation results across vendors. The proposed Unet-GAN provides an annotation-free solution to the cross-vendor medical image segmentation problem, potentially extending a trained deep learning model to multi-center and multi-vendor use in real clinical scenario.
△ Less
Submitted 30 October, 2019;
originally announced October 2019.
-
Double-Sparsity Learning Based Channel-and-Signal Estimation in Massive MIMO with Generalized Spatial Modulation
Authors:
Xiaoyan Kuai,
Xiaojun Yuan,
Wen**g Yan,
Hang Liu,
Ying Jun,
Zhang
Abstract:
In this paper, we study joint antenna activity detection, channel estimation, and multiuser detection for massive multiple-input multiple-output (MIMO) systems with general spatial modulation (GSM). We first establish a double-sparsity massive MIMO model by considering the channel sparsity of the massive MIMO channel and the signal sparsity of GSM. Based on the double-sparsity model, we formulate…
▽ More
In this paper, we study joint antenna activity detection, channel estimation, and multiuser detection for massive multiple-input multiple-output (MIMO) systems with general spatial modulation (GSM). We first establish a double-sparsity massive MIMO model by considering the channel sparsity of the massive MIMO channel and the signal sparsity of GSM. Based on the double-sparsity model, we formulate a blind detection problem. To solve the blind detection problem, we develop message-passing based blind channel-and-signal estimation (BCSE) algorithm. The BCSE algorithm basically follows the affine sparse matrix factorization technique, but with critical modifications to handle the double-sparsity property of the model. We show that the BCSE algorithm significantly outperforms the existing blind and training-based algorithms, and is able to closely approach the genie bounds (with either known channel or known signal). In the BCSE algorithm, short pilots are employed to remove the phase and permutation ambiguities after sparse matrix factorization. To utilize the short pilots more efficiently, we further develop the semi-blind channel-and-signal estimation (SBCSE) algorithm to incorporate the estimation of the phase and permutation ambiguities into the iterative message-passing process. We show that the SBCSE algorithm substantially outperforms the counterpart algorithms including the BCSE algorithm in the short-pilot regime.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Convolutional Neural Networks for Space-Time Block Coding Recognition
Authors:
Wenjun Yan,
Qing Ling,
Limin Zhang
Abstract:
We apply the latest advances in machine learning with deep neural networks to the tasks of radio modulation recognition, channel coding recognition, and spectrum monitoring. This paper first proposes an identification algorithm for space-time block coding of a signal. The feature between spatial multiplexing and Alamouti signals is extracted by adapting convolutional neural networks after preproce…
▽ More
We apply the latest advances in machine learning with deep neural networks to the tasks of radio modulation recognition, channel coding recognition, and spectrum monitoring. This paper first proposes an identification algorithm for space-time block coding of a signal. The feature between spatial multiplexing and Alamouti signals is extracted by adapting convolutional neural networks after preprocessing the received sequence. Unlike other algorithms, this method requires no prior information of channel coefficients and noise power, and consequently is well-suited for noncooperative contexts. Results show that the proposed algorithm performs well even at a low signal-to-noise ratio
△ Less
Submitted 12 February, 2020; v1 submitted 18 October, 2019;
originally announced October 2019.
-
Deep AutoEncoder-based Lossy Geometry Compression for Point Clouds
Authors:
Wei Yan,
Yiting shao,
Shan Liu,
Thomas H Li,
Zhu Li,
Ge Li
Abstract:
Point cloud is a fundamental 3D representation which is widely used in real world applications such as autonomous driving. As a newly-developed media format which is characterized by complexity and irregularity, point cloud creates a need for compression algorithms which are more flexible than existing codecs. Recently, autoencoders(AEs) have shown their effectiveness in many visual analysis tasks…
▽ More
Point cloud is a fundamental 3D representation which is widely used in real world applications such as autonomous driving. As a newly-developed media format which is characterized by complexity and irregularity, point cloud creates a need for compression algorithms which are more flexible than existing codecs. Recently, autoencoders(AEs) have shown their effectiveness in many visual analysis tasks as well as image compression, which inspires us to employ it in point cloud compression. In this paper, we propose a general autoencoder-based architecture for lossy geometry point cloud compression. To the best of our knowledge, it is the first autoencoder-based geometry compression codec that directly takes point clouds as input rather than voxel grids or collections of images. Compared with handcrafted codecs, this approach adapts much more quickly to previously unseen media contents and media formats, meanwhile achieving competitive performance. Our architecture consists of a pointnet-based encoder, a uniform quantizer, an entropy estimation block and a nonlinear synthesis transformation module. In lossy geometry compression of point cloud, results show that the proposed method outperforms the test model for categories 1 and 3 (TMC13) published by MPEG-3DG group on the 125th meeting, and on average a 73.15\% BD-rate gain is achieved.
△ Less
Submitted 17 April, 2019;
originally announced May 2019.
-
Passive Beamforming and Information Transfer via Large Intelligent Surface
Authors:
Wen**g Yan,
Xiaoyan Kuai,
Xiaojun Yuan
Abstract:
Large intelligent surface (LIS) has emerged as a promising new solution to improve the energy and spectrum efficiency of wireless networks. A LIS, composed of a large number of low-cost and energy-efficient reconfigurable passive reflecting elements, enhances wireless communications by reflecting im**ing electro-magnetic waves. In this paper, we propose a novel passive beamforming and informatio…
▽ More
Large intelligent surface (LIS) has emerged as a promising new solution to improve the energy and spectrum efficiency of wireless networks. A LIS, composed of a large number of low-cost and energy-efficient reconfigurable passive reflecting elements, enhances wireless communications by reflecting im**ing electro-magnetic waves. In this paper, we propose a novel passive beamforming and information transfer (PBIT) technique, in which the LIS simultaneously enhances the primary communication and sends information to the receiver. We develop a passive beamforming method to improve the average receive signal-to-noise ratio (SNR).We also establish a two-step approach at the receiver to retrieve the information from both the transmitter and the LIS. Numerical results show that the proposed PBIT system, especially with the optimized passive beamforming, significantly outperforms the system without LIS enhancement. Furthermore, a tradeoff between the passive-beamforming gain and the information rate of the LIS has been demonstrated.
△ Less
Submitted 7 August, 2019; v1 submitted 4 May, 2019;
originally announced May 2019.
-
Fully Distributed DC Optimal Power Flow Based on Distributed Economic Dispatch and Distributed State Estimation
Authors:
Qiao Li,
David Wenzhong Gao,
Lin Cheng,
Fang Zhang,
Weihang Yan
Abstract:
Optimal power flow (OPF) is an important technique for power systems to achieve optimal operation while satisfying multiple constraints. The traditional OPF are mostly centralized methods which are executed in the centralized control center. This paper introduces a totally Distributed DC Optimal Power Flow (DDCOPF) method for future power systems which have more and more distributed generators. Th…
▽ More
Optimal power flow (OPF) is an important technique for power systems to achieve optimal operation while satisfying multiple constraints. The traditional OPF are mostly centralized methods which are executed in the centralized control center. This paper introduces a totally Distributed DC Optimal Power Flow (DDCOPF) method for future power systems which have more and more distributed generators. The proposed method is based on the Distributed Economic Dispatch (DED) method and the Distributed State Estimation (DSE) method. In this proposed scheme, the DED method is used to achieve the optimal power dispatch with the lowest cost, and the DSE method provides power flow information of the power system to the proposed DDCOPF algorithm. In the proposed method, the Auto-Regressive (AR) model is used to predict the load variation so that the proposed algorithm can prevent overflow. In addition, a method called constraint algorithm is developed to correct the results of DED with the proposed correction algorithm and penalty term so that the constraints for the power system will not be violated. Different from existing research, the proposed method is completely distributed without need for any centralized facility.
△ Less
Submitted 4 March, 2019;
originally announced March 2019.