Search | arXiv e-print repository

arXiv:2406.19833 [pdf, other]

LightStereo: Channel Boost Is All Your Need for Efficient 2D Cost Aggregation

Authors: Xianda Guo, Chenming Zhang, Dujun Nie, Wenzhao Zheng, Youmin Zhang, Long Chen

Abstract: We present LightStereo, a cutting-edge stereo-matching network crafted to accelerate the matching process. Departing from conventional methodologies that rely on aggregating computationally intensive 4D costs, LightStereo adopts the 3D cost volume as a lightweight alternative. While similar approaches have been explored previously, our breakthrough lies in enhancing performance through a dedicated… ▽ More We present LightStereo, a cutting-edge stereo-matching network crafted to accelerate the matching process. Departing from conventional methodologies that rely on aggregating computationally intensive 4D costs, LightStereo adopts the 3D cost volume as a lightweight alternative. While similar approaches have been explored previously, our breakthrough lies in enhancing performance through a dedicated focus on the channel dimension of the 3D cost volume, where the distribution of matching costs is encapsulated. Our exhaustive exploration has yielded plenty of strategies to amplify the capacity of the pivotal dimension, ensuring both precision and efficiency. We compare the proposed LightStereo with existing state-of-the-art methods across various benchmarks, which demonstrate its superior performance in speed, accuracy, and resource utilization. LightStereo achieves a competitive EPE metric in the SceneFlow datasets while demanding a minimum of only 22 GFLOPs, with an inference time of just 17 ms. Our comprehensive analysis reveals the effect of 2D cost aggregation for stereo matching, paving the way for real-world applications of efficient stereo systems. Code will be available at \url{https://github.com/XiandaGuo/OpenStereo}. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Code will be available at \url{https://github.com/XiandaGuo/OpenStereo}

arXiv:2306.03622 [pdf, other]

FaaSwap: SLO-Aware, GPU-Efficient Serverless Inference via Model Swap**

Authors: Minchen Yu, Ao Wang, Dong Chen, Haoxuan Yu, Xiaonan Luo, Zhuohao Li, Wei Wang, Ruichuan Chen, Dapeng Nie, Haoran Yang

Abstract: Serverless computing has become increasingly popular for machine learning inference. However, current serverless platforms lack efficient support for GPUs, limiting their ability to deliver low-latency inference. In this paper, we propose FaaSwap, a GPU-efficient serverless inference platform. FaaSwap employs a holistic approach to system and algorithm design. It maintains models in main memory an… ▽ More Serverless computing has become increasingly popular for machine learning inference. However, current serverless platforms lack efficient support for GPUs, limiting their ability to deliver low-latency inference. In this paper, we propose FaaSwap, a GPU-efficient serverless inference platform. FaaSwap employs a holistic approach to system and algorithm design. It maintains models in main memory and dynamically swaps them onto GPUs upon request arrivals (i.e., late binding), thereby enabling a large number of inference functions to efficiently share a node's GPUs. FaaSwap uses various techniques, including asynchronous API redirection, GPU runtime sharing, pipelined model execution, and efficient GPU memory management, to achieve the optimal performance. We also develop an interference-aware request scheduling algorithm that allows FaaSwap to meet the latency SLOs for individual inference functions. We have implemented FaaSwap as a prototype on a leading commercial serverless platform. Experimental evaluations demonstrate that, with model swap**, FaaSwap can concurrently serve hundreds of functions on a single worker node with 4 V100 GPUs, while achieving inference performance comparable to native execution (where each function runs on a dedicated GPU). When deployed on a 6-node production testbed, FaaSwap meets the latency SLOs for over 1k functions, the maximum that the testbed can handle concurrently. △ Less

Submitted 8 February, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

arXiv:2304.11409 [pdf, other]

The Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior

Authors: Yilin Liu, Jiang Li, Yunkui Pang, Dong Nie, Pew-thian Yap

Abstract: Deep Image Prior (DIP) shows that some network architectures naturally bias towards smooth images and resist noises, a phenomenon known as spectral bias. Image denoising is an immediate application of this property. Although DIP has removed the requirement of large training sets, it still presents two practical challenges for denoising: architectural design and noise-fitting, which are often inter… ▽ More Deep Image Prior (DIP) shows that some network architectures naturally bias towards smooth images and resist noises, a phenomenon known as spectral bias. Image denoising is an immediate application of this property. Although DIP has removed the requirement of large training sets, it still presents two practical challenges for denoising: architectural design and noise-fitting, which are often intertwined. Existing methods mostly handcraft or search for the architecture from a large design space, due to the lack of understanding on how the architectural choice corresponds to the image. In this study, we analyze from a frequency perspective to demonstrate that the unlearnt upsampling is the main driving force behind the denoising phenomenon in DIP. This finding then leads to strategies for estimating a suitable architecture for every image without a laborious search. Extensive experiments show that the estimated architectures denoise and preserve the textural details better than current methods with up to 95% fewer parameters. The under-parameterized nature also makes them especially robust to a higher level of noise. △ Less

Submitted 26 August, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

Comments: Accepted to ICCV 2023

arXiv:2204.04797 [pdf, other]

doi 10.1109/TKDE.2023.3310909

Multi-Label Clinical Time-Series Generation via Conditional GAN

Authors: Chang Lu, Chandan K. Reddy, ** Wang, Dong Nie, Yue Ning

Abstract: In recent years, deep learning has been successfully adopted in a wide range of applications related to electronic health records (EHRs) such as representation learning and clinical event prediction. However, due to privacy constraints, limited access to EHR becomes a bottleneck for deep learning research. To mitigate these concerns, generative adversarial networks (GANs) have been successfully us… ▽ More In recent years, deep learning has been successfully adopted in a wide range of applications related to electronic health records (EHRs) such as representation learning and clinical event prediction. However, due to privacy constraints, limited access to EHR becomes a bottleneck for deep learning research. To mitigate these concerns, generative adversarial networks (GANs) have been successfully used for generating EHR data. However, there are still challenges in high-quality EHR generation, including generating time-series EHR data and imbalanced uncommon diseases. In this work, we propose a Multi-label Time-series GAN (MTGAN) to generate EHR and simultaneously improve the quality of uncommon disease generation. The generator of MTGAN uses a gated recurrent unit (GRU) with a smooth conditional matrix to generate sequences and uncommon diseases. The critic gives scores using Wasserstein distance to recognize real samples from synthetic samples by considering both data and temporal features. We also propose a training strategy to calculate temporal features for real data and stabilize GAN training. Furthermore, we design multiple statistical metrics and prediction tasks to evaluate the generated data. Experimental results demonstrate the quality of the synthetic data and the effectiveness of MTGAN in generating realistic sequential EHR data, especially for uncommon diseases. △ Less

Submitted 31 August, 2023; v1 submitted 10 April, 2022; originally announced April 2022.

Comments: \c{opyright}2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2112.01147 [pdf, other]

CO2Sum:Contrastive Learning for Factual-Consistent Abstractive Summarization

Authors: Wei Liu, Huanqin Wu, Wen**g Mu, Zhen Li, Tao Chen, Dan Nie

Abstract: Generating factual-consistent summaries is a challenging task for abstractive summarization. Previous works mainly encode factual information or perform post-correct/rank after decoding. In this paper, we provide a factual-consistent solution from the perspective of contrastive learning, which is a natural extension of previous works. We propose CO2Sum (Contrastive for Consistency), a contrastive… ▽ More Generating factual-consistent summaries is a challenging task for abstractive summarization. Previous works mainly encode factual information or perform post-correct/rank after decoding. In this paper, we provide a factual-consistent solution from the perspective of contrastive learning, which is a natural extension of previous works. We propose CO2Sum (Contrastive for Consistency), a contrastive learning scheme that can be easily applied on sequence-to-sequence models for factual-consistent abstractive summarization, proving that the model can be fact-aware without modifying the architecture. CO2Sum applies contrastive learning on the encoder, which can help the model be aware of the factual information contained in the input article, or performs contrastive learning on the decoder, which makes the model to generate factual-correct output summary. What's more, these two schemes are orthogonal and can be combined to further improve faithfulness. Comprehensive experiments on public benchmarks demonstrate that CO2Sum improves the faithfulness on large pre-trained language models and reaches competitive results compared to other strong factual-consistent summarization baselines. △ Less

Submitted 9 January, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: 9 pages

arXiv:2111.06400 [pdf, other]

Fast T2w/FLAIR MRI Acquisition by Optimal Sampling of Information Complementary to Pre-acquired T1w MRI

Authors: Junwei Yang, Xiao-Xin Li, Feihong Liu, Dong Nie, Pietro Lio, Haikun Qi, Dinggang Shen

Abstract: Recent studies on T1-assisted MRI reconstruction for under-sampled images of other modalities have demonstrated the potential of further accelerating MRI acquisition of other modalities. Most of the state-of-the-art approaches have achieved improvement through the development of network architectures for fixed under-sampling patterns, without fully exploiting the complementary information between… ▽ More Recent studies on T1-assisted MRI reconstruction for under-sampled images of other modalities have demonstrated the potential of further accelerating MRI acquisition of other modalities. Most of the state-of-the-art approaches have achieved improvement through the development of network architectures for fixed under-sampling patterns, without fully exploiting the complementary information between modalities. Although existing under-sampling pattern learning algorithms can be simply modified to allow the fully-sampled T1-weighted MR image to assist the pattern learning, no significant improvement on the reconstruction task can be achieved. To this end, we propose an iterative framework to optimize the under-sampling pattern for MRI acquisition of another modality that can complement the fully-sampled T1-weighted MR image at different under-sampling factors, while jointly optimizing the T1-assisted MRI reconstruction model. Specifically, our proposed method exploits the difference of latent information between the two modalities for determining the sampling patterns that can maximize the assistance power of T1-weighted MR image in improving the MRI reconstruction. We have demonstrated superior performance of our learned under-sampling patterns on a public dataset, compared to commonly used under-sampling patterns and state-of-the-art methods that can jointly optimize both the reconstruction network and the under-sampling pattern, up to 8-fold under-sampling factor. △ Less

Submitted 10 November, 2021; originally announced November 2021.

arXiv:2108.12126 [pdf, other]

Automated Generation of Accurate \& Fluent Medical X-ray Reports

Authors: Hoang T. N. Nguyen, Dong Nie, Taivanbat Badamdorj, Yujie Liu, Yingying Zhu, Jason Truong, Li Cheng

Abstract: Our paper focuses on automating the generation of medical reports from chest X-ray image inputs, a critical yet time-consuming task for radiologists. Unlike existing medical re-port generation efforts that tend to produce human-readable reports, we aim to generate medical reports that are both fluent and clinically accurate. This is achieved by our fully differentiable and end-to-end paradigm cont… ▽ More Our paper focuses on automating the generation of medical reports from chest X-ray image inputs, a critical yet time-consuming task for radiologists. Unlike existing medical re-port generation efforts that tend to produce human-readable reports, we aim to generate medical reports that are both fluent and clinically accurate. This is achieved by our fully differentiable and end-to-end paradigm containing three complementary modules: taking the chest X-ray images and clinical his-tory document of patients as inputs, our classification module produces an internal check-list of disease-related topics, referred to as enriched disease embedding; the embedding representation is then passed to our transformer-based generator, giving rise to the medical reports; meanwhile, our generator also pro-duces the weighted embedding representation, which is fed to our interpreter to ensure consistency with respect to disease-related topics.Our approach achieved promising results on commonly-used metrics concerning language fluency and clinical accuracy. Moreover, noticeable performance gains are consistently ob-served when additional input information is available, such as the clinical document and extra scans of different views. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: accepted in emnlp

arXiv:2106.04847 [pdf, other]

UniKeyphrase: A Unified Extraction and Generation Framework for Keyphrase Prediction

Authors: Huanqin Wu, Wei Liu, Lei Li, Dan Nie, Tao Chen, Feng Zhang, Di Wang

Abstract: Keyphrase Prediction (KP) task aims at predicting several keyphrases that can summarize the main idea of the given document. Mainstream KP methods can be categorized into purely generative approaches and integrated models with extraction and generation. However, these methods either ignore the diversity among keyphrases or only weakly capture the relation across tasks implicitly. In this paper, we… ▽ More Keyphrase Prediction (KP) task aims at predicting several keyphrases that can summarize the main idea of the given document. Mainstream KP methods can be categorized into purely generative approaches and integrated models with extraction and generation. However, these methods either ignore the diversity among keyphrases or only weakly capture the relation across tasks implicitly. In this paper, we propose UniKeyphrase, a novel end-to-end learning framework that jointly learns to extract and generate keyphrases. In UniKeyphrase, stacked relation layer and bag-of-words constraint are proposed to fully exploit the latent semantic relation between extraction and generation in the view of model structure and training process, respectively. Experiments on KP benchmarks demonstrate that our joint approach outperforms mainstream methods by a large margin. △ Less

Submitted 31 August, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: 11pages, 6 figures, 6 tables, published in ACL 2021 findings

arXiv:2009.00328 [pdf, other]

Secrecy Outage Analysis of Two-Hop Decode-and-Forward Mixed RF/UWOC Systems

Authors: Yi Lou, Ruofan Sun, Julian Cheng, Donghu Nie, Gang Qiao

Abstract: We analyze the secrecy performance of a two-hop mixed radio frequency (RF)/underwater wireless optical communication (UWOC) system using a decode-and-forward (DF) relay. All RF and UWOC links are modeled by the $α-μ$ and exponential-generalized Gamma distributions, respectively. We first derive the expressions of the secrecy outage probability (SOP) in exact closed-form, which are subsequently use… ▽ More We analyze the secrecy performance of a two-hop mixed radio frequency (RF)/underwater wireless optical communication (UWOC) system using a decode-and-forward (DF) relay. All RF and UWOC links are modeled by the $α-μ$ and exponential-generalized Gamma distributions, respectively. We first derive the expressions of the secrecy outage probability (SOP) in exact closed-form, which are subsequently used to derive asymptotic expressions at high SNR that only includes simple functions for further insight. Moreover, based on the asymptotic expression, we can determine the optimal transmit power for a wide variety of RF and UWOC channel conditions. All analyses are validated using Monte Carlo simulation. △ Less

Submitted 1 September, 2020; originally announced September 2020.

arXiv:2008.02868 [pdf, other]

Performance of Underwater Wireless Optical Communications in Presents of Cascaded Mixture Exponential-Generalized Gamma Turbulence

Authors: Yi Lou, Julian Cheng, Donghu Nie, Gang Qiao

Abstract: Underwater wireless optical communication is one of the critical technologies for buoy-based high-speed cross-sea surface communication, where the communication nodes are vertically deployed. Due to the vertically inhomogeneous nature of the underwater environment, seawater is usually vertically divided into multiple layers with different parameters that reflect the real environment. In this work,… ▽ More Underwater wireless optical communication is one of the critical technologies for buoy-based high-speed cross-sea surface communication, where the communication nodes are vertically deployed. Due to the vertically inhomogeneous nature of the underwater environment, seawater is usually vertically divided into multiple layers with different parameters that reflect the real environment. In this work, we consider a generalized UWOC channel model that contains$N$ layers. To capture the effects of air bubbles and temperature gradients on channel statistics, we model each layer by a mixture Exponential-Generalized Gamma(EGG) distribution. We derive the PDF and CDF of the end-to-end SNR in exact closed-form. Then, unified BER and outage expressions using OOK and BPSK are also derived. The performance and behavior of common vertical underwater optical communication scenarios are thoroughly analyzed through the appropriate selection of parameters. All the derived expressions are verified via Monte Carlo simulations. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: 12 pages, 4 figures

arXiv:2005.10439 [pdf, other]

HF-UNet: Learning Hierarchically Inter-Task Relevance in Multi-Task U-Net for Accurate Prostate Segmentation

Authors: Kelei He, Chunfeng Lian, Bing Zhang, Xin Zhang, Xiaohuan Cao, Dong Nie, Yang Gao, Junfeng Zhang, Dinggang Shen

Abstract: Accurate segmentation of the prostate is a key step in external beam radiation therapy treatments. In this paper, we tackle the challenging task of prostate segmentation in CT images by a two-stage network with 1) the first stage to fast localize, and 2) the second stage to accurately segment the prostate. To precisely segment the prostate in the second stage, we formulate prostate segmentation in… ▽ More Accurate segmentation of the prostate is a key step in external beam radiation therapy treatments. In this paper, we tackle the challenging task of prostate segmentation in CT images by a two-stage network with 1) the first stage to fast localize, and 2) the second stage to accurately segment the prostate. To precisely segment the prostate in the second stage, we formulate prostate segmentation into a multi-task learning framework, which includes a main task to segment the prostate, and an auxiliary task to delineate the prostate boundary. Here, the second task is applied to provide additional guidance of unclear prostate boundary in CT images. Besides, the conventional multi-task deep networks typically share most of the parameters (i.e., feature representations) across all tasks, which may limit their data fitting ability, as the specificities of different tasks are inevitably ignored. By contrast, we solve them by a hierarchically-fused U-Net structure, namely HF-UNet. The HF-UNet has two complementary branches for two tasks, with the novel proposed attention-based task consistency learning block to communicate at each level between the two decoding branches. Therefore, HF-UNet endows the ability to learn hierarchically the shared representations for different tasks, and preserve the specificities of learned representations for different tasks simultaneously. We did extensive evaluations of the proposed method on a large planning CT image dataset, including images acquired from 339 patients. The experimental results show HF-UNet outperforms the conventional multi-task network architectures and the state-of-the-art methods. △ Less

Submitted 23 May, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

arXiv:2002.00092 [pdf, other]

Hybrid Graph Neural Networks for Crowd Counting

Authors: Ao Luo, Fan Yang, Xin Li, Dong Nie, Zhicheng Jiao, Shangchen Zhou, Hong Cheng

Abstract: Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is stil… ▽ More Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges:(i) multi-scale relations for capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can distill rich relations between the nodes to obtain more powerful representations, leading to robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art approaches by a large margin. △ Less

Submitted 31 January, 2020; originally announced February 2020.

Comments: To appear in AAAI 2020

arXiv:1907.03297 [pdf, other]

Dual Adversarial Learning with Attention Mechanism for Fine-grained Medical Image Synthesis

Authors: Dong Nie, Lei Xiang, Qian Wang, Dinggang Shen

Abstract: Medical imaging plays a critical role in various clinical applications. However, due to multiple considerations such as cost and risk, the acquisition of certain image modalities could be limited. To address this issue, many cross-modality medical image synthesis methods have been proposed. However, the current methods cannot well model the hard-to-synthesis regions (e.g., tumor or lesion regions)… ▽ More Medical imaging plays a critical role in various clinical applications. However, due to multiple considerations such as cost and risk, the acquisition of certain image modalities could be limited. To address this issue, many cross-modality medical image synthesis methods have been proposed. However, the current methods cannot well model the hard-to-synthesis regions (e.g., tumor or lesion regions). To address this issue, we propose a simple but effective strategy, that is, we propose a dual-discriminator (dual-D) adversarial learning system, in which, a global-D is used to make an overall evaluation for the synthetic image, and a local-D is proposed to densely evaluate the local regions of the synthetic image. More importantly, we build an adversarial attention mechanism which targets at better modeling hard-to-synthesize regions (e.g., tumor or lesion regions) based on the local-D. Experimental results show the robustness and accuracy of our method in synthesizing fine-grained target images from the corresponding source images. In particular, we evaluate our method on two datasets, i.e., to address the tasks of generating T2 MRI from T1 MRI for the brain tumor images and generating MRI from CT. Our method outperforms the state-of-the-art methods under comparison in all datasets and tasks. And the proposed difficult-region-aware attention mechanism is also proved to be able to help generate more realistic images, especially for the hard-to-synthesize regions. △ Less

Submitted 7 July, 2019; originally announced July 2019.

arXiv:1906.04306 [pdf, other]

Semantic-guided Encoder Feature Learning for Blurry Boundary Delineation

Authors: Dong Nie, Dinggang Shen

Abstract: Encoder-decoder architectures are widely adopted for medical image segmentation tasks. With the lateral skip connection, the models can obtain and fuse both semantic and resolution information in deep layers to achieve more accurate segmentation performance. However, in many applications (e.g., blurry boundary images), these models often cannot precisely locate complex boundaries and segment tiny… ▽ More Encoder-decoder architectures are widely adopted for medical image segmentation tasks. With the lateral skip connection, the models can obtain and fuse both semantic and resolution information in deep layers to achieve more accurate segmentation performance. However, in many applications (e.g., blurry boundary images), these models often cannot precisely locate complex boundaries and segment tiny isolated parts. To solve this challenging problem, we firstly analyze why simple skip connections are not enough to help accurately locate indistinct boundaries and argue that it is due to the fuzzy information in the skip connection provided in the encoder layers. Then we propose a semantic-guided encoder feature learning strategy to learn both high resolution and rich semantic encoder features so that we can more accurately locate the blurry boundaries, which can also enhance the network by selectively learning discriminative features. Besides, we further propose a soft contour constraint mechanism to model the blurry boundary detection. Experimental results on real clinical datasets show that our proposed method can achieve state-of-the-art segmentation accuracy, especially for the blurry regions. Further analysis also indicates that our proposed network components indeed contribute to the improvement of performance. Experiments on additional datasets validate the generalization ability of our proposed method. △ Less

Submitted 10 June, 2019; originally announced June 2019.

arXiv:1905.08720 [pdf]

doi 10.1109/TIP.2020.3003735

Task Decomposition and Synchronization for Semantic Biomedical Image Segmentation

Authors: Xuhua Ren, Lichi Zhang, Sahar Ahmad, Dong Nie, Fan Yang, Lei Xiang, Qian Wang, Dinggang Shen

Abstract: Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) predicti… ▽ More Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) prediction of the class labels of the objects within the image, and (3) classification of the scene the image belonging to. While these three sub-tasks are trained to optimize their individual loss functions of different perceptual levels, we propose to let them interact by the task-task context ensemble. Moreover, we propose a novel sync-regularization to penalize the deviation between the outputs of the pixel-wise segmentation and the class prediction tasks. These effective regularizations help FCN utilize context information comprehensively and attain accurate semantic segmentation, even though the number of the images for training may be limited in many biomedical applications. We have successfully applied our framework to three diverse 2D/3D medical image datasets, including Robotic Scene Segmentation Challenge 18 (ROBOT18), Brain Tumor Segmentation Challenge 18 (BRATS18), and Retinal Fundus Glaucoma Challenge (REFUGE18). We have achieved top-tier performance in all three challenges. △ Less

Submitted 22 June, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

Comments: IEEE Transactions on Medical Imaging

arXiv:1811.02629 [pdf, other]

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset. △ Less

Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

arXiv:1709.02073 [pdf]

Deep Embedding Convolutional Neural Network for Synthesizing CT Image from T1-Weighted MR Image

Authors: Lei Xiang, Qian Wang, Xiyao **, Dong Nie, Yu Qiao, Dinggang Shen

Abstract: Recently, more and more attention is drawn to the field of medical image synthesis across modalities. Among them, the synthesis of computed tomography (CT) image from T1-weighted magnetic resonance (MR) image is of great importance, although the map** between them is highly complex due to large gaps of appearances of the two modalities. In this work, we aim to tackle this MR-to-CT synthesis by a… ▽ More Recently, more and more attention is drawn to the field of medical image synthesis across modalities. Among them, the synthesis of computed tomography (CT) image from T1-weighted magnetic resonance (MR) image is of great importance, although the map** between them is highly complex due to large gaps of appearances of the two modalities. In this work, we aim to tackle this MR-to-CT synthesis by a novel deep embedding convolutional neural network (DECNN). Specifically, we generate the feature maps from MR images, and then transform these feature maps forward through convolutional layers in the network. We can further compute a tentative CT synthesis from the midway of the flow of feature maps, and then embed this tentative CT synthesis back to the feature maps. This embedding operation results in better feature maps, which are further transformed forward in DECNN. After repeat-ing this embedding procedure for several times in the network, we can eventually synthesize a final CT image in the end of the DECNN. We have validated our proposed method on both brain and prostate datasets, by also compar-ing with the state-of-the-art methods. Experimental results suggest that our DECNN (with repeated embedding op-erations) demonstrates its superior performances, in terms of both the perceptive quality of the synthesized CT image and the run-time cost for synthesizing a CT image. △ Less

Submitted 8 November, 2017; v1 submitted 7 September, 2017; originally announced September 2017.

arXiv:1612.05362 [pdf, other]

Medical Image Synthesis with Context-Aware Generative Adversarial Networks

Authors: Dong Nie, Roger Trullo, Caroline Petitjean, Su Ruan, Dinggang Shen

Abstract: Computed tomography (CT) is critical for various clinical applications, e.g., radiotherapy treatment planning and also PET attenuation correction. However, CT exposes radiation during acquisition, which may cause side effects to patients. Compared to CT, magnetic resonance imaging (MRI) is much safer and does not involve any radiations. Therefore, recently, researchers are greatly motivated to est… ▽ More Computed tomography (CT) is critical for various clinical applications, e.g., radiotherapy treatment planning and also PET attenuation correction. However, CT exposes radiation during acquisition, which may cause side effects to patients. Compared to CT, magnetic resonance imaging (MRI) is much safer and does not involve any radiations. Therefore, recently, researchers are greatly motivated to estimate CT image from its corresponding MR image of the same subject for the case of radiotherapy planning. In this paper, we propose a data-driven approach to address this challenging problem. Specifically, we train a fully convolutional network to generate CT given an MR image. To better model the nonlinear relationship from MRI to CT and to produce more realistic images, we propose to use the adversarial training strategy and an image gradient difference loss function. We further apply AutoContext Model to implement a context-aware generative adversarial network. Experimental results show that our method is accurate and robust for predicting CT images from MRI images, and also outperforms three state-of-the-art methods under comparison. △ Less

Submitted 15 December, 2016; originally announced December 2016.

arXiv:1612.05306 [pdf, ps, other]

Beampattern-Based Tracking for Millimeter Wave Communication Systems

Authors: Kang Gao, Mingming Cai, Ding Nie, Bertrand Hochwald, J. Nicholas Laneman, Huang Huang, Kunpeng Liu

Abstract: We present a tracking algorithm to maintain the communication link between a base station (BS) and a mobile station (MS) in a millimeter wave (mmWave) communication system, where antenna arrays are used for beamforming in both the BS and MS. Downlink transmission is considered, and the tracking is performed at the MS as it moves relative to the BS. Specifically, we consider the case that the MS ro… ▽ More We present a tracking algorithm to maintain the communication link between a base station (BS) and a mobile station (MS) in a millimeter wave (mmWave) communication system, where antenna arrays are used for beamforming in both the BS and MS. Downlink transmission is considered, and the tracking is performed at the MS as it moves relative to the BS. Specifically, we consider the case that the MS rotates quickly due to hand movement. The algorithm estimates the angle of arrival (AoA) by using variations in the radiation pattern of the beam as a function of this angle. Numerical results show that the algorithm achieves accurate beam alignment when the MS rotates in a wide range of angular speeds. For example, the algorithm can support angular speeds up to 800 degrees per second when tracking updates are available every 10 ms. △ Less

Submitted 15 December, 2016; originally announced December 2016.

Comments: 6 pages, to be published in Proc. IEEE GLOBECOM 2016, Washington, D.C., USA

arXiv:1609.03160 [pdf, ps, other]

Effect of Wideband Beam Squint on Codebook Design in Phased-Array Wireless Systems

Authors: Mingming Cai, Kang Gao, Ding Nie, Bertrand Hochwald, J. Nicholas Laneman, Huang Huang, Kunpeng Liu

Abstract: Analog beamforming with phased arrays is a promising technique for 5G wireless communication at millimeter wave frequencies. Using a discrete codebook consisting of multiple analog beams, each beam focuses on a certain range of angles of arrival or departure and corresponds to a set of fixed phase shifts across frequency due to practical hardware considerations. However, for sufficiently large ban… ▽ More Analog beamforming with phased arrays is a promising technique for 5G wireless communication at millimeter wave frequencies. Using a discrete codebook consisting of multiple analog beams, each beam focuses on a certain range of angles of arrival or departure and corresponds to a set of fixed phase shifts across frequency due to practical hardware considerations. However, for sufficiently large bandwidth, the gain provided by the phased array is actually frequency dependent, which is an effect called beam squint, and this effect occurs even if the radiation pattern of the antenna elements is frequency independent. This paper examines the nature of beam squint for a uniform linear array (ULA) and analyzes its impact on codebook design as a function of the number of antennas and system bandwidth normalized by the carrier frequency. The criterion for codebook design is to guarantee that each beam's minimum gain for a range of angles and for all frequencies in the wideband system exceeds a target threshold, for example 3 dB below the array's maximum gain. Analysis and numerical examples suggest that a denser codebook is required to compensate for beam squint. For example, 54% more beams are needed compared to a codebook design that ignores beam squint for a ULA with 32 antennas operating at a carrier frequency of 73 GHz and bandwidth of 2.5 GHz. Furthermore, beam squint with this design criterion limits the bandwidth or the number of antennas of the array if the other one is fixed. △ Less

Submitted 22 September, 2016; v1 submitted 11 September, 2016; originally announced September 2016.

Comments: 6 pages, to be published in Proc. IEEE GLOBECOM 2016, Washington, D.C., USA

arXiv:1511.03518 [pdf, ps, other]

doi 10.1016/j.physa.2016.06.027

Diffusion-like recommendation with enhanced similarity of objects

Authors: Ya-Hui An, Qiang Dong, Chong-**g Sun, Da-Cheng Nie, Yan Fu

Abstract: In last decades, diversity and accuracy have been regarded as two important measures in evaluating a recommendation model. However, a clear concern is that a model focusing excessively on one measure will put the other one at risk, thus it is not easy to greatly improve diversity and accuracy simultaneously. In this paper, we propose to enhance the Resource-Allocation (RA) similarity in resource t… ▽ More In last decades, diversity and accuracy have been regarded as two important measures in evaluating a recommendation model. However, a clear concern is that a model focusing excessively on one measure will put the other one at risk, thus it is not easy to greatly improve diversity and accuracy simultaneously. In this paper, we propose to enhance the Resource-Allocation (RA) similarity in resource transfer equations of diffusion-like models, by giving a tunable exponent to the RA similarity, and traversing the value of the exponent to achieve the optimal recommendation results. In this way, we can increase the recommendation scores (allocated resource) of many unpopular objects. Experiments on three benchmark data sets, MovieLens, Netflix, and RateYourMusic show that the modified models can yield remarkable performance improvement compared with the original ones. △ Less

Submitted 11 October, 2018; v1 submitted 11 November, 2015; originally announced November 2015.

Journal ref: Physica A: Statistical Mechanics and its Applications 461 (2016) 708-715

arXiv:1509.02152 [pdf, other]

doi 10.1109/TAP.2016.2645778,

doi 10.1109/TAP.2016.2645786

Bandwidth Analysis of Multiport Radio-Frequency Systems

Authors: Ding Nie, Bertrand M. Hochwald

Abstract: When multiple radio-frequency sources are connected to multiple loads through a passive multiport matching network, perfect power transfer to the loads across all frequencies is generally impossible. In this two-part paper, we provide analyses of bandwidth over which power transfer is possible. Our principal tools include broadband multiport matching upper bounds, presented herein, on the integral… ▽ More When multiple radio-frequency sources are connected to multiple loads through a passive multiport matching network, perfect power transfer to the loads across all frequencies is generally impossible. In this two-part paper, we provide analyses of bandwidth over which power transfer is possible. Our principal tools include broadband multiport matching upper bounds, presented herein, on the integral over all frequency of the logarithm of a suitably defined power loss ratio. In general, the larger the integral, the larger the bandwidth over which power transfer can be accomplished. We apply these bounds in several ways: We show how the number of sources and loads, and the coupling between loads, affect achievable bandwidth. We analyze the bandwidth of networks constrained to have certain architectures. We characterize systems whose bandwidths scale as the ratio between the numbers of loads and sources. The first part of the paper presents the bounds and uses them to analyze loads whose frequency responses can be represented by analytical circuit models. The second part analyzes the bandwidth of realistic loads whose frequency responses are available numerically. We provide applications to wireless transmitters where the loads are antennas being driven by amplifiers. The derivations of the bounds are also included. △ Less

Submitted 15 March, 2017; v1 submitted 7 September, 2015; originally announced September 2015.

Comments: Published by IEEE Transactions on Antennas and Propagation

Journal ref: IEEE Trans. Ant. Prop., vol. 65, no. 3, pp. 1081--1107, Mar. 2017

arXiv:1408.5240 [pdf, other]

doi 10.1016/j.physa.2014.10.021

Whether Information Network Supplements Friendship Network

Authors: Lili Miao, Qian-Ming Zhang, Da-Chen Nie, Shi-Min Cai

Abstract: Homophily is a significant mechanism for link prediction in complex network, of which principle describes that people with similar profiles or experiences tend to tie with each other. In a multi-relationship network, friendship among people has been utilized to reinforce similarity of taste for recommendation system whose basic idea is similar to homophily, yet how the taste inversely affects frie… ▽ More Homophily is a significant mechanism for link prediction in complex network, of which principle describes that people with similar profiles or experiences tend to tie with each other. In a multi-relationship network, friendship among people has been utilized to reinforce similarity of taste for recommendation system whose basic idea is similar to homophily, yet how the taste inversely affects friendship prediction is little discussed. This paper contributes to address the issue by analyzing two benchmark datasets both including user's behavioral information of taste and friendship based on the principle of homophily. It can be found that the creation of friendship tightly associates with personal taste. Especially, the behavioral information of taste involving with popular objects is much more effective to improve the performance of friendship prediction. However, this result seems to be contradictory to the finding in [Q.M. Zhang, et al., PLoS ONE 8(2013)e62624] that the behavior information of taste involving with popular objects is redundant in recommendation system. We thus discuss this inconformity to comprehensively understand the correlation between them. △ Less

Submitted 22 August, 2014; originally announced August 2014.

Comments: 8 pages, 5 figures

Journal ref: Physica A 419, 301 (2015)

arXiv:1403.7595 [pdf, ps, other]

doi 10.1371/journal.pone.0101675

Information Filtering on Coupled Social Networks

Authors: Da-Cheng Nie, Zi-Ke Zhang, Jun-lin Zhou, Yan Fu, Kui Zhang

Abstract: In this paper, based on the coupled social networks (CSN), we propose a hybrid algorithm to nonlinearly integrate both social and behavior information of online users. Filtering algorithm based on the coupled social networks, which considers the effects of both social influence and personalized preference. Experimental results on two real datasets, \emph{Epinions} and \emph{Friendfeed}, show that… ▽ More In this paper, based on the coupled social networks (CSN), we propose a hybrid algorithm to nonlinearly integrate both social and behavior information of online users. Filtering algorithm based on the coupled social networks, which considers the effects of both social influence and personalized preference. Experimental results on two real datasets, \emph{Epinions} and \emph{Friendfeed}, show that hybrid pattern can not only provide more accurate recommendations, but also can enlarge the recommendation coverage while adopting global metric. Further empirical analyses demonstrate that the mutual reinforcement and rich-club phenomenon can also be found in coupled social networks where the identical individuals occupy the core position of the online system. This work may shed some light on the in-depth understanding structure and function of coupled social networks. △ Less

Submitted 29 March, 2014; originally announced March 2014.

arXiv:1402.5774 [pdf, ps, other]

Information Filtering via Balanced Diffusion on Bipartite Networks

Authors: Da-Cheng Nie, Ya-Hui An, Qiang Dong, Yan Fu, Tao Zhou

Abstract: Recent decade has witnessed the increasing popularity of recommender systems, which help users acquire relevant commodities and services from overwhelming resources on Internet. Some simple physical diffusion processes have been used to design effective recommendation algorithms for user-object bipartite networks, typically mass diffusion (MD) and heat conduction (HC) algorithms which have differe… ▽ More Recent decade has witnessed the increasing popularity of recommender systems, which help users acquire relevant commodities and services from overwhelming resources on Internet. Some simple physical diffusion processes have been used to design effective recommendation algorithms for user-object bipartite networks, typically mass diffusion (MD) and heat conduction (HC) algorithms which have different advantages respectively on accuracy and diversity. In this paper, we investigate the effect of weight assignment in the hybrid of MD and HC, and find that a new hybrid algorithm of MD and HC with balanced weights will achieve the optimal recommendation results, we name it balanced diffusion (BD) algorithm. Numerical experiments on three benchmark data sets, MovieLens, Netflix and RateYourMusic (RYM), show that the performance of BD algorithm outperforms the existing diffusion-based methods on the three important recommendation metrics, accuracy, diversity and novelty. Specifically, it can not only provide accurately recommendation results, but also yield higher diversity and novelty in recommendations by accurately recommending unpopular objects. △ Less

Submitted 24 February, 2014; originally announced February 2014.

Comments: 13 pages, 6 figures

arXiv:1402.4621 [pdf]

Cyber Behavior of Microblog Users: Onlies Versus Others

Authors: Dong Nie, Bibo Hao, Zheng Yan, Tingshao Zhu

Abstract: Much research has been conducted to investigate personality and daily behavior of these only children ('Onlies') due to the Chinese one-child-per-family policy, and report the singleton generation to be more selfish. As Microblog becomes increasingly popular recently in China, we studied cyber behavior of Onlies and children with siblings ('Others') on Sina Microblog ('Weibo'), a leading Microblog… ▽ More Much research has been conducted to investigate personality and daily behavior of these only children ('Onlies') due to the Chinese one-child-per-family policy, and report the singleton generation to be more selfish. As Microblog becomes increasingly popular recently in China, we studied cyber behavior of Onlies and children with siblings ('Others') on Sina Microblog ('Weibo'), a leading Microblog service provider in China. Participants were 1792 Weibo users. Their recorded data on Weibo were downloaded to assess their cyber behaviors. The general results show that (1) Onlies have a smaller social circle; (2)Onlies are more significantly active on social platform. △ Less

Submitted 19 February, 2014; originally announced February 2014.

Comments: 24 pages

Showing 1–26 of 26 results for author: Nie, D