Search | arXiv e-print repository

Capacity Credit Evaluation of Generalized Energy Storage Considering Endogenous Uncertainty

Authors: Ning Qi, Pierre Pinson, Mads R. Almassalkhi, Yingrui Zhuang, Yifan Su, Feng Liu

Abstract: Generalized energy storage (GES), encompassing both physical and virtual energy storage, can provide remarkable but uncertain adequacy flexibility. When assessing GES's contribution to resource adequacy, the literature typically considers exogenous uncertainties (e.g., failures and stochastic response) but overlooks endogenous uncertainties, such as self-scheduling in liberal markets and decision-… ▽ More Generalized energy storage (GES), encompassing both physical and virtual energy storage, can provide remarkable but uncertain adequacy flexibility. When assessing GES's contribution to resource adequacy, the literature typically considers exogenous uncertainties (e.g., failures and stochastic response) but overlooks endogenous uncertainties, such as self-scheduling in liberal markets and decision-dependent uncertainty (DDU). In this regard, this paper proposes a novel capacity credit evaluation framework to accurately quantify GES's contribution to resource adequacy, where a sequential coordinated dispatch method is proposed to capture realistic GES operations by coordinating self-scheduling in the day-ahead energy market and real-time adequacy-oriented dispatch in the capacity market. To incorporate DDU of GES (i.e., responsiveness affected by dispatch decisions and prices in capacity market), we present a chance-constrained optimization approach and tractable solution methodologies for real-time dispatch. We propose a practical adequacy assessment method to quantify the impact of DDU on capacity credit by evaluating the consequence of ignoring DDU. Additionally, a novel capacity credit index called equivalent storage capacity substitution is introduced to quantify the equivalent deterministic storage capacity of the uncertain virtual energy storage. Simulations show that the proposed method yields reliable and accurate capacity credit values by accounting for self-scheduling of GES and managing the risk from DDU. Finally, key impact factors of GES's capacity credit are thoroughly discussed, offering valuable insights for the decision-making of capacity market operators. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: This is a manuscript submitted to IEEE Transcations on Power Systems

arXiv:2406.05954 [pdf, other]

Aligning Large Language Models with Representation Editing: A Control Perspective

Authors: Lingkai Kong, Haorui Wang, Wenhao Mu, Yuanqi Du, Yuchen Zhuang, Yifei Zhou, Yue Song, Rongzhi Zhang, Kai Wang, Chao Zhang

Abstract: Aligning large language models (LLMs) with human objectives is crucial for real-world applications. However, fine-tuning LLMs for alignment often suffers from unstable training and requires substantial computing resources. Test-time alignment techniques, such as prompting and guided decoding, do not modify the underlying model, and their performance remains dependent on the original model's capabi… ▽ More Aligning large language models (LLMs) with human objectives is crucial for real-world applications. However, fine-tuning LLMs for alignment often suffers from unstable training and requires substantial computing resources. Test-time alignment techniques, such as prompting and guided decoding, do not modify the underlying model, and their performance remains dependent on the original model's capabilities. To address these challenges, we propose aligning LLMs through representation editing. The core of our method is to view a pre-trained autoregressive LLM as a discrete-time stochastic dynamical system. To achieve alignment for specific objectives, we introduce external control signals into the state space of this language dynamical system. We train a value function directly on the hidden states according to the Bellman equation, enabling gradient-based optimization to obtain the optimal control signals at test time. Our experiments demonstrate that our method outperforms existing test-time alignment techniques while requiring significantly fewer resources compared to fine-tuning methods. △ Less

Submitted 11 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

Comments: fix typos

arXiv:2405.05944 [pdf, other]

MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI

Authors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Abhinav Suri, Ronald M. Summers

Abstract: Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmenta… ▽ More Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. Methods: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). Findings: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. Interpretation: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others. △ Less

Submitted 24 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: We made the segmentation model publicly available

arXiv:2404.18105 [pdf, other]

Tightly-Coupled VLP/INS Integrated Navigation by Inclination Estimation and Blockage Handling

Authors: Xiao Sun, Yuan Zhuang, Xiansheng Yang, Jianzhu Huai, Tianming Huang, Daquan Feng

Abstract: Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to… ▽ More Visible Light Positioning (VLP) has emerged as a promising technology capable of delivering indoor localization with high accuracy. In VLP systems that use Photodiodes (PDs) as light receivers, the Received Signal Strength (RSS) is affected by the incidence angle of light, making the inclination of PDs a critical parameter in the positioning model. Currently, most studies assume the inclination to be constant, limiting the applications and positioning accuracy. Additionally, light blockages may severely interfere with the RSS measurements but the literature has not explored blockage detection in real-world experiments. To address these problems, we propose a tightly coupled VLP/INS (Inertial Navigation System) integrated navigation system that uses graph optimization to account for varying PD inclinations and VLP blockages. We also discussed the possibility of simultaneously estimating the robot's pose and the locations of some unknown LEDs. Simulations and two groups of real-world experiments demonstrate the efficiency of our approach, achieving an average positioning accuracy of 10 cm during movement and inclination accuracy within 1 degree despite inclination changes and blockages. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.16905 [pdf, other]

Samsung Research China-Bei**g at SemEval-2024 Task 3: A multi-stage framework for Emotion-Cause Pair Extraction in Conversations

Authors: Shen Zhang, Haojie Zhang, **g Zhang, Xudong Zhang, Yimeng Zhuang, **ting Wu

Abstract: In human-computer interaction, it is crucial for agents to respond to human by understanding their emotions. Unraveling the causes of emotions is more challenging. A new task named Multimodal Emotion-Cause Pair Extraction in Conversations is responsible for recognizing emotion and identifying causal expressions. In this study, we propose a multi-stage framework to generate emotion and extract the… ▽ More In human-computer interaction, it is crucial for agents to respond to human by understanding their emotions. Unraveling the causes of emotions is more challenging. A new task named Multimodal Emotion-Cause Pair Extraction in Conversations is responsible for recognizing emotion and identifying causal expressions. In this study, we propose a multi-stage framework to generate emotion and extract the emotion causal pairs given the target emotion. In the first stage, Llama-2-based InstructERC is utilized to extract the emotion category of each utterance in a conversation. After emotion recognition, a two-stream attention model is employed to extract the emotion causal pairs given the target emotion for subtask 2 while MuTEC is employed to extract causal span for subtask 1. Our approach achieved first place for both of the two subtasks in the competition. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2312.06453 [pdf, other]

Semantic Image Synthesis for Abdominal CT

Authors: Yan Zhuang, Benjamin Hou, Tejas Sudharshan Mathai, Pritam Mukherjee, Boah Kim, Ronald M. Summers

Abstract: As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the perfo… ▽ More As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the performance of three diffusion models, as well as to other state-of-the-art GAN-based approaches, and studied the different conditioning scenarios for the semantic mask. Experimental results demonstrated that diffusion models were able to synthesize abdominal CT images with better quality. Additionally, encoding the mask and the input separately is more effective than naïve concatenating. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: This paper has been accepted at Deep Generative Models workshop at MICCAI 2023

arXiv:2306.09014 [pdf, other]

Geometric Wide-Angle Camera Calibration: A Review and Comparative Study

Authors: Jianzhu Huai, Yuan Zhuang, Yuxin Shao, Grzegorz Jozkow, Binliang Wang, Yijia He, Alper Yilmaz

Abstract: Wide-angle cameras are widely used in photogrammetry and autonomous systems which rely on the accurate metric measurements derived from images. To find the geometric relationship between incoming rays and image pixels, geometric camera calibration (GCC) has been actively developed. Aiming to provide practical calibration guidelines, this work surveys the existing GCC tools and evaluates the repres… ▽ More Wide-angle cameras are widely used in photogrammetry and autonomous systems which rely on the accurate metric measurements derived from images. To find the geometric relationship between incoming rays and image pixels, geometric camera calibration (GCC) has been actively developed. Aiming to provide practical calibration guidelines, this work surveys the existing GCC tools and evaluates the representative ones for wide-angle cameras. The survey covers camera models, calibration targets, and algorithms used in these tools, highlighting their properties and the trends in GCC development. The evaluation compares six target-based GCC tools, namely, BabelCalib, Basalt, Camodocal, Kalibr, the MATLAB calibrator, and the OpenCV-based ROS calibrator, with simulated and real data for wide-angle cameras described by four parametric projection models. These tests reveal the strengths and weaknesses of these camera models, as well as the repeatability of these GCC tools. In view of the survey and evaluation, future research directions of wide-angle GCC are also discussed. △ Less

Submitted 27 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: 18 pages, 12 figures

arXiv:2211.14986 [pdf]

An Unpaired Cross-modality Segmentation Framework Using Data Augmentation and Hybrid Convolutional Networks for Segmenting Vestibular Schwannoma and Cochlea

Authors: Yuzhou Zhuang, Hong Liu, Enmin Song, Coskun Cetinkaya, Chih-Cheng Hung

Abstract: The crossMoDA challenge aims to automatically segment the vestibular schwannoma (VS) tumor and cochlea regions of unlabeled high-resolution T2 scans by leveraging labeled contrast-enhanced T1 scans. The 2022 edition extends the segmentation task by including multi-institutional scans. In this work, we proposed an unpaired cross-modality segmentation framework using data augmentation and hybrid con… ▽ More The crossMoDA challenge aims to automatically segment the vestibular schwannoma (VS) tumor and cochlea regions of unlabeled high-resolution T2 scans by leveraging labeled contrast-enhanced T1 scans. The 2022 edition extends the segmentation task by including multi-institutional scans. In this work, we proposed an unpaired cross-modality segmentation framework using data augmentation and hybrid convolutional networks. Considering heterogeneous distributions and various image sizes for multi-institutional scans, we apply the min-max normalization for scaling the intensities of all scans between -1 and 1, and use the voxel size resampling and center crop** to obtain fixed-size sub-volumes for training. We adopt two data augmentation methods for effectively learning the semantic information and generating realistic target domain scans: generative and online data augmentation. For generative data augmentation, we use CUT and CycleGAN to generate two groups of realistic T2 volumes with different details and appearances for supervised segmentation training. For online data augmentation, we design a random tumor signal reducing method for simulating the heterogeneity of VS tumor signals. Furthermore, we utilize an advanced hybrid convolutional network with multi-dimensional convolutions to adaptively learn sparse inter-slice information and dense intra-slice information for accurate volumetric segmentation of VS tumor and cochlea regions in anisotropic scans. On the crossMoDA2022 validation dataset, our method produces promising results and achieves the mean DSC values of 72.47% and 76.48% and ASSD values of 3.42 mm and 0.53 mm for VS tumor and cochlea regions, respectively. △ Less

Submitted 27 November, 2022; originally announced November 2022.

Comments: Accepted by BrainLes MICCAI proceedings

arXiv:2209.07937 [pdf, other]

DPFNet: A Dual-branch Dilated Network with Phase-aware Fourier Convolution for Low-light Image Enhancement

Authors: Yunliang Zhuang, Zhuoran Zheng, Chen Lyu

Abstract: Low-light image enhancement is a classical computer vision problem aiming to recover normal-exposure images from low-light images. However, convolutional neural networks commonly used in this field are good at sampling low-frequency local structural features in the spatial domain, which leads to unclear texture details of the reconstructed images. To alleviate this problem, we propose a novel modu… ▽ More Low-light image enhancement is a classical computer vision problem aiming to recover normal-exposure images from low-light images. However, convolutional neural networks commonly used in this field are good at sampling low-frequency local structural features in the spatial domain, which leads to unclear texture details of the reconstructed images. To alleviate this problem, we propose a novel module using the Fourier coefficients, which can recover high-quality texture details under the constraint of semantics in the frequency phase and supplement the spatial domain. In addition, we design a simple and efficient module for the image spatial domain using dilated convolutions with different receptive fields to alleviate the loss of detail caused by frequent downsampling. We integrate the above parts into an end-to-end dual branch network and design a novel loss committee and an adaptive fusion module to guide the network to flexibly combine spatial and frequency domain features to generate more pleasing visual effects. Finally, we evaluate the proposed network on public benchmarks. Extensive experimental results show that our method outperforms many existing state-of-the-art ones, showing outstanding performance and potential. △ Less

Submitted 16 September, 2022; originally announced September 2022.

arXiv:2207.04211 [pdf, other]

BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval

Authors: Wenqiao Zhang, Jiannan Guo, Mengze Li, Haochen Shi, Shengyu Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang

Abstract: Content-Based Image Retrieval (CIR) aims to search for a target image by concurrently comprehending the composition of an example image and a complementary text, which potentially impacts a wide variety of real-world applications, such as internet search and fashion retrieval. In this scenario, the input image serves as an intuitive context and background for the search, while the corresponding la… ▽ More Content-Based Image Retrieval (CIR) aims to search for a target image by concurrently comprehending the composition of an example image and a complementary text, which potentially impacts a wide variety of real-world applications, such as internet search and fashion retrieval. In this scenario, the input image serves as an intuitive context and background for the search, while the corresponding language expressly requests new traits on how specific characteristics of the query image should be modified in order to get the intended target image. This task is challenging since it necessitates learning and understanding the composite image-text representation by incorporating cross-granular semantic updates. In this paper, we tackle this task by a novel \underline{\textbf{B}}ottom-up cr\underline{\textbf{O}}ss-modal \underline{\textbf{S}}emantic compo\underline{\textbf{S}}ition (\textbf{BOSS}) with Hybrid Counterfactual Training framework, which sheds new light on the CIR task by studying it from two previously overlooked perspectives: \emph{implicitly bottom-up composition of visiolinguistic representation} and \emph{explicitly fine-grained correspondence of query-target construction}. On the one hand, we leverage the implicit interaction and composition of cross-modal embeddings from the bottom local characteristics to the top global semantics, preserving and transforming the visual representation conditioned on language semantics in several continuous steps for effective target image search. On the other hand, we devise a hybrid counterfactual training strategy that can reduce the model's ambiguity for similar queries. △ Less

Submitted 9 July, 2022; originally announced July 2022.

arXiv:2205.00348 [pdf, other]

End-to-End Signal Classification in Signed Cumulative Distribution Transform Space

Authors: Abu Hasnat Mohammad Rubaiyat, Shiying Li, Xuwang Yin, Mohammad Shifat E Rabbi, Yan Zhuang, Gustavo K. Rohde

Abstract: This paper presents a new end-to-end signal classification method using the signed cumulative distribution transform (SCDT). We adopt a transport-based generative model to define the classification problem. We then make use of mathematical properties of the SCDT to render the problem easier in transform domain, and solve for the class of an unknown sample using a nearest local subspace (NLS) searc… ▽ More This paper presents a new end-to-end signal classification method using the signed cumulative distribution transform (SCDT). We adopt a transport-based generative model to define the classification problem. We then make use of mathematical properties of the SCDT to render the problem easier in transform domain, and solve for the class of an unknown sample using a nearest local subspace (NLS) search algorithm in SCDT domain. Experiments show that the proposed method provides high accuracy classification results while being data efficient, robust to out-of-distribution samples, and competitive in terms of computational complexity with respect to the deep learning end-to-end classification methods. The implementation of the proposed method in Python language is integrated as a part of the software package PyTransKit (https://github.com/rohdelab/PyTransKit). △ Less

Submitted 23 July, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

arXiv:2203.15347 [pdf, other]

Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis

Authors: Yunlong Zhang, Xin Lin, Yihong Zhuang, LiyanSun, Yue Huang, Xinghao Ding, Guisheng Wang, Lin Yang, Yizhou Yu

Abstract: Synthesizing a subject-specific pathology-free image from a pathological image is valuable for algorithm development and clinical practice. In recent years, several approaches based on the Generative Adversarial Network (GAN) have achieved promising results in pseudo-healthy synthesis. However, the discriminator (i.e., a classifier) in the GAN cannot accurately identify lesions and further hampers… ▽ More Synthesizing a subject-specific pathology-free image from a pathological image is valuable for algorithm development and clinical practice. In recent years, several approaches based on the Generative Adversarial Network (GAN) have achieved promising results in pseudo-healthy synthesis. However, the discriminator (i.e., a classifier) in the GAN cannot accurately identify lesions and further hampers from generating admirable pseudo-healthy images. To address this problem, we present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images. Then, we apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem existing in medical image segmentation. Furthermore, a reliable metric is proposed by utilizing two attributes of label noise to measure the health of synthetic images. Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods. The method achieves better performance than the existing methods with only 30\% of the training data. The effectiveness of the proposed method is also demonstrated on the LiTS and the T1 modality of BraTS. The code and the pre-trained model of this study are publicly available at https://github.com/Au3C2/Generator-Versus-Segmentor. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.13991 [pdf]

Risk Assessment with Generic Energy Storage under Exogenous and Endogenous Uncertainty

Authors: Ning Qi, Lin Cheng, Yuxiang Wan, Yingrui Zhuang, Zeyu Liu

Abstract: Current risk assessment ignores the stochastic nature of energy storage availability itself and thus lead to potential risk during operation. This paper proposes the redefinition of generic energy storage (GES) that is allowed to offer probabilistic reserve. A data-driven unified model with exogenous and endogenous uncertainty (EXU & EDU) description is presented for four typical types of GES. Mor… ▽ More Current risk assessment ignores the stochastic nature of energy storage availability itself and thus lead to potential risk during operation. This paper proposes the redefinition of generic energy storage (GES) that is allowed to offer probabilistic reserve. A data-driven unified model with exogenous and endogenous uncertainty (EXU & EDU) description is presented for four typical types of GES. Moreover, risk indices are proposed to assess the impact of overlooking (EXU & EDU) of GES. Comparative results between EXU & EDU are illustrated in distribution system with day-ahead chance-constrained optimization (CCO) and more severe risks are observed for the latter, which indicate that system operator (SO) should adopt novel strategies for EDU uncertainty. △ Less

Submitted 26 March, 2022; originally announced March 2022.

Comments: PES GM2022-Exogenous and Endogenous Uncertainty

arXiv:2203.03140 [pdf, other]

An Improved Automatic Modulation Classification Scheme Based on Adaptive Fusion Network

Authors: Hao Shi, Qi Peng, Yiqi Zhuang

Abstract: Due to the over-fitting problem caused by imbalance samples, there is still room to improve the performance of data-driven automatic modulation classification (AMC) in noisy scenarios. By fully considering the signal characteristics, an AMC scheme based on adaptive fusion network (AFNet) is proposed in this work. The AFNet can extract and aggregate multi-scale spatial features of in-phase and quad… ▽ More Due to the over-fitting problem caused by imbalance samples, there is still room to improve the performance of data-driven automatic modulation classification (AMC) in noisy scenarios. By fully considering the signal characteristics, an AMC scheme based on adaptive fusion network (AFNet) is proposed in this work. The AFNet can extract and aggregate multi-scale spatial features of in-phase and quadrature (I/Q) signals intelligently, thus improving the feature representation capability. Moreover, a novel confidence weighted loss function is proposed to address the imbalance issue and it is implemented by a two-stage learning scheme.Through the two-stage learning, AFNet can focus on high-confidence samples with more valid information and extract effective representations, so as to improve the overall classification performance. In the simulations, the proposed scheme reaches an average accuracy of 62.66% on a wide range of SNRs, which outperforms other AMC models. The effects of the loss function on classification accuracy are further studied. △ Less

Submitted 7 March, 2022; originally announced March 2022.

Comments: 5 pages, 6 figures, Accepted to IEEE VTC 2022-Spring

arXiv:2203.02953 [pdf]

Point Spread Function Estimation of Defocus

Authors: Renzhi He, Yan Zhuang, Boya Fu, Fei Liu

Abstract: This Point spread function (PSF) plays a crucial role in many computational imaging applications, such as shape from focus/defocus, depth estimation, and fluorescence microscopy. However, the mathematical model of the defocus process is still unclear. In this work, we develop an alternative method to estimate the precise mathematical model of the point spread function to describe the defocus proce… ▽ More This Point spread function (PSF) plays a crucial role in many computational imaging applications, such as shape from focus/defocus, depth estimation, and fluorescence microscopy. However, the mathematical model of the defocus process is still unclear. In this work, we develop an alternative method to estimate the precise mathematical model of the point spread function to describe the defocus process. We first derive the mathematical algorithm for the PSF which is used to generate the simulated focused images for different focus depth. Then we compute the loss function of the similarity between the simulated focused images and real focused images where we design a novel and efficient metric based on the defocus histogram to evaluate the difference between the focused images. After we solve the minimum value of the loss function, it means we find the optimal parameters for the PSF. We also construct a hardware system consisting of a focusing system and a structured light system to acquire the all-in-focus image, the focused image with corresponding focus depth, and the depth map in the same view. The three types of images, as a dataset, are used to obtain the precise PSF. Our experiments on standard planes and actual objects show that the proposed algorithm can accurately describe the defocus process. The accuracy of our algorithm is further proved by evaluating the difference among the actual focused images, the focused image generated by our algorithm, the focused image generated by others. The results show that the loss of our algorithm is 40% less than others on average. △ Less

Submitted 19 September, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

arXiv:2203.01589 [pdf, other]

Reconfigurable Intelligent Surface Assisted OFDM Relaying: Subcarrier Matching with Balanced SNR

Authors: Tong Zhang, Shuai Wang, Yufan Zhuang, Changsheng You, Miaowen Wen, Yik-Chung Wu

Abstract: This paper considers a reconfigurable intelligent surface (RIS) aided orthogonal frequency division multiplexing (OFDM) relaying system, and investigates the joint design of RIS passive beamforming and subcarrier matching under two cases, where Case-I ignores the source-RIS-destination signal, while Case-II explores this signal for rate enhancement. We formulate a mixed-integer nonlinear programmi… ▽ More This paper considers a reconfigurable intelligent surface (RIS) aided orthogonal frequency division multiplexing (OFDM) relaying system, and investigates the joint design of RIS passive beamforming and subcarrier matching under two cases, where Case-I ignores the source-RIS-destination signal, while Case-II explores this signal for rate enhancement. We formulate a mixed-integer nonlinear programming (MINIP) problem to maximize the sum achievable rate of all subcarriers by jointly optimizing the passive beamforming and subcarrier matching. To solve this problem, we first develop a branch-and-bound (BnB)-based alternating optimization algorithm for attaining a near-optimal solution. Then, a low-complexity difference-of-convex penalty-based algorithm and learning-to-optimize approach are also proposed. Finally, simulation results demonstrate that the RIS-assisted OFDM relaying system achieves a substantial achievable rate gain as compared to that without RIS since RIS recasts the subcarrier matching and balances the signal-to-noise ratio (SNR) among different subcarrier pairs. △ Less

Submitted 5 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE Transactions on Vehicular Technology

arXiv:2201.08996 [pdf, other]

Linear Array Network for Low-light Image Enhancement

Authors: Keqi Wang, Ziteng Cui, Jieru Jia, Hao Xu, Ge Wu, Yin Zhuang, Lu Chen, Zhiguo Hu, Yuhua Qian

Abstract: Convolution neural networks (CNNs) based methods have dominated the low-light image enhancement tasks due to their outstanding performance. However, the convolution operation is based on a local sliding window mechanism, which is difficult to construct the long-range dependencies of the feature maps. Meanwhile, the self-attention based global relationship aggregation methods have been widely used… ▽ More Convolution neural networks (CNNs) based methods have dominated the low-light image enhancement tasks due to their outstanding performance. However, the convolution operation is based on a local sliding window mechanism, which is difficult to construct the long-range dependencies of the feature maps. Meanwhile, the self-attention based global relationship aggregation methods have been widely used in computer vision, but these methods are difficult to handle high-resolution images because of the high computational complexity. To solve this problem, this paper proposes a Linear Array Self-attention (LASA) mechanism, which uses only two 2-D feature encodings to construct 3-D global weights and then refines feature maps generated by convolution layers. Based on LASA, Linear Array Network (LAN) is proposed, which is superior to the existing state-of-the-art (SOTA) methods in both RGB and RAW based low-light enhancement tasks with a smaller amount of parameters. The code is released in https://github.com/cuiziteng/LASA_enhancement. △ Less

Submitted 16 February, 2022; v1 submitted 22 January, 2022; originally announced January 2022.

arXiv:2201.08169 [pdf, other]

Secure Rate-Splitting for MIMO Broadcast Channel with Imperfect CSIT and a Jammer

Authors: Tong Zhang, Dongsheng Chen, Na Li, Yufan Zhuang, Bojie Lv, Rui Wang

Abstract: In this paper, we investigate the secure rate-splitting for the two-user multiple-input multiple-output (MIMO) broadcast channel with imperfect channel state information at the transmitter (CSIT) and a multiple-antenna jammer, where each receiver has an equal number of antennas and the jammer has perfect channel state information (CSI). Specifically, we design a secure rate-splitting multiple-acce… ▽ More In this paper, we investigate the secure rate-splitting for the two-user multiple-input multiple-output (MIMO) broadcast channel with imperfect channel state information at the transmitter (CSIT) and a multiple-antenna jammer, where each receiver has an equal number of antennas and the jammer has perfect channel state information (CSI). Specifically, we design a secure rate-splitting multiple-access strategy, where the security of split private and common messages is ensured by precoder design with joint nulling and aligning the leakage information, regarding different antenna configurations. Moreover, we show that the sum-secure degrees-of-freedom (SDoF) achieved by secure rate-splitting is optimal and outperforms that by conventional zero-forcing. Therefore, we reveal the sum-SDoF of the two-user MIMO broadcast channel with imperfect CSIT and a jammer, and validate the superiority of rate-splitting for the security purpose in this scenario with emphasis of MIMO. △ Less

Submitted 10 July, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: 6 pages, 3 figures

arXiv:2201.00100 [pdf, other]

doi 10.1109/TIP.2021.3139232

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images

Authors: Xiaoqiang Wang, Lei Zhu, Siliang Tang, Huazhu Fu, ** Li, Fei Wu, Yi Yang, Yueting Zhuang

Abstract: Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images. However, RGB-D data is not easily acquired, which limits the development of RGB-D SOD techniques. To alleviate this issue, we present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection. We first devise a… ▽ More Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images. However, RGB-D data is not easily acquired, which limits the development of RGB-D SOD techniques. To alleviate this issue, we present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection. We first devise a depth decoupling convolutional neural network (DDCNN), which contains a depth estimation branch and a saliency detection branch. The depth estimation branch is trained with RGB-D images and then used to estimate the pseudo depth maps for all unlabeled RGB images to form the paired data. The saliency detection branch is used to fuse the RGB feature and depth feature to predict the RGB-D saliency. Then, the whole DDCNN is assigned as the backbone in a teacher-student framework for semi-supervised learning. Moreover, we also introduce a consistency loss on the intermediate attention and saliency maps for the unlabeled data, as well as a supervised depth and saliency loss for labeled data. Experimental results on seven widely-used benchmark datasets demonstrate that our DDCNN outperforms state-of-the-art methods both quantitatively and qualitatively. We also demonstrate that our semi-supervised DS-Net can further improve the performance, even when using an RGB image with the pseudo depth map. △ Less

Submitted 31 December, 2021; originally announced January 2022.

Comments: Accepted by IEEE TIP

arXiv:2111.00666 [pdf, other]

Self-Verification in Image Denoising

Authors: Huangxing Lin, Yihong Zhuang, Delu Zeng, Yue Huang, Xinghao Ding, John Paisley

Abstract: We devise a new regularization, called self-verification, for image denoising. This regularization is formulated using a deep image prior learned by the network, rather than a traditional predefined prior. Specifically, we treat the output of the network as a ``prior'' that we denoise again after ``re-noising''. The comparison between the again denoised image and its prior can be interpreted as a… ▽ More We devise a new regularization, called self-verification, for image denoising. This regularization is formulated using a deep image prior learned by the network, rather than a traditional predefined prior. Specifically, we treat the output of the network as a ``prior'' that we denoise again after ``re-noising''. The comparison between the again denoised image and its prior can be interpreted as a self-verification of the network's denoising ability. We demonstrate that self-verification encourages the network to capture low-level image statistics needed to restore the image. Based on this self-verification regularization, we further show that the network can learn to denoise even if it has not seen any clean images. This learning strategy is self-supervised, and we refer to it as Self-Verification Image Denoising (SVID). SVID can be seen as a mixture of learning-based methods and traditional model-based denoising methods, in which regularization is adaptively formulated using the output of the network. We show the application of SVID to various denoising tasks using only observed corrupted data. It can achieve the denoising performance close to supervised CNNs. △ Less

Submitted 31 October, 2021; originally announced November 2021.

arXiv:2110.05606 [pdf, other]

Nearest Subspace Search in The Signed Cumulative Distribution Transform Space for 1D Signal Classification

Authors: Abu Hasnat Mohammad Rubaiyat, Mohammad Shifat-E-Rabbi, Yan Zhuang, Shiying Li, Gustavo K. Rohde

Abstract: This paper presents a new method to classify 1D signals using the signed cumulative distribution transform (SCDT). The proposed method exploits certain linearization properties of the SCDT to render the problem easier to solve in the SCDT space. The method uses the nearest subspace search technique in the SCDT domain to provide a non-iterative, effective, and simple to implement classification alg… ▽ More This paper presents a new method to classify 1D signals using the signed cumulative distribution transform (SCDT). The proposed method exploits certain linearization properties of the SCDT to render the problem easier to solve in the SCDT space. The method uses the nearest subspace search technique in the SCDT domain to provide a non-iterative, effective, and simple to implement classification algorithm. Experiments show that the proposed technique outperforms the state-of-the-art neural networks using a very low number of training samples and is also robust to out-of-distribution examples on simulated data. We also demonstrate the efficacy of the proposed technique in real-world applications by applying it to an ECG classification problem. The python code implementing the proposed classifier can be found in PyTransKit (https://github.com/rohdelab/PyTransKit). △ Less

Submitted 25 February, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: 7 pages, 3 figures, 1 table

arXiv:2109.06179 [pdf, other]

Unsupervised learning approaches to characterize heterogeneous samples using X-ray single particle imaging

Authors: Yulong Zhuang, Salah Awel, Anton Barty, Richard Bean, Johan Bielecki, Martin Bergemann, Benedikt J. Daurer, Tomas Ekeberg, Armando D. Estillore, Hans Fangohr, Klaus Giewekemeyer, Mark S. Hunter, Mikhail Karnevskiy, Richard A. Kirian, Henry Kirkwood, Yoonhee Kim, Jayanath Koliyadu, Holger Lange, Romain Letrun, Jannik Lübke, Abhishek Mall, Thomas Michelat, Andrew J. Morgan, Nils Roth, Amit K. Samanta , et al. (17 additional authors not shown)

Abstract: One of the outstanding analytical problems in X-ray single particle imaging (SPI) is the classification of structural heterogeneity, which is especially difficult given the low signal-to-noise ratios of individual patterns and that even identical objects can yield patterns that vary greatly when orientation is taken into consideration. We propose two methods which explicitly account for this orien… ▽ More One of the outstanding analytical problems in X-ray single particle imaging (SPI) is the classification of structural heterogeneity, which is especially difficult given the low signal-to-noise ratios of individual patterns and that even identical objects can yield patterns that vary greatly when orientation is taken into consideration. We propose two methods which explicitly account for this orientation-induced variation and can robustly determine the structural landscape of a sample ensemble. The first, termed common-line principal component analysis (PCA) provides a rough classification which is essentially parameter-free and can be run automatically on any SPI dataset. The second method, utilizing variation auto-encoders (VAEs) can generate 3D structures of the objects at any point in the structural landscape. We implement both these methods in combination with the noise-tolerant expand-maximize-compress (EMC) algorithm and demonstrate its utility by applying it to an experimental dataset from gold nanoparticles with only a few thousand photons per pattern and recover both discrete structural classes as well as continuous deformations. These developments diverge from previous approaches of extracting reproducible subsets of patterns from a dataset and open up the possibility to move beyond studying homogeneous sample sets and study open questions on topics such as nanocrystal growth and dynamics as well as phase transitions which have not been externally triggered. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: 29 pages, 9 figures

arXiv:2108.07200 [pdf, other]

Continuous-Time Spatiotemporal Calibration of a Rolling Shutter Camera-IMU System

Authors: Jianzhu Huai, Yuan Zhuang, Qicheng Yuan, Yukai Lin

Abstract: The rolling shutter (RS) mechanism is widely used by consumer-grade cameras, which are essential parts in smartphones and autonomous vehicles. The RS effect leads to image distortion upon relative motion between a camera and the scene. This effect needs to be considered in video stabilization, structure from motion, and vision-aided odometry, for which recent studies have improved earlier global s… ▽ More The rolling shutter (RS) mechanism is widely used by consumer-grade cameras, which are essential parts in smartphones and autonomous vehicles. The RS effect leads to image distortion upon relative motion between a camera and the scene. This effect needs to be considered in video stabilization, structure from motion, and vision-aided odometry, for which recent studies have improved earlier global shutter (GS) methods by accounting for the RS effect. However, it is still unclear how the RS affects spatiotemporal calibration of the camera in a sensor assembly, which is crucial to good performance in aforementioned applications. This work takes the camera-IMU system as an example and looks into the RS effect on its spatiotemporal calibration. To this end, we develop a calibration method for a RS-camera-IMU system with continuous-time B-splines by using a calibration target. Unlike in calibrating GS cameras, every observation of a landmark on the target has a unique camera pose fitted by continuous-time B-splines. With simulated data generated from four sets of public calibration data, we show that RS can noticeably affect the extrinsic parameters, causing errors about 1$^\circ$ in orientation and 2 $cm$ in translation with a RS setting as in common smartphone cameras. With real data collected by two industrial camera-IMU systems, we find that considering the RS effect gives more accurate and consistent spatiotemporal calibration. Moreover, our method also accurately calibrates the inter-line delay of the RS. The code for simulation and calibration is publicly available. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: 11 pages, 9 figures

arXiv:2107.03600 [pdf, other]

Reinforcement Learning based Negotiation-aware Motion Planning of Autonomous Vehicles

Authors: Zhitao Wang, Yuzheng Zhuang, Qiang Gu, Dong Chen, Hongbo Zhang, Wulong Liu

Abstract: For autonomous vehicles integrating onto roadways with human traffic participants, it requires understanding and adapting to the participants' intention and driving styles by responding in predictable ways without explicit communication. This paper proposes a reinforcement learning based negotiation-aware motion planning framework, which adopts RL to adjust the driving style of the planner by dyna… ▽ More For autonomous vehicles integrating onto roadways with human traffic participants, it requires understanding and adapting to the participants' intention and driving styles by responding in predictable ways without explicit communication. This paper proposes a reinforcement learning based negotiation-aware motion planning framework, which adopts RL to adjust the driving style of the planner by dynamically modifying the prediction horizon length of the motion planner in real time adaptively w.r.t the event of a change in environment, typically triggered by traffic participants' switch of intents with different driving styles. The framework models the interaction between the autonomous vehicle and other traffic participants as a Markov Decision Process. A temporal sequence of occupancy grid maps are taken as inputs for RL module to embed an implicit intention reasoning. Curriculum learning is employed to enhance the training efficiency and the robustness of the algorithm. We applied our method to narrow lane navigation in both simulation and real world to demonstrate that the proposed method outperforms the common alternative due to its advantage in alleviating the social dilemma problem with proper negotiation skills. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2105.14671 [pdf]

Signal Acquisition of Luojia-1A Low Earth Orbit Navigation Augmentation System with Software Defined Receiver

Authors: Liang Chen, Xiangchen Lu, Nan Shen, Lei Wang, Yuan Zhuang, Ye Su, Deren Li, Ruizhi Chen

Abstract: Low earth orbit (LEO) satellite navigation signal can be used as an opportunity signal in case of a Global navigation satellite system (GNSS) outage, or as an enhancement means of traditional GNSS positioning algorithms. No matter which service mode is used, signal acquisition is the prerequisite of providing enhanced LEO navigation service. Compared with the medium orbit satellite, the transit ti… ▽ More Low earth orbit (LEO) satellite navigation signal can be used as an opportunity signal in case of a Global navigation satellite system (GNSS) outage, or as an enhancement means of traditional GNSS positioning algorithms. No matter which service mode is used, signal acquisition is the prerequisite of providing enhanced LEO navigation service. Compared with the medium orbit satellite, the transit time of the LEO satellite is shorter. Thus, it is of great significance to expand the successful acquisition time range of the LEO signal. Previous studies on LEO signal acquisition are based on simulation data. However, signal acquisition research based on real data is very important. In this work, the signal characteristics of LEO satellite: power space density in free space and the Doppler shift of LEO satellite are individually studied. The unified symbol definitions of several integration algorithms based on the parallel search signal acquisition algorithm are given. To verify these algorithms for LEO signal acquisition, a software-defined receiver (SDR) is developed. The performance of those integration algorithms on expanding the successful acquisition time range is verified by the real data collected from the Luojia-1A satellite. The experimental results show that the integration strategy can expand the successful acquisition time range, and it will not expand indefinitely with the integration duration. △ Less

Submitted 30 May, 2021; originally announced May 2021.

arXiv:2102.08596 [pdf, other]

Consistent Right-Invariant Fixed-Lag Smoother with Application to Visual Inertial SLAM

Authors: Jianzhu Huai, Yukai Lin, Yuan Zhuang, Min Shi

Abstract: State estimation problems without absolute position measurements routinely arise in navigation of unmanned aerial vehicles, autonomous ground vehicles, etc., whose proper operation relies on accurate state estimates and reliable covariances. Unaware of absolute positions, these problems have immanent unobservable directions. Traditional causal estimators, however, usually gain spurious information… ▽ More State estimation problems without absolute position measurements routinely arise in navigation of unmanned aerial vehicles, autonomous ground vehicles, etc., whose proper operation relies on accurate state estimates and reliable covariances. Unaware of absolute positions, these problems have immanent unobservable directions. Traditional causal estimators, however, usually gain spurious information on the unobservable directions, leading to over-confident covariance inconsistent with actual estimator errors. The consistency problem of fixed-lag smoothers (FLSs) has only been attacked by the first estimate Jacobian (FEJ) technique because of the complexity to analyze their observability property. But the FEJ has several drawbacks hampering its wide adoption. To ensure the consistency of a FLS, this paper introduces the right invariant error formulation into the FLS framework. To our knowledge, we are the first to analyze the observability of a FLS with the right invariant error. Our main contributions are twofold. As the first novelty, to bypass the complexity of analysis with the classic observability matrix, we show that observability analysis of FLSs can be done equivalently on the linearized system. Second, we prove that the inconsistency issue in the traditional FLS can be elegantly solved by the right invariant error formulation without artificially correcting Jacobians. By applying the proposed FLS to the monocular visual inertial simultaneous localization and map** (SLAM) problem, we confirm that the method consistently estimates covariance similarly to a batch smoother in simulation and that our method achieved comparable accuracy as traditional FLSs on real data. △ Less

Submitted 21 March, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

Comments: 13 pages, 4 figures, AAAI 2021 Conference

arXiv:2011.14512 [pdf, other]

Adaptive noise imitation for image denoising

Authors: Huangxing Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Yizhou Yu, Xiaoqing Liu, John Paisley

Abstract: The effectiveness of existing denoising algorithms typically relies on accurate pre-defined noise statistics or plenty of paired data, which limits their practicality. In this work, we focus on denoising in the more common case where noise statistics and paired data are unavailable. Considering that denoising CNNs require supervision, we develop a new \textbf{adaptive noise imitation (ADANI)} algo… ▽ More The effectiveness of existing denoising algorithms typically relies on accurate pre-defined noise statistics or plenty of paired data, which limits their practicality. In this work, we focus on denoising in the more common case where noise statistics and paired data are unavailable. Considering that denoising CNNs require supervision, we develop a new \textbf{adaptive noise imitation (ADANI)} algorithm that can synthesize noisy data from naturally noisy images. To produce realistic noise, a noise generator takes unpaired noisy/clean images as input, where the noisy image is a guide for noise generation. By imposing explicit constraints on the type, level and gradient of noise, the output noise of ADANI will be similar to the guided noise, while kee** the original clean background of the image. Coupling the noisy data output from ADANI with the corresponding ground-truth, a denoising CNN is then trained in a fully-supervised manner. Experiments show that the noisy data produced by ADANI are visually and statistically similar to real ones so that the denoising CNN in our method is competitive to other networks trained with external paired data. △ Less

Submitted 29 November, 2020; originally announced November 2020.

arXiv:2007.13597 [pdf, other]

3D diffractive imaging of nanoparticle ensembles using an X-ray laser

Authors: Kartik Ayyer, P. Lourdu Xavier, Johan Bielecki, Zhou Shen, Benedikt J. Daurer, Amit K. Samanta, Salah Awel, Richard Bean, Anton Barty, Tomas Ekeberg, Armando D. Estillore, Klaus Giewekemeyer, Mark S. Hunter, Richard A. Kirian, Henry Kirkwood, Yoonhee Kim, Jayanath Koliyadu, Holger Lange, Romain Letruin, Jannik Lübke, Andrew J. Morgan, Nils Roth, Tokushi Sato, Marcin Sikorski, Florian Schulz , et al. (12 additional authors not shown)

Abstract: We report the 3D structure determination of gold nanoparticles (AuNPs) by X-ray single particle imaging (SPI). Around 10 million diffraction patterns from gold nanoparticles were measured in less than 100 hours of beam time, more than 100 times the amount of data in any single prior SPI experiment, using the new capabilities of the European X-ray free electron laser which allow measurements of 150… ▽ More We report the 3D structure determination of gold nanoparticles (AuNPs) by X-ray single particle imaging (SPI). Around 10 million diffraction patterns from gold nanoparticles were measured in less than 100 hours of beam time, more than 100 times the amount of data in any single prior SPI experiment, using the new capabilities of the European X-ray free electron laser which allow measurements of 1500 frames per second. A classification and structural sorting method was developed to disentangle the heterogeneity of the particles and to obtain a resolution of better than 3 nm. With these new experimental and analytical developments, we have entered a new era for the SPI method and the path towards close-to-atomic resolution imaging of biomolecules is apparent. △ Less

Submitted 17 July, 2020; originally announced July 2020.

Comments: 25 pages, 5 main figures, 6 supplementary figures, 2 supplementary movies (link in document)

arXiv:2007.06727 [pdf, other]

Inertial Sensing Meets Artificial Intelligence: Opportunity or Challenge?

Authors: You Li, Ruizhi Chen, Xiaoji Niu, Yuan Zhuang, Zhouzheng Gao, Xin Hu, Naser El-Sheimy

Abstract: The inertial navigation system (INS) has been widely used to provide self-contained and continuous motion estimation in intelligent transportation systems. Recently, the emergence of chip-level inertial sensors has expanded the relevant applications from positioning, navigation, and mobile map** to location-based services, unmanned systems, and transportation big data. Meanwhile, benefit from th… ▽ More The inertial navigation system (INS) has been widely used to provide self-contained and continuous motion estimation in intelligent transportation systems. Recently, the emergence of chip-level inertial sensors has expanded the relevant applications from positioning, navigation, and mobile map** to location-based services, unmanned systems, and transportation big data. Meanwhile, benefit from the emergence of big data and the improvement of algorithms and computing power, artificial intelligence (AI) has become a consensus tool that has been successfully applied in various fields. This article reviews the research on using AI technology to enhance inertial sensing from various aspects, including sensor design and selection, calibration and error modeling, navigation and motion-sensing algorithms, multi-sensor information fusion, system evaluation, and practical application. Based on the over 30 representative articles selected from the nearly 300 related publications, this article summarizes the state of the art, advantages, and challenges on each aspect. Finally, it summarizes nine advantages and nine challenges of AI-enhanced inertial sensing and then points out future research directions. △ Less

Submitted 13 July, 2020; originally announced July 2020.

arXiv:2004.04688 [pdf, other]

doi 10.1109/JIOT.2018.2889303

Towards Robust Crowdsourcing-Based Localization: A Fingerprinting Accuracy Indicator Enhanced Wireless/Magnetic/Inertial Integration Approach

Authors: You Li, Zhe He, Zhouzheng Gao, Yuan Zhuang, Chuang Shi, Naser El-Sheimy

Abstract: The next-generation internet of things (IoT) systems have an increasingly demand on intelligent localization which can scale with big data without human perception. Thus, traditional localization solutions without accuracy metric will greatly limit vast applications. Crowdsourcing-based localization has been proven to be effective for mass-market location-based IoT applications. This paper propose… ▽ More The next-generation internet of things (IoT) systems have an increasingly demand on intelligent localization which can scale with big data without human perception. Thus, traditional localization solutions without accuracy metric will greatly limit vast applications. Crowdsourcing-based localization has been proven to be effective for mass-market location-based IoT applications. This paper proposes an enhanced crowdsourcing-based localization method by integrating inertial, wireless, and magnetic sensors. Both wireless and magnetic fingerprinting accuracy are predicted in real time through the introduction of fingerprinting accuracy indicators (FAI) from three levels (i.e., signal, geometry, and database). The advantages and limitations of these FAI factors and their performances on predicting location errors and outliers are investigated. Furthermore, the FAI-enhanced extended Kalman filter (EKF) is proposed, which improved the dead-reckoning (DR)/WiFi, DR/Magnetic, and DR/WiFi/Magnetic integrated localization accuracy by 30.2 %, 19.4 %, and 29.0 %, and reduced the maximum location errors by 41.2 %, 28.4 %, and 44.2 %, respectively. These outcomes confirm the effectiveness of the FAI-enhanced EKF on improving both accuracy and reliability of multi-sensor integrated localization using crowdsourced data. △ Less

Submitted 9 April, 2020; originally announced April 2020.

arXiv:2004.04618 [pdf, other]

doi 10.1109/JIOT.2019.2957778

Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization

Authors: You Li, Xin Hu, Yuan Zhuang, Zhouzheng Gao, Peng Zhang, Naser El-Sheimy

Abstract: Location is key to spatialize internet-of-things (IoT) data. However, it is challenging to use low-cost IoT devices for robust unsupervised localization (i.e., localization without training data that have known location labels). Thus, this paper proposes a deep reinforcement learning (DRL) based unsupervised wireless-localization method. The main contributions are as follows. (1) This paper propos… ▽ More Location is key to spatialize internet-of-things (IoT) data. However, it is challenging to use low-cost IoT devices for robust unsupervised localization (i.e., localization without training data that have known location labels). Thus, this paper proposes a deep reinforcement learning (DRL) based unsupervised wireless-localization method. The main contributions are as follows. (1) This paper proposes an approach to model a continuous wireless-localization process as a Markov decision process (MDP) and process it within a DRL framework. (2) To alleviate the challenge of obtaining rewards when using unlabeled data (e.g., daily-life crowdsourced data), this paper presents a reward-setting mechanism, which extracts robust landmark data from unlabeled wireless received signal strengths (RSS). (3) To ease requirements for model re-training when using DRL for localization, this paper uses RSS measurements together with agent location to construct DRL inputs. The proposed method was tested by using field testing data from multiple Bluetooth 5 smart ear tags in a pasture. Meanwhile, the experimental verification process reflected the advantages and challenges for using DRL in wireless localization. △ Less

Submitted 9 April, 2020; originally announced April 2020.

arXiv:2004.03738 [pdf, other]

doi 10.1109/JIOT.2020.3019199

Location-Enabled IoT (LE-IoT): A Survey of Positioning Techniques, Error Sources, and Mitigation

Authors: You Li, Yuan Zhuang, Xin Hu, Zhouzheng Gao, Jia Hu, Long Chen, Zhe He, Ling Pei, Kejie Chen, Maosong Wang, Xiaoji Niu, Ruizhi Chen, John Thompson, Fadhel Ghannouchi, Naser El-Sheimy

Abstract: The Internet of Things (IoT) has started to empower the future of many industrial and mass-market applications. Localization techniques are becoming key to add location context to IoT data without human perception and intervention. Meanwhile, the newly-emerged Low-Power Wide-Area Network (LPWAN) technologies have advantages such as long-range, low power consumption, low cost, massive connections,… ▽ More The Internet of Things (IoT) has started to empower the future of many industrial and mass-market applications. Localization techniques are becoming key to add location context to IoT data without human perception and intervention. Meanwhile, the newly-emerged Low-Power Wide-Area Network (LPWAN) technologies have advantages such as long-range, low power consumption, low cost, massive connections, and the capability for communication in both indoor and outdoor areas. These features make LPWAN signals strong candidates for mass-market localization applications. However, there are various error sources that have limited localization performance by using such IoT signals. This paper reviews the IoT localization system through the following sequence: IoT localization system review -- localization data sources -- localization algorithms -- localization error sources and mitigation -- localization performance evaluation. Compared to the related surveys, this paper has a more comprehensive and state-of-the-art review on IoT localization methods, an original review on IoT localization error sources and mitigation, an original review on IoT localization performance evaluation, and a more comprehensive review of IoT localization applications, opportunities, and challenges. Thus, this survey provides comprehensive guidance for peers who are interested in enabling localization ability in the existing IoT systems, using IoT systems for localization, or integrating IoT signals with the existing localization sensors. △ Less

Submitted 7 April, 2020; originally announced April 2020.

arXiv:2001.00269 [pdf]

doi 10.1109/TITS.2020.2984197

A Smart, Efficient, and Reliable Parking Surveillance System with Edge Artificial Intelligence on IoT Devices

Authors: Ruimin Ke, Yifan Zhuang, Ziyuan Pu, Yinhai Wang

Abstract: Cloud computing has been a main-stream computing service for years. Recently, with the rapid development in urbanization, massive video surveillance data are produced at an unprecedented speed. A traditional solution to deal with the big data would require a large amount of computing and storage resources. With the advances in Internet of things (IoT), artificial intelligence, and communication te… ▽ More Cloud computing has been a main-stream computing service for years. Recently, with the rapid development in urbanization, massive video surveillance data are produced at an unprecedented speed. A traditional solution to deal with the big data would require a large amount of computing and storage resources. With the advances in Internet of things (IoT), artificial intelligence, and communication technologies, edge computing offers a new solution to the problem by processing the data partially or wholly on the edge of a surveillance system. In this study, we investigate the feasibility of using edge computing for smart parking surveillance tasks, which is a key component of Smart City. The system processing pipeline is carefully designed with the consideration of flexibility, online surveillance, data transmission, detection accuracy, and system reliability. It enables artificial intelligence at the edge by implementing an enhanced single shot multibox detector (SSD). A few more algorithms are developed on both the edge and the server targeting optimal system efficiency and accuracy. Thorough field tests were conducted in the Angle Lake parking garage for three months. The experimental results are promising that the final detection method achieves over 95% accuracy in real-world scenarios with high efficiency and reliability. The proposed smart parking surveillance system can be a solid foundation for future applications of intelligent transportation systems. △ Less

Submitted 1 April, 2020; v1 submitted 1 January, 2020; originally announced January 2020.

Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2020

arXiv:1912.04016 [pdf, other]

Deep Neural Network for Fast and Accurate Single Image Super-Resolution via Channel-Attention-based Fusion of Orientation-aware Features

Authors: Du Chen, Zewei He, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, Michael Ying Yang, Siliang Tang, Yueting Zhuang

Abstract: Recently, Convolutional Neural Networks (CNNs) have been successfully adopted to solve the ill-posed single image super-resolution (SISR) problem. A commonly used strategy to boost the performance of CNN-based SISR models is deploying very deep networks, which inevitably incurs many obvious drawbacks (e.g., a large number of network parameters, heavy computational loads, and difficult model traini… ▽ More Recently, Convolutional Neural Networks (CNNs) have been successfully adopted to solve the ill-posed single image super-resolution (SISR) problem. A commonly used strategy to boost the performance of CNN-based SISR models is deploying very deep networks, which inevitably incurs many obvious drawbacks (e.g., a large number of network parameters, heavy computational loads, and difficult model training). In this paper, we aim to build more accurate and faster SISR models via develo** better-performing feature extraction and fusion techniques. Firstly, we proposed a novel Orientation-Aware feature extraction and fusion Module (OAM), which contains a mixture of 1D and 2D convolutional kernels (i.e., 5 x 1, 1 x 5, and 3 x 3) for extracting orientation-aware features. Secondly, we adopt the channel attention mechanism as an effective technique to adaptively fuse features extracted in different directions and in hierarchically stacked convolutional stages. Based on these two important improvements, we present a compact but powerful CNN-based model for high-quality SISR via Channel Attention-based fusion of Orientation-Aware features (SISR-CA-OA). Extensive experimental results verify the superiority of the proposed SISR-CA-OA model, performing favorably against the state-of-the-art SISR models in terms of both restoration accuracy and computational efficiency. The source codes will be made publicly available. △ Less

Submitted 9 December, 2019; originally announced December 2019.

Comments: 12 pages, 11 figures

arXiv:1904.11419 [pdf]

Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yi** Zhuang, Agus Sudjianto

Abstract: Generative Adversarial Net (GAN) has been proven to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions can be both categorical and continuous variables containing different kinds of auxiliary information. Our simulation studies show that CGAN… ▽ More Generative Adversarial Net (GAN) has been proven to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions can be both categorical and continuous variables containing different kinds of auxiliary information. Our simulation studies show that CGAN is able to learn different kinds of normal and heavy tail distributions, as well as dependent structures of different time series and it can further generate conditional predictive distributions consistent with the training data distributions. We also provide an in-depth discussion on the rationale of GAN and the neural network as hierarchical splines to draw a clear connection with the existing statistical method for distribution generation. In practice, CGAN has a wide range of applications in the market risk and counterparty risk analysis: it can be applied to learn the historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES) and predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate CGAN is able to outperform the Historic Simulation, a popular method in market risk analysis for the calculation of VaR. CGAN can also be applied in the economic time series modeling and forecasting, and an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN is given at the end of the paper. △ Less

Submitted 25 April, 2019; originally announced April 2019.

arXiv:1807.02010 [pdf, other]

DNA Computing for Combinational Logic

Authors: Chuan Zhang, Lulu Ge, Yuchen Zhuang, Ziyuan Shen, Zhiwei Zhong, Zaichen Zhang, Xiaohu You

Abstract: With the progressive scale-down of semiconductor's feature size, people are looking forward to More Moore and More than Moore. In order to offer a possible alternative implementation process, people are trying to figure out a feasible transfer from silicon to molecular computing. Such transfer lies on bio-based modules programming with computer-like logic, aiming at realizing the Turing machine. T… ▽ More With the progressive scale-down of semiconductor's feature size, people are looking forward to More Moore and More than Moore. In order to offer a possible alternative implementation process, people are trying to figure out a feasible transfer from silicon to molecular computing. Such transfer lies on bio-based modules programming with computer-like logic, aiming at realizing the Turing machine. To accomplish this, the DNA-based combinational logic is inevitably the first step we have taken care of. This timely overview paper introduces combinational logic synthesized in DNA computing from both analog and digital perspectives separately. State-of-the-art research progress is summarized for interested readers to quick understand DNA computing, initiate discussion on existing techniques and inspire innovation solutions. We hope this paper can pave the way for the future DNA computing synthesis. △ Less

Submitted 5 July, 2018; originally announced July 2018.

Showing 1–36 of 36 results for author: Zhuang, Y