Search | arXiv e-print repository

FusionINN: Decomposable Image Fusion for Brain Tumor Monitoring

Authors: Nishant Kumar, Ziyan Tao, Jaikirat Singh, Yang Li, Peiwen Sun, Binghui Zhao, Stefan Gumhold

Abstract: Image fusion typically employs non-invertible neural networks to merge multiple source images into a single fused image. However, for clinical experts, solely relying on fused images may be insufficient for making diagnostic decisions, as the fusion mechanism blends features from source images, thereby making it difficult to interpret the underlying tumor pathology. We introduce FusionINN, a novel… ▽ More Image fusion typically employs non-invertible neural networks to merge multiple source images into a single fused image. However, for clinical experts, solely relying on fused images may be insufficient for making diagnostic decisions, as the fusion mechanism blends features from source images, thereby making it difficult to interpret the underlying tumor pathology. We introduce FusionINN, a novel decomposable image fusion framework, capable of efficiently generating fused images and also decomposing them back to the source images. FusionINN is designed to be bijective by including a latent image alongside the fused image, while ensuring minimal transfer of information from the source images to the latent representation. To the best of our knowledge, we are the first to investigate the decomposability of fused images, which is particularly crucial for life-sensitive applications such as medical image fusion compared to other tasks like multi-focus or multi-exposure image fusion. Our extensive experimentation validates FusionINN over existing discriminative and generative fusion methods, both subjectively and objectively. Moreover, compared to a recent denoising diffusion-based fusion model, our approach offers faster and qualitatively better fusion results. △ Less

Submitted 10 June, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

Comments: Accepted at IJCAI Workshop 2024. Source code available at https://github.com/nish03/FusionINN

arXiv:2403.05808 [pdf, other]

Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution

Authors: Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haorang Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang

Abstract: Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation inf… ▽ More Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation information on the diffusion process. Furthermore, these methods fail to consider the spatial variability inherent in the estimated blur kernel, stemming from factors such as motion jitter and out-of-focus elements in open-environment scenarios. This oversight results in a notable deviation of the image super-resolution effect from fundamental realities. To address these concerns, we introduce a framework known as Adaptive Multi-modal Fusion of \textbf{S}patially Variant Kernel Refinement with Diffusion Model for Blind Image \textbf{S}uper-\textbf{R}esolution (SSR). Within the SSR framework, we propose a Spatially Variant Kernel Refinement (SVKR) module. SVKR estimates a Depth-Informed Kernel, which takes the depth information into account and is spatially variant. Additionally, SVKR enhance the accuracy of depth information acquired from LR images, allowing for mutual enhancement between the depth map and blur kernel estimates. Finally, we introduce the Adaptive Multi-Modal Fusion (AMF) module to align the information from three modalities: low-resolution images, depth maps, and blur kernels. This alignment can constrain the diffusion model to generate more authentic SR results. Quantitative and qualitative experiments affirm the superiority of our approach, while ablation experiments corroborate the effectiveness of the modules we have proposed. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2310.09922 [pdf, ps, other]

Enhance Security of Time-Modulated Array-Enabled Directional Modulation by Introducing Symbol Ambiguity

Authors: Zhihao Tao, Zhaoyi Xu, Athina Petropulu

Abstract: In this paper, if the time-modulated array (TMA)-enabled directional modulation (DM) communication system can be cracked is investigated and the answer is YES! We first demonstrate that the scrambling data received at the eavesdropper can be defied by using grid search to successfully find the only and actual mixing matrix generated by TMA. Then, we propose introducing symbol ambiguity to TMA to d… ▽ More In this paper, if the time-modulated array (TMA)-enabled directional modulation (DM) communication system can be cracked is investigated and the answer is YES! We first demonstrate that the scrambling data received at the eavesdropper can be defied by using grid search to successfully find the only and actual mixing matrix generated by TMA. Then, we propose introducing symbol ambiguity to TMA to defend the defying of grid search, and design two principles for the TMA mixing matrix, i.e., rank deficiency and non-uniqueness of the ON-OFF switching pattern, that can be used to construct the symbol ambiguity. Also, we present a feasible mechanism to implement these two principles. Our proposed principles and mechanism not only shed light on how to design a more secure TMA DM system theoretically in the future, but also have been validated to be effective by bit error rate measurements. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.09299 [pdf, other]

Digital Twin Assisted Deep Reinforcement Learning for Online Admission Control in Sliced Network

Authors: Zhenyu Tao, Wei Xu, Xiaohu You

Abstract: The proliferation of diverse wireless services in 5G and beyond has led to the emergence of network slicing technologies. Among these, admission control plays a crucial role in achieving service-oriented optimization goals through the selective acceptance of service requests. Although deep reinforcement learning (DRL) forms the foundation in many admission control approaches thanks to its effectiv… ▽ More The proliferation of diverse wireless services in 5G and beyond has led to the emergence of network slicing technologies. Among these, admission control plays a crucial role in achieving service-oriented optimization goals through the selective acceptance of service requests. Although deep reinforcement learning (DRL) forms the foundation in many admission control approaches thanks to its effectiveness and flexibility, initial instability with excessive convergence delay of DRL models hinders their deployment in real-world networks. We propose a digital twin (DT) accelerated DRL solution to address this issue. Specifically, we first formulate the admission decision-making process as a semi-Markov decision process, which is subsequently simplified into an equivalent discrete-time Markov decision process to facilitate the implementation of DRL methods. A neural network-based DT is established with a customized output layer for queuing systems, trained through supervised learning, and then employed to assist the training phase of the DRL model. Extensive simulations show that the DT-accelerated DRL improves resource utilization by over 40% compared to the directly trained state-of-the-art dueling deep Q-learning model. This improvement is achieved while preserving the model's capability to optimize the long-term rewards of the admission process. △ Less

Submitted 21 November, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: 13 pages, 8 figures

arXiv:2310.08551 [pdf, ps, other]

How secure is the time-modulated array-enabled ofdm directional modulation?

Authors: Zhihao Tao, Zhaoyi Xu, Athina Petropulu

Abstract: Time-modulated arrays (TMA) transmitting orthogonal frequency division multiplexing (OFDM) waveforms achieve physical layer security by allowing the signal to reach the legitimate destination undistorted, while making the signal appear scrambled in all other directions. In this paper, we examine how secure the TMA OFDM system is, and show that it is possible for the eavesdropper to defy the scramb… ▽ More Time-modulated arrays (TMA) transmitting orthogonal frequency division multiplexing (OFDM) waveforms achieve physical layer security by allowing the signal to reach the legitimate destination undistorted, while making the signal appear scrambled in all other directions. In this paper, we examine how secure the TMA OFDM system is, and show that it is possible for the eavesdropper to defy the scrambling. In particular, we show that, based on the scrambled signal, the eavesdropper can formulate a blind source separation problem and recover data symbols and TMA parameters via independent component analysis (ICA) techniques. We show how the scaling and permutation ambiguities arising in ICA can be resolved by exploiting the Toeplitz structure of the corresponding mixing matrix, and knowledge of data constellation, OFDM specifics, and the rules for choosing TMA parameters. We also introduce a novel TMA implementation to defend the scrambling against the eavesdropper. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: This work was already submitted to IEEE ICASSP 2024

arXiv:2304.00021 [pdf, other]

Rapid online solution of inverse heat transfer problem by ANN-based extended Kalman smoothing algorithm

Authors: Xinxin Zhang, Dike Li, Jianqin Zhu, Zhi Tao, Lu Qiu

Abstract: Digital twin is a modern technology for many advanced applications. To construct a digital twin of a thermal system, it is required to make online estimations of unknown time-varying boundary conditions from sensor measured data, which needs to solve inverse heat transfer problems (IHTPs). However, a fast and accurate solution is challenging since the measured data is normally contaminated with no… ▽ More Digital twin is a modern technology for many advanced applications. To construct a digital twin of a thermal system, it is required to make online estimations of unknown time-varying boundary conditions from sensor measured data, which needs to solve inverse heat transfer problems (IHTPs). However, a fast and accurate solution is challenging since the measured data is normally contaminated with noise and the traditional method to solve IHTP involves significant amount of calculations. Therefore, in this work, a rapid yet robust inversion algorithm called ANN-based extended Kalman smoothing algorithm is developed to realize the online prediction of desired parameter based on the measurements. The fast prediction is realized by replacing the conventional CFD-based state transfer models in extended Kalman smoothing algorithm with pre-trained ANN. Then, a two-dimensional internal convective heat transfer problem was employed as the case study to test the algorithm. The results have proved that the proposed algorithm is a computational-light and robust approach for solving IHTPs. The proposed algorithm can achieve estimation of unknown boundary conditions with a dimensionless average error of 0.0580 under noisy temperature measurement with a standard deviation of 10 K with a drastic reduction of computational cost compared to the conventional approach. Moreover, the effects of training data, location of sensor, future time step selection on the performance of prediction are investigated. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2302.06980 [pdf, other]

doi 10.1109/TCCN.2023.3324634

Deep Learning-Based Modeling of 5G Core Control Plane for 5G Network Digital Twin

Authors: Zhenyu Tao, Yongliang Guo, Guanghui He, Yongming Huang, Xiaohu You

Abstract: Digital twin serves as a crucial facilitator in the advancement and implementation of emerging technologies within 5G and beyond networks. However, the intricate structure and diverse functionalities of the existing 5G core network, especially the control plane, present challenges in constructing core network digital twins. In this paper, we propose two novel data-driven architectures for modeling… ▽ More Digital twin serves as a crucial facilitator in the advancement and implementation of emerging technologies within 5G and beyond networks. However, the intricate structure and diverse functionalities of the existing 5G core network, especially the control plane, present challenges in constructing core network digital twins. In this paper, we propose two novel data-driven architectures for modeling the 5G control plane and implement corresponding deep learning models, namely 5GC-Seq2Seq and 5GC-former, based on the Vanilla Seq2Seq model and Transformer decoder respectively. We also present a solution enabling the interconversion of signaling messages and length-limited vectors to construct a dataset. The experiments are based on 5G core network signaling messages collected by the Spirent C50 network tester, encompassing various procedures such as registration, handover, and PDU sessions. The results show that 5GC-Seq2Seq achieves a 99.997\% F1-score (a metric measuring the accuracy of positive samples) in single UE scenarios with a simple structure, but exhibits significantly reduced performance in handling concurrency. In contrast, 5GC-former surpasses 99.999\% F1-score while maintaining robust performance under concurrent UE scenarios by constructing a more complex and highly parallel model. These findings validate that our method accurately replicates the principal functionalities of the 5G core network control plane. △ Less

Submitted 18 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Journal ref: IEEE Transactions on Cognitive Communications and Networking

arXiv:2209.12075 [pdf, other]

S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction

Authors: Jiamian Wang, Kunpeng Li, Yulun Zhang, Xin Yuan, Zhiqiang Tao

Abstract: The technology of hyperspectral imaging (HSI) records the visual information upon long-range-distributed spectral wavelengths. A representative hyperspectral image acquisition procedure conducts a 3D-to-2D encoding by the coded aperture snapshot spectral imager (CASSI) and requires a software decoder for the 3D signal reconstruction. By observing this physical encoding procedure, two major challen… ▽ More The technology of hyperspectral imaging (HSI) records the visual information upon long-range-distributed spectral wavelengths. A representative hyperspectral image acquisition procedure conducts a 3D-to-2D encoding by the coded aperture snapshot spectral imager (CASSI) and requires a software decoder for the 3D signal reconstruction. By observing this physical encoding procedure, two major challenges stand in the way of a high-fidelity reconstruction. (i) To obtain 2D measurements, CASSI dislocates multiple channels by disperser-titling and squeezes them onto the same spatial region, yielding an entangled data loss. (ii) The physical coded aperture leads to a masked data loss by selectively blocking the pixel-wise light exposure. To tackle these challenges, we propose a spatial-spectral (S^2-) Transformer network with a mask-aware learning strategy. First, we simultaneously leverage spatial and spectral attention modeling to disentangle the blended information in the 2D measurement along both two dimensions. A series of Transformer structures are systematically designed to fully investigate the spatial and spectral informative properties of the hyperspectral data. Second, the masked pixels will induce higher prediction difficulty and should be treated differently from unmasked ones. Thereby, we adaptively prioritize the loss penalty attributing to the mask structure by inferring the pixel-wise reconstruction difficulty upon the mask-encoded prediction. We theoretically discusses the distinct convergence tendencies between masked/unmasked regions of the proposed learning strategy. Extensive experiments demonstrates that the proposed method achieves superior reconstruction performance. Additionally, we empirically elaborate the behaviour of spatial and spectral attentions under the proposed architecture, and comprehensively examine the impact of the mask-aware learning. △ Less

Submitted 14 December, 2022; v1 submitted 24 September, 2022; originally announced September 2022.

Comments: 11 pages, 16 figures, 6 tables, Code: https://github.com/Jiamian-Wang/S2-transformer-HSI

arXiv:2112.15362 [pdf, other]

Modeling Mask Uncertainty in Hyperspectral Image Reconstruction

Authors: Jiamian Wang, Yulun Zhang, Xin Yuan, Ziyi Meng, Zhiqiang Tao

Abstract: Recently, hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded aperture snapshot spectral imaging (CASSI) system. Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI, during which the mask largely imp… ▽ More Recently, hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded aperture snapshot spectral imaging (CASSI) system. Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI, during which the mask largely impacts the reconstruction performance and could work as a "model hyperparameter" governing on data augmentations. This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments. To address this challenge, we introduce mask uncertainty for HSI with a complete variational Bayesian learning treatment and explicitly model it through a mask decomposition inspired by real hardware. Specifically, we propose a novel Graph-based Self-Tuning (GST) network to reason uncertainties adapting to varying spatial structures of masks among different hardware. Moreover, we develop a bilevel optimization framework to balance HSI reconstruction and uncertainty estimation, accounting for the hyperparameter property of masks. Extensive experimental results and model discussions validate the effectiveness (over 33/30 dB) of the proposed GST method under two miscalibration scenarios and demonstrate a highly competitive performance compared with the state-of-the-art well-calibrated methods. Our code and pre-trained model are available at https://github.com/Jiamian-Wang/mask_uncertainty_spectral_SCI △ Less

Submitted 23 September, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

Comments: ECCV 2022 Oral Paper

arXiv:2108.07739 [pdf, other]

A Simple and Efficient Reconstruction Backbone for Snapshot Compressive Imaging

Authors: Jiamian Wang, Yulun Zhang, Xin Yuan, Yun Fu, Zhiqiang Tao

Abstract: The emerging technology of snapshot compressive imaging (SCI) enables capturing high dimensional (HD) data in an efficient way. It is generally implemented by two components: an optical encoder that compresses HD signals into a 2D measurement and an algorithm decoder that retrieves the HD data upon the hardware-encoded measurement. Over a broad range of SCI applications, hyperspectral imaging (HSI… ▽ More The emerging technology of snapshot compressive imaging (SCI) enables capturing high dimensional (HD) data in an efficient way. It is generally implemented by two components: an optical encoder that compresses HD signals into a 2D measurement and an algorithm decoder that retrieves the HD data upon the hardware-encoded measurement. Over a broad range of SCI applications, hyperspectral imaging (HSI) and video compressive sensing have received significant research attention in recent years. Among existing SCI reconstruction algorithms, deep learning-based methods stand out as their promising performance and efficient inference. However, the deep reconstruction network may suffer from overlarge model size and highly-specialized network design, which inevitably lead to costly training time, high memory usage, and limited flexibility, thus discouraging the deployments of SCI systems in practical scenarios. In this paper, we tackle the above challenges by proposing a simple yet highly efficient reconstruction method, namely stacked residual network (SRN), by revisiting the residual learning strategy with nested structures and spatial-invariant property. The proposed SRN empowers high-fidelity data retrieval with fewer computation operations and negligible model size compared with existing networks, and also serves as a versatile backbone applicable for both hyperspectral and video data. Based on the proposed backbone, we first develop the channel attention enhanced SRN (CAE-SRN) to explore the spectral inter-dependencies for fine-grained spatial estimation in HSI. We then employ SRN as a deep denoiser and incorporate it into a generalized alternating projection (GAP) framework -- resulting in GAP-SRN -- to handle the video compressive sensing task. Experimental results demonstrate the state-of-the-art performance, high computational efficiency of the proposed SRN on two SCI applications. △ Less

Submitted 1 February, 2022; v1 submitted 17 August, 2021; originally announced August 2021.

Comments: 17 pages, 15 figures. Code and pre-trained models: https://github.com/Jiamian-Wang/HSI_baseline

arXiv:2007.11257 [pdf, other]

Deep-VFX: Deep Action Recognition Driven VFX for Short Video

Authors: Ao Luo, Ning Xie, Zhijia Tao, Feng Jiang

Abstract: Human motion is a key function to communicate information. In the application, short-form mobile video is so popular all over the world such as Tik Tok. The users would like to add more VFX so as to pursue creativity and personlity. Many special effects are added on the short video platform. These gives the users more possibility to show off these personality. The common and traditional way is to… ▽ More Human motion is a key function to communicate information. In the application, short-form mobile video is so popular all over the world such as Tik Tok. The users would like to add more VFX so as to pursue creativity and personlity. Many special effects are added on the short video platform. These gives the users more possibility to show off these personality. The common and traditional way is to create the template of VFX. However, in order to synthesis the perfect, the users have to tedious attempt to grasp the timing and rhythm of new templates. It is not easy-to-use especially for the mobile app. This paper aims to change the VFX synthesis by motion driven instead of the traditional template matching. We propose the AI method to improve this VFX synthesis. In detail, in order to add the special effect on the human body. The skeleton extraction is essential in this system. We also propose a novel form of LSTM to find out the user's intention by action recognition. The experiment shows that our system enables to generate VFX for short video more easier and efficient. △ Less

Submitted 22 July, 2020; originally announced July 2020.

Showing 1–11 of 11 results for author: Tao, Z