Search | arXiv e-print repository

Zero-Shot Image Denoising for High-Resolution Electron Microscopy

Authors: Xuanyu Tian, Zhuoya Dong, Xiyue Lin, Yue Gao, Hongjiang Wei, Yanhang Ma, **gyi Yu, Yuyao Zhang

Abstract: High-resolution electron microscopy (HREM) imaging technique is a powerful tool for directly visualizing a broad range of materials in real-space. However, it faces challenges in denoising due to ultra-low signal-to-noise ratio (SNR) and scarce data availability. In this work, we propose Noise2SR, a zero-shot self-supervised learning (ZS-SSL) denoising framework for HREM. Within our framework, we… ▽ More High-resolution electron microscopy (HREM) imaging technique is a powerful tool for directly visualizing a broad range of materials in real-space. However, it faces challenges in denoising due to ultra-low signal-to-noise ratio (SNR) and scarce data availability. In this work, we propose Noise2SR, a zero-shot self-supervised learning (ZS-SSL) denoising framework for HREM. Within our framework, we propose a super-resolution (SR) based self-supervised training strategy, incorporating the Random Sub-sampler module. The Random Sub-sampler is designed to generate approximate infinite noisy pairs from a single noisy image, serving as an effective data augmentation in zero-shot denoising. Noise2SR trains the network with paired noisy images of different resolutions, which is conducted via SR strategy. The SR-based training facilitates the network adopting more pixels for supervision, and the random sub-sampling helps compel the network to learn continuous signals enhancing the robustness. Meanwhile, we mitigate the uncertainty caused by random-sampling by adopting minimum mean squared error (MMSE) estimation for the denoised results. With the distinctive integration of training strategy and proposed designs, Noise2SR can achieve superior denoising performance using a single noisy HREM image. We evaluate the performance of Noise2SR in both simulated and real HREM denoising tasks. It outperforms state-of-the-art ZS-SSL methods and achieves comparable denoising performance with supervised methods. The success of Noise2SR suggests its potential for improving the SNR of images in material imaging domains. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 12 pages, 12 figures

arXiv:2406.00974 [pdf, other]

Large Language Model Assisted Optimal Bidding of BESS in FCAS Market: An AI-agent based Approach

Authors: Borui Zhang, Chaojie Li, Guo Chen, Zhaoyang Dong

Abstract: To incentivize flexible resources such as Battery Energy Storage Systems (BESSs) to offer Frequency Control Ancillary Services (FCAS), Australia's National Electricity Market (NEM) has implemented changes in recent years towards shorter-term bidding rules and faster service requirements. However, firstly, existing bidding optimization methods often overlook or oversimplify the key aspects of FCAS… ▽ More To incentivize flexible resources such as Battery Energy Storage Systems (BESSs) to offer Frequency Control Ancillary Services (FCAS), Australia's National Electricity Market (NEM) has implemented changes in recent years towards shorter-term bidding rules and faster service requirements. However, firstly, existing bidding optimization methods often overlook or oversimplify the key aspects of FCAS market procedures, resulting in an inaccurate depiction of the market bidding process. Thus, the BESS bidding problem is modeled based on the actual bidding records and the latest market specifications and then formulated as a deep reinforcement learning (DRL) problem. Secondly, the erratic decisions of the DRL agent caused by imperfectly predicted market information increases the risk of profit loss. Hence, a Conditional Value at Risk (CVaR)-based DRL algorithm is developed to enhance the risk resilience of bidding strategies. Thirdly, well-trained DRL models still face performance decline in uncommon scenarios during online operations. Therefore, a Large Language Models (LLMs)-assisted artificial intelligence (AI)-agent interactive decision-making framework is proposed to improve the strategy timeliness, reliability and interpretability in uncertain new scenarios, where conditional hybrid decision and self-reflection mechanisms are designed to address LLMs' hallucination challenge. The experiment results demonstrate that our proposed framework has higher bidding profitability compared to the baseline methods by effectively mitigating the profit loss caused by various uncertainties. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.06999 [pdf, other]

Large Language Model-aided Edge Learning in Distribution System State Estimation

Authors: Renyou Xie, Xin Yin, Chaojie Li, Nian Liu, Bo Zhao, Zhaoyang Dong

Abstract: Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks. Besides intensive computational requirements, conventional DSSE methods need high-quality measurements to obtain accurate states, whereas missing values often occur due to sensor failures or communication delays. To address these challenging issues, a forec… ▽ More Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks. Besides intensive computational requirements, conventional DSSE methods need high-quality measurements to obtain accurate states, whereas missing values often occur due to sensor failures or communication delays. To address these challenging issues, a forecast-then-estimate framework of edge learning is proposed for DSSE, leveraging large language models (LLMs) to forecast missing measurements and provide pseudo-measurements. Firstly, natural language-based prompts and measurement sequences are integrated by the proposed LLM to learn patterns from historical data and provide accurate forecasting results. Secondly, a convolutional layer-based neural network model is introduced to improve the robustness of state estimation under missing measurement. Thirdly, to alleviate the overfitting of the deep learning-based DSSE, it is reformulated as a multi-task learning framework containing shared and task-specific layers. The uncertainty weighting algorithm is applied to find the optimal weights to balance different tasks. The numerical simulation on the Simbench case is used to demonstrate the effectiveness of the proposed forecast-then-estimate framework. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.03952 [pdf, other]

HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech

Authors: Zhongren Dong, Zixing Zhang, Weixiang Xu, **g Han, Jianjun Ou, Björn W. Schuller

Abstract: Automatically detecting Alzheimer's Disease (AD) from spontaneous speech plays an important role in its early diagnosis. Recent approaches highly rely on the Transformer architectures due to its efficiency in modelling long-range context dependencies. However, the quadratic increase in computational complexity associated with self-attention and the length of audio poses a challenge when deploying… ▽ More Automatically detecting Alzheimer's Disease (AD) from spontaneous speech plays an important role in its early diagnosis. Recent approaches highly rely on the Transformer architectures due to its efficiency in modelling long-range context dependencies. However, the quadratic increase in computational complexity associated with self-attention and the length of audio poses a challenge when deploying such models on edge devices. In this context, we construct a novel framework, namely Hierarchical Attention-Free Transformer (HAFFormer), to better deal with long speech for AD detection. Specifically, we employ an attention-free module of Multi-Scale Depthwise Convolution to replace the self-attention and thus avoid the expensive computation, and a GELU-based Gated Linear Unit to replace the feedforward layer, aiming to automatically filter out the redundant information. Moreover, we design a hierarchical structure to force it to learn a variety of information grains, from the frame level to the dialogue level. By conducting extensive experiments on the ADReSS-M dataset, the introduced HAFFormer can achieve competitive results (82.6% accuracy) with other recent work, but with significant computational complexity and model size reduction compared to the standard Transformer. This shows the efficiency of HAFFormer in dealing with long audio for AD detection. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Journal ref: publised at ICASSP 2024

arXiv:2404.10777 [pdf, other]

Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

Authors: Zhenxing Dong, Jidong Jia, Yan Li, Yuye Ling

Abstract: Recently, deep learning-based computer-generated holography (CGH) has demonstrated tremendous potential in three-dimensional (3D) displays and yielded impressive display quality. However, most existing deep learning-based CGH techniques can only generate holograms of 1080p resolution, which is far from the ultra-high resolution (16K+) required for practical virtual reality (VR) and augmented reali… ▽ More Recently, deep learning-based computer-generated holography (CGH) has demonstrated tremendous potential in three-dimensional (3D) displays and yielded impressive display quality. However, most existing deep learning-based CGH techniques can only generate holograms of 1080p resolution, which is far from the ultra-high resolution (16K+) required for practical virtual reality (VR) and augmented reality (AR) applications to support a wide field of view and large eye box. One of the major obstacles in current CGH frameworks lies in the limited memory available on consumer-grade GPUs which could not facilitate the generation of higher-definition holograms. To overcome the aforementioned challenge, we proposed a divide-conquer-and-merge strategy to address the memory and computational capacity scarcity in ultra-high-definition CGH generation. This algorithm empowers existing CGH frameworks to synthesize higher-definition holograms at a faster speed while maintaining high-fidelity image display quality. Both simulations and experiments were conducted to demonstrate the capabilities of the proposed framework. By integrating our strategy into HoloNet and CCNNs, we achieved significant reductions in GPU memory usage during the training period by 64.3\% and 12.9\%, respectively. Furthermore, we observed substantial speed improvements in hologram generation, with an acceleration of up to 3$\times$ and 2 $\times$, respectively. Particularly, we successfully trained and inferred 8K definition holograms on an NVIDIA GeForce RTX 3090 GPU for the first time in simulations. Furthermore, we conducted full-color optical experiments to verify the effectiveness of our method. We believe our strategy can provide a novel approach for memory- and time-efficient holographic displays. △ Less

Submitted 25 February, 2024; originally announced April 2024.

Comments: This paper has been accepted as conference paper in IEEE VR 2024

arXiv:2403.15029 [pdf]

On the Solution Uniqueness of Data-Driven Modeling of Flexible Loads

Authors: Shuai Lu, Jiayi Ding, Wei Gu, Junpeng Zhu, Yijun Xu, Zhaoyang Dong, Zezheng Sun

Abstract: This letter first explores the solution uniqueness of the data-driven modeling of price-responsive flexible loads (PFL). The PFL on the demand side is critical in modern power systems. An accurate PFL model is fundamental for system operations. Yet, whether the PFL model can be uniquely and correctly identified from operational data remains unclear. To address this, we analyze the structural and p… ▽ More This letter first explores the solution uniqueness of the data-driven modeling of price-responsive flexible loads (PFL). The PFL on the demand side is critical in modern power systems. An accurate PFL model is fundamental for system operations. Yet, whether the PFL model can be uniquely and correctly identified from operational data remains unclear. To address this, we analyze the structural and practical identifiability of the PFL model, deriving the condition for the solution uniqueness. Based on this, we point out the implications for selecting physical models of PFL to enhance the identification results. Numerical results validate this work. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.08200 [pdf, ps, other]

Prototy** and Experimental Results for Environment-Aware Millimeter Wave Beam Alignment via Channel Knowledge Map

Authors: Zhuoyin Dai, Di Wu, Zhenjun Dong, Kun Li, Dingyang Ding, Sihan Wang, Yong Zeng

Abstract: Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, te… ▽ More Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, termed beam index map (BIM). To this end, a general CKM construction method is first presented, and an indoor BIM is constructed offline to learn the candidate transmit and receive beam index pairs for each grid in the experimental area. Furthermore, based on the location information of the receiver (or the dynamic obstacles) from the ultra-wide band (UWB) positioning system, the established BIM is used to achieve training-free beam alignment by directly providing the beam indexes for the transmitter and receiver. Three typical scenarios are considered in the experiment, including quasi-static environment with line-of-sight (LoS) link, quasistatic environment without LoS link and dynamic environment. Besides, the receiver orientation measured from the gyroscope is also used to help CKM predict more accurate beam indexes. The experiment results show that compared with the benchmark location-based beam alignment strategy, the CKM-based beam alignment strategy can achieve much higher received power, which is close to that achieved by exhaustive beam search, but with significantly reduced training overhead. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.03314 [pdf, other]

Collision Avoidance Verification of Multiagent Systems with Learned Policies

Authors: Zihao Dong, Shayegan Omidshafiei, Michael Everett

Abstract: For many multiagent control problems, neural networks (NNs) have enabled promising new capabilities. However, many of these systems lack formal guarantees (e.g., collision avoidance, robustness), which prevents leveraging these advances in safety-critical settings. While there is recent work on formal verification of NN-controlled systems, most existing techniques cannot handle scenarios with more… ▽ More For many multiagent control problems, neural networks (NNs) have enabled promising new capabilities. However, many of these systems lack formal guarantees (e.g., collision avoidance, robustness), which prevents leveraging these advances in safety-critical settings. While there is recent work on formal verification of NN-controlled systems, most existing techniques cannot handle scenarios with more than one agent. To address this research gap, this paper presents a backward reachability-based approach for verifying the collision avoidance properties of Multi-Agent Neural Feedback Loops (MA-NFLs). Given the dynamics models and trained control policies of each agent, the proposed algorithm computes relative backprojection sets by (simultaneously) solving a series of Mixed Integer Linear Programs (MILPs) offline for each pair of agents. We account for state measurement uncertainties, making it well aligned with real-world scenarios. Using those results, the agents can quickly check for collision avoidance online by solving low-dimensional Linear Programs (LPs). We demonstrate the proposed algorithm can verify collision-free properties of a MA-NFL with agents trained to imitate a collision avoidance algorithm (Reciprocal Velocity Obstacles). We further demonstrate the computational scalability of the approach on systems with up to 10 agents. △ Less

Submitted 25 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: 6 pages, 6 figures

arXiv:2401.16446 [pdf]

Framework of Resilient Transmission Network Reconfiguration Considering Cyber-Attacks

Authors: Chao Yang, Gaoqi Liang, Steven R. Weller, Shaoyan Li, Junhua Zhao, Zhaoyang Dong

Abstract: Fast and reliable transmission network reconfiguration is critical in improving power grid resilience to cyber-attacks. If the network reconfiguration following cyber-attacks is imperfect, secondary incidents may delay or interrupt post-attack restoration of the power grid. This paper proposes a framework of resilient transmission network reconfiguration, taking into account the impacts of cyber-a… ▽ More Fast and reliable transmission network reconfiguration is critical in improving power grid resilience to cyber-attacks. If the network reconfiguration following cyber-attacks is imperfect, secondary incidents may delay or interrupt post-attack restoration of the power grid. This paper proposes a framework of resilient transmission network reconfiguration, taking into account the impacts of cyber-attacks in the network reconfiguration process. First, the mechanism of cyber-attack propagation is analyzed based on the characteristics of network reconfiguration. Second, systematic resilience indices are specially extracted in which the impact of cyber-attacks on network reconfiguration is quantified. These indices are defined in terms of the restoration characteristics of the transmission power system. Third, representative cyber-attack incidents motivate an optimization-based model of resilient transmission network reconfiguration, and an optimal reconstruction scheme is obtained. Finally, simulation results based on the IEEE 39-bus system verify the feasibility and effectiveness of the proposed framework in enhancing power grid resilience to cyber-attacks. △ Less

Submitted 28 January, 2024; originally announced January 2024.

arXiv:2312.16082 [pdf, ps, other]

The Quantum Kalman Decomposition: A Gramian Matrix Approach

Authors: Guofeng Zhang, **ghao Li, Zhiyuan Dong, Ian R. Petersen

Abstract: The Kalman canonical form for quantum linear systems was derived in \cite{ZGPG18}. The purpose of this paper is to present an alternative derivation by means of a Gramian matrix approach. Controllability and observability Gramian matrices are defined for linear quantum systems, which are used to characterize various subspaces. Based on these characterizations, real orthogonal and block symplectic… ▽ More The Kalman canonical form for quantum linear systems was derived in \cite{ZGPG18}. The purpose of this paper is to present an alternative derivation by means of a Gramian matrix approach. Controllability and observability Gramian matrices are defined for linear quantum systems, which are used to characterize various subspaces. Based on these characterizations, real orthogonal and block symplectic coordinate transformation matrices are constructed to transform a given quantum linear system to the Kalman canonical form. An example is used to illustrate the main results. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 22 pages, 2 figures, submitted for publication. Comments are welcome

arXiv:2312.12795 [pdf, ps, other]

doi 10.1109/TSG.2023.3326928

Joint Trading and Scheduling among Coupled Carbon-Electricity-Heat-Gas Industrial Clusters

Authors: Dafeng Zhu, Bo Yang, Yu Wu, Haoran Deng, Zhaoyang Dong, Kai Ma, ** Guan

Abstract: This paper presents a carbon-energy coupling management framework for an industrial park, where the carbon flow model accompanying multi-energy flows is adopted to track and suppress carbon emissions on the user side. To deal with the quadratic constraint of gas flows, a bound tightening algorithm for constraints relaxation is adopted. The synergies among the carbon capture, energy storage, power-… ▽ More This paper presents a carbon-energy coupling management framework for an industrial park, where the carbon flow model accompanying multi-energy flows is adopted to track and suppress carbon emissions on the user side. To deal with the quadratic constraint of gas flows, a bound tightening algorithm for constraints relaxation is adopted. The synergies among the carbon capture, energy storage, power-to-gas further consume renewable energy and reduce carbon emissions. Aiming at carbon emissions disparities and supply-demand imbalances, this paper proposes a carbon trading ladder reward and punishment mechanism and an energy trading and scheduling method based on Lyapunov optimization and matching game to maximize the long-term benefits of each industrial cluster without knowing the prior information of random variables. Case studies show that our proposed trading method can reduce overall costs and carbon emissions while relieving energy pressure, which is important for Environmental, Social and Governance (ESG). △ Less

Submitted 20 December, 2023; originally announced December 2023.

Journal ref: IEEE Transactions on Smart Grid, 2023

arXiv:2312.06197 [pdf, other]

MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer

Authors: Dong Yao, Jieming Zhu, Jiahao Xun, Shengyu Zhang, Zhou Zhao, Liqun Deng, Wenqiao Zhang, Zhenhua Dong, Xin Jiang

Abstract: Recent research in self-supervised contrastive learning of music representations has demonstrated remarkable results across diverse downstream tasks. However, a prevailing trend in existing methods involves representing equally-sized music clips in either waveform or spectrogram formats, often overlooking the intrinsic part-whole hierarchies within music. In our quest to comprehend the bottom-up s… ▽ More Recent research in self-supervised contrastive learning of music representations has demonstrated remarkable results across diverse downstream tasks. However, a prevailing trend in existing methods involves representing equally-sized music clips in either waveform or spectrogram formats, often overlooking the intrinsic part-whole hierarchies within music. In our quest to comprehend the bottom-up structure of music, we introduce MART, a hierarchical music representation learning approach that facilitates feature interactions among cropped music clips while considering their part-whole hierarchies. Specifically, we propose a hierarchical part-whole transformer to capture the structural relationships between music clips in a part-whole hierarchy. Furthermore, a hierarchical contrastive learning objective is crafted to align part-whole music representations at adjacent levels, progressively establishing a multi-hierarchy representation space. The effectiveness of our music representation learning from part-whole hierarchies has been empirically validated across multiple downstream tasks, including music classification and cover song identification. △ Less

Submitted 19 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: Short paper accepted by WWW 2024. This is revised and condensed based on the previous version titled "Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast". For more experimental details and discussions, please refer to the original long paper at arXiv:2312.06197v1

arXiv:2311.13361 [pdf, other]

Applying Large Language Models to Power Systems: Potential Security Threats

Authors: Jiaqi Ruan, Gaoqi Liang, Huan Zhao, Guolong Liu, Xianzhuo Sun, **g Qiu, Zhao Xu, Fushuan Wen, Zhao Yang Dong

Abstract: Applying large language models (LLMs) to modern power systems presents a promising avenue for enhancing decision-making and operational efficiency. However, this action may also incur potential security threats, which have not been fully recognized so far. To this end, this article analyzes potential threats incurred by applying LLMs to power systems, emphasizing the need for urgent research and d… ▽ More Applying large language models (LLMs) to modern power systems presents a promising avenue for enhancing decision-making and operational efficiency. However, this action may also incur potential security threats, which have not been fully recognized so far. To this end, this article analyzes potential threats incurred by applying LLMs to power systems, emphasizing the need for urgent research and development of countermeasures. △ Less

Submitted 24 January, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2310.11690 [pdf, other]

doi 10.1016/j.rser.2023.113913

Deep learning based on Transformer architecture for power system short-term voltage stability assessment with class imbalance

Authors: Yang Li, Jiting Cao, Yan Xu, Lipeng Zhu, Zhao Yang Dong

Abstract: Most existing data-driven power system short-term voltage stability assessment (STVSA) approaches presume class-balanced input data. However, in practical applications, the occurrence of short-term voltage instability following a disturbance is minimal, leading to a significant class imbalance problem and a consequent decline in classifier performance. This work proposes a Transformer-based STVSA… ▽ More Most existing data-driven power system short-term voltage stability assessment (STVSA) approaches presume class-balanced input data. However, in practical applications, the occurrence of short-term voltage instability following a disturbance is minimal, leading to a significant class imbalance problem and a consequent decline in classifier performance. This work proposes a Transformer-based STVSA method to address this challenge. By utilizing the basic Transformer architecture, a stability assessment Transformer (StaaT) is developed {as a classification model to reflect the correlation between the operational states of the system and the resulting stability outcomes}. To combat the negative impact of imbalanced datasets, this work employs a conditional Wasserstein generative adversarial network with gradient penalty (CWGAN-GP) for synthetic data generation, aiding in the creation of a balanced, representative training set for the classifier. Semi-supervised clustering learning is implemented to enhance clustering quality, addressing the lack of a unified quantitative criterion for short-term voltage stability. {Numerical tests on the IEEE 39-bus test system extensively demonstrate that the proposed method exhibits robust performance under class imbalances up to 100:1 and noisy environments, and maintains consistent effectiveness even with an increased penetration of renewable energy}. Comparative results reveal that the CWGAN-GP generates more balanced datasets than traditional oversampling methods and that the StaaT outperforms other deep learning algorithms. This study presents a compelling solution for real-world STVSA applications that often face class imbalance and data noise challenges. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted by Renewable and Sustainable Energy Reviews

Journal ref: Renewable and Sustainable Energy Reviews 189 (2024) 113913

arXiv:2310.11044 [pdf, ps, other]

A Tutorial on Near-Field XL-MIMO Communications Towards 6G

Authors: Haiquan Lu, Yong Zeng, Changsheng You, Yu Han, Jiayi Zhang, Zhe Wang, Zhenjun Dong, Shi **, Cheng-Xiang Wang, Tao Jiang, Xiaohu You, Rui Zhang

Abstract: Extremely large-scale multiple-input multiple-output (XL-MIMO) is a promising technology for the sixth-generation (6G) mobile communication networks. By significantly boosting the antenna number or size to at least an order of magnitude beyond current massive MIMO systems, XL-MIMO is expected to unprecedentedly enhance the spectral efficiency and spatial resolution for wireless communication. The… ▽ More Extremely large-scale multiple-input multiple-output (XL-MIMO) is a promising technology for the sixth-generation (6G) mobile communication networks. By significantly boosting the antenna number or size to at least an order of magnitude beyond current massive MIMO systems, XL-MIMO is expected to unprecedentedly enhance the spectral efficiency and spatial resolution for wireless communication. The evolution from massive MIMO to XL-MIMO is not simply an increase in the array size, but faces new design challenges, in terms of near-field channel modelling, performance analysis, channel estimation, and practical implementation. In this article, we give a comprehensive tutorial overview on near-field XL-MIMO communications, aiming to provide useful guidance for tackling the above challenges. First, the basic near-field modelling for XL-MIMO is established, by considering the new characteristics of non-uniform spherical wave (NUSW) and spatial non-stationarity. Next, based on the near-field modelling, the performance analysis of XL-MIMO is presented, including the near-field signal-to-noise ratio (SNR) scaling laws, beam focusing pattern, achievable rate, and degrees-of-freedom (DoF). Furthermore, various XL-MIMO design issues such as near-field beam codebook, beam training, channel estimation, and delay alignment modulation (DAM) transmission are elaborated. Finally, we point out promising directions to inspire future research on near-field XL-MIMO communications. △ Less

Submitted 3 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: 42 pages

arXiv:2310.08418 [pdf, ps, other]

doi 10.1109/TSG.2024.3420743

Privacy-Preserved Aggregate Thermal Dynamic Model of Buildings

Authors: Zeyin Hou, Shuai Lu, Yijun Xu, Haifeng Qiu, Wei Gu, Zhaoyang Dong, Shixing Ding

Abstract: The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperat… ▽ More The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperature and heating/cooling power, needs to be collected in the parameter estimation procedure to obtain the aggregate model, causing severe privacy concerns. In light of this, we propose a novel privacy-preserved parameter estimation approach to infer the aggregate model for the thermal dynamics of the building cluster for the first time. Using it, the parameters of the aggregate thermal dynamic model (ATDM) can be obtained by the load aggregator without accessing the individual's privacy information. More specifically, this method not only exploits the block coordinate descent (BCD) method to resolve its non-convexity in the estimation but investigates the transformation-based encryption (TE) associated with its secure aggregation protocol (SAP) techniques to realize privacy-preserved computation. Its capability of preserving privacy is also theoretically proven. Finally, simulation results using real-world data demonstrate the accuracy and privacy-preserved performance of our proposed method. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.06238 [pdf, other]

Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering

Authors: Xiulong Liu, Zhikang Dong, Peng Zhang

Abstract: In recent years, there has been a growing emphasis on the intersection of audio, vision, and text modalities, driving forward the advancements in multimodal research. However, strong bias that exists in any modality can lead to the model neglecting the others. Consequently, the model's ability to effectively reason across these diverse modalities is compromised, impeding further advancement. In th… ▽ More In recent years, there has been a growing emphasis on the intersection of audio, vision, and text modalities, driving forward the advancements in multimodal research. However, strong bias that exists in any modality can lead to the model neglecting the others. Consequently, the model's ability to effectively reason across these diverse modalities is compromised, impeding further advancement. In this paper, we meticulously review each question type from the original dataset, selecting those with pronounced answer biases. To counter these biases, we gather complementary videos and questions, ensuring that no answers have outstanding skewed distribution. In particular, for binary questions, we strive to ensure that both answers are almost uniformly spread within each question category. As a result, we construct a new dataset, named MUSIC-AVQA v2.0, which is more challenging and we believe could better foster the progress of AVQA task. Furthermore, we present a novel baseline model that delves deeper into the audio-visual-text interrelation. On MUSIC-AVQA v2.0, this model surpasses all the existing benchmarks, improving accuracy by 2% on MUSIC-AVQA v2.0, setting a new state-of-the-art performance. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.11139 [pdf, other]

More complex encoder is not all you need

Authors: Weibin Yang, Longwei Xu, Pengwei Wang, Dehua Geng, Yusong Li, Mingyuan Xu, Zhiqi Dong

Abstract: U-Net and its variants have been widely used in medical image segmentation. However, most current U-Net variants confine their improvement strategies to building more complex encoder, while leaving the decoder unchanged or adopting a simple symmetric structure. These approaches overlook the true functionality of the decoder: receiving low-resolution feature maps from the encoder and restoring feat… ▽ More U-Net and its variants have been widely used in medical image segmentation. However, most current U-Net variants confine their improvement strategies to building more complex encoder, while leaving the decoder unchanged or adopting a simple symmetric structure. These approaches overlook the true functionality of the decoder: receiving low-resolution feature maps from the encoder and restoring feature map resolution and lost information through upsampling. As a result, the decoder, especially its upsampling component, plays a crucial role in enhancing segmentation outcomes. However, in 3D medical image segmentation, the commonly used transposed convolution can result in visual artifacts. This issue stems from the absence of direct relationship between adjacent pixels in the output feature map. Furthermore, plain encoder has already possessed sufficient feature extraction capability because downsampling operation leads to the gradual expansion of the receptive field, but the loss of information during downsampling process is unignorable. To address the gap in relevant research, we extend our focus beyond the encoder and introduce neU-Net (i.e., not complex encoder U-Net), which incorporates a novel Sub-pixel Convolution for upsampling to construct a powerful decoder. Additionally, we introduce multi-scale wavelet inputs module on the encoder side to provide additional information. Our model design achieves excellent results, surpassing other state-of-the-art methods on both the Synapse and ACDC datasets. △ Less

Submitted 27 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2308.11289 [pdf, other]

Multi-User Modular XL-MIMO Communications: Near-Field Beam Focusing Pattern and User Grou**

Authors: Xinrui Li, Zhenjun Dong, Yong Zeng, Shi **, Rui Zhang

Abstract: In this paper, we investigate multi-user modular extremely large-scale multiple-input multiple-output (XL-MIMO) communication systems, where modular extremely large-scale uniform linear array (XL-ULA) is deployed at the base station (BS) to serve multiple single-antenna users. By exploiting the unique modular array architecture and considering the potential near-field propagation, we develop sub-a… ▽ More In this paper, we investigate multi-user modular extremely large-scale multiple-input multiple-output (XL-MIMO) communication systems, where modular extremely large-scale uniform linear array (XL-ULA) is deployed at the base station (BS) to serve multiple single-antenna users. By exploiting the unique modular array architecture and considering the potential near-field propagation, we develop sub-array based uniform spherical wave (USW) models for distinct versus common angles of arrival/departure (AoAs/AoDs) with respect to different sub-arrays/modules, respectively. Under such USW models, we analyze the beam focusing patterns at the near-field observation location by using near-field beamforming. The analysis reveals that compared to the conventional XL-MIMO with collocated antenna elements, modular XL-MIMO can provide better spatial resolution by benefiting from its larger array aperture. However, it also incurs undesired grating lobes due to the large inter-module separation. Moreover, it is found that for multi-user modular XL-MIMO communications, the achievable signal-to-interference-plus-noise ratio (SINR) for users may be degraded by the grating lobes of the beam focusing pattern. To address this issue, an efficient user grou** method is proposed for multi-user transmission scheduling, so that users located within the grating lobes of each other are not allocated to the same time-frequency resource block (RB) for their communications. Numerical results are presented to verify the effectiveness of the proposed user grou** method, as well as the superior performance of modular XL-MIMO over its collocated counterpart with densely distributed users. △ Less

Submitted 22 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

arXiv:2307.11130 [pdf, other]

Frequency-aware optical coherence tomography image super-resolution via conditional generative adversarial neural network

Authors: Xueshen Li, Zhenxing Dong, Hongshan Liu, Jennifer J. Kang-Mieler, Yuye Ling, Yu Gan

Abstract: Optical coherence tomography (OCT) has stimulated a wide range of medical image-based diagnosis and treatment in fields such as cardiology and ophthalmology. Such applications can be further facilitated by deep learning-based super-resolution technology, which improves the capability of resolving morphological structures. However, existing deep learning-based method only focuses on spatial distrib… ▽ More Optical coherence tomography (OCT) has stimulated a wide range of medical image-based diagnosis and treatment in fields such as cardiology and ophthalmology. Such applications can be further facilitated by deep learning-based super-resolution technology, which improves the capability of resolving morphological structures. However, existing deep learning-based method only focuses on spatial distribution and disregard frequency fidelity in image reconstruction, leading to a frequency bias. To overcome this limitation, we propose a frequency-aware super-resolution framework that integrates three critical frequency-based modules (i.e., frequency transformation, frequency skip connection, and frequency alignment) and frequency-based loss function into a conditional generative adversarial network (cGAN). We conducted a large-scale quantitative study from an existing coronary OCT dataset to demonstrate the superiority of our proposed framework over existing deep learning frameworks. In addition, we confirmed the generalizability of our framework by applying it to fish corneal images and rat retinal images, demonstrating its capability to super-resolve morphological details in eye imaging. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: 13 pages, 7 figures, submitted to Biomedical Optics Express special issue

arXiv:2307.09775 [pdf, other]

DisCover: Disentangled Music Representation Learning for Cover Song Identification

Authors: Jiahao Xun, Shengyu Zhang, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, Ruiqi Li, Lichao Zhang, Fei Wu

Abstract: In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal… ▽ More In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal of disentangling version-specific and version-invariant factors, which could make it easier for the model to learn invariant music representations for unseen query songs. We analyze the CSI task in a disentanglement view with the causal graph technique, and identify the intra-version and inter-version effects biasing the invariant learning. To block these effects, we propose the disentangled music representation learning framework (DisCover) for CSI. DisCover consists of two critical components: (1) Knowledge-guided Disentanglement Module (KDM) and (2) Gradient-based Adversarial Disentanglement Module (GADM), which block intra-version and inter-version biased effects, respectively. KDM minimizes the mutual information between the learned representations and version-variant factors that are identified with prior domain knowledge. GADM identifies version-variant factors by simulating the representation transitions between intra-song versions, and exploits adversarial distillation for effect blocking. Extensive comparisons with best-performing methods and in-depth analysis demonstrate the effectiveness of DisCover and the and necessity of disentanglement for CSI. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2307.00858 [pdf, ps, other]

Beyond the Snapshot: Brain Tokenized Graph Transformer for Longitudinal Brain Functional Connectome Embedding

Authors: Zijian Dong, Yilei Wu, Yu Xiao, Joanna Su Xian Chong, Yueming **, Juan Helen Zhou

Abstract: Under the framework of network-based neurodegeneration, brain functional connectome (FC)-based Graph Neural Networks (GNN) have emerged as a valuable tool for the diagnosis and prognosis of neurodegenerative diseases such as Alzheimer's disease (AD). However, these models are tailored for brain FC at a single time point instead of characterizing FC trajectory. Discerning how FC evolves with diseas… ▽ More Under the framework of network-based neurodegeneration, brain functional connectome (FC)-based Graph Neural Networks (GNN) have emerged as a valuable tool for the diagnosis and prognosis of neurodegenerative diseases such as Alzheimer's disease (AD). However, these models are tailored for brain FC at a single time point instead of characterizing FC trajectory. Discerning how FC evolves with disease progression, particularly at the predementia stages such as cognitively normal individuals with amyloid deposition or individuals with mild cognitive impairment (MCI), is crucial for delineating disease spreading patterns and develo** effective strategies to slow down or even halt disease advancement. In this work, we proposed the first interpretable framework for brain FC trajectory embedding with application to neurodegenerative disease diagnosis and prognosis, namely Brain Tokenized Graph Transformer (Brain TokenGT). It consists of two modules: 1) Graph Invariant and Variant Embedding (GIVE) for generation of node and spatio-temporal edge embeddings, which were tokenized for downstream processing; 2) Brain Informed Graph Transformer Readout (BIGTR) which augments previous tokens with trainable type identifiers and non-trainable node identifiers and feeds them into a standard transformer encoder to readout. We conducted extensive experiments on two public longitudinal fMRI datasets of the AD continuum for three tasks, including differentiating MCI from controls, predicting dementia conversion in MCI, and classification of amyloid positive or negative cognitively normal individuals. Based on brain FC trajectory, the proposed Brain TokenGT approach outperformed all the other benchmark models and at the same time provided excellent interpretability. The code is available at https://github.com/ZijianD/Brain-TokenGT.git △ Less

Submitted 12 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: MICCAI 2023

arXiv:2306.07712 [pdf]

Multiple-Step Quantized Triplet STDP Implemented with Memristive Synapse

Authors: Y. Liu, D. Wang, Z. Dong, W. Zhao

Abstract: As an extension of the pairwise spike-timing-dependent plasticity (STDP) learning rule, the triplet STDP is provided with greater capability in characterizing the synaptic changes in the biological neural cell. In this work, a novel mixed-signal circuit scheme, called multiple-step quantized triplet STDP, is designed to provide a precise and flexible implementation of coactivation triplet STDP lea… ▽ More As an extension of the pairwise spike-timing-dependent plasticity (STDP) learning rule, the triplet STDP is provided with greater capability in characterizing the synaptic changes in the biological neural cell. In this work, a novel mixed-signal circuit scheme, called multiple-step quantized triplet STDP, is designed to provide a precise and flexible implementation of coactivation triplet STDP learning rule in memristive synapse spiking neural network. The robustness of the circuit is greatly improved through the utilization of pulse-width encoded weight modulation signals. The circuit performance is studied through the simulations which are carried out in MATLAB Simulink & Simscape, and assessment is given by comparing the results of circuits with the algorithmic approaches. △ Less

Submitted 27 August, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: 5 pages, 9 figures

arXiv:2306.06379 [pdf]

Implementation of Multiple-Step Quantized STDP Based on Novel Memristive Synapses

Authors: Y. Liu, D. Wang, Z. Dong, H. Xie, W. Zhao

Abstract: Memristors have been widely studied as artificial synapses in neuromorphic circuits, due to their functional similarity with biological synapses, low operating power, and high integration density. In this work, a memristive synapse, composed of four memristors and two resistors, for SNN is designed and utilized for a neuron circuit implementing the robust spike-timing dependent plasticity learning… ▽ More Memristors have been widely studied as artificial synapses in neuromorphic circuits, due to their functional similarity with biological synapses, low operating power, and high integration density. In this work, a memristive synapse, composed of four memristors and two resistors, for SNN is designed and utilized for a neuron circuit implementing the robust spike-timing dependent plasticity learning. The synapse can be either excitatory or inhibitory by rationally arranging the resistors in the circuit. This is the first of its kind, enabling Hebbian and anti-Hebbian training without requiring additional processing of neural signals. Then, a neuron circuit is designed based on the proposed synapses. The robustness and compatibility of this neuron circuit are greatly enhanced by employing the clock-based square-wave pulsed to transmit spikes and modulate the synaptic weight. To study the performance of proposed synapses and circuit, simulations based on behavior models are carried out in the MATLAB Simulink and Simscape. Specially, a memristor model with balanced flexibility, efficiency, convergence, and emulation performance, is developed through including the nonlinear Joule effect. Using this memristor model in pattern learning, the influence of weak signal-induced weight variation on circuit performance can be rigorously assessed. This proposed circuit could give some inspiration for combining the analog memristive synapse and leaky integrate-and-fire neuron with digital control units, prompting their development as edge computing devices. △ Less

Submitted 27 August, 2023; v1 submitted 10 June, 2023; originally announced June 2023.

Comments: 10 pages, 20 figures

arXiv:2305.05408 [pdf, other]

Near-Field Beam Focusing Pattern and Grating Lobe Characterization for Modular XL-Array

Authors: Xinrui Li, Zhenjun Dong, Yong Zeng, Shi **, Rui Zhang

Abstract: In this paper, we investigate the near-field modelling and analyze the beam focusing pattern for modular extremely large-scale array (XL-array) communications. As modular XL-array is physically and electrically large in general, the accurate characterization of amplitude and phase variations across its array elements requires the non-uniform spherical wave (NUSW) model, which, however, is difficul… ▽ More In this paper, we investigate the near-field modelling and analyze the beam focusing pattern for modular extremely large-scale array (XL-array) communications. As modular XL-array is physically and electrically large in general, the accurate characterization of amplitude and phase variations across its array elements requires the non-uniform spherical wave (NUSW) model, which, however, is difficult for performance analysis and optimization. To address this issue, we first present two ways to simplify the NUSW model by exploiting the unique regular structure of modular XL-array, termed sub-array based uniform spherical wave (USW) models with different or common angles, respectively. Based on the developed models, the near-field beam focusing patterns of XL-array communications are derived. It is revealed that compared to the existing collocated XL-array with the same number of array elements, modular XL-array can significantly enhance the spatial resolution, but at the cost of generating undesired grating lobes. Fortunately, different from the conventional far-field uniform plane wave (UPW) model, the near-field USW model for modular XL-array exhibits a higher grating lobe suppression capability, thanks to the non-linear phase variations across the array elements. Finally, simulation results are provided to verify the near-field beam focusing pattern and grating lobe characteristics of modular XL-array. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2301.12325 [pdf, ps, other]

Data-Driven Load-Current Sharing Control for Multi-Stack Fuel Cell System with Circulating Current Mitigation

Authors: Yiqiao Xu, Xiaoyu Guo, Zhen Dong, Zhengtao Ding, Alessandra Parisio

Abstract: The global trend toward renewable power generation has drawn great attention to hydrogen Fuel Cells (FCs), which have a wide variety of applications, from utility power stations to laptops. The Multi-stack Fuel Cell System (MFCS), which is an assembly of FC stacks, can be a remedy for obstacles in high-power applications. However, the output voltage of FC stacks varies dramatically under variable… ▽ More The global trend toward renewable power generation has drawn great attention to hydrogen Fuel Cells (FCs), which have a wide variety of applications, from utility power stations to laptops. The Multi-stack Fuel Cell System (MFCS), which is an assembly of FC stacks, can be a remedy for obstacles in high-power applications. However, the output voltage of FC stacks varies dramatically under variable load conditions; hence, in order for MFCS to be efficiently operated and guarantee an appropriate load-current sharing among the FC stacks, advanced converter controllers for power conditioning need to be designed. An accurate circuit model is essential for controller design, which accounts for the fact that the parameters of some converter components may change due to aging and repetitive stress in long-term operations. Existing control frameworks and parametric and non-parametric system identification techniques do not consider the aforementioned challenges. Thus, this paper investigates the potential of a data-driven method that, without system identification, directly implements control on paralleled converters using raw data. Based on pre-collected input/output trajectories, a non-parametric representation of the overall circuit is produced for implementing predictive control. While approaching equal current sharing within the MFCS, the proposed method considers the minimization of load-following error and mitigation of circulating current between the converters. Simulation results verify the effectiveness of the proposed method. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2209.00778 [pdf, other]

doi 10.1109/TSG.2022.3204796

Detection of False Data Injection Attacks in Smart Grid: A Secure Federated Deep Learning Approach

Authors: Yang Li, Xinhao Wei, Yuanzheng Li, Zhaoyang Dong, Mohammad Shahidehpour

Abstract: As an important cyber-physical system (CPS), smart grid is highly vulnerable to cyber attacks. Amongst various types of attacks, false data injection attack (FDIA) proves to be one of the top-priority cyber-related issues and has received increasing attention in recent years. However, so far little attention has been paid to privacy preservation issues in the detection of FDIAs in smart grid. Insp… ▽ More As an important cyber-physical system (CPS), smart grid is highly vulnerable to cyber attacks. Amongst various types of attacks, false data injection attack (FDIA) proves to be one of the top-priority cyber-related issues and has received increasing attention in recent years. However, so far little attention has been paid to privacy preservation issues in the detection of FDIAs in smart grid. Inspired by federated learning, a FDIA detection method based on secure federated deep learning is proposed in this paper by combining Transformer, federated learning and Paillier cryptosystem. The Transformer, as a detector deployed in edge nodes, delves deep into the connection between individual electrical quantities by using its multi-head self-attention mechanism. By using federated learning framework, our approach utilizes the data from all nodes to collaboratively train a detection model while preserving data privacy by kee** the data locally during training. To improve the security of federated learning, a secure federated learning scheme is designed by combing Paillier cryptosystem with federated learning. Through extensive experiments on the IEEE 14-bus and 118-bus test systems, the effectiveness and superiority of the proposed method are verifed. △ Less

Submitted 5 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

Comments: Accepted by IEEE Transactions on Smart Grid

Journal ref: IEEE Transactions on Smart Grid 13 (2022) 4862-4872

arXiv:2208.13342 [pdf, other]

Convergent Economic Model Predictive Control through parameter-varying storage functions for dissipativity

Authors: Zihang Dong, David Angeli, Goran Strbac

Abstract: This paper presents a new concept of controlled dissipativity as an extension of the standard dissipativity property to systems with parameter-varying storage functions under the framework of economic model predictive control (EMPC). Based on this concept, two EMPC controllers, integrated with the dissipation inequality constraints rendering the storage function parameters as decision variables, a… ▽ More This paper presents a new concept of controlled dissipativity as an extension of the standard dissipativity property to systems with parameter-varying storage functions under the framework of economic model predictive control (EMPC). Based on this concept, two EMPC controllers, integrated with the dissipation inequality constraints rendering the storage function parameters as decision variables, are formulated and the associated recursive feasibility is ensured. Then, the asymptotic convergence to an optimal equilibrium in closed-loop, without requiring the standard dissipativity assumption, is enforced by trading it off with asymptotic performance. The upper bound of asymptotic average closed-loop performance is also evaluated. Finally, an illustrative example by using the EMPC controllers with terminal equilibrium or terminal region conditions is provided to show the effectiveness of our methods. △ Less

Submitted 21 February, 2023; v1 submitted 28 August, 2022; originally announced August 2022.

arXiv:2208.13003 [pdf]

doi 10.1002/mrm.29657

Latent Signal Models: Learning Compact Representations of Signal Evolution for Improved Time-Resolved, Multi-contrast MRI

Authors: Yamin Arefeen, Junshen Xu, Molin Zhang, Zi**g Dong, Fuyixue Wang, Jacob White, Berkin Bilgic, Elfar Adalsteinsson

Abstract: Purpose: Training auto-encoders on simulated signal evolution and inserting the decoder into the forward model improves reconstructions through more compact, Bloch-equation-based representations of signal in comparison to linear subspaces. Methods: Building on model-based nonlinear and linear subspace techniques that enable reconstruction of signal dynamics, we train auto-encoders on dictionarie… ▽ More Purpose: Training auto-encoders on simulated signal evolution and inserting the decoder into the forward model improves reconstructions through more compact, Bloch-equation-based representations of signal in comparison to linear subspaces. Methods: Building on model-based nonlinear and linear subspace techniques that enable reconstruction of signal dynamics, we train auto-encoders on dictionaries of simulated signal evolution to learn more compact, non-linear, latent representations. The proposed Latent Signal Model framework inserts the decoder portion of the auto-encoder into the forward model and directly reconstructs the latent representation. Latent Signal Models essentially serve as a proxy for fast and feasible differentiation through the Bloch-equations used to simulate signal. This work performs experiments in the context of T2-shuffling, gradient echo EPTI, and MPRAGE-shuffling. We compare how efficiently auto-encoders represent signal evolution in comparison to linear subspaces. Simulation and in-vivo experiments then evaluate if reducing degrees of freedom by inserting the decoder into the forward model improves reconstructions in comparison to subspace constraints. Results: An auto-encoder with one real latent variable represents FSE, EPTI, and MPRAGE signal evolution as well as linear subspaces characterized by four basis vectors. In simulated/in-vivo T2-shuffling and in-vivo EPTI experiments, the proposed framework achieves consistent quantitative NRMSE and qualitative improvement over linear approaches. From qualitative evaluation, the proposed approach yields images with reduced blurring and noise amplification in MPRAGE shuffling experiments. Conclusion: Directly solving for non-linear latent representations of signal evolution improves time-resolved MRI reconstructions through reduced degrees of freedom. △ Less

Submitted 27 August, 2022; originally announced August 2022.

arXiv:2208.03429 [pdf]

doi 10.1109/TBCAS.2023.3267614

High-level synthesis design of scalable ultrafast ultrasound beamformer with single FPGA

Authors: Zhengchang Kou, Qi You, Jihun Kim, Zhijie Dong, Matthew R. Lowerison, Nathiya V. Chandra Sekaran, Daniel A. Llano, Pengfei Song, Michael L. Oelze

Abstract: Ultrafast ultrasound imaging is essential for advanced ultrasound imaging techniques such as ultrasound localization microscopy (ULM) and functional ultrasound (fUS). Current ultrafast ultrasound imaging is challenged by the ultrahigh data bandwidth associated with the radio frequency (RF) signal, and by the latency of the computationally expensive beamforming process. As such, continuous ultrafas… ▽ More Ultrafast ultrasound imaging is essential for advanced ultrasound imaging techniques such as ultrasound localization microscopy (ULM) and functional ultrasound (fUS). Current ultrafast ultrasound imaging is challenged by the ultrahigh data bandwidth associated with the radio frequency (RF) signal, and by the latency of the computationally expensive beamforming process. As such, continuous ultrafast data acquisition and beamforming remain elusive with existing software beamformers based on CPUs or GPUs. To address these challenges, the proposed work introduces a novel method of implementing an ultrafast ultrasound beamformer specifically for ultrafast plane wave imaging (PWI) on a field programmable gate array (FPGA) by using high-level synthesis. A parallelized implementation of the beamformer on a single FPGA was proposed by 1) utilizing a delay compression technique to reduce the delay profile size, which enables both run-time pre-calculated delay profile loading from external memory and delay reuse 2) vectorizing channel data fetching which is enabled by delay reuse, and 3) using fixed summing networks to reduce consumption of logic resources. Our proposed method presents two unique advantages over current FPGA beamformers: 1) high scalability that allows fast adaptation to different FPGA resources and beamforming speed demands by using Xilinx High-Level Synthesis as the development tool, and 2) allow a compact form factor design by using a single FPGA to complete the beamforming instead of multiple FPGAs. With the proposed method, a sustainable average beamforming rate of 4.83 G samples/second in terms of input raw RF sample was achieved. The resulting image quality of the proposed beamformer was compared with the software beamformer on the Verasonics Vantage system for both phantom imaging and in vivo imaging of a mouse brain. △ Less

Submitted 13 April, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

arXiv:2208.03028 [pdf, other]

Multimodal Brain Disease Classification with Functional Interaction Learning from Single fMRI Volume

Authors: Wei Dai, Ziyao Zhang, Lixia Tian, Shengyuan Yu, Shuhui Wang, Zhao Dong, Hairong Zheng

Abstract: In neuroimaging analysis, fMRI can well assess the function changes for brain diseases with no obvious structural lesions. To date, most deep-learning-based fMRI studies have employed functional connectivity (FC) as the basic feature for disease classification. However, FC is calculated on time series of predefined regions of interest and neglects detailed information contained in each voxel. Anot… ▽ More In neuroimaging analysis, fMRI can well assess the function changes for brain diseases with no obvious structural lesions. To date, most deep-learning-based fMRI studies have employed functional connectivity (FC) as the basic feature for disease classification. However, FC is calculated on time series of predefined regions of interest and neglects detailed information contained in each voxel. Another drawback of using FC is the limited sample size for the training of deep models. The low representation ability of FC leads to poor performance in clinical practice, especially when dealing with multimodal medical data involving multiple types of visual signals and textual records for brain diseases. To overcome this bottleneck problem in the fMRI feature modality, we propose BrainFormer, an end-to-end functional interaction learning method for brain disease classification with single fMRI volume. Unlike traditional deep learning methods that construct convolution and transformers on FC, BrainFormer learns the functional interaction from fMRI signals, by modeling the local cues within each voxel with 3D convolutions and capturing the global correlations among distant regions with specially designed global attention mechanisms from shallow layers to deep layers. Meanwhile, BrainFormer can deal with multimodal medical data including fMRI volume, structural MRI, FC features and phenotypic data to achieve more comprehensive brain disease diagnosis. We evaluate BrainFormer on five independent multi-site datasets on autism, Alzheimer's disease, depression, attention deficit hyperactivity disorder and headache disorders. The results demonstrate its effectiveness and generalizability for multiple brain diseases diagnosis with multimodal features. BrainFormer may promote precision of neuroimaging-based diagnosis in clinical practice and motivate future studies on fMRI analysis. △ Less

Submitted 1 March, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

arXiv:2205.04846 [pdf, other]

MNet: Rethinking 2D/3D Networks for Anisotropic Medical Image Segmentation

Authors: Zhangfu Dong, Yuting He, Xiaoming Qi, Yang Chen, Huazhong Shu, Jean-Louis Coatrieux, Guanyu Yang, Shuo Li

Abstract: The nature of thick-slice scanning causes severe inter-slice discontinuities of 3D medical images, and the vanilla 2D/3D convolutional neural networks (CNNs) fail to represent sparse inter-slice information and dense intra-slice information in a balanced way, leading to severe underfitting to inter-slice features (for vanilla 2D CNNs) and overfitting to noise from long-range slices (for vanilla 3D… ▽ More The nature of thick-slice scanning causes severe inter-slice discontinuities of 3D medical images, and the vanilla 2D/3D convolutional neural networks (CNNs) fail to represent sparse inter-slice information and dense intra-slice information in a balanced way, leading to severe underfitting to inter-slice features (for vanilla 2D CNNs) and overfitting to noise from long-range slices (for vanilla 3D CNNs). In this work, a novel mesh network (MNet) is proposed to balance the spatial representation inter axes via learning. 1) Our MNet latently fuses plenty of representation processes by embedding multi-dimensional convolutions deeply into basic modules, making the selections of representation processes flexible, thus balancing representation for sparse inter-slice information and dense intra-slice information adaptively. 2) Our MNet latently fuses multi-dimensional features inside each basic module, simultaneously taking the advantages of 2D (high segmentation accuracy of the easily recognized regions in 2D view) and 3D (high smoothness of 3D organ contour) representations, thus obtaining more accurate modeling for target regions. Comprehensive experiments are performed on four public datasets (CT\&MR), the results consistently demonstrate the proposed MNet outperforms the other methods. The code and datasets are available at: https://github.com/zfdong-code/MNet △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: Accepted by IJCAI 2022

arXiv:2205.04080 [pdf, ps, other]

Linear quantum systems: a tutorial

Authors: Guofeng Zhang, Zhiyuan Dong

Abstract: The purpose of this tutorial is to give a brief introduction to linear quantum control systems. The mathematical model of linear quantum control systems is presented first, then some fundamental control-theoretic notions such as stability, controllability and observability are given, which are closely related to several important concepts in quantum information science such as decoherence-free sub… ▽ More The purpose of this tutorial is to give a brief introduction to linear quantum control systems. The mathematical model of linear quantum control systems is presented first, then some fundamental control-theoretic notions such as stability, controllability and observability are given, which are closely related to several important concepts in quantum information science such as decoherence-free subsystems, quantum non-demolition variables, and back-action evasion measurements. After that, quantum Gaussian states are introduced, in particular, an information-theoretic uncertainty relation is presented which often gives a better bound for mixed Gaussian states than the well-known Heisenberg uncertainty relation. The quantum Kalman filter is presented for quantum linear systems, which is the quantum analogy of the Kalman filter for classical (namely, non-quantum-mechanical) linear systems. The quantum Kalman canonical decomposition for quantum linear systems is recorded, and its application is illustrated by means of a recent experiment. As single- and multi-photon states are useful resources in quantum information technology, the response of quantum linear systems to these types of input is presented. Finally, coherent feedback control of quantum linear systems is briefly introduced, and a recent experiment is used to demonstrate the effectiveness of quantum linear systems and networks theory.dback control of quantum linear systems is briefly introduced, and a recent experiment is used to demonstrate the effectiveness of quantum linear systems and networks theory. △ Less

Submitted 25 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

Comments: 55 pages, 4 figures, to appear in Annual Reviews in Control

MSC Class: 81Q93; 93B10; 81V80

arXiv:2204.11270 [pdf, other]

Optimization-Based Ram** Reserve Allocation of BESS for AGC Enhancement

Authors: Yiqiao Xu, Alessandra Parisio, Zhongguo Li, Zhen Dong, Zhengtao Ding

Abstract: This paper presents a novel scheme termed Optimization-based Ram** Reserve Allocation (ORRA) for addressing an ongoing challenge in Automatic Generation Control (AGC) enhancement, i.e., the optimal coordination of multiple Battery Energy Storage Systems (BESSs). While exploiting further the synergy between BESSs and slow ram** resources, the proposed scheme offers an insight into the energy-ne… ▽ More This paper presents a novel scheme termed Optimization-based Ram** Reserve Allocation (ORRA) for addressing an ongoing challenge in Automatic Generation Control (AGC) enhancement, i.e., the optimal coordination of multiple Battery Energy Storage Systems (BESSs). While exploiting further the synergy between BESSs and slow ram** resources, the proposed scheme offers an insight into the energy-neutral operation, which is achieved by smoothly discontinuing the BESS participation along with the minimization of Area Injection Error (AIE), a variant of traditional Area Control Error (ACE). The first stage of ORRA is to incorporate Neural Networks (NNs) with the AIE in order to ensure a zero-mean of ram** reserves to be allocated among BESSs. These AIE signals are then used to formulate the optimal coordination of BESS as an online optimization problem, which is therefore feedback-driven. Finally, a distributed optimization algorithm is developed to solve the formulated problem in real-time, achieving a sublinear dynamic regret that quantifies the cost difference to the trajectory computed by a centralized optimizer with perfect global information. Consistent with the geographical distribution of BESSs, the proposed ORRA is fully distributed such that the algorithm can be executed in parallel at all nodes. Simulations on a modified IEEE 14-bus system are performed to illustrate the effectiveness and important features of ORRA. △ Less

Submitted 3 May, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

arXiv:2203.12561 [pdf, other]

Scatter Ptychography

Authors: Qian Huang, Zhipeng Dong, Yuzuru Takashima, Timothy J. Schulz, David J. Brady

Abstract: Coherent illumination reflected by a remote target may be secondarily scattered by intermediate objects or materials. Here we show that phase retrieval on remotely observed images of such scattered fields enables imaging of the illuminated object at resolution proportional to $λR_s/A_s$, where $R_s$ is the range between the scatterer and the target and $A_s$ is the diameter of the observed scatter… ▽ More Coherent illumination reflected by a remote target may be secondarily scattered by intermediate objects or materials. Here we show that phase retrieval on remotely observed images of such scattered fields enables imaging of the illuminated object at resolution proportional to $λR_s/A_s$, where $R_s$ is the range between the scatterer and the target and $A_s$ is the diameter of the observed scatter. This resolution may exceed the resolution of directly viewing the target by the factor $R_cA_s/R_sA_c$, where $R_c$ is the range between the observer and the target and $A_c$ is the observing aperture. Here we use this technique to demonstrate $\approx 32\times$ resolution improvement relative to direct imaging. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2203.12365 [pdf, other]

doi 10.1016/j.apenergy.2022.120353

Values of Coordinated Residential Space Heating in Demand Response Provision

Authors: Zihang Dong, Xi Zhang, Goran Strbac

Abstract: Demand-side response from space heating in residential buildings can potentially provide a huge amount of flexibility for the power system, particularly with deep electrification of the heat sector. In this context, this paper presents a novel distributed control strategy to coordinate space heating across numerous residential households with diversified thermal parameters. By employing an iterati… ▽ More Demand-side response from space heating in residential buildings can potentially provide a huge amount of flexibility for the power system, particularly with deep electrification of the heat sector. In this context, this paper presents a novel distributed control strategy to coordinate space heating across numerous residential households with diversified thermal parameters. By employing an iterative algorithm under the game-theoretical framework, each household adjusts its own heating schedule through demand shift and thermal comfort compensation with the purpose of achieving individual cost savings, whereas the aggregate peak demand is effectively shaved on the system level. Additionally, an innovative thermal comfort model which considers both the temporal and spatial differences in customised thermal comfort requirements is proposed. Through a series of case studies, it is demonstrated that the proposed space heating coordination strategy can facilitate effective energy arbitrage for individual buildings, driving a 13.96% reduction in system operational cost and 28.22% peak shaving. Moreover, the superiority of the proposed approach in thermal comfort maintenance is numerically analysed based on the proposed thermal comfort quantification model. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Journal ref: Applied Energy, 330, p.120353 (2023)

arXiv:2203.08659 [pdf, other]

doi 10.1109/TCNS.2022.3221472

On optimal coordinated dispatch for heterogeneous storage fleets with partial availability

Authors: David Angeli, Zihang Dong, Goran Strbac

Abstract: This paper addresses the problem of optimal scheduling of an aggregated power profile (during a coordinated discharging or charging operation) by means of a heterogeneous fleet of storage devices subject to availability constraints. Devices have heterogeneous initial levels of energy, power ratings and efficiency; moreover, the fleet operates without cross-charging of the units. An explicit feedba… ▽ More This paper addresses the problem of optimal scheduling of an aggregated power profile (during a coordinated discharging or charging operation) by means of a heterogeneous fleet of storage devices subject to availability constraints. Devices have heterogeneous initial levels of energy, power ratings and efficiency; moreover, the fleet operates without cross-charging of the units. An explicit feedback policy is proposed to compute a feasible schedule whenever one exists and scalable design procedures to achieve maximum time to failure or minimal unserved energy in the case of unfeasible aggregated demand profiles. Finally, a time-domain characterization of the set of feasible demand profiles using aggregate constraints is proposed, suitable for optimization problems where the aggregate population behaviour is of interest. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: IEEE Transactions on Control of Network Systems (2022)

arXiv:2201.03040 [pdf, other]

Near-Field Spatial Correlation for Extremely Large-Scale Array Communications

Authors: Zhenjun Dong, Yong Zeng

Abstract: Extremely large-scale array (XL-array) communications correspond to systems whose antenna sizes are so large that the scatterers and/or users may no longer be located in the far-field region. By discarding the conventional far-field uniform plane wave (UPW) assumption, this letter studies the near-field spatial correlation of XL-array communications, by taking into account the more generic non-uni… ▽ More Extremely large-scale array (XL-array) communications correspond to systems whose antenna sizes are so large that the scatterers and/or users may no longer be located in the far-field region. By discarding the conventional far-field uniform plane wave (UPW) assumption, this letter studies the near-field spatial correlation of XL-array communications, by taking into account the more generic non-uniform spherical wave (NUSW) characteristics. It is revealed that different from the far-field channel spatial correlation which only depends on the power angular spectrum (PAS), the near-field spatial correlation depends on the scattered power distribution not just characterized by their arriving angles, but also by the scatterers' distances, which is termed as power location spectrum (PLS). A novel integral expression is derived for the near-field spatial correlation in terms of the scatterers' location distribution, which includes the far-field spatial correlation as a special case. The result shows that different from the far-field case, the near-field spatial correlation no longer exhibits spatial stationarity in general, since the correlation coefficient for each pair of antennas depends on their specific positions, rather than their relative distance only. To gain further insights, we propose a generalized one-ring model for scatterer distribution, by allowing the ring center to be flexibly located rather than coinciding with the array center as in the conventional one-ring model. Numerical results are provided to show the necessity of the near-field spatial correlation modelling for XL-array communications. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2111.03301 [pdf, other]

Frequency-Aware Physics-Inspired Degradation Model for Real-World Image Super-Resolution

Authors: Zhenxing Dong, Hong Cao, Wang Shen, Yu Gan, Yuye Ling, Guangtao Zhai, Yikai Su

Abstract: Current learning-based single image super-resolution (SISR) algorithms underperform on real data due to the deviation in the assumed degrada-tion process from that in the real-world scenario. Conventional degradation processes consider applying blur, noise, and downsampling (typicallybicubic downsampling) on high-resolution (HR) images to synthesize low-resolution (LR) counterparts. However, few w… ▽ More Current learning-based single image super-resolution (SISR) algorithms underperform on real data due to the deviation in the assumed degrada-tion process from that in the real-world scenario. Conventional degradation processes consider applying blur, noise, and downsampling (typicallybicubic downsampling) on high-resolution (HR) images to synthesize low-resolution (LR) counterparts. However, few works on degradation modelling have taken the physical aspects of the optical imaging system intoconsideration. In this paper, we analyze the imaging system optically andexploit the characteristics of the real-world LR-HR pairs in the spatial frequency domain. We formulate a real-world physics-inspired degradationmodel by considering bothopticsandsensordegradation; The physical degradation of an imaging system is modelled as a low-pass filter, whose cut-off frequency is dictated by the object distance, the focal length of thelens, and the pixel size of the image sensor. In particular, we propose to use a convolutional neural network (CNN) to learn the cutoff frequency of real-world degradation process. The learned network is then applied to synthesize LR images from unpaired HR images. The synthetic HR-LR image pairs are later used to train an SISR network. We evaluatethe effectiveness and generalization capability of the proposed degradation model on real-world images captured by different imaging systems. Experimental results showcase that the SISR network trained by using our synthetic data performs favorably against the network using the traditional degradation model. Moreover, our results are comparable to that obtained by the same network trained by using real-world LR-HR pairs, which are challenging to obtain in real scenes. △ Less

Submitted 11 February, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

Comments: 22 pages,12 figures

arXiv:2111.02395 [pdf, other]

Exact aggregate models for optimal management of heterogeneous fleets of storage devices

Authors: David Angeli, Zihang Dong, Goran Strbac

Abstract: Future power grids will entail large fleets of storage devices capable of scheduling their charging/discharging profiles so as to achieve lower peak demand and reduce energy bills, by shifting absorption times in sync with the availability of renewable energy sources. Optimal management of such fleets entails large scale optimisation problems which are better dealt with in a hierarchical manner, b… ▽ More Future power grids will entail large fleets of storage devices capable of scheduling their charging/discharging profiles so as to achieve lower peak demand and reduce energy bills, by shifting absorption times in sync with the availability of renewable energy sources. Optimal management of such fleets entails large scale optimisation problems which are better dealt with in a hierarchical manner, by clustering together individual devices into fleets. Leveraging on recent results characterizing the set of aggregate demand profiles of a heterogeneous fleet of charging (or, respectively, discharging) devices we propose a way to achieve optimality, in a unit commitment problem, by adopting a simplified formulation with a number of constraints for the fleet that scales linearly in the number of time-slots considered and is independent of the size of the fleet. This is remarkable, as it shows that, under suitable conditions, a heterogeneous fleet of any size can effectively be treated as a single storage unit. △ Less

Submitted 3 November, 2021; originally announced November 2021.

arXiv:2110.14174 [pdf, other]

doi 10.1109/TAC.2022.3169582

On the Dynamics of the Tavis-Cummings Model

Authors: Zhiyuan Dong, Guofeng Zhang, Ai-Guo Wu, Re-Bing Wu

Abstract: The purpose of this paper is to present a comprehensive study of the Tavis-Cummings model from a system-theoretic perspective. A typical form of the Tavis-Cummings model is composed of an ensemble of non-interacting two-level systems (TLSs) that are collectively coupled to a common cavity resonator. The associated quantum linear passive system is proposed, whose canonical form reveals typical feat… ▽ More The purpose of this paper is to present a comprehensive study of the Tavis-Cummings model from a system-theoretic perspective. A typical form of the Tavis-Cummings model is composed of an ensemble of non-interacting two-level systems (TLSs) that are collectively coupled to a common cavity resonator. The associated quantum linear passive system is proposed, whose canonical form reveals typical features of the Tavis-Cummings model, including $\sqrt{N}$- scaling, dark states, bright states, single-excitation superradiant and subradiant states. The passivity of this linear system is related to the vacuum Rabi mode splitting phenomenon in Tavis-Cummings systems. On the basis of the linear model, an analytic form is presented for the steady-state output state of the Tavis-Cummings model driven by a single-photon state. Master equations are used to study the excitation properties of the Tavis-Cummings model in the multi-excitation scenario. Finally, in terms of the transition matrix for a linear time-varying system, a computational framework is proposed for calculating the state of the Tavis-Cummings model, which is applicable to the multi-excitation case. △ Less

Submitted 9 May, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 16 pages, 8 figures, IEEE Transactions on Automatic Control, to appear

Journal ref: IEEE Transactions on Automatic Control, 2022

arXiv:2109.13450 [pdf, ps, other]

Two-Stage Channel Estimation Approach for Cell-Free IoT With Massive Random Access

Authors: Xinhua Wang, Alexei Ashikhmin, Zhicheng Dong, Chao Zhai

Abstract: We investigate the activity detection and channel estimation issues for cell-free Internet of Things (IoT) networks with massive random access. In each time slot, only partial devices are active and communicate with neighboring access points (APs) using non-orthogonal random pilot sequences. Different from the centralized processing in cellular networks, the activity detection and channel estimati… ▽ More We investigate the activity detection and channel estimation issues for cell-free Internet of Things (IoT) networks with massive random access. In each time slot, only partial devices are active and communicate with neighboring access points (APs) using non-orthogonal random pilot sequences. Different from the centralized processing in cellular networks, the activity detection and channel estimation in cell-free IoT is more challenging due to the distributed and user-centric architecture. We propose a two-stage approach to detect the random activities of devices and estimate their channel states. In the first stage, the activity of each device is jointly detected by its adjacent APs based on the vector approximate message passing (Vector AMP) algorithm. In the second stage, each AP re-estimates the channel using the linear minimum mean square error (LMMSE) method based on the detected activities to improve the channel estimation accuracy. We derive closed-form expressions for the activity detection error probability and the mean-squared channel estimation errors for a typical device. Finally, we analyze the performance of the entire cell-free IoT network in terms of coverage probability. Simulation results validate the derived closed-form expressions and show that the cell-free IoT significantly outperforms the collocated massive MIMO and small-cell schemes in terms of coverage probability. △ Less

Submitted 27 September, 2021; originally announced September 2021.

arXiv:2108.05985 [pdf]

doi 10.1002/mrm.29194

Optimized multi-axis spiral projection MR fingerprinting with subspace reconstruction for rapid whole-brain high-isotropic-resolution quantitative imaging

Authors: Xiaozhi Cao, Congyu Liao, Siddharth Srinivasan Iyer, Zhixing Wang, Zihan Zhou, Erpeng Dai, Gilad Liberman, Zi**g Dong, Ting Gong, Hongjian He, Jianhui Zhong, Berkin Bilgic, Kawin Setsompop

Abstract: Purpose: To improve image quality and accelerate the acquisition of 3D MRF. Methods: Building on the multi-axis spiral-projection MRF technique, a subspace reconstruction with locally low rank (LLR) constraint and a modified spiral-projection spatiotemporal encoding scheme termed tiny-golden-angle-shuffling (TGAS) were implemented for rapid whole-brain high-resolution quantitative map**. The LLR… ▽ More Purpose: To improve image quality and accelerate the acquisition of 3D MRF. Methods: Building on the multi-axis spiral-projection MRF technique, a subspace reconstruction with locally low rank (LLR) constraint and a modified spiral-projection spatiotemporal encoding scheme termed tiny-golden-angle-shuffling (TGAS) were implemented for rapid whole-brain high-resolution quantitative map**. The LLR regularization parameter and the number of subspace bases were tuned using retrospective in-vivo data and simulated examinations, respectively. B0 inhomogeneity correction using multi-frequency interpolation was incorporated into the subspace reconstruction to further improve the image quality by mitigating blurring caused by off-resonance effect. Results: The proposed MRF acquisition and reconstruction framework can produce provide high quality 1-mm isotropic whole-brain quantitative maps in a total acquisition time of 1 minute 55 seconds, with higher-quality results than ones obtained from the previous approach in 6 minutes. The comparison of quantitative results indicates that neither the subspace reconstruction nor the TGAS trajectory induce bias for T1 and T2 map**. High quality whole-brain MRF data were also obtained at 0.66-mm isotropic resolution in 4 minutes using the proposed technique, where the increased resolution was shown to improve visualization of subtle brain structures. Conclusion: The proposed TGAS-SPI-MRF with optimized spiral-projection trajectory and subspace reconstruction can enable high-resolution quantitative map** with faster acquisition speed. △ Less

Submitted 12 August, 2021; originally announced August 2021.

Comments: 40 pages, 11 figures, 2 tables

Journal ref: Magnetic Resonance in Medicine, 2022

arXiv:2012.12114 [pdf, other]

Deep Deterministic Policy Gradient for Relay Selection and Power Allocation in Cooperative Communication Network

Authors: Yuanzhe Geng, Erwu Liu, Rui Wang, Yiming Liu, Jie Wang, Gang Shen, Zhao Dong

Abstract: Perfect channel state information (CSI) is usually required when considering relay selection and power allocation in cooperative communication. However, it is difficult to get an accurate CSI in practical situations. In this letter, we study the outage probability minimizing problem based on optimizing relay selection and transmission power. We propose a prioritized experience replay aided deep de… ▽ More Perfect channel state information (CSI) is usually required when considering relay selection and power allocation in cooperative communication. However, it is difficult to get an accurate CSI in practical situations. In this letter, we study the outage probability minimizing problem based on optimizing relay selection and transmission power. We propose a prioritized experience replay aided deep deterministic policy gradient learning framework, which can find an optimal solution by dealing with continuous action space, without any prior knowledge of CSI. Simulation results reveal that our approach outperforms reinforcement learning based methods in existing literatures, and improves the communication success rate by about 4%. △ Less

Submitted 14 March, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

arXiv:2006.08357 [pdf, other]

CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs

Authors: Zhen Dong, Dequan Wang, Qi**g Huang, Yizhao Gao, Yaohui Cai, Tian Li, Bichen Wu, Kurt Keutzer, John Wawrzynek

Abstract: Deploying deep learning models on embedded systems has been challenging due to limited computing resources. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, such as object detection, have not been adequately addressed. Compared with image classification, detection problems are more sensitive to the spatial variance of objects, and… ▽ More Deploying deep learning models on embedded systems has been challenging due to limited computing resources. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, such as object detection, have not been adequately addressed. Compared with image classification, detection problems are more sensitive to the spatial variance of objects, and therefore, require specialized convolutions to aggregate spatial information. To address this need, recent work introduces dynamic deformable convolution to augment regular convolutions. However, this will lead to inefficient memory accesses of inputs with existing hardware. In this work, we harness the flexibility of FPGAs to develop a novel object detection pipeline with deformable convolutions. We show the speed-accuracy tradeoffs for a set of algorithm modifications including irregular-access versus limited-range and fixed-shape. We then Co-Design a Network CoDeNet with the modified deformable convolution and quantize it to 4-bit weights and 8-bit activations. With our high-efficiency implementation, our solution reaches 26.9 frames per second with a tiny model size of 0.76 MB while achieving 61.7 AP50 on the standard object detection dataset, Pascal VOC. With our higher accuracy implementation, our model gets to 67.1 AP50 on Pascal VOC with only 2.9 MB of parameters-20.9x smaller but 10% more accurate than Tiny-YOLO. △ Less

Submitted 25 January, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

Comments: Github repo: https://github.com/DequanWang/CoDeNet arXiv:2002.08357 is the preliminary version of this paper

Journal ref: FPGA 2021

arXiv:2004.04323 [pdf, ps, other]

doi 10.1109/TSG.2021.3066449

Efficient Robust Dispatch of Combined Heat and Power Systems

Authors: Yibao Jiang, Can Wan, Audun Botterud, Yonghua Song, Zhao Yang Dong

Abstract: Combined heat and power systems facilitate efficient interactions between individual energy sectors for higher renewable energy accommodation. However, the feasibility of operational strategies is difficult to guarantee due to the presence of substantial uncertainties pertinent to renewable energy and multi-energy loads. This paper proposes a novel efficient robust dispatch model of combined heat… ▽ More Combined heat and power systems facilitate efficient interactions between individual energy sectors for higher renewable energy accommodation. However, the feasibility of operational strategies is difficult to guarantee due to the presence of substantial uncertainties pertinent to renewable energy and multi-energy loads. This paper proposes a novel efficient robust dispatch model of combined heat and power systems based on extensions of disturbance invariant sets. The approach has high computational efficiency and provides flexible and robust strategies with an adjustable level of conservativeness. In particular, the proposed robust dispatch method obtains operational strategies by solving a nominal uncertainty-free dispatch problem, whose complexity is identical to a deterministic problem. The robustness against uncertainties is enhanced by endowing the nominal dispatch model with properly tightened constraints considering time-variant uncertainty sets. Towards this end, a novel direct constraint tightening algorithm is developed based on the dual norm to calculate multi-period tightened constraints efficiently without linear programming iterations. Furthermore, the budget uncertainty set is newly combined with constraint tightening to flexibly adjust the conservativeness level of robust solutions. The effectiveness of the proposed robust method is demonstrated in simulation studies of a test system in terms of computational efficiency, decision robustness and cost optimality. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: 8 pages, 12 figures

arXiv:2004.03870 [pdf, other]

On the dynamics of a quantum coherent feedback network of cavity-mediated double quantum dot qubits

Authors: Zhiyuan Dong, Wei Cui, Guofeng Zhang

Abstract: The purpose of this paper is to present a comprehensive study of a coherent feedback network where the main component consists of two distant double quantum dot (DQD) qubits which are directly coupled to a cavity. This main component has recently been physically realized (van Woerkom, {\it et al.}, Microwave photon-mediated interactions between semiconductor qubits, Physical Review X, 8(4):041018,… ▽ More The purpose of this paper is to present a comprehensive study of a coherent feedback network where the main component consists of two distant double quantum dot (DQD) qubits which are directly coupled to a cavity. This main component has recently been physically realized (van Woerkom, {\it et al.}, Microwave photon-mediated interactions between semiconductor qubits, Physical Review X, 8(4):041018, 2018). The feedback loop is closed by cascading this main component with a beamsplitter. The dynamics of this coherent feedback network is studied from three perspectives. First, an analytic form of the output single-photon state of the network driven by a single-photon state is derived; in particular, it is observed that coherent feedback elongates considerably the interaction between the input single photon and the network. Second, excitation probabilities of DQD qubits are computed when the network is driven by a single-photon input state. Moreover, if the input is vacuum but one of the two DQD qubits is initialized in its excited state, the explicit expression of the state of the network is derived, in particular, it is shown that the output field and the two DQD qubits can form an entangled state if the transition frequencies of two DQD qubits are equal. Finally, the exact form of the pulse shape is obtained by which the single-photon input can fully excite one of these two DQD qubits at any controllable time, which may be useful in the construction of $2$-qubit quantum gates. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: 31 page, 11 figures; submitted for publication; comments are welcome!

arXiv:2002.08357 [pdf, other]

Algorithm-hardware Co-design for Deformable Convolution

Authors: Qi**g Huang, Dequan Wang, Yizhao Gao, Yaohui Cai, Zhen Dong, Bichen Wu, Kurt Keutzer, John Wawrzynek

Abstract: FPGAs provide a flexible and efficient platform to accelerate rapidly-changing algorithms for computer vision. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, including object detection and instance segmentation, have not been adequately addressed. Compared with image classification, detection problems are more sensitive to the s… ▽ More FPGAs provide a flexible and efficient platform to accelerate rapidly-changing algorithms for computer vision. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, including object detection and instance segmentation, have not been adequately addressed. Compared with image classification, detection problems are more sensitive to the spatial variance of objects, and therefore, require specialized convolutions to aggregate spatial information. To address this, recent work proposes dynamic deformable convolution to augment regular convolutions. Regular convolutions process a fixed grid of pixels across all the spatial locations in an image, while dynamic deformable convolutions may access arbitrary pixels in the image and the access pattern is input-dependent and varies per spatial location. These properties lead to inefficient memory accesses of inputs with existing hardware. In this work, we first investigate the overhead of the deformable convolution on embedded FPGA SoCs, and then show the accuracy-latency tradeoffs for a set of algorithm modifications including full versus depthwise, fixed-shape, and limited-range. These modifications benefit the energy efficiency for embedded devices in general as they reduce the compute complexity. We then build an efficient object detection network with modified deformable convolutions and quantize the network using state-of-the-art quantization methods. We implement a unified hardware engine on FPGA to support all the operations in the network. Preliminary experiments show that little accuracy is compromised and speedup can be achieved with our co-design optimization for the deformable convolution. △ Less

Submitted 18 February, 2020; originally announced February 2020.

Journal ref: NeurIPS EMC2 2019

arXiv:2002.00403 [pdf, other]

Multiuser Scheduling for Minimizing Age of Information in Uplink MIMO Systems

Authors: He Chen, Qian Wang, Zheng Dong, Ning Zhang

Abstract: This paper studies the user scheduling problem in a multiuser multiple-input multi-output (MIMO) status update system, in which multiple single-antenna devices aim to send their latest statuses to a multiple-antenna information-fusion access point (AP) via a shared wireless channel. The information freshness in the considered system is quantified by a recently proposed metric, termed age of inform… ▽ More This paper studies the user scheduling problem in a multiuser multiple-input multi-output (MIMO) status update system, in which multiple single-antenna devices aim to send their latest statuses to a multiple-antenna information-fusion access point (AP) via a shared wireless channel. The information freshness in the considered system is quantified by a recently proposed metric, termed age of information (AoI). Thanks to the extra spatial degrees-of-freedom brought about by the multiple antennas at the AP, multiple devices can be granted to transmit simultaneously in each time slot. We aim to seek the optimal scheduling policy that can minimize the network-wide AoI by optimally deciding which device or group of devices to be scheduled for transmission in each slot given the instantaneous AoI values of all devices at the beginning of the slot. To that end, we formulate the multiuser scheduling problem as a Markov decision process (MDP). We attain the optimal policy by resolving the formulated MDP problem and develop a low-complexity sub-optimal policy. Simulation results show that the proposed optimal and sub-optimal policies significantly outperform the state-of-the-art benchmark schemes. △ Less

Submitted 12 February, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

arXiv:2001.10728 [pdf, other]

Design of Non-orthogonal and Noncoherent Massive MIMO for Scalable URLLC Beyond 5G

Authors: He Chen, Zheng Dong, Jian-Kang Zhang, Branka Vucetic

Abstract: This paper is to design and optimize a non-orthogonal and noncoherent massive multiple-input multiple-output (MIMO) framework towards enabling scalable ultra-reliable low-latency communications (sURLLC) in wireless systems beyond 5G. In this framework, the huge diversity gain associated with the large-scale antenna array in massive MIMO systems is leveraged to ensure ultrahigh reliability. To redu… ▽ More This paper is to design and optimize a non-orthogonal and noncoherent massive multiple-input multiple-output (MIMO) framework towards enabling scalable ultra-reliable low-latency communications (sURLLC) in wireless systems beyond 5G. In this framework, the huge diversity gain associated with the large-scale antenna array in massive MIMO systems is leveraged to ensure ultrahigh reliability. To reduce the overhead and latency induced by the channel estimation process, we advocate the noncoherent communication technique which does not need the knowledge of instantaneous channel state information (CSI) but only depends on the large-scale fading coefficients for information decoding. To boost the scalability of the system considered, we enable the non-orthogonal channel access of multiple users by devising a new differential modulation scheme to assure that each transmitted signal matrix can be uniquely determined in the noise-free case and be reliably estimated in noisy cases when the antenna array size is scaled up. The key idea is to make the transmitted signals from multiple users be superimposed properly over the air such that when the sum-signal is correctly detected, the signals sent by all users can be uniquely determined. To further improve the average error performance when the array antenna number is large, we propose a max-min Kullback-Leibler (KL) divergence-based design by jointly optimizing the transmitted powers of all users and the sub-constellation assignment among them. Simulation results show that the proposed design significantly outperforms the existing max-min Euclidean distance-based counterpart in terms of error performance. Moreover, our proposed approach also has a better error performance than the conventional coherent zero-forcing (ZF) receiver with orthogonal channel training, particularly for cell-edge users. △ Less

Submitted 10 February, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: arXiv admin note: text overlap with arXiv:1903.01642

Showing 1–50 of 54 results for author: Dong, Z