Skip to main content

Showing 1–50 of 295 results for author: Xu, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  2. arXiv:2406.15716  [pdf, other

    eess.IV cs.CV

    Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

    Authors: Han Liu, Hao Li, Jiacheng Wang, Yubo Fan, Zhoubing Xu, Ipek Oguz

    Abstract: Fluorescence labeling is the standard approach to reveal cellular structures and other subcellular constituents for microscopy images. However, this invasive procedure may perturb or even kill the cells and the procedure itself is highly time-consuming and complex. Recently, in silico labeling has emerged as a promising alternative, aiming to use machine learning models to directly predict the flu… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.09272  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

    Authors: Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman

    Abstract: Generating realistic audio for human interactions is important for many applications, such as creating sound effects for films or virtual reality games. Existing approaches implicitly assume total correspondence between the video and audio during training, yet many sounds happen off-screen and have weak to no correspondence with the visuals -- resulting in uncontrolled ambient sounds or hallucinat… ▽ More

    Submitted 20 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://vision.cs.utexas.edu/projects/action2sound

  4. arXiv:2406.02640  [pdf, other

    eess.IV physics.med-ph physics.optics

    Ghost imaging-based Non-contact Heart Rate Detection

    Authors: Jianming Yu, Yuchen He, Bin Li, Hui Chen, Huaibin Zheng, Jianbin Liu, Zhuo Xu

    Abstract: Remote heart rate measurement is an increasingly concerned research field, usually using remote photoplethysmography (rPPG) to collect heart rate information through video data collection. However, in certain specific scenarios (such as low light conditions, intense lighting, and non-line-of-sight situations), traditional imaging methods fail to capture image information effectively, that may lead… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 4 pages, 6 figures

  5. arXiv:2405.05579  [pdf

    cs.HC eess.SY

    Intelligent EC Rearview Mirror: Enhancing Driver Safety with Dynamic Glare Mitigation via Cloud Edge Collaboration

    Authors: Junyi Yang, Zefei Xu, Huayi Lai, Hongjian Chen, Sifan Kong, Yutong Wu, Huan Yang

    Abstract: Sudden glare from trailing vehicles significantly increases driving safety risks. Existing anti-glare technologies such as electronic, manually-adjusted, and electrochromic rearview mirrors, are expensive and lack effective adaptability in different lighting conditions. To address these issues, our research introduces an intelligent rearview mirror system utilizing novel all-liquid electrochromic… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  6. arXiv:2405.04806  [pdf, other

    eess.SY

    A leadless power transfer and wireless telemetry solutions for an endovascular electrocorticography

    Authors: Zhangyu Xu, Majid Khazaee, Nhan Duy Truong, Deniel Havenga, Armin Nikpour, Arman Ahnood, Omid Kavehei

    Abstract: Endovascular brain-computer interfaces (eBCIs) offer a minimally invasive way to connect the brain to external devices, merging neuroscience, engineering, and medical technology. Achieving wireless data and power transmission is crucial for the clinical viability of these implantable devices. Typically, solutions for endovascular electrocorticography (ECoG) include a sensing stent with multiple el… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 17 Pages, 12 figures

  7. arXiv:2404.05404  [pdf, other

    eess.SY

    Contouring Error Bounded Control for Biaxial Switched Linear Systems

    Authors: Meng Yuan, Ye Wang, Chris Manzie, Zhezhuang Xu, Tianyou Chai

    Abstract: Biaxial motion control systems are used extensively in manufacturing and printing industries. To improve throughput and reduce machine cost, lightweight materials are being proposed in structural components but may result in higher flexibility in the machine links. This flexibility is often position dependent and compromises precision of the end effector of the machine. To address the need for imp… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  8. arXiv:2404.01082  [pdf, other

    eess.IV

    The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

    Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

    Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More

    Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 25 pages, 17 figures

  9. arXiv:2403.16361  [pdf, other

    eess.IV cs.CV

    RSTAR: Rotational Streak Artifact Reduction in 4D CBCT using Separable and Circular Convolutions

    Authors: Ziheng Deng, Hua Chen, Haibo Hu, Zhiyong Xu, Tianling Lyu, Yan Xi, Yang Chen, Jun Zhao

    Abstract: Four-dimensional cone-beam computed tomography (4D CBCT) provides respiration-resolved images and can be used for image-guided radiation therapy. However, the ability to reveal respiratory motion comes at the cost of image artifacts. As raw projection data are sorted into multiple respiratory phases, there is a limited number of cone-beam projections available for image reconstruction. Consequentl… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  10. arXiv:2403.15716  [pdf, other

    cs.RO cs.AI eess.SY

    Distributed Robust Learning based Formation Control of Mobile Robots based on Bioinspired Neural Dynamics

    Authors: Zhe Xu, Tao Yan, Simon X. Yang, S. Andrew Gadsden, Mohammad Biglarbegian

    Abstract: This paper addresses the challenges of distributed formation control in multiple mobile robots, introducing a novel approach that enhances real-world practicability. We first introduce a distributed estimator using a variable structure and cascaded design technique, eliminating the need for derivative information to improve the real time performance. Then, a kinematic tracking control method is de… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by IEEE Transactions on Intelligent Vehicles

  11. arXiv:2403.13601  [pdf, other

    eess.SY

    Lattice piecewise affine approximation of explicit model predictive control with application to satellite attitude control

    Authors: Zhengqi Xu, Jun Xu, Ai-Guo Wu, Shuning Wang

    Abstract: Satellite attitude cotrol is a crucial part of aerospace technology, and model predictive control(MPC) is one of the most promising controllers in this area, which will be less effective if real-time online optimization can not be achieved. Explicit MPC converts the online calculation into a table lookup process, however the solution is difficult to obtain if the system dimension is high or the co… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  12. arXiv:2403.10012  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

    Authors: Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

    Abstract: Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications. In this paper, in contrast to improving the simulation pipeline, we deliver a novel insight into real-world CAC from the perspective of Unsupervi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Codes and datasets will be made publicly available at https://github.com/zju-jiangqi/QDMR

  13. arXiv:2403.09993  [pdf, other

    cs.CV eess.IV

    TRG-Net: An Interpretable and Controllable Rain Generator

    Authors: Zhiqiang Pang, Hong Wang, Qi Xie, Deyu Meng, Zongben Xu

    Abstract: Exploring and modeling rain generation mechanism is critical for augmenting paired data to ease training of rainy image processing models. Against this task, this study proposes a novel deep learning based rain generator, which fully takes the physical generation mechanism underlying rains into consideration and well encodes the learning of the fundamental rain factors (i.e., shape, orientation, l… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  14. arXiv:2403.04549  [pdf, other

    cs.CV eess.IV

    Explainable Face Verification via Feature-Guided Gradient Backpropagation

    Authors: Yuhang Lu, Zewei Xu, Touradj Ebrahimi

    Abstract: Recent years have witnessed significant advancement in face recognition (FR) techniques, with their applications widely spread in people's lives and security-sensitive areas. There is a growing need for reliable interpretations of decisions of such systems. Existing studies relying on various mechanisms have investigated the usage of saliency maps as an explanation approach, but suffer from differ… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  15. arXiv:2402.19013  [pdf, other

    eess.SY

    Ultraviolet Positioning via TDOA: Error Analysis and System Prototype

    Authors: Shihui Yu, Chubing Lv, Yueke Yang, Yuchen Pan, Lei Sun, Juliang Cao, Ruihang Yu, Chen Gong, Wenqi Wu, Zhengyuan Xu

    Abstract: This work performs the design, real-time hardware realization, and experimental evaluation of a positioning system by ultra-violet (UV) communication under photon-level signal detection. The positioning is based on time-difference of arrival (TDOA) principle. Time division-based transmission of synchronization sequence from three transmitters with known positions is applied. We investigate the pos… ▽ More

    Submitted 14 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  16. arXiv:2402.06841  [pdf

    eess.IV cs.CV

    Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA

    Authors: Shaojie Tang, Penpen Miao, Xingyu Gao, Yu Zhong, Dantong Zhu, Haixing Wen, Zhihui Xu, Qiuyue Wei, Hong** Yao, Xin Huang, Rui Gao, Chen Zhao, Weihua Zhou

    Abstract: A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  17. arXiv:2402.00744  [pdf, other

    cs.SD cs.CL eess.AS

    BATON: Aligning Text-to-Audio Model with Human Preference Feedback

    Authors: Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, **gquan Liu, Jiasheng Lu, Xiu Li

    Abstract: With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment betw… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  18. arXiv:2402.00320  [pdf

    eess.IV

    DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

    Authors: Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang **, Chenxi Hu

    Abstract: Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to tr… ▽ More

    Submitted 2 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 10 pages, 8 figures

  19. arXiv:2401.15913  [pdf, other

    eess.IV cs.CV cs.LG physics.flu-dyn stat.AP

    Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

    Authors: Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

    Abstract: Flow image super-resolution (FISR) aims at recovering high-resolution turbulent velocity fields from low-resolution flow images. Existing FISR methods mainly process the flow images in natural image patterns, while the critical and distinct flow visual properties are rarely considered. This negligence would cause the significant domain gap between flow and natural images to severely hamper the acc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  20. arXiv:2401.10269  [pdf, ps, other

    cs.IT eess.SP stat.ME

    Robust Multi-Sensor Multi-Target Tracking Using Possibility Labeled Multi-Bernoulli Filter

    Authors: Han Cai, Chenbao Xue, Jeremie Houssineau, Zhirun Xue

    Abstract: With the increasing complexity of multiple target tracking scenes, a single sensor may not be able to effectively monitor a large number of targets. Therefore, it is imperative to extend the single-sensor technique to Multi-Sensor Multi-Target Tracking (MSMTT) for enhanced functionality. Typical MSMTT methods presume complete randomness of all uncertain components, and therefore effective solution… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  21. arXiv:2401.03476  [pdf, other

    cs.MM cs.AI cs.HC cs.SD eess.AS

    Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

    Authors: Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu

    Abstract: Current talking avatars mostly generate co-speech gestures based on audio and text of the utterance, without considering the non-speaking motion of the speaker. Furthermore, previous works on co-speech gesture generation have designed network structures based on individual gesture datasets, which results in limited data volume, compromised generalizability, and restricted speaker movements. To tac… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 6 pages, 3 figures, ICASSP 2024

  22. arXiv:2401.03150  [pdf, other

    eess.IV

    O-PRESS: Boosting OCT axial resolution with Prior guidance, Recurrence, and Equivariant Self-Supervision

    Authors: Kaiyan Li, **gyuan Yang, Wenxuan Liang, Xingde Li, Chenxi Zhang, Lulu Chen, Chan Wu, Xiao Zhang, Zhiyan Xu, Yuelin Wang, Lihui Meng, Yue Zhang, Youxin Chen, S. Kevin Zhou

    Abstract: Optical coherence tomography (OCT) is a noninvasive technology that enables real-time imaging of tissue microanatomies. The axial resolution of OCT is intrinsically constrained by the spectral bandwidth of the employed light source while maintaining a fixed center wavelength for a specific application. Physically extending this bandwidth faces strong limitations and requires a substantial cost. We… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  23. arXiv:2401.03122  [pdf, other

    cs.CV eess.IV

    SAR Despeckling via Regional Denoising Diffusion Probabilistic Model

    Authors: Xuran Hu, Ziqiang Xu, Zhihan Chen, Zhengpeng Feng, Mingzhe Zhu, LJubisa Stankovic

    Abstract: Speckle noise poses a significant challenge in maintaining the quality of synthetic aperture radar (SAR) images, so SAR despeckling techniques have drawn increasing attention. Despite the tremendous advancements of deep learning in fixed-scale SAR image despeckling, these methods still struggle to deal with large-scale SAR images. To address this problem, this paper introduces a novel despeckling… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 5 pages, 5 figures

    ACM Class: I.4.4

  24. arXiv:2312.15863  [pdf, other

    cs.LG cs.AI cs.RO eess.SY

    PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning

    Authors: Hangyu Mao, Rui Zhao, Ziyue Li, Zhiwei Xu, Hao Chen, Yiqun Chen, Bin Zhang, Zhen Xiao, Junge Zhang, Jiang** Yin

    Abstract: Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL. This work studies the former. Specifically, the Perception and Decision-making Interleaving Transformer (PDiT) network is proposed, which cascades two Transformers in a very natural way: the perceiving one focuses on \emph{the environmental perception} by processing the observation at t… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024, full paper with oral presentation). Cover our preliminary study: arXiv:2212.14538

  25. arXiv:2312.15701  [pdf, other

    eess.IV cs.CV cs.LG

    Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

    Authors: Jiahong Fu, Qi Xie, Deyu Meng, Zongben Xu

    Abstract: The deep unfolding approach has attracted significant attention in computer vision tasks, which well connects conventional image processing modeling manners with more recent deep learning techniques. Specifically, by establishing a direct correspondence between algorithm operators at each implementation step and network modules within each layer, one can rationally construct an almost ``white box'… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  26. arXiv:2312.13611  [pdf, other

    cs.LG cs.NI eess.SP

    Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

    Authors: Zheshun Wu, Zenglin Xu, Dun Zeng, Junfan Li, Jie Liu

    Abstract: With the proliferation of intelligent mobile devices in wireless device-to-device (D2D) networks, decentralized federated learning (DFL) has attracted significant interest. Compared to centralized federated learning (CFL), DFL mitigates the risk of central server failures due to communication bottlenecks. However, DFL faces several challenges, such as the severe heterogeneity of data distributions… ▽ More

    Submitted 10 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear in IEEE Transactions on Vehicular Technology

  27. arXiv:2312.05528  [pdf, other

    eess.IV cs.CV

    Exploring 3D U-Net Training Configurations and Post-Processing Strategies for the MICCAI 2023 Kidney and Tumor Segmentation Challenge

    Authors: Kwang-Hyun Uhm, Hyunjun Cho, Zhixin Xu, Seohoon Lim, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: In 2023, it is estimated that 81,800 kidney cancer cases will be newly diagnosed, and 14,890 people will die from this cancer in the United States. Preoperative dynamic contrast-enhanced abdominal computed tomography (CT) is often used for detecting lesions. However, there exists inter-observer variability due to subtle differences in the imaging features of kidney and kidney tumors. In this paper… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: MICCAI 2023, KITS 2023 challenge 2nd place

  28. arXiv:2312.03376  [pdf, other

    eess.SY

    Beacon-enabled TDMA Ultraviolet Communication Network System Design and Realization

    Authors: Yuchen Pan, Fei Long, ** Li, Haotian Shi, Jiazhao Shi, Hanlin Xiao, Chen Gong, Zhengyuan Xu

    Abstract: Nonline of sight (NLOS) ultraviolet (UV) scattering communication can serve as a good candidate for outdoor optical wireless communication (OWC) in the cases of non-perfect transmitter-receiver alignment and radio silence. We design and demonstrate a NLOS UV scattering communication network system in this paper, where a beacon-enabled time division multiple access (TDMA) scheme is adopted. In our… ▽ More

    Submitted 15 April, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  29. arXiv:2312.01573  [pdf

    eess.IV cs.CV

    Survey on deep learning in multimodal medical imaging for cancer detection

    Authors: Yan Tian, Zhaocheng Xu, Yujun Ma, Wei** Ding, Ruili Wang, Zhihong Gao, Guohua Cheng, Linyang He, Xuran Zhao

    Abstract: The task of multimodal cancer detection is to determine the locations and categories of lesions by using different imaging techniques, which is one of the key research methods for cancer diagnosis. Recently, deep learning-based object detection has made significant developments due to its strength in semantic feature extraction and nonlinear function fitting. However, multimodal cancer detection r… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Journal ref: Neural Computing and Applications. 2023 Nov 29:1-6

  30. arXiv:2311.18188  [pdf, other

    eess.AS cs.LG

    Speech Understanding on Tiny Devices with A Learning Cache

    Authors: Afsara Benazir, Zhiming Xu, Felix Xiaozhu Lin

    Abstract: This paper addresses spoken language understanding (SLU) on microcontroller-like embedded devices, integrating on-device execution with cloud offloading in a novel fashion. We leverage temporal locality in the speech inputs to a device and reuse recent SLU inferences accordingly. Our idea is simple: let the device match incoming inputs against cached results, and only offload inputs not matched to… ▽ More

    Submitted 8 May, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: accepted at MobiSys'24

  31. arXiv:2311.16572   

    eess.SY physics.ao-ph physics.soc-ph

    Adapting to climate change: Long-term impact of wind resource changes on China's power system resilience

    Authors: Jiaqi Ruan, Xiangrui Meng, Yifan Zhu, Gaoqi Liang, Xianzhuo Sun, Huayi Wu, Huijuan Xiao, Mengqian Lu, Pin Gao, Jiapeng Li, Wai-Kin Wong, Zhao Xu, Junhua Zhao

    Abstract: Modern society's reliance on power systems is at risk from the escalating effects of wind-related climate change. Yet, failure to identify the intricate relationship between wind-related climate risks and power systems could lead to serious short- and long-term issues, including partial or complete blackouts. Here, we develop a comprehensive framework to assess China's power system resilience acro… ▽ More

    Submitted 24 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Not suitable for publication

  32. arXiv:2311.14925  [pdf, other

    cs.CV eess.IV

    Coordinate-based Neural Network for Fourier Phase Retrieval

    Authors: Tingyou Li, Zixin Xu, Yong S. Chu, Xiao**g Huang, Jizhou Li

    Abstract: Fourier phase retrieval is essential for high-definition imaging of nanoscale structures across diverse fields, notably coherent diffraction imaging. This study presents the Single impliCit neurAl Network (SCAN), a tool built upon coordinate neural networks meticulously designed for enhanced phase retrieval performance. Remedying the drawbacks of conventional iterative methods which are easiliy tr… ▽ More

    Submitted 8 January, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  33. arXiv:2311.13361  [pdf, other

    cs.AI cs.HC eess.SY

    Applying Large Language Models to Power Systems: Potential Security Threats

    Authors: Jiaqi Ruan, Gaoqi Liang, Huan Zhao, Guolong Liu, Xianzhuo Sun, **g Qiu, Zhao Xu, Fushuan Wen, Zhao Yang Dong

    Abstract: Applying large language models (LLMs) to modern power systems presents a promising avenue for enhancing decision-making and operational efficiency. However, this action may also incur potential security threats, which have not been fully recognized so far. To this end, this article analyzes potential threats incurred by applying LLMs to power systems, emphasizing the need for urgent research and d… ▽ More

    Submitted 24 January, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

  34. arXiv:2311.04234  [pdf

    eess.SP cs.CV cs.LG

    Leveraging sinusoidal representation networks to predict fMRI signals from EEG

    Authors: Yamin Li, Ange Lou, Ziyuan Xu, Shiyu Wang, Catie Chang

    Abstract: In modern neuroscience, functional magnetic resonance imaging (fMRI) has been a crucial and irreplaceable tool that provides a non-invasive window into the dynamics of whole-brain activity. Nevertheless, fMRI is limited by hemodynamic blurring as well as high cost, immobility, and incompatibility with metal implants. Electroencephalography (EEG) is complementary to fMRI and can directly record the… ▽ More

    Submitted 24 January, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

  35. arXiv:2311.03653  [pdf, ps, other

    cs.IT eess.SP

    On the Performance of LoRa Empowered Communication for Wireless Body Area Networks

    Authors: Minling Zhang, Guofa Cai, Zhi** Xu, Jiguang He, Markku Juntti

    Abstract: To remotely monitor the physiological status of the human body, long range (LoRa) communication has been considered as an eminently suitable candidate for wireless body area networks (WBANs). Typically, a Rayleigh-lognormal fading channel is encountered by the LoRa links of the WBAN. In this context, we characterize the performance of the LoRa system in WBAN scenarios with an emphasis on the physi… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  36. arXiv:2311.00483  [pdf, other

    eess.IV cs.CV

    DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

    Authors: Xiaohua Jiang, Yihao Guo, Jian Huang, Yuting Wu, Meiyi Luo, Zhaoyang Xu, Qianni Zhang, Xingru Huang, Hong He, Shaowei Jiang, **g Ye, Mang Xiao

    Abstract: The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in… ▽ More

    Submitted 19 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 36pages,16figures,7tables

    MSC Class: 68; 92 ACM Class: I.4; J.3

  37. arXiv:2310.14172  [pdf, other

    eess.IV cs.CV

    ASC: Appearance and Structure Consistency for Unsupervised Domain Adaptation in Fetal Brain MRI Segmentation

    Authors: Zihang Xu, Haifan Gong, Xiang Wan, Haofeng Li

    Abstract: Automatic tissue segmentation of fetal brain images is essential for the quantitative analysis of prenatal neurodevelopment. However, producing voxel-level annotations of fetal brain imaging is time-consuming and expensive. To reduce labeling costs, we propose a practical unsupervised domain adaptation (UDA) setting that adapts the segmentation labels of high-quality fetal brain atlases to unlabel… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: MICCAI 2023, released code: https://github.com/lhaof/ASC

  38. arXiv:2310.13210  [pdf, other

    eess.SP

    Time-Modulated Intelligent Reflecting Surface for Waveform Security

    Authors: Zhaoyi Xu, Athina Petropulu

    Abstract: We consider an OFDM transmitter aided by an intelligent reflecting surface (IRS) and propose a novel approach to enhance waveform security by employing time modulation (TM) at the IRS side. By controlling the periodic TM pattern of the IRS elements, the system is designed to preserve communication information towards an authorized recipient and scramble the information towards all other directions… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: Submitted to ICASSP 2024, under review

  39. arXiv:2310.12570  [pdf, other

    eess.IV cs.CV cs.GR cs.LG

    DA-TransUNet: Integrating Spatial and Channel Dual Attention with Transformer U-Net for Medical Image Segmentation

    Authors: Guanqun Sun, Yizhi Pan, Weikun Kong, Zichang Xu, Jianhua Ma, Teeradaj Racharak, Le-Minh Nguyen, Junyi Xin

    Abstract: Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional Unet architectures and their transformer-integrated variants excel in automated segmentation tasks. However, they lack the ability to harness the intrinsic position and channel features of image. Existing models also struggle with parameter efficiency and computational complexity,… ▽ More

    Submitted 14 November, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

  40. arXiv:2310.09922  [pdf, ps, other

    eess.SP

    Enhance Security of Time-Modulated Array-Enabled Directional Modulation by Introducing Symbol Ambiguity

    Authors: Zhihao Tao, Zhaoyi Xu, Athina Petropulu

    Abstract: In this paper, if the time-modulated array (TMA)-enabled directional modulation (DM) communication system can be cracked is investigated and the answer is YES! We first demonstrate that the scrambling data received at the eavesdropper can be defied by using grid search to successfully find the only and actual mixing matrix generated by TMA. Then, we propose introducing symbol ambiguity to TMA to d… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  41. arXiv:2310.08551  [pdf, ps, other

    eess.SP

    How secure is the time-modulated array-enabled ofdm directional modulation?

    Authors: Zhihao Tao, Zhaoyi Xu, Athina Petropulu

    Abstract: Time-modulated arrays (TMA) transmitting orthogonal frequency division multiplexing (OFDM) waveforms achieve physical layer security by allowing the signal to reach the legitimate destination undistorted, while making the signal appear scrambled in all other directions. In this paper, we examine how secure the TMA OFDM system is, and show that it is possible for the eavesdropper to defy the scramb… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: This work was already submitted to IEEE ICASSP 2024

  42. arXiv:2310.07730  [pdf, other

    cs.CV eess.IV

    Domain-Controlled Prompt Learning

    Authors: Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

    Abstract: Large pre-trained vision-language models, such as CLIP, have shown remarkable generalization capabilities across various tasks when appropriate text prompts are provided. However, adapting these models to specific domains, like remote sensing images (RSIs), medical images, etc, remains unexplored and challenging. Existing prompt learning methods often lack domain-awareness or domain-transfer mecha… ▽ More

    Submitted 12 December, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

  43. arXiv:2310.00900  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models

    Authors: Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu

    Abstract: Speech enhancement aims to improve the quality of speech signals in terms of quality and intelligibility, and speech editing refers to the process of editing the speech according to specific user needs. In this paper, we propose a Unified Speech Enhancement and Editing (uSee) model with conditional diffusion models to handle various tasks at the same time in a generative manner. Specifically, by p… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  44. arXiv:2309.16161  [pdf, other

    eess.SY cs.AI cs.MA cs.RO math.OC

    Leveraging Untrustworthy Commands for Multi-Robot Coordination in Unpredictable Environments: A Bandit Submodular Maximization Approach

    Authors: Zirui Xu, Xiaofeng Lin, Vasileios Tzoumas

    Abstract: We study the problem of multi-agent coordination in unpredictable and partially-observable environments with untrustworthy external commands. The commands are actions suggested to the robots, and are untrustworthy in that their performance guarantees, if any, are unknown. Such commands may be generated by human operators or machine learning algorithms and, although untrustworthy, can often increas… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  45. arXiv:2309.14761  [pdf, other

    eess.AS cs.SD

    Optimization Techniques for a Physical Model of Human Vocalisation

    Authors: Mateo Cámara, Zhiyuan Xu, Yisu Zong, José Luis Blanco, Joshua D. Reiss

    Abstract: We present a non-supervised approach to optimize and evaluate the synthesis of non-speech audio effects from a speech production model. We use the Pink Trombone synthesizer as a case study of a simplified production model of the vocal tract to target non-speech human audio signals --yawnings. We selected and optimized the control parameters of the synthesizer to minimize the difference between rea… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to DAFx 2023

  46. arXiv:2309.09028  [pdf, other

    eess.AS cs.SD

    Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

    Authors: Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu

    Abstract: Enhancing speech signal quality in adverse acoustic environments is a persistent challenge in speech processing. Existing deep learning based enhancement methods often struggle to effectively remove background noise and reverberation in real-world scenarios, hampering listening experiences. To address these challenges, we propose a novel approach that uses pre-trained generative methods to resynth… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: Paper in submission

  47. arXiv:2309.07432  [pdf, other

    cs.SD eess.AS

    SpatialCodec: Neural Spatial Speech Coding

    Authors: Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu

    Abstract: In this work, we address the challenge of encoding speech captured by a microphone array using deep learning techniques with the aim of preserving and accurately reconstructing crucial spatial cues embedded in multi-channel recordings. We propose a neural spatial audio coding framework that achieves a high compression ratio, leveraging single-channel neural sub-band codec and SpatialCodec. Our app… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Paper in Submission

  48. arXiv:2309.03486  [pdf, other

    eess.AS cs.SD

    Simulating room transfer functions between transducers mounted on audio devices using a modified image source method

    Authors: Zeyu Xu, Adrian Herzog, Alexander Lodermeyer, Emanuël A. P. Habets, Albert G. Prinn

    Abstract: The image source method (ISM) is often used to simulate room acoustics due to its ease of use and computational efficiency. The standard ISM is limited to simulations of room impulse responses between point sources and omnidirectional receivers. In this work, the ISM is extended using spherical harmonic directivity coefficients to include acoustic diffraction effects due to source and receiver tra… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: The following article has been submitted to the Journal of the Acoustical Society of America (JASA). After it is published, it will be found at http://asa.scitation.org/journal/jas

  49. arXiv:2309.02743  [pdf, other

    eess.AS cs.SD

    MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023

    Authors: Zhihang Xu, Shaofei Zhang, Xi Wang, Jiajun Zhang, Wenning Wei, Lei He, Sheng Zhao

    Abstract: In this paper, we present MuLanTTS, the Microsoft end-to-end neural text-to-speech (TTS) system designed for the Blizzard Challenge 2023. About 50 hours of audiobook corpus for French TTS as hub task and another 2 hours of speaker adaptation as spoke task are released to build synthesized voices for different test purposes including sentences, paragraphs, homographs, lists, etc. Building upon Deli… ▽ More

    Submitted 11 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: 6 pages

  50. arXiv:2309.02432  [pdf, other

    eess.AS cs.SD

    Employing Real Training Data for Deep Noise Suppression

    Authors: Ziyi Xu, Marvin Sach, Jan Pirklbauer, Tim Fingscheidt

    Abstract: Most deep noise suppression (DNS) models are trained with reference-based losses requiring access to clean speech. However, sometimes an additive microphone model is insufficient for real-world applications. Accordingly, ways to use real training data in supervised learning for DNS models promise to reduce a potential training/inference mismatch. Employing real data for DNS training requires eithe… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.