Skip to main content

Showing 1–50 of 173 results for author: Xu, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.11006  [pdf, other

    cs.SD cs.AI eess.AS

    SPEAR: Receiver-to-Receiver Acoustic Neural War** Field

    Authors: Yuhang He, Shitong Xu, Jia-Xing Zhong, Sangyun Shin, Niki Trigoni, Andrew Markham

    Abstract: We present SPEAR, a continuous receiver-to-receiver acoustic neural war** field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by war** the spat… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures in main paper

  2. arXiv:2406.09426  [pdf

    cs.SD eess.AS

    Analyzing phonetic structure of Mandarin using Audacity

    Authors: Shizheng Xu

    Abstract: Mandarin Chinese is the official language in China, Taiwan, and Singapore. It is also the main non-official language spoken predominantly at home in Toronto and Vancouver. This article employs the audio software Audacity and leverages theoretical knowledge to conduct a comprehensive analysis of Mandarin Chinese. The study initiates with an overview of the fundamental principles underlying Mandarin… ▽ More

    Submitted 14 April, 2024; originally announced June 2024.

    Comments: audio source: https://leetcafe.com/language-analysis/

  3. arXiv:2406.07421  [pdf, other

    cs.SD eess.AS

    A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition

    Authors: Zhenyu Zhou, Shibiao Xu, Shi Yin, Lantian Li, Dong Wang

    Abstract: Data augmentation (DA) has played a pivotal role in the success of deep speaker recognition. Current DA techniques primarily focus on speaker-preserving augmentation, which does not change the speaker trait of the speech and does not create new speakers. Recent research has shed light on the potential of speaker augmentation, which generates new speakers to enrich the training dataset. In this stu… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: to be published in INTERSPEECH 2024

  4. arXiv:2406.07310  [pdf, other

    eess.AS cs.CL cs.SD

    MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting

    Authors: Zhiqi Ai, Zhiyong Chen, Shugong Xu

    Abstract: In this paper, we propose MM-KWS, a novel approach to user-defined keyword spotting leveraging multi-modal enrollments of text and speech templates. Unlike previous methods that focus solely on either text or speech features, MM-KWS extracts phoneme, text, and speech embeddings from both modalities. These embeddings are then compared with the query speech embedding to detect the target keywords. T… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  5. arXiv:2406.02438  [pdf, other

    eess.AS cs.MM cs.SD

    CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection

    Authors: Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, **g Guo, Tomoki Toda, Zhiyao Duan

    Abstract: Recent singing voice synthesis and conversion advancements necessitate robust singing voice deepfake detection (SVDD) models. Current SVDD datasets face challenges due to limited controllability, diversity in deepfake methods, and licensing restrictions. Addressing these gaps, we introduce CtrSVDD, a large-scale, diverse collection of bonafide and deepfake singing vocals. These vocals are synthesi… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  6. arXiv:2405.19450  [pdf, other

    cs.CV eess.IV

    FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining

    Authors: Dong Li, Yidi Liu, Xueyang Fu, Senyan Xu, Zheng-Jun Zha

    Abstract: Image deraining aims to remove rain streaks from rainy images and restore clear backgrounds. Currently, some research that employs the Fourier transform has proved to be effective for image deraining, due to it acting as an effective frequency prior for capturing rain streaks. However, despite there exists dependency of low frequency and high frequency in images, these Fourier-based methods rarely… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  7. arXiv:2405.07689  [pdf, other

    cs.MM cs.NI eess.SY

    Quality of Experience Optimization for Real-time XR Video Transmission with Energy Constraints

    Authors: Guang** Pan, Shugong Xu, Shunqing Zhang, Xiao**g Chen, Yanzan Sun

    Abstract: Extended Reality (XR) is an important service in the 5G network and in future 6G networks. In contrast to traditional video on demand services, real-time XR video is transmitted frame-by-frame, requiring low latency and being highly sensitive to network fluctuations. In this paper, we model the quality of experience (QoE) for real-time XR video transmission on a frame-by-frame basis. Based on the… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 6 pages, 5 figures

  8. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhi**g Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Hai** Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  9. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhi**g Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  10. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  11. arXiv:2404.09905  [pdf, other

    cs.NI cs.MM eess.IV eess.SY

    Quality of Experience Oriented Cross-layer Optimization for Real-time XR Video Transmission

    Authors: Guang** Pan, Shugong Xu, Shunqing Zhang, Xiao**g Chen, Yanzan Sun

    Abstract: Extended reality (XR) is one of the most important applications of beyond 5G and 6G networks. Real-time XR video transmission presents challenges in terms of data rate and delay. In particular, the frame-by-frame transmission mode of XR video makes real-time XR video very sensitive to dynamic network environments. To improve the users' quality of experience (QoE), we design a cross-layer transmiss… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 14 pages, 13 figures. arXiv admin note: text overlap with arXiv:2402.01180

  12. arXiv:2403.19251  [pdf, other

    quant-ph eess.SY

    Arbitrary State Transition of Open Qubit System Based on Switching Control

    Authors: Guangpu Wu, Shibei Xue, Shan Ma, Sen Kuang, Daoyi Dong, Ian R. Petersen

    Abstract: We present a switching control strategy based on Lyapunov control for arbitrary state transitions in open qubit systems. With coherent vector representation, we propose a switching control strategy, which can prevent the state of the qubit from entering invariant sets and singular value sets, effectively driving the system ultimately to a sufficiently small neighborhood of target states. In compar… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 12 pages, 7 figures

  13. arXiv:2403.16643  [pdf, other

    eess.IV cs.CV

    Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution

    Authors: Qing** Zheng, Ling Zheng, Yuanfan Guo, Ying Li, Songcen Xu, Jiankang Deng, Hang Xu

    Abstract: Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content, eliminating any distortions or synthetic details. While traditional diffusion-based SR techniques have demonstrated remarkable abilities to enhance image detail, they are prone to artifact introduction during iterative procedures. Such… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  14. arXiv:2403.15145  [pdf, ps, other

    cs.IT eess.SP

    Robust Resource Allocation for STAR-RIS Assisted SWIPT Systems

    Authors: Guangyu Zhu, Xidong Mu, Li Guo, Ao Huang, Shibiao Xu

    Abstract: A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted simultaneous wireless information and power transfer (SWIPT) system is proposed. More particularly, an STAR-RIS is deployed to assist in the information/power transfer from a multi-antenna access point (AP) to multiple single-antenna information users (IUs) and energy users (EUs), where two practica… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  15. arXiv:2403.13245  [pdf, other

    eess.SY cs.AI cs.DC cs.LG cs.RO

    Federated reinforcement learning for robot motion planning with zero-shot generalization

    Authors: Zhenyuan Yuan, Siyuan Xu, Minghui Zhu

    Abstract: This paper considers the problem of learning a control policy for robot motion planning with zero-shot generalization, i.e., no data collection and policy adaptation is needed when the learned policy is deployed in new environments. We develop a federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server, i.e., the Cloud, without sharing… ▽ More

    Submitted 7 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  16. arXiv:2402.18070  [pdf, other

    cs.AR eess.SP

    A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

    Authors: Limin Jiang, Yi Shi, Haiqin Hu, Qingyu Deng, Siyi Xu, Yintao Liu, Feng Yuan, Si Wang, Yihao Shen, Fangfang Ye, Shan Cao, Zhiyuan Jiang

    Abstract: Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 7 figures, conference

  17. arXiv:2402.01180  [pdf, other

    cs.NI cs.MM eess.SP

    Real-time Extended Reality Video Transmission Optimization Based on Frame-priority Scheduling

    Authors: Guang** Pan, Shugong Xu, Shunqing Zhang, Xiao**g Chen, Yanzan Sun

    Abstract: Extended reality (XR) is one of the most important applications of 5G. For real-time XR video transmission in 5G networks, a low latency and high data rate are required. In this paper, we propose a resource allocation scheme based on frame-priority scheduling to meet these requirements. The optimization problem is modelled as a frame-priority-based radio resource scheduling problem to improve tran… ▽ More

    Submitted 7 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 6 pages, 7 figures

  18. arXiv:2401.09552  [pdf, other

    physics.app-ph eess.SP

    Centralized active reconfigurable intelligent surface: Architecture, path loss analysis and experimental verification

    Authors: Changhao Liu, Fan Yang, Shenheng Xu, Yezhen Li, Maokun Li

    Abstract: Reconfigurable intelligent surfaces (RISs) are promising candidate for the 6G communication. Recently, active RIS has been proposed to compensate the multiplicative fading effect inherent in passive RISs. However, conventional distributed active RISs, with at least one amplifier per element, are costly, complex, and power-intensive. To address these challenges, this paper proposes a novel architec… ▽ More

    Submitted 18 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  19. arXiv:2401.00135  [pdf

    eess.IV cs.CV

    Deep Radon Prior: A Fully Unsupervised Framework for Sparse-View CT Reconstruction

    Authors: Shuo Xu, Yucheng Zhang, Gang Chen, Xincheng Xiang, Peng Cong, Yuewen Sun

    Abstract: Although sparse-view computed tomography (CT) has significantly reduced radiation dose, it also introduces severe artifacts which degrade the image quality. In recent years, deep learning-based methods for inverse problems have made remarkable progress and have become increasingly popular in CT reconstruction. However, most of these methods suffer several limitations: dependence on high-quality tr… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    Comments: 11 pages, 12 figures, Journal paper

  20. arXiv:2312.09063  [pdf, other

    eess.IV cs.CV

    Image Demoireing in RAW and sRGB Domains

    Authors: Shuning Xu, Binbin Song, Xiangyu Chen, Xina Liu, Jiantao Zhou

    Abstract: Moire patterns frequently appear when capturing screens with smartphones or cameras, potentially compromising image quality. Previous studies suggest that moire pattern elimination in the RAW domain offers greater effectiveness compared to demoireing in the sRGB domain. Nevertheless, relying solely on RAW data for image demoireing is insufficient in mitigating the color cast due to the absence of… ▽ More

    Submitted 15 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

  21. arXiv:2312.05256  [pdf, other

    eess.IV cs.AI

    Holistic Evaluation of GPT-4V for Biomedical Imaging

    Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, **gyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

    Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  22. arXiv:2312.01042  [pdf, ps, other

    cs.IT eess.SP

    Covert Communications in STAR-RIS-Aided Rate-Splitting Multiple Access Systems

    Authors: Heng Chang, Hai Yang, Shuobo Xu, Xiyu Pang, Hongwu Liu

    Abstract: In this paper, we investigate covert communications in a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided rate-splitting multiple access (RSMA) system. Under the RSMA principles, the messages for the covert user (Bob) and public user (Grace) are converted to the common and private streams at the legitimate transmitter (Alice) to realize downlink transm… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 17 pages, submitted to journal

  23. arXiv:2311.04116  [pdf, other

    eess.IV

    Improved Topological Preservation in 3D Axon Segmentation and Centerline Detection using Geometric Assessment-driven Topological Smoothing (GATS)

    Authors: Nina I. Shamsi, Alex S. Xu, Lars A. Gjesteby, Laura J. Brattain

    Abstract: Automated axon tracing via fully supervised learning requires large amounts of 3D brain imagery, which is time consuming and laborious to obtain. It also requires expertise. Thus, there is a need for more efficient segmentation and centerline detection techniques to use in conjunction with automated annotation tools. Topology-preserving methods ensure that segmented components maintain geometric c… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  24. arXiv:2311.00263  [pdf, other

    eess.SY

    The bottleneck and ceiling effects in quantized tracking control of heterogeneous multi-agent systems under DoS attacks

    Authors: Shuai Feng, Maopeng Ran, Baoyong Zhang, Lihua Xie, Shengyuan Xu

    Abstract: In this paper, we investigate tracking control of heterogeneous multi-agent systems under Denial-of-Service (DoS) attacks and state quantization. Dynamic quantized mechanisms are designed for inter-follower communication and leader-follower communication. Zooming-in and out factors, and data rates of both mechanisms for preventing quantizer saturation are provided. Our results show that by tuning… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  25. arXiv:2310.01565  [pdf, other

    cs.LG cs.IR eess.IV

    Causality-informed Rapid Post-hurricane Building Damage Detection in Large Scale from InSAR Imagery

    Authors: Chenguang Wang, Yepeng Liu, Xiaojian Zhang, Xuechun Li, Vladimir Paramygin, Arthriya Subgranon, Peter Sheng, Xilei Zhao, Susu Xu

    Abstract: Timely and accurate assessment of hurricane-induced building damage is crucial for effective post-hurricane response and recovery efforts. Recently, remote sensing technologies provide large-scale optical or Interferometric Synthetic Aperture Radar (InSAR) imagery data immediately after a disastrous event, which can be readily used to conduct rapid building damage assessment. Compared to optical s… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 6 pages, 3 figures

  26. arXiv:2310.00593  [pdf, other

    eess.SP

    Nonlinear Multi-Carrier System with Signal Clip**: Measurement, Analysis, and Optimization

    Authors: Yuyang Du, Liang Hao, Yiming Lei, Qun Yang, Shiqi Xu

    Abstract: Signal clip** is a classic technique for reducing peak-to-average power ratio (PAPR) in orthogonal frequency division multiplexing (OFDM) systems. It has been widely applied in consumer electronic devices owing to its low complexity and high efficiency. Although clip** reduces the nonlinear distortion caused by power amplifiers (PAs), it induces additional clip** distortion. Optimizing the j… ▽ More

    Submitted 16 February, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  27. arXiv:2309.13819  [pdf, other

    eess.AS cs.SD

    A Two-Step Approach for Narrowband Source Localization in Reverberant Rooms

    Authors: Wei-Ting Lai, Lachlan Birnie, Thushara Abhayapala, Amy Bastine, Shaoheng Xu, Prasanga Samarasinghe

    Abstract: This paper presents a two-step approach for narrowband source localization within reverberant rooms. The first step involves dereverberation by modeling the homogeneous component of the sound field by an equivalent decomposition of planewaves using Iteratively Reweighted Least Squares (IRLS), while the second step focuses on source localization by modeling the dereverberated component as a sparse… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  28. arXiv:2309.07152  [pdf

    eess.SP physics.med-ph

    Novel Smart N95 Filtering Facepiece Respirator with Real-time Adaptive Fit Functionality and Wireless Humidity Monitoring for Enhanced Wearable Comfort

    Authors: Kangkyu Kwon, Yoon Jae Lee, Yeongju Jung, Ira Soltis, Chanyeong Choi, Yewon Na, Lissette Romero, Myung Chul Kim, Nathan Rodeheaver, Hodam Kim, Michael S. Lloyd, Ziqing Zhuang, William King, Susan Xu, Seung-Hwan Ko, **woo Lee, Woon-Hong Yeo

    Abstract: The widespread emergence of the COVID-19 pandemic has transformed our lifestyle, and facial respirators have become an essential part of daily life. Nevertheless, the current respirators possess several limitations such as poor respirator fit because they are incapable of covering diverse human facial sizes and shapes, potentially diminishing the effect of wearing respirators. In addition, the cur… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 20 pages, 5 figures, 1 table, submitted for possible publication

    MSC Class: 92C55

  29. arXiv:2309.04335  [pdf, ps, other

    cs.IT eess.SP

    On the performance of an integrated communication and localization system: an analytical framework

    Authors: Yuan Gao, Haonan Hu, Jiliang Zhang, Yanliang **, Shugong Xu, Xiaoli Chu

    Abstract: Quantifying the performance bound of an integrated localization and communication (ILAC) system and the trade-off between communication and localization performance is critical. In this letter, we consider an ILAC system that can perform communication and localization via time-domain or frequency-domain resource allocation. We develop an analytical framework to derive the closed-form expression of… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures

  30. arXiv:2308.16612  [pdf, other

    cs.CV eess.IV

    Neural Gradient Regularizer

    Authors: Shuang Xu, Yifan Wang, Zixiang Zhao, Jiangjun Peng, Xiangyong Cao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc Van Gool

    Abstract: Owing to its significant success, the prior imposed on gradient maps has consistently been a subject of great interest in the field of image processing. Total variation (TV), one of the most representative regularizers, is known for its ability to capture the intrinsic sparsity prior underlying gradient maps. Nonetheless, TV and its variants often underestimate the gradient maps, leading to the we… ▽ More

    Submitted 13 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

  31. arXiv:2308.12617  [pdf, ps, other

    eess.SY cs.MA

    Quantized distributed Nash equilibrium seeking under DoS attacks: A quantized consensus based approach

    Authors: Shuai Feng, Maojiao Ye, Lihua Xie, Shengyuan Xu

    Abstract: This paper studies distributed Nash equilibrium (NE) seeking under Denial-of-Service (DoS) attacks and quantization. The players can only exchange information with their own direct neighbors. The transmitted information is subject to quantization and packet losses induced by malicious DoS attacks. We propose a quantized distributed NE seeking strategy based on the approach of dynamic quantized con… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  32. arXiv:2308.10181  [pdf

    eess.SY

    Stochastic Optimization of Coupled Power Distribution-Urban Transportation Network Operations with Autonomous Mobility on Demand Systems

    Authors: Han Wang, Xiaoyuan Xu, Yue Chen, Zheng Yan, Mohammad Shahidehpour, Jiaqi Li, Shaolun Xu

    Abstract: Autonomous mobility on demand systems (AMoDS) will significantly affect the operation of coupled power distribution-urban transportation networks (PTNs) by the optimal dispatch of electric vehicles (EVs). This paper proposes an uncertainty method to analyze the operational states of PTNs with AMoDS. First, a PTN operation framework is designed considering the controllable EVs dispatched by AMoDS a… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: 10 pages, 13 figures

  33. arXiv:2308.04163  [pdf, other

    cs.CV eess.IV

    Under-Display Camera Image Restoration with Scattering Effect

    Authors: Binbin Song, Xiangyu Chen, Shuning Xu, Jiantao Zhou

    Abstract: The under-display camera (UDC) provides consumers with a full-screen visual experience without any obstruction due to notches or punched holes. However, the semi-transparent nature of the display inevitably introduces the severe degradation into UDC images. In this work, we address the UDC image restoration problem with the specific consideration of the scattering effect caused by the display. We… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV2023

  34. arXiv:2308.01317  [pdf

    cs.CV eess.IV

    ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

    Authors: Shawn Xu, Lin Yang, Christopher Kelly, Marcin Sieniek, Timo Kohlberger, Martin Ma, Wei-Hung Weng, Atilla Kiraly, Sahar Kazemzadeh, Zakkai Melamed, Jungyeon Park, Patricia Strachan, Yun Liu, Chuck Lau, Preeti Singh, Christina Chen, Mozziyar Etemadi, Sreenivasa Raju Kalidindi, Yossi Matias, Katherine Chou, Greg S. Corrado, Shravya Shetty, Daniel Tse, Shruthi Prabhakara, Daniel Golden , et al. (3 additional authors not shown)

    Abstract: In this work, we present an approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, that leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR ach… ▽ More

    Submitted 7 September, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  35. arXiv:2307.12027  [pdf, other

    cs.CV eess.IV

    On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

    Authors: Xin Luo, Yunan Zhu, Shunxin Xu, Dong Liu

    Abstract: Several recent studies advocate the use of spectral discriminators, which evaluate the Fourier spectra of images for generative modeling. However, the effectiveness of the spectral discriminators is not well interpreted yet. We tackle this issue by examining the spectral discriminators in the context of perceptual image super-resolution (i.e., GAN-based SR), as SR image quality is susceptible to s… ▽ More

    Submitted 16 August, 2023; v1 submitted 22 July, 2023; originally announced July 2023.

    Comments: Accepted to ICCV 2023. Code and Models are publicly available at https://github.com/Luciennnnnnn/DualFormer

  36. arXiv:2307.04827  [pdf, other

    cs.SD cs.CL cs.MM eess.AS

    LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad

    Authors: Siting Xu, Yunlong Tang, Feng Zheng

    Abstract: Launchpad is a musical instrument that allows users to create and perform music by pressing illuminated buttons. To assist and inspire the design of the Launchpad light effect, and provide a more accessible approach for beginners to create music visualization with this instrument, we proposed the LaunchpadGPT model to generate music visualization designs on Launchpad automatically. Based on the la… ▽ More

    Submitted 23 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted by International Computer Music Conference (ICMC) 2023

  37. arXiv:2307.00307  [pdf, other

    eess.IV

    Spatio-Temporal Classification of Lung Ventilation Patterns using 3D EIT Images: A General Approach for Individualized Lung Function Evaluation

    Authors: Shuzhe Chen, Li Li, Zhichao Lin, Ke Zhang, Ying Gong, Lu Wang, Xu Wu, Maokun Li, Yuanlin Song, Fan Yang, Shenheng Xu

    Abstract: The Pulmonary Function Test (PFT) is an widely utilized and rigorous classification test for lung function evaluation, serving as a comprehensive tool for lung diagnosis. Meanwhile, Electrical Impedance Tomography (EIT) is a rapidly advancing clinical technique that visualizes conductivity distribution induced by ventilation. EIT provides additional spatial and temporal information on lung ventila… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  38. arXiv:2306.17797  [pdf, other

    cs.CV eess.IV

    HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising

    Authors: Li Pang, Weizhen Gu, Xiangyong Cao, Xiangyu Rui, Jiangjun Peng, Shuang Xu, Gang Yang, Deyu Meng

    Abstract: Hyperspectral image (HSI) denoising is essentially ill-posed since a noisy HSI can be degraded from multiple clean HSIs. However, current deep learning-based approaches ignore this fact and restore the clean image with deterministic map** (i.e., the network receives a noisy HSI and outputs a clean HSI). To alleviate this issue, this paper proposes a flow-based HSI denoising network (HIDFlowNet)… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: 10 pages, 8 figures

  39. arXiv:2306.10146  [pdf, other

    cs.CV eess.IV

    Multi-task 3D building understanding with multi-modal pretraining

    Authors: Shicheng Xu

    Abstract: This paper explores various learning strategies for 3D building type classification and part segmentation on the BuildingNet dataset. ULIP with PointNeXt and PointNeXt segmentation are extended for the classification and segmentation task on BuildingNet dataset. The best multi-task PointNeXt-s model with multi-modal pretraining achieves 59.36 overall accuracy for 3D building type classification, a… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 8 pages, 9 figures, 9 tables

  40. arXiv:2306.04242  [pdf, other

    eess.SP cs.RO

    4D Millimeter-Wave Radar in Autonomous Driving: A Survey

    Authors: Zeyu Han, Jiahao Wang, Zikun Xu, Shuocheng Yang, Lei He, Shaobing Xu, Jianqiang Wang, Keqiang Li

    Abstract: The 4D millimeter-wave (mmWave) radar, proficient in measuring the range, azimuth, elevation, and velocity of targets, has attracted considerable interest within the autonomous driving community. This is attributed to its robustness in extreme environments and the velocity and elevation measurement capabilities. However, despite the rapid advancement in research related to its sensing theory and a… ▽ More

    Submitted 26 April, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  41. Dynamic quantized consensus under DoS attacks: Towards a tight zooming-out factor

    Authors: Shuai Feng, Maopeng Ran, Hideaki Ishii, Shengyuan Xu

    Abstract: This paper deals with dynamic quantized consensus of dynamical agents in a general form under packet losses induced by Denial-of-Service (DoS) attacks. The communication channel has limited bandwidth and hence the transmitted signals over the network are subject to quantization. To deal with agent's output, an observer is implemented at each node. The state of the observer is quantized by a finite… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  42. arXiv:2305.14694  [pdf, other

    eess.SY

    Analysis of Contagion Dynamics with Active Cyber Defenders

    Authors: Keith Paarporn, Shouhuai Xu

    Abstract: In this paper, we analyze the infection spreading dynamics of malware in a population of cyber nodes (i.e., computers or devices). Unlike most prior studies where nodes are reactive to infections, in our setting some nodes are active defenders meaning that they are able to clean up malware infections of their neighboring nodes, much like how spreading malware exploits the network connectivity prop… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 3 figures

  43. arXiv:2305.05085  [pdf, other

    physics.optics eess.IV

    Tensorial tomographic Fourier Ptychography with applications to muscle tissue imaging

    Authors: Shiqi Xu, Xiang Dai, Paul Ritter, Kyung Chul Lee, Xi Yang, Lucas Kreiss, Kevin C. Zhou, Kanghyun Kim, Amey Chaware, Jadee Neff, Carolyn Glass, Seung Ah Lee, Oliver Friedrich, Roarke Horstmeyer

    Abstract: We report Tensorial tomographic Fourier Ptychography (ToFu), a new non-scanning label-free tomographic microscopy method for simultaneous imaging of quantitative phase and anisotropic specimen information in 3D. Built upon Fourier Ptychography, a quantitative phase imaging technique, ToFu additionally highlights the vectorial nature of light. The imaging setup consists of a standard microscope equ… ▽ More

    Submitted 13 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Journal ref: Tensorial tomographic Fourier Ptychography with applications to muscle tissue imaging, Adv. Photon. 6(2), 026004 (2024)

  44. arXiv:2305.04160  [pdf, other

    cs.CL cs.AI cs.CV eess.AS

    X-LLM: Bootstrap** Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages

    Authors: Feilong Chen, Minglun Han, Haozhi Zhao, Qingyang Zhang, **g Shi, Shuang Xu, Bo Xu

    Abstract: Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4, based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous visual language models. We attribute this to the use of more advanced LLMs compared with previous multimodal models. Unfortunately, the model architecture and training strategies of GPT-4 are unknown. To endow LLMs with multimod… ▽ More

    Submitted 21 May, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

  45. arXiv:2305.01968  [pdf

    eess.IV cs.CV cs.LG

    DPSeq: A Novel and Efficient Digital Pathology Classifier for Predicting Cancer Biomarkers using Sequencer Architecture

    Authors: Min Cen, Xingyu Li, Bangwei Guo, Jitendra Jonnagaddala, Hong Zhang, Xu Steven Xu

    Abstract: In digital pathology tasks, transformers have achieved state-of-the-art results, surpassing convolutional neural networks (CNNs). However, transformers are usually complex and resource intensive. In this study, we developed a novel and efficient digital pathology classifier called DPSeq, to predict cancer biomarkers through fine-tuning a sequencer architecture integrating horizon and vertical bidi… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  46. arXiv:2304.05175  [pdf

    math.OC eess.SY

    Sufficient Conditions for the Exact Relaxation of Complementarity Constraints for Storages in Multi-period OPF Problems

    Authors: Qi Wang, Wenchuan Wu, Chenhui Lin, Shuwei Xu, Xueliang Li

    Abstract: Storage-concerned Optimal Power Flow (OPF) with complementarity constraints is highly non-convex and intractable. In this paper, we propose two generalized sufficient conditions which guarantee no simultaneous charging and discharging (SCD) in the relaxed multi-period OPF excluding the complementarity constraints. Moreover, we prove that the regions on the locational marginal prices (LMPs) formed… ▽ More

    Submitted 10 November, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

  47. arXiv:2303.15759  [pdf

    cs.NI eess.SP

    Performance Analysis of Non-ideal Wireless PBFT Networks with mmWave and Terahertz Signals

    Authors: Haoxiang Luo, Xiangyue Yang, Hongfang Yu, Gang Sun, Shizhong Xu, Long Luo

    Abstract: Due to advantages in security and privacy, blockchain is considered a key enabling technology to support 6G communications. Practical Byzantine Fault Tolerance (PBFT) is seen as the most applicable consensus mechanism in blockchain-enabled wireless networks. However, previous studies on PBFT do not consider the channel performance of the physical layer, such as path loss and channel fading, result… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: IEEE International Conference on Metaverse Computing, Networking and Applications (MetaCom) 2023

  48. arXiv:2303.08140  [pdf, other

    eess.IV cs.LG physics.bio-ph

    Digital staining in optical microscopy using deep learning -- a review

    Authors: Lucas Kreiss, Shaowei Jiang, Xiang Li, Shiqi Xu, Kevin C. Zhou, Alexander Mühlberg, Kyung Chul Lee, Kanghyun Kim, Amey Chaware, Michael Ando, Laura Barisoni, Seung Ah Lee, Guoan Zheng, Kyle Lafata, Oliver Friedrich, Roarke Horstmeyer

    Abstract: Until recently, conventional biochemical staining had the undisputed status as well-established benchmark for most biomedical problems related to clinical diagnostics, fundamental research and biotechnology. Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis,… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Review article, 4 main Figures, 3 Tables, 2 supplementary figures

  49. arXiv:2303.07089   

    eess.SP

    Range Resolution Enhanced Method with Spectral Properties for Hyperspectral Lidar

    Authors: Yuhao Xia, Shilong Xu, Hui Shao, Ahui Hou, Jiajie Fang, Fei Han, Youlong Chen, Jiaqi Wen, Yuwei Chen, Yihua Hu

    Abstract: Waveform decomposition is needed as a first step in the extraction of various types of geometric and spectral information from hyperspectral full-waveform LiDAR echoes. We present a new approach to deal with the "Pseudo-monopulse" waveform formed by the overlapped waveforms from multi-targets when they are very close. We use one single skew-normal distribution (SND) model to fit waveforms of all s… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

  50. arXiv:2302.10406  [pdf

    cs.CL cs.CV cs.LG eess.IV q-bio.QM

    Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines

    Authors: Min Cen, Xingyu Li, Bangwei Guo, Jitendra Jonnagaddala, Hong Zhang, Xu Steven Xu

    Abstract: NLP-based computer vision models, particularly vision transformers, have been shown to outperform CNN models in many imaging tasks. However, most digital pathology artificial-intelligence models are based on CNN architectures, probably owing to a lack of data regarding NLP models for pathology images. In this study, we developed digital pathology pipelines to benchmark the five most recently propo… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.