Skip to main content

Showing 1–50 of 148 results for author: Liu, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.14440  [pdf, other

    eess.SP

    LLM4CP: Adapting Large Language Models for Channel Prediction

    Authors: Boxun Liu, Xuanyu Liu, Shijian Gao, Xiang Cheng, Liuqing Yang

    Abstract: Channel prediction is an effective approach for reducing the feedback or estimation overhead in massive multi-input multi-output (m-MIMO) systems. However, existing channel prediction methods lack precision due to model mismatch errors or network generalization issues. Large language models (LLMs) have demonstrated powerful modeling and generalization abilities, and have been successfully applied… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.13150  [pdf

    eess.IV cs.CV

    MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction

    Authors: Jiaqi Cui, Xinyi Zeng, Pinxian Zeng, Bo Liu, Xi Wu, Jiliu Zhou, Yan Wang

    Abstract: Radiation hazards associated with standard-dose positron emission tomography (SPET) images remain a concern, whereas the quality of low-dose PET (LPET) images fails to meet clinical requirements. Therefore, there is great interest in reconstructing SPET images from LPET images. However, prior studies focus solely on image data, neglecting vital complementary information from other modalities, e.g.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI2024

  3. arXiv:2406.12707  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction

    Authors: Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu

    Abstract: Large Language Model (LLM)-enhanced agents become increasingly prevalent in Human-AI communication, offering vast potential from entertainment to professional domains. However, current multi-modal dialogue systems overlook the acoustic information present in speech, which is crucial for understanding human communication nuances. This oversight can lead to misinterpretations of speakers' intentions… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures, ACL24 accepted

  4. arXiv:2406.05359  [pdf, other

    eess.AS cs.SD

    Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization

    Authors: Bei Liu, Haoyu Wang, Yanmin Qian

    Abstract: Modern speaker verification (SV) systems typically demand expensive storage and computing resources, thereby hindering their deployment on mobile devices. In this paper, we explore adaptive neural network quantization for lightweight speaker verification. Firstly, we propose a novel adaptive uniform precision quantization method which enables the dynamic generation of quantization centroids custom… ▽ More

    Submitted 18 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: submitted to IEEE/ACM Transactions on Audio Speech and Language Processing (Under Review)

  5. arXiv:2405.12487  [pdf, other

    cs.CV eess.IV

    3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification

    Authors: Yan He, Bing Tu, Bo Liu, Jun Li, Antonio Plaza

    Abstract: Hyperspectral image (HSI) classification constitutes the fundamental research in remote sensing fields. Convolutional Neural Networks (CNNs) and Transformers have demonstrated impressive capability in capturing spectral-spatial contextual dependencies. However, these architectures suffer from limited receptive fields and quadratic computational complexity, respectively. Fortunately, recent Mamba a… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  6. arXiv:2405.09778  [pdf, other

    eess.SP

    Beam Pattern Modulation Embedded Hybrid Transceiver Optimization for Integrated Sensing and Communication

    Authors: Boxun Liu, Shijian Gao, Zonghui Yang, Xiang Cheng, Liuqing Yang

    Abstract: Integrated sensing and communication (ISAC) emerges as a promising technology for B5G/6G, particularly in the millimeter-wave (mmWave) band. However, the widely utilized hybrid architecture in mmWave systems compromises multiplexing gain due to the constraints of limited radio frequency chains. Moreover, additional sensing functionalities exacerbate the impairment of spectrum efficiency (SE). In t… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  7. arXiv:2405.09298  [pdf

    eess.IV cs.CV

    Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis

    Authors: Yujie Xiang, Bo**g Liu, Mattias Rantalainen

    Abstract: AI-based analysis of histopathology whole slide images (WSIs) is central in computational pathology. However, image quality, including unsharp areas of WSIs, impacts model performance. We investigate the impact of blur and propose a multi-model approach to mitigate negative impact of unsharp image areas. In this study, we use a simulation approach, evaluating model performance under varying levels… ▽ More

    Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    ACM Class: I.4; J.3

  8. arXiv:2404.17736  [pdf, other

    eess.SP cs.CV cs.IT eess.IV

    Diffusion-Aided Joint Source Channel Coding For High Realism Wireless Image Transmission

    Authors: Mingyu Yang, Bowen Liu, Boyang Wang, Hun-Seok Kim

    Abstract: Deep learning-based joint source-channel coding (deep JSCC) has been demonstrated as an effective approach for wireless image transmission. Nevertheless, current research has concentrated on minimizing a standard distortion metric such as Mean Squared Error (MSE), which does not necessarily improve the perceptual quality. To address this issue, we propose DiffJSCC, a novel framework that leverages… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  9. arXiv:2404.07556  [pdf, other

    eess.IV cs.CV

    Attention-Aware Laparoscopic Image Desmoking Network with Lightness Embedding and Hybrid Guided Embedding

    Authors: Ziteng Liu, Jiahua Zhu, Bainan Liu, Hao Liu, Wenpeng Gao, Yili Fu

    Abstract: This paper presents a novel method of smoke removal from the laparoscopic images. Due to the heterogeneous nature of surgical smoke, a two-stage network is proposed to estimate the smoke distribution and reconstruct a clear, smoke-free surgical scene. The utilization of the lightness channel plays a pivotal role in providing vital information pertaining to smoke density. The reconstruction of smok… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: ISBI2024

  10. arXiv:2403.15418  [pdf, other

    eess.SP

    Stochastic Analysis of Touch-Tone Frequency Recognition in Two-Way Radio Systems for Dialed Telephone Number Identification

    Authors: Liqiang Yu, Chen Li, Bo Liu, Chang Che

    Abstract: This paper focuses on recognizing dialed numbers in a touch-tone telephone system based on the Dual Tone MultiFrequency (DTMF) signaling technique with analysis of stochastic aspects during the noise and random duration of characters. Each dialed digit's acoustic profile is derived from a composite of two carrier frequencies, distinctly assigned to represent that digit. The identification of each… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: It is accepted by The 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE 2024)

  11. arXiv:2403.15238  [pdf

    eess.IV cs.CV stat.ME

    WEEP: A method for spatial interpretation of weakly supervised CNN models in computational pathology

    Authors: Abhinav Sharma, Bo**g Liu, Mattias Rantalainen

    Abstract: Deep learning enables the modelling of high-resolution histopathology whole-slide images (WSI). Weakly supervised learning of tile-level data is typically applied for tasks where labels only exist on the patient or WSI level (e.g. patient outcomes or histological grading). In this context, there is a need for improved spatial interpretability of predictions from such models. We propose a novel met… ▽ More

    Submitted 8 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  12. arXiv:2403.13290  [pdf, other

    math.OC eess.SY

    A Log-domain Interior Point Method for Convex Quadratic Games

    Authors: Bingqi Liu, Dominic Liao-McPherson

    Abstract: In this paper, we propose an equilibrium-seeking algorithm for finding generalized Nash equilibria of non-cooperative monotone convex quadratic games. Specifically, we recast the Nash equilibrium-seeking problem as variational inequality problem that we solve using a log-domain interior point method and provide a general purpose solver based on this algorithm. This approach is suitable for non-pot… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  13. arXiv:2403.10417  [pdf, other

    eess.SP

    Beam Pattern Modulation Embedded mmWave Hybrid Transceiver Design Towards ISAC

    Authors: Boxun Liu, Shijian Gao, Zonghui Yang, Xiang Cheng

    Abstract: Integrated Sensing and Communication (ISAC) emerges as a promising technology for B5G/6G, particularly in the millimeter-wave (mmWave) band. However, the widespread adoption of hybrid architecture in mmWave systems compromises multiplexing gain due to limited radio-frequency chains, resulting in mediocre performance when embedding sensing functionality. To avoid sacrificing the spectrum efficiency… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  14. arXiv:2402.09442  [pdf

    eess.SP cs.AI

    Progress in artificial intelligence applications based on the combination of self-driven sensors and deep learning

    Authors: Weixiang Wan, Wenjian Sun, Qiang Zeng, Linying Pan, **gyu Xu, Bo Liu

    Abstract: In the era of Internet of Things, how to develop a smart sensor system with sustainable power supply, easy deployment and flexible use has become a difficult problem to be solved. The traditional power supply has problems such as frequent replacement or charging when in use, which limits the development of wearable devices. The contact-to-separate friction nanogenerator (TENG) was prepared by usin… ▽ More

    Submitted 12 March, 2024; v1 submitted 30 January, 2024; originally announced February 2024.

    Comments: This aticle was accepted by ieee conference

  15. arXiv:2402.04267  [pdf

    physics.med-ph cs.AI cs.CV eess.IV

    Application analysis of ai technology combined with spiral CT scanning in early lung cancer screening

    Authors: Shulin Li, Liqiang Yu, Bo Liu, Qunwei Lin, Jiaxin Huang

    Abstract: At present, the incidence and fatality rate of lung cancer in China rank first among all malignant tumors. Despite the continuous development and improvement of China's medical level, the overall 5-year survival rate of lung cancer patients is still lower than 20% and is staged. A number of studies have confirmed that early diagnosis and treatment of early stage lung cancer is of great significanc… ▽ More

    Submitted 26 January, 2024; originally announced February 2024.

    Comments: This article was accepted by Frontiers in Computing and Intelligent Systems https://drpress.org/ojs/index.php/fcis/article/view/15781. arXiv admin note: text overlap with arXiv:nlin/0508031 by other authors

  16. arXiv:2402.02327  [pdf, other

    cs.CV cs.SD eess.AS

    Bootstrap** Audio-Visual Segmentation by Strengthening Audio Cues

    Authors: Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jie** Ye, Nenghai Yu

    Abstract: How to effectively interact audio with vision has garnered considerable interest within the multi-modality research field. Recently, a novel audio-visual segmentation (AVS) task has been proposed, aiming to segment the sounding objects in video frames under the guidance of audio cues. However, most existing AVS methods are hindered by a modality imbalance where the visual features tend to dominate… ▽ More

    Submitted 6 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  17. arXiv:2401.06272  [pdf, other

    eess.IV cs.CV

    Segmentation of Mediastinal Lymph Nodes in CT with Anatomical Priors

    Authors: Tejas Sudharshan Mathai, Bohan Liu, Ronald M. Summers

    Abstract: Purpose: Lymph nodes (LNs) in the chest have a tendency to enlarge due to various pathologies, such as lung cancer or pneumonia. Clinicians routinely measure nodal size to monitor disease progression, confirm metastatic cancer, and assess treatment response. However, variations in their shapes and appearances make it cumbersome to identify LNs, which reside outside of most organs. Methods: We prop… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Submitted to CARS 2024

  18. arXiv:2401.05698  [pdf, other

    cs.CV cs.HC cs.MM cs.SD eess.AS

    HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition

    Authors: Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao

    Abstract: Audio-Visual Emotion Recognition (AVER) has garnered increasing attention in recent years for its critical role in creating emotion-ware intelligent machines. Previous efforts in this area are dominated by the supervised learning paradigm. Despite significant progress, supervised learning is meeting its bottleneck due to the longstanding data scarcity issue in AVER. Motivated by recent advances in… ▽ More

    Submitted 1 April, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by Information Fusion. The code is available at https://github.com/sunlicai/HiCMAE

    Journal ref: Information Fusion, 2024

  19. arXiv:2312.15628  [pdf, other

    cs.SD eess.AS

    Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation

    Authors: Bingzhi Liu, Yin Cao, Haohe Liu, Yi Zhou

    Abstract: Diffusion models have demonstrated promising results in text-to-audio generation tasks. However, their practical usability is hindered by slow sampling speeds, limiting their applicability in high-throughput scenarios. To address this challenge, progressive distillation methods have been effective in producing more compact and efficient models. Nevertheless, these methods encounter issues with unb… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 5 pages

  20. arXiv:2312.02199  [pdf, other

    cs.CV cs.AI cs.LG eess.IV stat.AP

    USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

    Authors: Jeremy Irvin, Lucas Tao, Joanne Zhou, Yuntao Ma, Langston Nashold, Benjamin Liu, Andrew Y. Ng

    Abstract: Large, self-supervised vision models have led to substantial advancements for automatically interpreting natural images. Recent works have begun tailoring these methods to remote sensing data which has rich structure with multi-sensor, multi-spectral, and temporal information providing massive amounts of self-labeled data that can be used for self-supervised pre-training. In this work, we develop… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  21. arXiv:2311.00996  [pdf, other

    eess.IV cs.CV

    VCISR: Blind Single Image Super-Resolution with Video Compression Synthetic Data

    Authors: Boyang Wang, Bowen Liu, Shiyu Liu, Fengyu Yang

    Abstract: In the blind single image super-resolution (SISR) task, existing works have been successful in restoring image-level unknown degradations. However, when a single video frame becomes the input, these works usually fail to address degradations caused by video compression, such as mosquito noise, ringing, blockiness, and staircase noise. In this work, we for the first time, present a video compressio… ▽ More

    Submitted 22 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

  22. arXiv:2310.08080  [pdf

    eess.IV cs.CV

    RT-SRTS: Angle-Agnostic Real-Time Simultaneous 3D Reconstruction and Tumor Segmentation from Single X-Ray Projection

    Authors: Miao Zhu, Qiming Fu, Bo Liu, Mengxi Zhang, Bojian Li, Xiaoyan Luo, Fugen Zhou

    Abstract: Radiotherapy is one of the primary treatment methods for tumors, but the organ movement caused by respiration limits its accuracy. Recently, 3D imaging from a single X-ray projection has received extensive attention as a promising approach to address this issue. However, current methods can only reconstruct 3D images without directly locating the tumor and are only validated for fixed-angle imagin… ▽ More

    Submitted 28 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  23. arXiv:2309.07198  [pdf, other

    eess.IV physics.app-ph physics.optics

    Temporal compressive edge imaging enabled by a lensless diffuser camera

    Authors: Ze Zheng, Baolei Liu, Jiaqi Song, Lei Ding, Xiaolan Zhong, David Mcgloin, Fan Wang

    Abstract: Lensless imagers based on diffusers or encoding masks enable high-dimensional imaging from a single shot measurement and have been applied in various applications. However, to further extract image information such as edge detection, conventional post-processing filtering operations are needed after the reconstruction of the original object images in the diffuser imaging systems. Here, we present… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: Optics Letters, 49(11), 3058-3061 (2024)

  24. arXiv:2308.13234  [pdf, other

    cs.HC cs.AI eess.SP q-bio.NC

    Decoding Natural Images from EEG for Object Recognition

    Authors: Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, Xiaorong Gao

    Abstract: Electroencephalography (EEG) signals, known for convenient non-invasive acquisition but low signal-to-noise ratio, have recently gained substantial attention due to the potential to decode natural images. This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals, particularly for object recognition. The framework utilizes imag… ▽ More

    Submitted 4 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: ICLR, 2024

  25. arXiv:2308.09223  [pdf, other

    eess.IV cs.CV cs.LG

    DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction

    Authors: Xiaoxiao He, Chaowei Tan, Ligong Han, Bo Liu, Leon Axel, Kang Li, Dimitris N. Metaxas

    Abstract: Accurate 3D cardiac reconstruction from cine magnetic resonance imaging (cMRI) is crucial for improved cardiovascular disease diagnosis and understanding of the heart's motion. However, current cardiac MRI-based reconstruction technology used in clinical settings is 2D with limited through-plane resolution, resulting in low-quality reconstructed cardiac volumes. To better reconstruct 3D cardiac vo… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted in MICCAI 2023

  26. arXiv:2306.14125  [pdf, other

    eess.SP

    M$^3$SC: A Generic Dataset for Mixed Multi-Modal (MMM) Sensing and Communication Integration

    Authors: Xiang Cheng, Ziwei Huang, Lu Bai, Haotian Zhang, Mingran Sun, Boxun Liu, Sijiang Li, Jianan Zhang, Minson Lee

    Abstract: The sixth generation (6G) of mobile communication system is witnessing a new paradigm shift, i.e., integrated sensing-communication system. A comprehensive dataset is a prerequisite for 6G integrated sensing-communication research. This paper develops a novel simulation dataset, named M3SC, for mixed multi-modal (MMM) sensing-communication integration, and the generation framework of the M3SC data… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: 12 pages, 12 figures

  27. arXiv:2306.07505  [pdf

    q-bio.TO eess.IV

    Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease

    Authors: Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu , et al. (22 additional authors not shown)

    Abstract: Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  28. arXiv:2305.18205  [pdf, other

    eess.SP cs.LG nucl-ex

    Pulse shape discrimination based on the Tempotron: a powerful classifier on GPU

    Authors: Haoran Liu, Peng Li, Ming-Zhe Liu, Kai-Ming Wang, Zhuo Zuo, Bing-Qi Liu

    Abstract: This study introduces the Tempotron, a powerful classifier based on a third-generation neural network model, for pulse shape discrimination. By eliminating the need for manual feature extraction, the Tempotron model can process pulse signals directly, generating discrimination results based on learned prior knowledge. The study performed experiments using GPU acceleration, resulting in over a 500… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 14 pages,7 figures

  29. arXiv:2305.10788  [pdf, other

    cs.SD cs.CL eess.AS

    Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

    Authors: Hang Shao, Wei Wang, Bei Liu, Xun Gong, Haoyu Wang, Yanmin Qian

    Abstract: Due to the rapid development of computing hardware resources and the dramatic growth of data, pre-trained models in speech recognition, such as Whisper, have significantly improved the performance of speech recognition tasks. However, these models usually have a high computational overhead, making it difficult to execute effectively on resource-constrained devices. To speed up inference and reduce… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  30. arXiv:2304.12804  [pdf, other

    cs.IT eess.SP

    Channel Estimation and Signal Detection for NLOS Ultraviolet Scattering Communication with Space Division Multiple Access

    Authors: Yubo Zhang, Yuchen Pan, Chen Gong, Beiyuan Liu, Zhengyuan Xu

    Abstract: We design a receiver assembling several photomultipliers (PMTs) as an array to increase the field of view (FOV) of the receiver and adapt to multiuser situation over None-line-of-sight (NLOS) ultraviolet (UV) channels. Channel estimation and signal detection have been investigated according to the space division characteristics of the structure. Firstly, we adopt the balanced structure on the pilo… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  31. arXiv:2304.09607  [pdf, other

    cs.SD cs.CL eess.AS

    CB-Conformer: Contextual biasing Conformer for biased word recognition

    Authors: Yaoxun Xu, Baiji Liu, Qiaochu Huang and, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng

    Abstract: Due to the mismatch between the source and target domains, how to better utilize the biased word information to improve the performance of the automatic speech recognition model in the target domain becomes a hot research topic. Previous approaches either decode with a fixed external language model or introduce a sizeable biasing module, which leads to poor adaptability and slow inference. In this… ▽ More

    Submitted 25 April, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

  32. arXiv:2304.02273  [pdf, other

    eess.IV cs.CV

    MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding

    Authors: Bowen Liu, Yu Chen, Rakesh Chowdary Machineni, Shiyu Liu, Hun-Seok Kim

    Abstract: Learning-based video compression has been extensively studied over the past years, but it still has limitations in adapting to various motion patterns and entropy models. In this paper, we propose multi-mode video compression (MMVC), a block wise mode ensemble deep video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns. Propose… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  33. arXiv:2303.14357  [pdf, other

    eess.IV cs.CV cs.LG

    Dealing With Heterogeneous 3D MR Knee Images: A Federated Few-Shot Learning Method With Dual Knowledge Distillation

    Authors: Xiaoxiao He, Chaowei Tan, Bo Liu, Li** Si, Weiwu Yao, Liang Zhao, Di Liu, Qilong Zhangli, Qi Chang, Kang Li, Dimitris N. Metaxas

    Abstract: Federated Learning has gained popularity among medical institutions since it enables collaborative training between clients (e.g., hospitals) without aggregating data. However, due to the high cost associated with creating annotations, especially for large 3D image datasets, clinical institutions do not have enough supervised data for training locally. Thus, the performance of the collaborative mo… ▽ More

    Submitted 17 April, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

  34. arXiv:2302.13161  [pdf, other

    cs.CR cs.LG eess.SY

    Cybersecurity Challenges of Power Transformers

    Authors: Hossein Rahimpour, Joe Tusek, Alsharif Abuadbba, Aruna Seneviratne, Toan Phung, Ahmed Musleh, Boyu Liu

    Abstract: The rise of cyber threats on critical infrastructure and its potential for devastating consequences, has significantly increased. The dependency of new power grid technology on information, data analytic and communication systems make the entire electricity network vulnerable to cyber threats. Power transformers play a critical role within the power grid and are now commonly enhanced through facto… ▽ More

    Submitted 25 March, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: 11 pages

  35. A Convolutional-Transformer Network for Crack Segmentation with Boundary Awareness

    Authors: Huaqi Tao, Bingxi Liu, **qiang Cui, Hong Zhang

    Abstract: Cracks play a crucial role in assessing the safety and durability of manufactured buildings. However, the long and sharp topological features and complex background of cracks make the task of crack segmentation extremely challenging. In this paper, we propose a novel convolutional-transformer network based on encoder-decoder architecture to solve this challenge. Particularly, we designed a Dilated… ▽ More

    Submitted 11 November, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Accepted to ICIP 2023

  36. arXiv:2302.07269  [pdf, other

    eess.IV physics.optics

    Dual-mode adaptive-SVD ghost imaging

    Authors: Da**g Wang, Baolei Liu, Jiaqi Song, Yao Wang, Xuchen Shan, Fan Wang

    Abstract: In this paper, we present a dual-mode adaptive singular value decomposition ghost imaging (A-SVD GI), which can be easily switched between the modes of imaging and edge detection. It can adaptively localize the foreground pixels via a threshold selection method. Then only the foreground region is illuminated by the singular value decomposition (SVD) - based patterns, consequently retrieving high-q… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  37. arXiv:2302.05726  [pdf, other

    eess.SY

    Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

    Authors: Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, Dacheng Tao

    Abstract: Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the ag… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  38. arXiv:2301.00657  [pdf, other

    eess.AS cs.AI cs.CL

    MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset

    Authors: Kailin Liang, Bin Liu, Yifan Hu, Rui Liu, Feilong Bao, Guanglai Gao

    Abstract: Text-to-Speech (TTS) synthesis for low-resource languages is an attractive research issue in academia and industry nowadays. Mongolian is the official language of the Inner Mongolia Autonomous Region and a representative low-resource language spoken by over 10 million people worldwide. However, there is a relative lack of open-source datasets for Mongolian TTS. Therefore, we make public an open-so… ▽ More

    Submitted 11 December, 2022; originally announced January 2023.

    Comments: Accepted by NCMMSC'2022 (https://ncmmsc2022.ustc.edu.cn/main.htm)

  39. arXiv:2301.00308  [pdf, other

    eess.SP eess.SY

    High-Accuracy Absolute-Position-Aided Code Phase Tracking Based on RTK/INS Deep Integration in Challenging Static Scenarios

    Authors: Yiran Luo, Li-Ta Hsu, Yang Jiang, Baoyu Liu, Zhetao Zhang, Yan Xiang, Naser El-Sheimy

    Abstract: Many multi-sensor navigation systems urgently demand accurate positioning initialization from global navigation satellite systems (GNSSs) in challenging static scenarios. However, ground blockages against line-of-sight (LOS) signal reception make it difficult for GNSS users. Steering local codes in GNSS basebands is a desiring way to correct instantaneous signal phase misalignment, efficiently gat… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

    Comments: 27 pages, 18 figures

  40. Personalized local heating neutralizing individual, spatial and temporal thermo-physiological variances in extreme cold environments

    Authors: Yi Ju, Xinyuan Ju, Hui Zhang, Bin Cao, Bin Liu, Yingxin Zhu

    Abstract: In this paper, we investigate the feasibility, robustness and optimization of introducing personal comfort systems (PCS), apparatuses that promises in energy saving and comfort improvement, into a broader range of environments. We report a series of laboratory experiments systematically examining the effect of personalized heating in neutralizing individual, spatial and temporal variations of ther… ▽ More

    Submitted 27 December, 2022; v1 submitted 11 December, 2022; originally announced December 2022.

    Journal ref: Building and Environment, 109950 (2022)

  41. arXiv:2211.10721  [pdf

    eess.SP

    Multi-timescale Event Detection in Nonintrusive Load Monitoring based on MDL Principle

    Authors: Bo Liu, Jianfeng Zhang, Wenpeng Luan, Zishuai Liu, Yixin Yu

    Abstract: Load event detection is the fundamental step for the event-based non-intrusive load monitoring (NILM). However, existing event detection methods with fixed parameters may fail in co** with the inherent multi-timescale characteristics of events and their event detection accuracy is easily affected by the load fluctuation. In this regard, this paper extends our previously designed two-stage event… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: 11 pages,16 figures

  42. arXiv:2211.06161  [pdf, other

    cs.LG eess.IV

    Spatial Temporal Graph Convolution with Graph Structure Self-learning for Early MCI Detection

    Authors: Yunpeng Zhao, Fugen Zhou, Bin Guo, Bo Liu

    Abstract: Graph neural networks (GNNs) have been successfully applied to early mild cognitive impairment (EMCI) detection, with the usage of elaborately designed features constructed from blood oxygen level-dependent (BOLD) time series. However, few works explored the feasibility of using BOLD signals directly as features. Meanwhile, existing GNN-based methods primarily rely on hand-crafted explicit brain t… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: 5 pages, 3 figures

  43. arXiv:2211.04238  [pdf, other

    eess.IV

    HDRfeat: A Feature-Rich Network for High Dynamic Range Image Reconstruction

    Authors: Lingkai Zhu, Fei Zhou, Bozhi Liu, Orcun Göksel

    Abstract: A major challenge for high dynamic range (HDR) image reconstruction from multi-exposed low dynamic range (LDR) images, especially with dynamic scenes, is the extraction and merging of relevant contextual features in order to suppress any ghosting and blurring artifacts from moving objects. To tackle this, in this work we propose a novel network for HDR reconstruction with deep and rich feature ext… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 4 pages, 5 figures

  44. arXiv:2211.00815  [pdf, other

    cs.SD eess.AS

    Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022

    Authors: Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

    Abstract: Many speaker recognition challenges have been held to assess the speaker verification system in the wild and probe the performance limit. Voxceleb Speaker Recognition Challenge (VoxSRC), based on the voxceleb, is the most popular. Besides, another challenge called CN-Celeb Speaker Recognition Challenge (CNSRC) is also held this year, which is based on the Chinese celebrity multi-genre dataset CN-C… ▽ More

    Submitted 1 June, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

    Comments: Accepted by InterSpeech 2023

  45. arXiv:2210.16791  [pdf, other

    cs.SD cs.AI eess.AS

    Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning

    Authors: Bozhong Liu, Xiaoxi Yu, Hantao Huang

    Abstract: Acoustic echo cancellation (AEC) is designed to remove echoes, reverberation, and unwanted added sounds from the microphone signal while maintaining the quality of the near-end speaker's speech. This paper proposes adaptive speech quality complex neural networks to focus on specific tasks for real-time acoustic echo cancellation. In specific, we propose a complex modularize neural network with dif… ▽ More

    Submitted 9 November, 2022; v1 submitted 30 October, 2022; originally announced October 2022.

    Comments: Submitted to International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2023. Under review

  46. arXiv:2209.09076  [pdf, other

    cs.SD eess.AS

    SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022

    Authors: Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

    Abstract: This report describes the SJTU-AISPEECH system for the Voxceleb Speaker Recognition Challenge 2022. For track1, we implemented two kinds of systems, the online system and the offline system. Different ResNet-based backbones and loss functions are explored. Our final fusion system achieved 3rd place in track1. For track3, we implemented statistic adaptation and jointly training based domain adaptat… ▽ More

    Submitted 20 September, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: System description of VoxSRC 2022

  47. arXiv:2209.07618  [pdf, other

    cs.GT cs.AI cs.MA eess.SY

    Differentiable Bilevel Programming for Stackelberg Congestion Games

    Authors: Jiayang Li, **g Yu, Qianni Wang, Boyi Liu, Zhaoran Wang, Yu Marco Nie

    Abstract: In a Stackelberg congestion game (SCG), a leader aims to maximize their own gain by anticipating and manipulating the equilibrium state at which the followers settle by playing a congestion game. Often formulated as bilevel programs, large-scale SCGs are well known for their intractability and complexity. Here, we attempt to tackle this computational challenge by marrying traditional methodologies… ▽ More

    Submitted 13 May, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

  48. arXiv:2208.02792  [pdf

    cs.RO eess.SY

    A Cooperative Perception Environment for Traffic Operations and Control

    Authors: Hanlin Chen, Brian Liu, Xumiao Zhang, Feng Qian, Z. Morley Mao, Yiheng Feng

    Abstract: Existing data collection methods for traffic operations and control usually rely on infrastructure-based loop detectors or probe vehicle trajectories. Connected and automated vehicles (CAVs) not only can report data about themselves but also can provide the status of all detected surrounding vehicles. Integration of perception data from multiple CAVs as well as infrastructure sensors (e.g., LiDAR)… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

  49. arXiv:2207.14174  [pdf, other

    eess.SP cs.AI

    Bayesian Optimization-Based Beam Alignment for MmWave MIMO Communication Systems

    Authors: Songjie Yang, Baojuan Liu, Zhiqin Hong, Zhongpei Zhang

    Abstract: Due to the very narrow beam used in millimeter wave communication (mmWave), beam alignment (BA) is a critical issue. In this work, we investigate the issue of mmWave BA and present a novel beam alignment scheme on the basis of a machine learning strategy, Bayesian optimization (BO). In this context, we consider the beam alignment issue to be a black box function and then use BO to find the possibl… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

  50. Low-complexity Sparse Array Synthesis Based on Off-grid Compressive Sensing

    Authors: Songjie Yang, Baojuan Liu, Zhiqin Hong, Zhongpei Zhang

    Abstract: A novel sparse array synthesis method for non-uniform planar arrays is proposed, which belongs to compressive sensing (CS)-based systhesis. Particularly, we propose an off-grid refinement technique to simultaneously optimize the antenna element positions and excitations with a low complexity, in response to the antenna position optimization problem that is difficult for standard CS. More important… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.