Skip to main content

Showing 1–50 of 72 results for author: Zhu, G

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16967  [pdf, other

    eess.SP eess.SY

    Remaining useful life prediction of rolling bearings based on refined composite multi-scale attention entropy and dispersion entropy

    Authors: Yunchong Long, Qinkang Pang, Guangjie Zhu, Junxian Cheng, Xiangshun Li

    Abstract: Remaining useful life (RUL) prediction based on vibration signals is crucial for ensuring the safe operation and effective health management of rotating machinery. Existing studies often extract health indicators (HI) from time domain and frequency domain features to analyze complex vibration signals, but these features may not accurately capture the degradation process. In this study, we propose… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 12pages, 9 figures

  2. arXiv:2406.16929  [pdf, other

    eess.SP cs.AI

    Modelling the 5G Energy Consumption using Real-world Data: Energy Fingerprint is All You Need

    Authors: Tingwei Chen, Yantao Wang, Hanzhi Chen, Zijian Zhao, Xinhao Li, Nicola Piovesan, Guangxu Zhu, Qingjiang Shi

    Abstract: The introduction of fifth-generation (5G) radio technology has revolutionized communications, bringing unprecedented automation, capacity, connectivity, and ultra-fast, reliable communications. However, this technological leap comes with a substantial increase in energy consumption, presenting a significant challenge. To improve the energy efficiency of 5G networks, it is imperative to develop sop… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2404.19182  [pdf, other

    eess.SP

    Robust Proximity Detection using On-Device Gait Monitoring

    Authors: Yuqian Hu, Guozhen Zhu, Beibei Wang, K. J. Ray Liu

    Abstract: Proximity detection in indoor environments based on WiFi signals has gained significant attention in recent years. Existing works rely on the dynamic signal reflections and their extracted features are dependent on motion strength. To address this issue, we design a robust WiFi-based proximity detector by considering gait monitoring. Specifically, we propose a gait score that accurately evaluates… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: This work has been accepted in IEEE 9th World Forum on Internet of Things (WFIoT)

  4. arXiv:2404.06007  [pdf, other

    cs.IT cs.AI cs.LG eess.SP

    Collaborative Edge AI Inference over Cloud-RAN

    Authors: Pengfei Zhang, Dingzhu Wen, Guangxu Zhu, Qimei Chen, Kaifeng Han, Yuanming Shi

    Abstract: In this paper, a cloud radio access network (Cloud-RAN) based collaborative edge AI inference architecture is proposed. Specifically, geographically distributed devices capture real-time noise-corrupted sensory data samples and extract the noisy local feature vectors, which are then aggregated at each remote radio head (RRH) to suppress sensing noise. To realize efficient uplink feature aggregatio… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by IEEE Transactions on Communications on 08-Apr-2024

  5. arXiv:2403.16397  [pdf, other

    eess.SP cs.AI

    RadioGAT: A Joint Model-based and Data-driven Framework for Multi-band Radiomap Reconstruction via Graph Attention Networks

    Authors: Xiaojie Li, Songyang Zhang, Hang Li, Xiaoyang Li, Lexi Xu, Haigao Xu, Hui Mei, Guangxu Zhu, Nan Qi, Ming Xiao

    Abstract: Multi-band radiomap reconstruction (MB-RMR) is a key component in wireless communications for tasks such as spectrum management and network planning. However, traditional machine-learning-based MB-RMR methods, which rely heavily on simulated data or complete structured ground truth, face significant deployment challenges. These challenges stem from the differences between simulated and actual data… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: submitted to IEEE journal for possible publication

  6. arXiv:2403.15145  [pdf, ps, other

    cs.IT eess.SP

    Robust Resource Allocation for STAR-RIS Assisted SWIPT Systems

    Authors: Guangyu Zhu, Xidong Mu, Li Guo, Ao Huang, Shibiao Xu

    Abstract: A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted simultaneous wireless information and power transfer (SWIPT) system is proposed. More particularly, an STAR-RIS is deployed to assist in the information/power transfer from a multi-antenna access point (AP) to multiple single-antenna information users (IUs) and energy users (EUs), where two practica… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  7. arXiv:2403.12400  [pdf, other

    cs.LG cs.AI eess.SP

    Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless Sensing

    Authors: Zijian Zhao, Tingwei Chen, Fanyi Meng, Hang Li, Xiaoyang Li, Guangxu Zhu

    Abstract: Despite the development of various deep learning methods for Wi-Fi sensing, package loss often results in noncontinuous estimation of the Channel State Information (CSI), which negatively impacts the performance of the learning models. To overcome this challenge, we propose a deep learning model based on Bidirectional Encoder Representations from Transformers (BERT) for CSI recovery, named CSI-BER… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 6 pages, accepted by IEEE INFOCOM Deepwireless Workshop 2024

  8. arXiv:2403.11693  [pdf, other

    cs.IT eess.SP

    Beamforming Design for Semantic-Bit Coexisting Communication System

    Authors: Maojun Zhang, Guangxu Zhu, Richeng **, Xiaoming Chen, Qingjiang Shi, Caijun Zhong, Kaibin Huang

    Abstract: Semantic communication (SemCom) is emerging as a key technology for future sixth-generation (6G) systems. Unlike traditional bit-level communication (BitCom), SemCom directly optimizes performance at the semantic level, leading to superior communication efficiency. Nevertheless, the task-oriented nature of SemCom renders it challenging to completely replace BitCom. Consequently, it is desired to c… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE for possible publication

  9. arXiv:2403.10493  [pdf, other

    cs.SD eess.AS eess.SP

    MusicHiFi: Fast High-Fidelity Stereo Vocoding

    Authors: Ge Zhu, Juan-Pablo Caceres, Zhiyao Duan, Nicholas J. Bryan

    Abstract: Diffusion-based audio and music generation models commonly generate music by constructing an image representation of audio (e.g., a mel-spectrogram) and then converting it to audio using a phase reconstruction model or vocoder. Typical vocoders, however, produce monophonic audio at lower resolutions (e.g., 16-24 kHz), which limits their effectiveness. We propose MusicHiFi -- an efficient high-fide… ▽ More

    Submitted 20 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  10. arXiv:2403.01956  [pdf, ps, other

    cs.IT eess.SP

    Hybrid Active-Passive RIS Transmitter Enabled Energy-Efficient Multi-User Communications

    Authors: Ao Huang, Xidong Mu, Li Guo, Guangyu Zhu

    Abstract: A novel hybrid active-passive reconfigurable intelligent surface (RIS) transmitter enabled downlink multi-user communication system is investigated. Specifically, RISs are exploited to serve as transmitter antennas, where each element can flexibly switch between active and passive modes to deliver information to multiple users. The system energy efficiency (EE) maximization problem is formulated b… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  11. arXiv:2402.17780  [pdf, other

    eess.SP cs.LG physics.med-ph

    Constraint Latent Space Matters: An Anti-anomalous Waveform Transformation Solution from Photoplethysmography to Arterial Blood Pressure

    Authors: Cheng Bian, Xiaoyu Li, Qi Bi, Guangpu Zhu, Jiegeng Lyu, Weile Zhang, Yelei Li, Zi**g Zeng

    Abstract: Arterial blood pressure (ABP) holds substantial promise for proactive cardiovascular health management. Notwithstanding its potential, the invasive nature of ABP measurements confines their utility primarily to clinical environments, limiting their applicability for continuous monitoring beyond medical facilities. The conversion of photoplethysmography (PPG) signals into ABP equivalents has garner… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI-2024, main track

  12. arXiv:2402.06986  [pdf, other

    cs.SD eess.AS

    Cacophony: An Improved Contrastive Audio-Text Model

    Authors: Ge Zhu, Jordan Darefsky, Zhiyao Duan

    Abstract: Despite recent advancements in audio-text modeling, audio-text contrastive models still lag behind their image-text counterparts in scale and performance. We propose a method to improve both the scale and the training of audio-text contrastive models. Specifically, we craft a large-scale audio-text dataset containing 13,000 hours of text-labeled audio, using pretrained language models to process n… ▽ More

    Submitted 29 April, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: Work in Progress

  13. arXiv:2402.02729  [pdf, ps, other

    cs.IT cs.CV cs.LG eess.IV

    Fast and Accurate Cooperative Radio Map Estimation Enabled by GAN

    Authors: Zezhong Zhang, Guangxu Zhu, Junting Chen, Shuguang Cui

    Abstract: In the 6G era, real-time radio resource monitoring and management are urged to support diverse wireless-empowered applications. This calls for fast and accurate estimation on the distribution of the radio resources, which is usually represented by the spatial signal power strength over the geographical environment, known as a radio map. In this paper, we present a cooperative radio map estimation… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  14. arXiv:2311.16192  [pdf, other

    cs.LG cs.AI eess.SP

    Utilizing Multiple Inputs Autoregressive Models for Bearing Remaining Useful Life Prediction

    Authors: Junliang Wang, Qinghua Zhang, Guanhua Zhu, Guoxi Sun

    Abstract: Accurate prediction of the Remaining Useful Life (RUL) of rolling bearings is crucial in industrial production, yet existing models often struggle with limited generalization capabilities due to their inability to fully process all vibration signal patterns. We introduce a novel multi-input autoregressive model to address this challenge in RUL prediction for bearings. Our approach uniquely integra… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  15. arXiv:2311.10525  [pdf, other

    cs.LG eess.SY

    Utilizing VQ-VAE for End-to-End Health Indicator Generation in Predicting Rolling Bearing RUL

    Authors: Junliang Wang, Qinghua Zhang, Guanhua Zhu, Guoxi Sun

    Abstract: The prediction of the remaining useful life (RUL) of rolling bearings is a pivotal issue in industrial production. A crucial approach to tackling this issue involves transforming vibration signals into health indicators (HI) to aid model training. This paper presents an end-to-end HI construction method, vector quantised variational autoencoder (VQ-VAE), which addresses the need for dimensionality… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 17 figures

  16. arXiv:2311.09028  [pdf, other

    cs.IT eess.SP

    Integrating Sensing, Communication, and Power Transfer: Multiuser Beamforming Design

    Authors: Ziqin Zhou, Xiaoyang Li, Guangxu Zhu, Jie Xu, Kaibin Huang, Shuguang Cui

    Abstract: In the sixth-generation (6G) networks, massive low-power devices are expected to sense environment and deliver tremendous data. To enhance the radio resource efficiency, the integrated sensing and communication (ISAC) technique exploits the sensing and communication functionalities of signals, while the simultaneous wireless information and power transfer (SWIPT) techniques utilizes the same signa… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: This paper has been submitted to IEEE for possible publication

  17. arXiv:2311.08667  [pdf, other

    cs.SD eess.AS

    EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

    Authors: Ge Zhu, Yutong Wen, Marc-André Carbonneau, Zhiyao Duan

    Abstract: Audio diffusion models can synthesize a wide variety of sounds. Existing models often operate on the latent domain with cascaded phase recovery modules to reconstruct waveform. This poses challenges when generating high-fidelity audio. In this paper, we propose EDMSound, a diffusion-based generative model in spectrogram domain under the framework of elucidated diffusion models (EDM). Combining wit… ▽ More

    Submitted 18 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS Workshop: Machine Learning for Audio (Camera Ready)

  18. arXiv:2310.07906  [pdf, ps, other

    eess.SY math.AP math.OC

    Power Tracking Control of Heterogeneous Populations of TCLs with Partially Measured States

    Authors: Zhenhe Zhang, Jun Zheng, Guchuan Zhu

    Abstract: This paper presents a new aggregate power tracking control scheme for populations of thermostatically controlled loads (TCLs). The control design is performed in the framework of partial differential equations (PDEs) based on a late-lum** procedure without truncating the infinite-dimensional model describing the dynamics of the TCL population. An input-output linearization control scheme, which… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2305.09118

    Journal ref: IEEE Access, 2024

  19. Successive Pose Estimation and Beam Tracking for mmWave Vehicular Communication Systems

    Authors: Cen Liu, Guangxu Zhu, Fan Liu, Yuanwei Liu, Kaibin Huang

    Abstract: The millimeter wave (mmWave) radar sensing-aided communications in vehicular mobile communication systems is investigated. To alleviate the beam training overhead under high mobility scenarios, a successive pose estimation and beam tracking (SPEBT) scheme is proposed to facilitate mmWave communications with the assistance of mmWave radar sensing. The proposed SPEBT scheme first resorts to a Fast C… ▽ More

    Submitted 5 August, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

    Comments: An extended version of a conference submission. 7 pages, 5 figures

    Journal ref: IEEE Global Communications Conference Workshops (GC Wkshps) 2023

  20. arXiv:2306.12298  [pdf, other

    cs.CV cs.LG eess.IV

    StarVQA+: Co-training Space-Time Attention for Video Quality Assessment

    Authors: Fengchuang Xing, Yuan-Gen Wang, Weixuan Tang, Guopu Zhu, Sam Kwong

    Abstract: Self-attention based Transformer has achieved great success in many computer vision tasks. However, its application to video quality assessment (VQA) has not been satisfactory so far. Evaluating the quality of in-the-wild videos is challenging due to the unknown of pristine reference and shooting distortion. This paper presents a co-trained Space-Time Attention network for the VQA problem, termed… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  21. arXiv:2306.06603  [pdf, ps, other

    cs.IT cs.LG eess.SP

    Task-Oriented Integrated Sensing, Computation and Communication for Wireless Edge AI

    Authors: Hong Xing, Guangxu Zhu, Dongzhu Liu, Haifeng Wen, Kaibin Huang, Kaishun Wu

    Abstract: With the advent of emerging IoT applications such as autonomous driving, digital-twin and metaverse etc. featuring massive data sensing, analyzing and inference as well critical latency in beyond 5G (B5G) networks, edge artificial intelligence (AI) has been proposed to provide high-performance computation of a conventional cloud down to the network edge. Recently, convergence of wireless sensing,… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: 18 pages, 6 figures, submitted for possible journal publication

  22. arXiv:2306.02990  [pdf, other

    cs.IT cs.LG eess.SP

    Integrated Sensing, Computation, and Communication for UAV-assisted Federated Edge Learning

    Authors: Yao Tang, Guangxu Zhu, Wei Xu, Man Hon Cheung, Tat-Ming Lok, Shuguang Cui

    Abstract: Federated edge learning (FEEL) enables privacy-preserving model training through periodic communication between edge devices and the server. Unmanned Aerial Vehicle (UAV)-mounted edge devices are particularly advantageous for FEEL due to their flexibility and mobility in efficient data collection. In UAV-assisted FEEL, sensing, computation, and communication are coupled and compete for limited onb… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  23. arXiv:2305.04152  [pdf, ps, other

    cs.IT eess.SP

    Bayesian Over-the-Air FedAvg via Channel Driven Stochastic Gradient Langevin Dynamics

    Authors: Boning Zhang, Dongzhu Liu, Osvaldo Simeone, Guangxu Zhu

    Abstract: The recent development of scalable Bayesian inference methods has renewed interest in the adoption of Bayesian learning as an alternative to conventional frequentist learning that offers improved model calibration via uncertainty quantification. Recently, federated averaging Langevin dynamics (FALD) was introduced as a variant of federated averaging that can efficiently implement distributed Bayes… ▽ More

    Submitted 9 May, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: 6 pages, 4 figures, 26 references, submitted

  24. arXiv:2303.16404  [pdf, other

    eess.SY

    Robust Andrew's sine estimate adaptive filtering

    Authors: Lu Lu, Yi Yu, Zongsheng Zheng, Guangya Zhu, Xiaomin Yang

    Abstract: The Andrew's sine function is a robust estimator, which has been used in outlier rejection and robust statistics. However, the performance of such estimator does not receive attention in the field of adaptive filtering techniques. Two Andrew's sine estimator (ASE)-based robust adaptive filtering algorithms are proposed in this brief. Specifically, to achieve improved performance and reduced comput… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: 5 pages, 5 figures

  25. arXiv:2303.11319  [pdf, other

    cs.IT eess.SP

    Over-the-Air Federated Edge Learning with Error-Feedback One-Bit Quantization and Power Control

    Authors: Yuding Liu, Dongzhu Liu, Guangxu Zhu, Qingjiang Shi, Caijun Zhong

    Abstract: Over-the-air federated edge learning (Air-FEEL) is a communication-efficient framework for distributed machine learning using training data distributed at edge devices. This framework enables all edge devices to transmit model updates simultaneously over the entire available bandwidth, allowing for over-the-air aggregation. A one-bit digital over-the-air aggregation (OBDA) scheme has been recently… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  26. arXiv:2303.06475  [pdf, other

    eess.AS cs.CL

    Transcription free filler word detection with Neural semi-CRFs

    Authors: Ge Zhu, Yujia Yan, Juan-Pablo Caceres, Zhiyao Duan

    Abstract: Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty. Previous works for detecting certain non-linguistic filler words are highly dependent on transcriptions from a well-established commercial automatic speech recognition (ASR) system. However, certain ASR systems are not universally accessible from… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  27. arXiv:2212.00227  [pdf, other

    cs.IT eess.SP

    Wireless Image Transmission with Semantic and Security Awareness

    Authors: Maojun Zhang, Yang Li, Zezhong Zhang, Guangxu Zhu, Caijun Zhong

    Abstract: Semantic communication is an increasingly popular framework for wireless image transmission due to its high communication efficiency. With the aid of the joint-source-and-channel (JSC) encoder implemented by neural network, semantic communication directly maps original images into symbol sequences containing semantic information. Compared with the traditional separate source and channel coding des… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: Submitted to IEEE WCL for possible publication

  28. arXiv:2211.01255  [pdf, other

    cs.IT cs.AI cs.LG eess.SP

    Task-Oriented Over-the-Air Computation for Multi-Device Edge AI

    Authors: Dingzhu Wen, Xiang Jiao, Peixi Liu, Guangxu Zhu, Yuanming Shi, Kaibin Huang

    Abstract: Departing from the classic paradigm of data-centric designs, the 6G networks for supporting edge AI features task-oriented techniques that focus on effective and efficient execution of AI task. Targeting end-to-end system performance, such techniques are sophisticated as they aim to seamlessly integrate sensing (data acquisition), communication (data transmission), and computation (data processing… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  29. arXiv:2210.09659  [pdf, ps, other

    eess.SP cs.IT

    Large-Scale Bandwidth and Power Optimization for Multi-Modal Edge Intelligence Autonomous Driving

    Authors: Xinrao Li, Tong Zhang, Shuai Wang, Guangxu Zhu, Rui Wang, Tsung-Hui Chang

    Abstract: Edge intelligence autonomous driving (EIAD) offers computing resources in autonomous vehicles for training deep neural networks. However, wireless channels between the edge server and the autonomous vehicles are time-varying due to the high-mobility of vehicles. Moreover, the required number of training samples for different data modalities, e.g., images, point-clouds, is diverse. Consequently, wh… ▽ More

    Submitted 7 December, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Submitted to IEEE

  30. arXiv:2207.14166  [pdf, ps, other

    cs.CV cs.LG eess.IV

    RHA-Net: An Encoder-Decoder Network with Residual Blocks and Hybrid Attention Mechanisms for Pavement Crack Segmentation

    Authors: Guijie Zhu, Zhun Fan, Jiacheng Liu, Duan Yuan, Peili Ma, Meihua Wang, Weihua Sheng, Kelvin C. P. Wang

    Abstract: The acquisition and evaluation of pavement surface data play an essential role in pavement condition evaluation. In this paper, an efficient and effective end-to-end network for automatic pavement crack segmentation, called RHA-Net, is proposed to improve the pavement crack segmentation accuracy. The RHA-Net is built by integrating residual blocks (ResBlocks) and hybrid attention blocks into the e… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

  31. Toward Ambient Intelligence: Federated Edge Learning with Task-Oriented Sensing, Computation, and Communication Integration

    Authors: Peixi Liu, Guangxu Zhu, Shuai Wang, Wei Jiang, Wu Luo, H. Vincent Poor, Shuguang Cui

    Abstract: In this paper, we address the problem of joint sensing, computation, and communication (SC$^{2}$) resource allocation for federated edge learning (FEEL) via a concrete case study of human motion recognition based on wireless sensing in ambient intelligence. First, by analyzing the wireless sensing process in human motion recognition, we find that there exists a thresholding value for the sensing t… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: 13 pages, submitted to IEEE for possible publication

  32. arXiv:2204.09079  [pdf, other

    eess.AS cs.SD eess.SP

    Music Source Separation with Generative Flow

    Authors: Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan

    Abstract: Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art. However, such parallel data is often difficult to obtain, and it is cumbersome to adapt trained models to mixtures with new sources. Source-only supervised models, in contrast, only require individual source data for training. In this paper, we first leverage flow-based gen… ▽ More

    Submitted 16 October, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted by Signal Processing Letters

  33. arXiv:2204.08169  [pdf, ps, other

    cs.NI eess.SP

    Actions at the Edge: Jointly Optimizing the Resources in Multi-access Edge Computing

    Authors: Yiqin Deng, Xianhao Chen, Guangyu Zhu, Yuguang Fang, Zhigang Chen, Xiaoheng Deng

    Abstract: Multi-access edge computing (MEC) is an emerging paradigm that pushes resources for sensing, communications, computing, storage and intelligence (SCCSI) to the premises closer to the end users, i.e., the edge, so that they could leverage the nearby rich resources to improve their quality of experience (QoE). Due to the growing emerging applications targeting at intelligentizing life-sustaining cyb… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 7 pages, 2 figures, accepted by IEEE Wireless Communications

  34. arXiv:2203.15135  [pdf, other

    cs.CL cs.SD eess.AS

    Filler Word Detection and Classification: A Dataset and Benchmark

    Authors: Ge Zhu, Juan-Pablo Caceres, Justin Salamon

    Abstract: Filler words such as `uh' or `um' are sounds or words people use to signal they are pausing to think. Finding and removing filler words from recordings is a common and tedious task in media editing. Automatically detecting and classifying filler words could greatly aid in this task, but few studies have been published on this problem to date. A key reason is the absence of a dataset with annotated… ▽ More

    Submitted 1 July, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: To appear at Insterspeech 2022

  35. arXiv:2202.11490  [pdf, other

    cs.LG cs.DC eess.SP

    Towards Tailored Models on Private AIoT Devices: Federated Direct Neural Architecture Search

    Authors: Chunhui Zhang, Xiaoming Yuan, Qianyun Zhang, Guangxu Zhu, Lei Cheng, Ning Zhang

    Abstract: Neural networks often encounter various stringent resource constraints while deploying on edge devices. To tackle these problems with less human efforts, automated machine learning becomes popular in finding various neural architectures that fit diverse Artificial Intelligence of Things (AIoT) scenarios. Recently, to prevent the leakage of private information while enable automated machine intelli… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:2011.03372

  36. arXiv:2202.05253  [pdf, other

    eess.AS cs.SD

    A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

    Authors: You Zhang, Ge Zhu, Zhiyao Duan

    Abstract: The performance of automatic speaker verification (ASV) systems could be degraded by voice spoofing attacks. Most existing works aimed to develop standalone spoofing countermeasure (CM) systems. Relatively little work targeted at develo** an integrated spoofing aware speaker verification (SASV) system. In the recent SASV challenge, the organizers encourage the development of such integration by… ▽ More

    Submitted 24 April, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: 8 pages, 5 figures, to be appear in Odyssey 2022

  37. arXiv:2201.12581  [pdf, other

    cs.IT eess.SP

    Integrated Sensing, Communication, and Computation Over-the-Air: MIMO Beamforming Design

    Authors: Xiaoyang Li, Fan Liu, Ziqin Zhou, Guangxu Zhu, Shuai Wang, Kaibin Huang, Yi Gong

    Abstract: To support the unprecedented growth of the Internet of Things (IoT) applications, tremendous data need to be collected by the IoT devices and delivered to the server for further computation. By utilizing the same signals for both radar sensing and data communication, the integrated sensing and communication (ISAC) technique has broken the barriers between data collection and delivery in the physic… ▽ More

    Submitted 22 February, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: This paper has been submitted to IEEE for possible publication

  38. arXiv:2201.08512  [pdf, other

    eess.SP cs.CV cs.NI

    Vertical Federated Edge Learning with Distributed Integrated Sensing and Communication

    Authors: Peixi Liu, Guangxu Zhu, Wei Jiang, Wu Luo, Jie Xu, Shuguang Cui

    Abstract: This letter studies a vertical federated edge learning (FEEL) system for collaborative objects/human motion recognition by exploiting the distributed integrated sensing and communication (ISAC). In this system, distributed edge devices first send wireless signals to sense targeted objects/human, and then exchange intermediate computed vectors (instead of raw sensing data) for collaborative recogni… ▽ More

    Submitted 6 June, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

    Comments: 5 pages, 7 figures, accepted by IEEE Communications Letters

  39. arXiv:2110.04265  [pdf, other

    eess.AS cs.SD

    A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

    Authors: Ge Zhu, Frank Cwitkowitz, Zhiyao Duan

    Abstract: In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments. In general, we observe a more significant performance degradation of these raw-waveform systems compared to spectral based systems. We then propose two strategies to improve the performance of raw-waveform based systems on cross-dataset te… ▽ More

    Submitted 11 October, 2021; v1 submitted 8 October, 2021; originally announced October 2021.

  40. arXiv:2110.02857  [pdf, other

    cs.IT eess.SP

    Joint Maneuver and Beamforming Design for UAV-Enabled Integrated Sensing and Communication

    Authors: Zhonghao Lyu, Guangxu Zhu, Jie Xu

    Abstract: This paper studies the UAV-enabled integrated sensing and communication (ISAC), in which UAVs are dispatched as aerial dual-functional access points (APs) for efficient ISAC. In particular, we consider a scenario with one UAV-AP equipped with a vertically placed uniform linear array (ULA), which sends combined information and sensing signals to communicate with multiple users and sense potential t… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: 30 pages, 19 figures

  41. arXiv:2107.12018  [pdf, other

    eess.AS cs.SD

    UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

    Authors: Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan

    Abstract: In this paper, we present UR-AIR system submission to the logical access (LA) and the speech deepfake (DF) tracks of the ASVspoof 2021 Challenge. The LA and DF tasks focus on synthetic speech detection (SSD), i.e. detecting text-to-speech and voice conversion as spoofing attacks. Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability,… ▽ More

    Submitted 23 August, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: To appear in Proc. ASVspoof 2021 Workshop

  42. arXiv:2107.11588  [pdf, other

    cs.NI cs.DC cs.IT cs.LG eess.SP

    Accelerating Federated Edge Learning via Optimized Probabilistic Device Scheduling

    Authors: Maojun Zhang, Guangxu Zhu, Shuai Wang, Jiamo Jiang, Caijun Zhong, Shuguang Cui

    Abstract: The popular federated edge learning (FEEL) framework allows privacy-preserving collaborative model training via frequent learning-updates exchange between edge devices and server. Due to the constrained bandwidth, only a subset of devices can upload their updates at each communication round. This has led to an active research area in FEEL studying the optimal device scheduling policy for minimizin… ▽ More

    Submitted 24 July, 2021; originally announced July 2021.

    Comments: In Proc. IEEE SPAWC2021

  43. arXiv:2107.09574  [pdf, other

    eess.SP cs.AI

    Accelerating Edge Intelligence via Integrated Sensing and Communication

    Authors: Tong Zhang, Shuai Wang, Guoliang Li, Fan Liu, Guangxu Zhu, Rui Wang

    Abstract: Realizing edge intelligence consists of sensing, communication, training, and inference stages. Conventionally, the sensing and communication stages are executed sequentially, which results in excessive amount of dataset generation and uploading time. This paper proposes to accelerate edge intelligence via integrated sensing and communication (ISAC). As such, the sensing and communication stages a… ▽ More

    Submitted 22 January, 2022; v1 submitted 20 July, 2021; originally announced July 2021.

    Comments: Accepted by IEEE ICC 2022. 7 Pages

  44. arXiv:2106.12864  [pdf, other

    eess.IV cs.CV cs.LG

    A Systematic Collection of Medical Image Datasets for Deep Learning

    Authors: Johann Li, Guangming Zhu, Cong Hua, Mingtao Feng, BasheerBennamoun, ** Li, Xiaoyuan Lu, Juan Song, Peiyi Shen, Xu Xu, Lin Mei, Liang Zhang, Syed Afaq Ali Shah, Mohammed Bennamoun

    Abstract: The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analy… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: This paper has been submitted to one journal

  45. arXiv:2106.09316  [pdf, ps, other

    cs.IT eess.SP

    Optimized Power Control Design for Over-the-Air Federated Edge Learning

    Authors: Xiaowen Cao, Guangxu Zhu, Jie Xu, Zhiqin Wang, Shuguang Cui

    Abstract: This paper investigates the transmission power control in over-the-air federated edge learning (Air-FEEL) system. Different from conventional power control designs (e.g., to minimize the individual mean squared error (MSE) of the over-the-air aggregation at each round), we consider a new power control design aiming at directly maximizing the convergence speed. Towards this end, we first analyze th… ▽ More

    Submitted 8 November, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: This paper is an extension of a conference paper and to appear in IEEE JSAC

  46. arXiv:2104.10095  [pdf, other

    cs.IT cs.DC cs.LG cs.NI eess.SP

    Turning Channel Noise into an Accelerator for Over-the-Air Principal Component Analysis

    Authors: Zezhong Zhang, Guangxu Zhu, Rui Wang, Vincent K. N. Lau, Kaibin Huang

    Abstract: Recently years, the attempts on distilling mobile data into useful knowledge has been led to the deployment of machine learning algorithms at the network edge. Principal component analysis (PCA) is a classic technique for extracting the linear structure of a dataset, which is useful for feature extraction and data compression. In this work, we propose the deployment of distributed PCA over a multi… ▽ More

    Submitted 1 April, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: 16 pages,9 figures, accepted by IEEE TWC for publication

  47. arXiv:2104.01320  [pdf, other

    eess.AS cs.SD

    An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

    Authors: You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan

    Abstract: Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials. In practice, however, acoustic condition variability in speech utterances may significantly degrade the performance of CM systems. In this paper, we conduct a cross-dataset study on several state-of-the-art CM systems and observe significant performance degr… ▽ More

    Submitted 10 October, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: 5 pages, 6 figures, in Proc. INTERSPEECH 2021

  48. arXiv:2011.05587  [pdf, ps, other

    cs.IT eess.SP

    Optimized Power Control for Over-the-Air Federated Edge Learning

    Authors: Xiaowen Cao, Guangxu Zhu, Jie Xu, Shuguang Cui

    Abstract: Over-the-air federated edge learning (Air-FEEL) is a communication-efficient solution for privacy-preserving distributed learning over wireless networks. Air-FEEL allows "one-shot" over-the-air aggregation of gradient/model-updates by exploiting the waveform superposition property of wireless channels, and thus promises an extremely low aggregation latency that is independent of the network size.… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

  49. arXiv:2010.12951  [pdf, other

    eess.AS cs.SD

    Y-Vector: Multiscale Waveform Encoder for Speaker Embedding

    Authors: Ge Zhu, Fei Jiang, Zhiyao Duan

    Abstract: State-of-the-art text-independent speaker verification systems typically use cepstral features or filter bank energies as speech features. Recent studies attempted to extract speaker embeddings directly from raw waveforms and have shown competitive results. In this paper, we propose a novel multi-scale waveform encoder that uses three convolution branches with different time scales to compute spee… ▽ More

    Submitted 8 June, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted by Interspeech 2021

  50. arXiv:2009.02181  [pdf, other

    cs.NI cs.IT eess.SP

    Over-the-Air Computing for Wireless Data Aggregation in Massive IoT

    Authors: Guangxu Zhu, Jie Xu, Kaibin Huang, Shuguang Cui

    Abstract: Wireless data aggregation (WDA), referring to aggregating data distributed at devices (e.g., sensors and smartphone), is a common operation in 5G-and-beyond machine-type communications to support Internet-of-Things (IoT), which lays the foundation for diversified applications such as distributed sensing, learning, and control. Conventional WDA techniques that are designed based on a separated-comm… ▽ More

    Submitted 14 November, 2020; v1 submitted 4 September, 2020; originally announced September 2020.

    Comments: An Introductory paper to over-the-air computing