Skip to main content

Showing 1–50 of 52 results for author: Zhu, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.09609  [pdf, other

    eess.SY

    Hierarchical Control for Vehicle Repositioning in Autonomous Mobility on Demand Systems

    Authors: Pengbo Zhu, Giancarlo Ferrari-Trecate, Nikolas Geroliminis

    Abstract: Balancing passenger demand and vehicle availability is crucial for ensuring the sustainability and effectiveness of urban transportation systems. To address this challenge, we propose a novel hierarchical strategy for the efficient distribution of empty vehicles in urban areas. The proposed approach employs a data-enabled predictive control algorithm to develop a high-level controller, which guide… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2406.08268  [pdf, other

    eess.SY

    Multi-Static ISAC based on Network-Assisted Full-Duplex Cell-Free Networks: Performance Analysis and Duplex Mode Optimization

    Authors: Fan Zeng, Ruoyun Liu, Xiaoyu Sun, **gxuan Yu, Jiamin Li, Pengchen Zhu, Dongming Wang, Xiaohu You

    Abstract: Multi-static integrated sensing and communication (ISAC) technology, which can achieve a wider coverage range and avoid self-interference, is an important trend for the future development of ISAC. Existing multi-static ISAC designs are unable to support the asymmetric uplink (UL)/downlink (DL) communication requirements in the scenario while simultaneously achieving optimal sensing performance. Th… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.07846  [pdf, other

    eess.AS

    DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion

    Authors: Ziqian Ning, Shuai Wang, Pengcheng Zhu, Zhichao Wang, Jixun Yao, Lei Xie, Mengxiao Bi

    Abstract: Streaming voice conversion has become increasingly popular for its potential in real-time applications. The recently proposed DualVC 2 has achieved robust and high-quality streaming voice conversion with a latency of about 180ms. Nonetheless, the recognition-synthesis framework hinders end-to-end optimization, and the instability of automatic speech recognition (ASR) model with short chunks makes… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  4. arXiv:2405.04000  [pdf, other

    cs.RO eess.SY

    Distributed Invariant Kalman Filter for Cooperative Localization using Matrix Lie Groups

    Authors: Yizhi Zhou, Yufan Liu, Pengxiang Zhu, Xuan Wang

    Abstract: This paper studies the problem of Cooperative Localization (CL) for multi-robot systems, where a group of mobile robots jointly localize themselves by using measurements from onboard sensors and shared information from other robots. We propose a novel distributed invariant Kalman Filter (DInEKF) based on the Lie group theory, to solve the CL problem in a 3-D environment. Unlike the standard EKF wh… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2404.19242  [pdf, other

    cs.CV eess.IV stat.ME

    A Minimal Set of Parameters Based Depth-Dependent Distortion Model and Its Calibration Method for Stereo Vision Systems

    Authors: Xin Ma, Puchen Zhu, Xiao Li, Xiaoyin Zheng, Jianshu Zhou, Xuchen Wang, Kwok Wai Samuel Au

    Abstract: Depth position highly affects lens distortion, especially in close-range photography, which limits the measurement accuracy of existing stereo vision systems. Moreover, traditional depth-dependent distortion models and their calibration methods have remained complicated. In this work, we propose a minimal set of parameters based depth-dependent distortion model (MDM), which considers the radial an… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted for publication in IEEE Transactions on Instrumentation and Measurement

  6. arXiv:2404.07721  [pdf, other

    eess.SP cs.IT

    Trainable Joint Channel Estimation, Detection and Decoding for MIMO URLLC Systems

    Authors: Yi Sun, Hong Shen, Bingqing Li, Wei Xu, Pengcheng Zhu, Nan Hu, Chunming Zhao

    Abstract: The receiver design for multi-input multi-output (MIMO) ultra-reliable and low-latency communication (URLLC) systems can be a tough task due to the use of short channel codes and few pilot symbols. Consequently, error propagation can occur in traditional turbo receivers, leading to performance degradation. Moreover, the processing delay induced by information exchange between different modules may… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 17 pages, 12 figures, accepted by IEEE Transactions on Wireless Communications

  7. arXiv:2312.16850  [pdf, other

    cs.SD eess.AS

    Accent-VITS:accent transfer for end-to-end TTS

    Authors: Linhan Ma, Yongmao Zhang, Xinfa Zhu, Yi Lei, Ziqian Ning, Pengcheng Zhu, Lei Xie

    Abstract: Accent transfer aims to transfer an accent from a source speaker to synthetic speech in the target speaker's voice. The main challenge is how to effectively disentangle speaker timbre and accent which are entangled in speech. This paper presents a VITS-based end-to-end accent transfer model named Accent-VITS.Based on the main structure of VITS, Accent-VITS makes substantial improvements to enable… ▽ More

    Submitted 29 December, 2023; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Accepted by NCMMSC2023

  8. A Coverage Control-based Idle Vehicle Rebalancing Approach for Autonomous Mobility-on-Demand Systems

    Authors: Pengbo Zhu, Isik Ilber Sirmatel, Giancarlo Ferrari-Trecate, Nikolas Geroliminis

    Abstract: As an emerging mode of urban transportation, Autonomous Mobility-on-Demand (AMoD) systems show the potential in improving mobility in cities through timely and door-to-door services. However, the spatiotemporal imbalances between mobility demand and supply may lead to inefficiencies and a low quality of service. Vehicle rebalancing (i.e., dispatching idle vehicles to high-demand areas), is a poten… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  9. arXiv:2311.06003  [pdf, ps, other

    eess.SP

    Passive Integrated Sensing and Communication Scheme based on RF Fingerprint Information Extraction for Cell-Free RAN

    Authors: **gxuan Yu, Fan Zeng, Jiamin Li, Feiyang Liu, Pengcheng Zhu, Dongming Wang, Xiaohu You

    Abstract: This paper investigates how to achieve integrated sensing and communication (ISAC) based on a cell-free radio access network (CF-RAN) architecture with a minimum footprint of communication resources. We propose a new passive sensing scheme. The scheme is based on the radio frequency (RF) fingerprint learning of the RF radio unit (RRU) to build an RF fingerprint library of RRUs. The source RRU is i… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 11 pages, 6 figures, submitted on 28-Feb-2023, China Communication, Accepted on 14-Sep-2023

  10. arXiv:2311.03419  [pdf, other

    eess.AS cs.LG cs.SD

    Personalizing Keyword Spotting with Speaker Information

    Authors: Beltrán Labrador, Pai Zhu, Guanlong Zhao, Angelo Scorza Scarpati, Quan Wang, Alicia Lozano-Diez, Alex Park, Ignacio López Moreno

    Abstract: Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups. To address this challenge, we propose a novel approach that integrates speaker information into keyword spotting using Feature-wise Linear Modulation (FiLM), a recent method for learning from multiple sources of information. We explore both Text-Dependent and Text-Independent speaker… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  11. arXiv:2309.15496  [pdf, other

    eess.AS cs.SD

    DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion

    Authors: Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Shuai Wang, Jixun Yao, Lei Xie, Mengxiao Bi

    Abstract: Voice conversion is becoming increasingly popular, and a growing number of application scenarios require models with streaming inference capabilities. The recently proposed DualVC attempts to achieve this objective through streaming model architecture design and intra-model knowledge distillation along with hybrid predictive coding to compensate for the lack of future information. However, DualVC… ▽ More

    Submitted 18 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted by ICASSP2024

  12. arXiv:2309.00559  [pdf, other

    eess.SP cs.IT

    Signal Processing and Learning for Next Generation Multiple Access in 6G

    Authors: Wei Chen, Yuanwei Liu, Hamid Jafarkhani, Yonina C. Eldar, Peiying Zhu, Khaled B Letaief

    Abstract: Wireless communication systems to date primarily rely on the orthogonality of resources to facilitate the design and implementation, from user access to data transmission. Emerging applications and scenarios in the sixth generation (6G) wireless systems will require massive connectivity and transmission of a deluge of data, which calls for more flexibility in the design concept that goes beyond or… ▽ More

    Submitted 9 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

  13. arXiv:2308.13789  [pdf

    eess.SP

    Sensiverse: A dataset for ISAC study

    Authors: Jia** Luo, Baojian Zhou, Yang Yu, ** Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong

    Abstract: In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  14. arXiv:2308.10428  [pdf, other

    eess.AS cs.SD

    Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models

    Authors: Heyang Xue, Shuai Guo, Pengcheng Zhu, Mengxiao Bi

    Abstract: Despite imperfect score-matching causing drift in training and sampling distributions of diffusion models, recent advances in diffusion-based acoustic models have revolutionized data-sufficient single-speaker Text-to-Speech (TTS) approaches, with Grad-TTS being a prime example. However, the sampling drift problem leads to these approaches struggling in multi-speaker scenarios in practice due to mo… ▽ More

    Submitted 31 August, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

  15. arXiv:2305.16105  [pdf, ps, other

    eess.SP

    Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

    Authors: Kang Li, Pengcheng Zhu, Yan Wang, Fu-Chun Zheng, Xiaohu You

    Abstract: Ultra-reliable and low-latency communications (URLLC) is firstly proposed in 5G networks, and expected to support applications with the most stringent quality-of-service (QoS). However, since the wireless channels vary dynamically, the transmit power for ensuring the QoS requirements of URLLC may be very high, which conflicts with the power limitation of a real system. To fulfill the successful UR… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 16 pages, 11 figures

  16. arXiv:2305.15985  [pdf, other

    cs.IT eess.SP

    Resource Allocation in Cell-Free MU-MIMO Multicarrier System with Finite and Infinite Blocklength

    Authors: Jiafei Fu, Pengcheng Zhu, Bo Ai, Jiangzhou Wang, Xiaohu You

    Abstract: The explosive growth of data results in more scarce spectrum resources. It is important to optimize the system performance under limited resources. In this paper, we investigate how to achieve weighted throughput (WTP) maximization for cell-free (CF) multiuser MIMO (MU-MIMO) multicarrier (MC) systems through resource allocation (RA), in the cases of finite blocklength (FBL) and infinite blocklengt… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  17. arXiv:2305.15935  [pdf, ps, other

    cs.IT eess.SP

    Grou** Method for mmWave Massive MIMO System: Exploitation of Angular Multiplexing Gain

    Authors: Peng Jiang, Pengcheng Zhu, Jiamin Li, Dongming Wang

    Abstract: A future millimeter-wave (mmWave) massive multiple-input and multiple-output (MIMO) system may serve hundreds or thousands of users at the same time; thus, research on multiple access technology is particularly important.Moreover, due to the short-wavelength nature of a mmWave, large-scale arrays are easier to implement than microwaves, while their directivity and sparseness make the physical beam… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 12 pages,16 figures

  18. arXiv:2305.12425  [pdf, other

    eess.AS cs.SD

    DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding

    Authors: Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Jixun Yao, Shuai Wang, Lei Xie, Mengxiao Bi

    Abstract: Voice conversion is an increasingly popular technology, and the growing number of real-time applications requires models with streaming conversion capabilities. Unlike typical (non-streaming) voice conversion, which can leverage the entire utterance as full context, streaming voice conversion faces significant challenges due to the missing future information, resulting in degraded intelligibility,… ▽ More

    Submitted 30 May, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

  19. arXiv:2305.08882  [pdf, other

    eess.IV physics.med-ph physics.optics

    Model-driven CT reconstruction algorithm for nano-resolution X-ray phase contrast imaging

    Authors: Xuebao Cai, Yuhang Tan, Ting Su, Dong Liang, Hairong Zheng, **you Xu, Pei** Zhu, Yongshuai Ge

    Abstract: The low-density imaging performance of a zone plate based nano-resolution hard X-ray computed tomography (CT) system can be significantly improved by incorporating a grating-based Lau interferometer. Due to the diffraction, however, the acquired nano-resolution phase signal may suffer splitting problem, which impedes the direct reconstruction of phase contrast CT (nPCT) images. To overcome, a new… ▽ More

    Submitted 13 October, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

  20. arXiv:2302.14536  [pdf, other

    eess.SP

    On the Road to 6G: Visions, Requirements, Key Technologies and Testbeds

    Authors: Cheng-Xiang Wang, Xiaohu You, Xiqi Gao, Xiuming Zhu, Zixin Li, Chuan Zhang, Haiming Wang, Yongming Huang, Yunfei Chen, Harald Haas, John S. Thompson, Erik G. Larsson, Marco Di Renzo, Wen Tong, Peiying Zhu, Xuemin, Shen, H. Vincent Poor, Lajos Hanzo

    Abstract: Fifth generation (5G) mobile communication systems have entered the stage of commercial development, providing users with new services and improved user experiences as well as offering a host of novel opportunities to various industries. However, 5G still faces many challenges. To address these challenges, international industrial, academic, and standards organizations have commenced research on s… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  21. arXiv:2302.08107  [pdf, other

    cs.IT eess.SP

    Spectral Efficiency and Scalability Analysis for Multi-Level Cooperative Cell-Free Massive MIMO Systems

    Authors: Jiamin Li, Xiaoyu Sun, Pengcheng Zhu, Dongming Wang, Xiaohu You

    Abstract: This paper proposes a multi-level cooperative architecture to balance the spectral efficiency and scalability of cell-free massive multiple-input multiple-output (MIMO) systems. In the proposed architecture, spatial expansion units (SEUs) are introduced to avoid a large amount of computation at the access points (APs) and increase the degree of cooperation among APs. We first derive the closed-for… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 5 pages, 3 figures

  22. arXiv:2302.05571  [pdf, other

    cs.IT eess.SP

    Network-Assisted Full-Duplex Cell-Free mmWave Massive MIMO Systems with DAC Quantization and Fronthaul Compression

    Authors: Jiamin Li, Qingrui Fan, Yu Zhang, Pengcheng Zhu, Dongming Wang, Hao Wu, Xiaohu You

    Abstract: In this paper, we investigate network-assisted full-duplex (NAFD) cell-free millimeter-wave (mmWave) massive multiple-input multiple-output (MIMO) systems with digital-to-analog converter (DAC) quantization and fronthaul compression. We propose to maximize the weighted uplink and downlink sum rate by jointly optimizing the power allocation of both the transmitting remote antenna units (T-RAUs) and… ▽ More

    Submitted 17 February, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: accepted by China Communications

  23. arXiv:2302.04456  [pdf, other

    cs.SD cs.AI cs.CL cs.MM eess.AS

    ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

    Authors: Pengfei Zhu, Chao Pang, Yekun Chai, Lei Li, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu

    Abstract: In recent years, the burgeoning interest in diffusion models has led to significant advances in image and speech generation. Nevertheless, the direct synthesis of music waveforms from unrestricted textual prompts remains a relatively underexplored domain. In response to this lacuna, this paper introduces a pioneering contribution in the form of a text-to-waveform music generation model, underpinne… ▽ More

    Submitted 21 September, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted by AACL demo 2023

  24. arXiv:2301.08863  [pdf, other

    cs.NI eess.SP

    HAPS for 6G Networks: Potential Use Cases, Open Challenges, and Possible Solutions

    Authors: Omid Abbasi, Animesh Yadav, Halim Yanikomeroglu, Ngoc Dung Dao, Gamini Senarath, Peiying Zhu

    Abstract: High altitude platform station (HAPS), which is deployed in the stratosphere at an altitude of 20-50 kilometres, has attracted much attention in recent years due to their large footprint, line-of-sight links, and fixed position relative to the Earth. Compared with existing network infrastructure, HAPS has a much larger coverage area than terrestrial base stations and is much closer than satellites… ▽ More

    Submitted 11 April, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

  25. Optimization of the energy efficiency in Smart Internet of Vehicles assisted by MEC

    Authors: Jiafei Fu, Pengcheng Zhu, **gyu Hua, Jiamin Li, Jiangang Wen

    Abstract: Smart Internet of Vehicles (IoV) as a promising application in Internet of Things (IoT) emerges with the development of the fifth generation mobile communication (5G). Nevertheless, the heterogeneous requirements of sufficient battery capacity, powerful computing ability and energy efficiency for electric vehicles face great challenges due to the explosive data growth in 5G and the sixth generatio… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: 17 pages, 9 figures, EURASIP J. Adv. Signal Process

    Journal ref: EURASIP J. Adv. Signal Process. 2022 (2022) 13

  26. Performance Analysis and Optimization of Network-Assisted Full-Duplex Systems under Low-Resolution ADCs

    Authors: Xiangning Song, Zhenhao Ji, Jiamin Li, Pengcheng Zhu, Dongming Wang, Xiaohu You

    Abstract: Network-assisted full-duplex (NAFD) distributed massive multiple input multiple output (M-MIMO) enables the in-band full-duplex with existing half-duplex devices at the network level, which exceptionally improves spectral efficiency. This paper analyzes the impact of low-resolution analog-to-digital converters (ADCs) on NAFD distributed M-MIMO and designs an efficient bit allocation algorithm for… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

  27. arXiv:2211.04710  [pdf, other

    eess.AS cs.SD

    Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features

    Authors: Ziqian Ning, Qicong Xie, Pengcheng Zhu, Zhichao Wang, Liumeng Xue, Jixun Yao, Lei Xie, Mengxiao Bi

    Abstract: Voice conversion for highly expressive speech is challenging. Current approaches struggle with the balancing between speaker similarity, intelligibility and expressiveness. To address this problem, we propose Expressive-VC, a novel end-to-end voice conversion framework that leverages advantages from both neural bottleneck feature (BNF) approach and information perturbation approach. Specifically,… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  28. arXiv:2211.03545  [pdf, other

    eess.AS cs.CL cs.SD

    ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

    Authors: Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu

    Abstract: Speech representation learning has improved both speech understanding and speech synthesis tasks for single language. However, its ability in cross-lingual scenarios has not been explored. In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing. We prop… ▽ More

    Submitted 4 December, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

  29. arXiv:2206.13722   

    eess.SP cs.IT

    Low Altitude 3-D Coverage Performance Analysis in Cell-Free Distributed Collaborative Massive MIMO Systems

    Authors: Jiamin Li, Qijun Pan, Pengcheng Zhu, Dongming Wang, Xiaohu You

    Abstract: To improve the poor performance of distributed operation and non-scalability of centralized operation in traditional cell-free massive MIMO, we propose a cell-free distributed collaborative (CFDC) massive multiple-input multiple-output (MIMO) system based on a novel two-layer model to take advantages of the distributed cloud-edge-end collaborative architecture in beyond 5G (B5G) internet of things… ▽ More

    Submitted 28 March, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: The work is further studied and the content of the paper is updated. So, temporarily withdrawn for these reasons

  30. arXiv:2205.03883  [pdf

    eess.IV cs.CV

    WKGM: Weight-K-space Generative Model for Parallel Imaging Reconstruction

    Authors: Zongjiang Tu, Die Liu, Xiaoqing Wang, Chen Jiang, Pengwen Zhu, Minghui Zhang, Shanshan Wang, Dong Liang, Qiegen Liu

    Abstract: Deep learning based parallel imaging (PI) has made great progresses in recent years to accelerate magnetic resonance imaging (MRI). Nevertheless, it still has some limitations, such as the robustness and flexibility of existing methods have great deficiency. In this work, we propose a method to explore the k-space domain learning via robust generative modeling for flexible calibration-less PI reco… ▽ More

    Submitted 24 November, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

    Comments: 11pages, 12 figures

  31. arXiv:2204.08013  [pdf, other

    physics.med-ph eess.IV

    One-step Method for Material Quantitation using In-line Tomography with Single Scanning

    Authors: Suyu Liao, Shiwo Deng, Yining Zhu, Huitao Zhang, Pei** Zhu, Kai Zhang, Xing Zhao

    Abstract: Objective: Quantitative technique based on In-line phase-contrast computed tomography with single scanning attracts more attention in application due to the flexibility of the implementation. However, the quantitative results usually suffer from artifacts and noise, since the phase retrieval and reconstruction are independent ("two-steps") without feedback from the original data. Our goal is to de… ▽ More

    Submitted 17 April, 2022; originally announced April 2022.

    Journal ref: IEEE Transactions on Biomedical Engineering, 2022

  32. arXiv:2203.16408  [pdf, other

    cs.SD eess.AS

    Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher

    Authors: Heyang Xue, Xinsheng Wang, Yongmao Zhang, Lei Xie, Pengcheng Zhu, Mengxiao Bi

    Abstract: Building a high-quality singing corpus for a person who is not good at singing is non-trivial, thus making it challenging to create a singing voice synthesizer for this person. Learn2Sing is dedicated to synthesizing the singing voice of a speaker without his or her singing data by learning from data recorded by others, i.e., the singing teacher. Inspired by the fact that pitch is the key style fa… ▽ More

    Submitted 26 May, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Submitted to INTERSPEECH 2022

  33. arXiv:2203.05956  [pdf, other

    eess.IV cs.CV cs.LG

    Label-efficient Hybrid-supervised Learning for Medical Image Segmentation

    Authors: Junwen Pan, Qi Bi, Yanzhan Yang, Pengfei Zhu, Cheng Bian

    Abstract: Due to the lack of expertise for medical image annotation, the investigation of label-efficient methodology for medical image segmentation becomes a heated topic. Recent progresses focus on the efficient utilization of weak annotations together with few strongly-annotated labels so as to achieve comparable segmentation performance in many unprofessional scenarios. However, these approaches only co… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

    Comments: Accepted to AAAI 2022

  34. arXiv:2201.07429  [pdf, other

    cs.SD cs.DB eess.AS

    Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis

    Authors: Yu Wang, Xinsheng Wang, Pengcheng Zhu, Jie Wu, Hanzhao Li, Heyang Xue, Yongmao Zhang, Lei Xie, Mengxiao Bi

    Abstract: This paper introduces Opencpop, a publicly available high-quality Mandarin singing corpus designed for singing voice synthesis (SVS). The corpus consists of 100 popular Mandarin songs performed by a female professional singer. Audio files are recorded with studio quality at a sampling rate of 44,100 Hz and the corresponding lyrics and musical scores are provided. All singing recordings have been p… ▽ More

    Submitted 19 January, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: will be submitted to Interspeech 2022

  35. arXiv:2111.12277  [pdf, other

    eess.AS cs.SD

    One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation

    Authors: Zhichao Wang, Qicong Xie, Tao Li, Hongqiang Du, Lei Xie, Pengcheng Zhu, Mengxiao Bi

    Abstract: One-shot style transfer is a challenging task, since training on one utterance makes model extremely easy to over-fit to training data and causes low speaker similarity and lack of expressiveness. In this paper, we build on the recognition-synthesis framework and propose a one-shot voice conversion approach for style transfer based on speaker adaptation. First, a speaker normalization module is ad… ▽ More

    Submitted 21 February, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted by ICASSP 2022

  36. arXiv:2110.08813  [pdf, other

    eess.AS cs.SD

    VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis

    Authors: Yongmao Zhang, Jian Cong, Heyang Xue, Lei Xie, Pengcheng Zhu, Mengxiao Bi

    Abstract: In this paper, we propose VISinger, a complete end-to-end high-quality singing voice synthesis (SVS) system that directly generates audio waveform from lyrics and musical score. Our approach is inspired by VITS, which adopts VAE-based posterior encoder augmented with normalizing flow-based prior encoder and adversarial decoder to realize complete end-to-end speech generation. VISinger follows the… ▽ More

    Submitted 24 February, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

    Comments: 5 pages, ICASSP 2022

  37. arXiv:2110.00896  [pdf, other

    eess.IV cs.CV

    Disarranged Zone Learning (DZL): An unsupervised and dynamic automatic stenosis recognition methodology based on coronary angiography

    Authors: Yanan Dai, Pengxiong Zhu, Bangde Xue, Yun Ling, Xibao Shi, Liang Geng, Qi Zhang, Jun Liu

    Abstract: We proposed a novel unsupervised methodology named Disarranged Zone Learning (DZL) to automatically recognize stenosis in coronary angiography. The methodology firstly disarranges the frames in a video, secondly it generates an effective zone and lastly trains an encoder-decoder GRU model to learn the capability to recover disarranged frames. The breakthrough of our study is to discover and valida… ▽ More

    Submitted 2 October, 2021; originally announced October 2021.

  38. Cell Multi-Bernoulli (Cell-MB) Sensor Control for Multi-object Search-While-Tracking (SWT)

    Authors: Keith A. LeGrand, **** Zhu, Silvia Ferrari

    Abstract: Information-driven control can be used to develop intelligent sensors that can optimize their measurement value based on environmental feedback. In object tracking applications, sensor actions are chosen based on the expected reduction in uncertainty also known as information gain. Random finite set (RFS) theory provides a formalism for quantifying and estimating information gain in multi-object t… ▽ More

    Submitted 11 July, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, No. 6, 01 June 2023, 7195 - 7207

  39. Spectral Efficiency Analysis of Cell-free Distributed Massive MIMO Systems with Imperfect Covariance Matrix

    Authors: Feng Ye, Jiamin Li, Pengcheng Zhu, Dongming Wang, Xiaohu You

    Abstract: In this paper, the impacts of imperfect channel covariance matrix on the spectral efficiency (SE) of cell-free distributed massive multiple-input multiple-output (MIMO) systems are analyzed. We propose to estimate the channel covariance matrix by alternately using the assigned pilots and their phase-shifted pilots in different coherent blocks, which improves the accuracy of channel estimation with… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

  40. arXiv:2104.02447  [pdf, ps, other

    cs.IT eess.SP

    On Performance Loss of DOA Measurement Using Massive MIMO Receiver with Mixed-ADCs

    Authors: Baihua Shi, Lingling Zhu, Wenlong Cai, Nuo Chen, Tong Shen, Pengcheng Zhu, Feng Shu, Jiangzhou Wang

    Abstract: High hardware cost and high power consumption of massive multiple-input and multiple output (MIMO) are two challenges for the future wireless communications including beyond fifth generation (B5G) and sixth generation (6G). Adopting the low-resolution analog-to-digital converter (ADC) is viewed as a promising solution. Additionally, the direction of arrival (DOA) estimation is an indispensable tec… ▽ More

    Submitted 18 January, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: 5 pages, 4 pages

  41. arXiv:2101.10703  [pdf, ps, other

    cs.IT eess.SP

    Privacy-preserving Channel Estimation in Cell-free Hybrid Massive MIMO Systems

    Authors: Jun Xu, Xiaodong Wang, Pengcheng Zhu, Xiaohu You

    Abstract: We consider a cell-free hybrid massive multiple-input multiple-output (MIMO) system with $K$ users and $M$ access points (APs), each with $N_a$ antennas and $N_r< N_a$ radio frequency (RF) chains. When $K\ll M{N_a}$, efficient uplink channel estimation and data detection with reduced number of pilots can be performed based on low-rank matrix completion. However, such a scheme requires the central… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: 30pages, 10figures

  42. arXiv:2009.02771  [pdf, other

    cs.NI eess.SY

    A Vision of Self-Evolving Network Management for Future Intelligent Vertical HetNet

    Authors: Tasneem Darwish, Gunes Karabulut Kurt, Halim Yanikomeroglu, Gamini Senarath, Peiying Zhu

    Abstract: Future integrated terrestrial-aerial-satellite networks will have to exhibit some unprecedented characteristics for the provision of both communications and computation services, and security for a tremendous number of devices with very broad and demanding requirements across multiple networks, operators, and ecosystems. Although 3GPP introduced the concept of self-organization networks (SONs) in… ▽ More

    Submitted 9 March, 2021; v1 submitted 6 September, 2020; originally announced September 2020.

  43. arXiv:2007.08747  [pdf, other

    eess.SP

    High Altitude Platform Station based Super Macro Base Station (HAPS-SMBS) Constellations

    Authors: Md Sahabul Alam, Gunes Karabulut Kurt, Halim Yanikomeroglu, Peiying Zhu, Ngoc Dũng Đào

    Abstract: High altitude platform station (HAPS) systems have recently attracted renewed attention. While terrestrial and satellite technologies are well-established for providing connectivity services, they face certain shortcomings and challenges, which could be addressed by complementing them with HAPS systems. In this paper, we envision a HAPS as a super macro base station, which we refer to as HAPS-SMBS… ▽ More

    Submitted 22 September, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

  44. arXiv:2006.09328  [pdf, ps, other

    eess.SP eess.SY

    Aerial Platforms with Reconfigurable Smart Surfaces for 5G and Beyond

    Authors: Safwan Alfattani, Wael Jaafar, Yassine Hmamouche, Halim Yanikomeroglu, Abbas Yongaçoglu, Ng\d{o}c Dũng Đào, Peiying Zhu

    Abstract: Aerial platforms are expected to deliver enhanced and seamless connectivity in the fifth generation (5G) wireless networks and beyond (B5G). This is generally achievable by supporting advanced onboard communication features embedded in heavy and energy-intensive equipment. Alternatively, reconfigurable smart surfaces (RSS), which smartly exploit/recycle signal reflections in the environment, are i… ▽ More

    Submitted 4 November, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: To appear in IEEE Communications Magazine

  45. arXiv:2005.11684  [pdf

    eess.SP

    Deep Learning-based Modulation Detection for NOMA Systems

    Authors: Wenwu Xie, Jian Xiao, **xia Yang, Xin Peng, Chao Yu, Peng Zhu

    Abstract: Since the signal with strong power should be demodulated first for successive interference cancellation (SIC) demodulation in non-orthogonal multiple access (NOMA) systems, the base station (BS) should inform the near user terminal (UT), which has allocated higher power, of modulation mode of the far user terminal. To avoid unnecessary signaling overhead in this process, a blind detection algorith… ▽ More

    Submitted 16 October, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

  46. arXiv:2005.10406  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Training Keyword Spotting Models on Non-IID Data with Federated Learning

    Authors: Andrew Hard, Kurt Partridge, Cameron Nguyen, Niranjan Subrahmanya, Aishanee Shah, Pai Zhu, Ignacio Lopez Moreno, Rajiv Mathews

    Abstract: We demonstrate that a production-quality keyword-spotting model can be trained on-device using federated learning and achieve comparable false accept and false reject rates to a centrally-trained model. To overcome the algorithmic constraints associated with fitting on-device data (which are inherently non-independent and identically distributed), we conduct thorough empirical studies of optimizat… ▽ More

    Submitted 4 June, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  47. arXiv:2003.02437  [pdf, other

    cs.CV cs.LG eess.IV

    Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning

    Authors: Yiming Sun, Bing Cao, Pengfei Zhu, Qinghua Hu

    Abstract: Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image. It empowers smart city traffic management and disaster rescue. Researchers have made mount of efforts in this area and achieved considerable progress. Nevertheless, it is still a challenge when the objects are hard to distinguish, especially in low light conditions. To tackle this problem, we con… ▽ More

    Submitted 14 October, 2021; v1 submitted 5 March, 2020; originally announced March 2020.

  48. SEAN: Image Synthesis with Semantic Region-Adaptive Normalization

    Authors: Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka

    Abstract: We propose semantic region-adaptive normalization (SEAN), a simple but effective building block for Generative Adversarial Networks conditioned on segmentation masks that describe the semantic regions in the desired output image. Using SEAN normalization, we can build a network architecture that can control the style of each semantic region individually, e.g., we can specify one style reference im… ▽ More

    Submitted 24 May, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: Accepted as a CVPR 2020 oral paper. The interactive demo is available at https://youtu.be/0Vbj9xFgoUw

  49. arXiv:1909.01124  [pdf, ps, other

    eess.SP

    Power Minimization for Wireless Backhaul Based Ultra-Dense Cache-enabled C-RAN

    Authors: Jun Xu, Pengcheng Zhu, Jiamin Li, Xiaohu You

    Abstract: This correspondence paper investigates joint design of small base station (SBS) clustering, multicast beamforming for access and backhaul links, as well as frequency allocation in backhaul transmission to minimize the total power consumption for wireless backhaul based ultra-dense cache-enabled cloud radio access network (C-RAN). To solve this nontrivial problem, we develop a low-complexity algori… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

  50. Deep Learning Based Pilot Design for Multi-user Distributed Massive MIMO Systems

    Authors: Jun Xu, Pengcheng Zhu, Jiamin Li, Xiaohu You

    Abstract: This letter proposes a deep learning based pilot design scheme to minimize the sum mean square error (MSE) of channel estimation for multi-user distributed massive multiple-input multiple-output (MIMO) systems. The pilot signal of each user is expressed as a weighted superposition of orthonormal pilot sequence basis, where the power assigned to each pilot sequence is the corresponding weight. A mu… ▽ More

    Submitted 18 March, 2019; originally announced March 2019.

    Comments: 4 Pages, 4 figures, accepted by IEEE Wireless Communications Letters