Skip to main content

Showing 1–8 of 8 results for author: Shang, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2310.10992  [pdf, other

    cs.SD eess.AS

    A High Fidelity and Low Complexity Neural Audio Coding

    Authors: Wenzhe Liu, Wei Xiao, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu

    Abstract: Audio coding is an essential module in the real-time communication system. Neural audio codecs can compress audio samples with a low bitrate due to the strong modeling and generative capabilities of deep neural networks. To address the poor high-frequency expression and high computational cost and storage consumption, we proposed an integrated framework that utilizes a neural network to model wide… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  2. arXiv:2305.05599  [pdf, other

    cs.SD cs.HC eess.AS

    Inter-SubNet: Speech Enhancement with Subband Interaction

    Authors: Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng

    Abstract: Subband-based approaches process subbands in parallel through the model with shared parameters to learn the commonality of local spectrums for noise reduction. In this way, they have achieved remarkable results with fewer parameters. However, in some complex environments, the lack of global spectral information has a negative impact on the performance of these subband-based approaches. To this end… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted by ICASSP 2023

  3. arXiv:2303.07704  [pdf, other

    eess.AS cs.SD

    TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge

    Authors: Yukai Ju, Jun Chen, Shimin Zhang, Shulin He, Wei Rao, Weixin Zhu, Yannan Wang, Tao Yu, Shidong Shang

    Abstract: This paper introduces the Unbeatable Team's submission to the ICASSP 2023 Deep Noise Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded version -- TEA-PSE 3.0. Specifically, TEA-PSE 3.0 incorporates a residual LSTM after squeezed temporal convolution network (S-TCN) to enhance sequence modeling capabilities. Additionally, the local-global representation (LGR) struct… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  4. arXiv:2211.05432  [pdf, other

    cs.SD eess.AS

    Speech Enhancement with Fullband-Subband Cross-Attention Network

    Authors: Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng

    Abstract: FullSubNet has shown its promising performance on speech enhancement by utilizing both fullband and subband information. However, the relationship between fullband and subband in FullSubNet is achieved by simply concatenating the output of fullband model and subband units. It only supplements the subband units with a small quantity of global information and has not considered the interaction betwe… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Accepted by InterSpeech 2022. arXiv admin note: text overlap with arXiv:2203.12188

  5. arXiv:2210.15853  [pdf, other

    cs.SD eess.AS

    Speech Enhancement with Intelligent Neural Homomorphic Synthesis

    Authors: Shulin He, Wei Rao, **jiang Liu, Jun Chen, Yukai Ju, Xueliang Zhang, Yannan Wang, Shidong Shang

    Abstract: Most neural network speech enhancement models ignore speech production mathematical models by directly map** Fourier transform spectrums or waveforms. In this work, we propose a neural source filter network for speech enhancement. Specifically, we use homomorphic signal processing and cepstral analysis to obtain noisy speech's excitation and vocal tract. Unlike traditional signal processing, we… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  6. arXiv:2203.16032  [pdf, other

    cs.SD eess.AS

    ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications

    Authors: Gaoxiong Yi, Wei Xiao, Yiming Xiao, Babak Naderi, Sebastian Möller, Wafaa Wardah, Gabriel Mittag, Ross Cutler, Zhuohuang Zhang, Donald S. Williamson, Fei Chen, Fuzheng Yang, Shidong Shang

    Abstract: With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is traditionally assessed in subjective tests in laborato… ▽ More

    Submitted 31 March, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  7. arXiv:2104.00960  [pdf, other

    eess.AS cs.SD

    INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

    Authors: Wei Rao, Yihui Fu, Yanxin Hu, Xin Xu, Yvkai Jv, Jiangyu Han, Zhongjie Jiang, Lei Xie, Yannan Wang, Shinji Watanabe, Zheng-Hua Tan, Hui Bu, Tao Yu, Shidong Shang

    Abstract: The ConferencingSpeech 2021 challenge is proposed to stimulate research on far-field multi-channel speech enhancement for video conferencing. The challenge consists of two separate tasks: 1) Task 1 is multi-channel speech enhancement with single microphone array and focusing on practical application with real-time requirement and 2) Task 2 is multi-channel speech enhancement with multiple distribu… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: 5 pages, submitted to INTERSPEECH 2021

  8. An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

    Authors: Shang Shang, Paul Cuff, Pan Hui, Sanjeev Kulkarni

    Abstract: We analyze a class of distributed quantized consensus algorithms for arbitrary static networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reac… ▽ More

    Submitted 24 September, 2014; originally announced September 2014.

    Comments: to appear in IEEE Trans. on Automatic Control, January, 2015. arXiv admin note: substantial text overlap with arXiv:1208.0788

    Journal ref: IEEE Trans. on Automatic Control, 60(4):1127-32, April, 2015