Skip to main content

Showing 1–14 of 14 results for author: Zong, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.12695  [pdf, other

    eess.SY

    Electrification of Clay Calcination: A First Look into Dynamic Modeling and Energy Management for Integration with Sustainable Power Grids

    Authors: Bruno Laurini, Nicola Cantisani, Wilson R. Leal da Silva, Yi Zong, John Bagterp Jørgensen

    Abstract: This article explores the electrification in clay calcination, proposing a dynamic model and energy management strategy for the integration of electrified calcination plants into sustainable power grids. A theoretical dynamic modeling of the electrified calcination process is introduced, aiming at outlining temperature profiles and energy usage - thus exploring the feasibility of electrification.… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Submitted to International Conference on Calcinated Clays for Sustainable Concrete 2024

  2. arXiv:2403.01494  [pdf, other

    eess.AS cs.SD eess.SP

    PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion

    Authors: Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian

    Abstract: In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human perception. To improve the content naturalness of converted audio, we have developed an end-to-end EVC architecture inspired by the high audio quality of… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP2024

  3. arXiv:2401.12925  [pdf, other

    cs.SD eess.AS

    Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition

    Authors: Yan Zhao, **cen Wang, Cheng Lu, Sunan Li, Björn Schuller, Yuan Zong, Wenming Zheng

    Abstract: Cross-corpus speech emotion recognition (SER) aims to transfer emotional knowledge from a labeled source corpus to an unlabeled corpus. However, prior methods require access to source data during adaptation, which is unattainable in real-life scenarios due to data privacy protection concerns. This paper tackles a more practical task, namely source-free cross-corpus SER, where a pre-trained source… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  4. arXiv:2401.09752  [pdf, other

    cs.SD cs.LG eess.AS

    Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

    Authors: Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng

    Abstract: In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers. Consequently, when the trained model is confronted with data from new speakers, its performance tends to degrade. To address the issue, we propose a Dynamic Joint Distribu… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  5. arXiv:2312.06466  [pdf, other

    cs.SD eess.AS

    Towards Domain-Specific Cross-Corpus Speech Emotion Recognition Approach

    Authors: Yan Zhao, Yuan Zong, Hailun Lian, Cheng Lu, **gang Shi, Wenming Zheng

    Abstract: Cross-corpus speech emotion recognition (SER) poses a challenge due to feature distribution mismatch, potentially degrading the performance of established SER methods. In this paper, we tackle this challenge by proposing a novel transfer subspace learning method called acoustic knowledgeguided transfer linear regression (AKTLR). Unlike existing approaches, which often overlook domain-specific know… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  6. arXiv:2310.03992  [pdf, other

    cs.SD eess.AS

    Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

    Authors: Yan Zhao, Yuan Zong, **cen Wang, Hailun Lian, Cheng Lu, Li Zhao, Wenming Zheng

    Abstract: In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  7. arXiv:2309.14761  [pdf, other

    eess.AS cs.SD

    Optimization Techniques for a Physical Model of Human Vocalisation

    Authors: Mateo Cámara, Zhiyuan Xu, Yisu Zong, José Luis Blanco, Joshua D. Reiss

    Abstract: We present a non-supervised approach to optimize and evaluate the synthesis of non-speech audio effects from a speech production model. We use the Pink Trombone synthesizer as a case study of a simplified production model of the vocal tract to target non-speech human audio signals --yawnings. We selected and optimized the control parameters of the synthesizer to minimize the difference between rea… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to DAFx 2023

  8. arXiv:2308.14568  [pdf, other

    cs.SD eess.AS

    Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition

    Authors: Yong Wang, Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Sunan Li

    Abstract: In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we firs… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by International Conference on Neural Information Processing (ICONIP2023)

  9. arXiv:2302.08921  [pdf, other

    cs.SD cs.CL eess.AS

    Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

    Authors: Yan Zhao, **cen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao

    Abstract: In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled testing (target) speech signals come from different corpora. Specifically, DIDAN first adopts a simple deep regression network consisting of a set of conv… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  10. arXiv:2210.12430  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Speech Emotion Recognition via an Attentive Time-Frequency Neural Network

    Authors: Cheng Lu, Wenming Zheng, Hailun Lian, Yuan Zong, Chuangao Tang, Sunan Li, Yan Zhao

    Abstract: Spectrogram is commonly used as the input feature of deep neural networks to learn the high(er)-level time-frequency pattern of speech signal for speech emotion recognition (SER). \textcolor{black}{Generally, different emotions correspond to specific energy activations both within frequency bands and time frames on spectrogram, which indicates the frequency and time domains are both essential to r… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: This paper has been accepted as a regular paper on IEEE Transactions on Computational Social Systems

  11. arXiv:2210.01725  [pdf, other

    cs.LG cs.AI eess.IV

    MEDFAIR: Benchmarking Fairness for Medical Imaging

    Authors: Yongshuo Zong, Yongxin Yang, Timothy Hospedales

    Abstract: A multitude of work has shown that machine learning-based medical diagnosis systems can be biased against certain subgroups of people. This has motivated a growing number of bias mitigation algorithms that aim to address fairness issues in machine learning. However, it is difficult to compare their effectiveness in medical imaging for two reasons. First, there is little consensus on the criteria t… ▽ More

    Submitted 17 February, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted to ICLR 2023

  12. arXiv:2208.04945  [pdf, other

    eess.IV

    Multiscale Autoencoder with Structural-Functional Attention Network for Alzheimer's Disease Prediction

    Authors: Yongcheng Zong, Changhong **g, Qiankun Zuo

    Abstract: The application of machine learning algorithms to the diagnosis and analysis of Alzheimer's disease (AD) from multimodal neuroimaging data is a current research hotspot. It remains a formidable challenge to learn brain region information and discover disease mechanisms from various magnetic resonance images (MRI). In this paper, we propose a simple but highly efficient end-to-end model, a multisca… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  13. Modelling and Synchronisation of Delayed Packet-Coupled Oscillators in Industrial Wireless Sensor Networks

    Authors: Yan Zong, Xuewu Dai, Pep Canyelles-Pericas, Krishna Busawon, Richard Binns, Zhiwei Gao

    Abstract: In this paper, a Packet-Coupled Oscillators (PkCOs) synchronisation protocol is proposed for time-sensitive Wireless Sensor Networks (WSNs) based on Pulse-Coupled Oscillators (PCO) in mathematical biology. The effects of delays on synchronisation performance are studied through mathematical modelling and analysis of packet exchange and processing delays. The delay compensation strategy (i.e., feed… ▽ More

    Submitted 18 April, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

  14. arXiv:1906.01704  [pdf, other

    q-bio.NC eess.SP

    A Novel Bi-hemispheric Discrepancy Model for EEG Emotion Recognition

    Authors: Yang Li, Wenming Zheng, Lei Wang, Yuan Zong, Lei Qi, Zhen Cui, Tong Zhang, Tengfei Song

    Abstract: The neuroscience study has revealed the discrepancy of emotion expression between left and right hemispheres of human brain. Inspired by this study, in this paper, we propose a novel bi-hemispheric discrepancy model (BiHDM) to learn the asymmetric differences between two hemispheres for electroencephalograph (EEG) emotion recognition. Concretely, we first employ four directed recurrent neural netw… ▽ More

    Submitted 10 May, 2019; originally announced June 2019.