Skip to main content

Showing 1–6 of 6 results for author: Guizzo, E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2204.02385  [pdf, other

    eess.AS cs.LG cs.SD

    Learning Speech Emotion Representations in the Quaternion Domain

    Authors: Eric Guizzo, Tillman Weyde, Simone Scardapane, Danilo Comminiello

    Abstract: The modeling of human emotion expression in speech signals is an important, yet challenging task. The high resource demand of speech emotion recognition models, combined with the the general scarcity of emotion-labelled data are obstacles to the development and application of effective solutions in this field. In this paper, we present an approach to jointly circumvent these difficulties. Our meth… ▽ More

    Submitted 3 March, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Accepted for Publication in IEEE/ACM Transactions on Audio, Speech and Language Processing

  2. arXiv:2202.10372  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

    Authors: Eric Guizzo, Christian Marinoni, Marco Pennese, Xinlei Ren, Xiguang Zheng, Chen Zhang, Bruno Masiero, Aurelio Uncini, Danilo Comminiello

    Abstract: The L3DAS22 Challenge is aimed at encouraging the development of machine learning strategies for 3D speech enhancement and 3D sound localization and detection in office-like environments. This challenge improves and extends the tasks of the L3DAS21 edition. We generated a new dataset, which maintains the same general characteristics of L3DAS21 datasets, but with an extended number of data points a… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Comments: Accepted to 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022). arXiv admin note: substantial text overlap with arXiv:2104.05499

    Journal ref: 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 9186-9190

  3. arXiv:2104.05499  [pdf, ps, other

    eess.AS cs.LG cs.SD eess.SP

    L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

    Authors: Eric Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, Christian Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo Nucciarelli, Ludovica Paglialunga, Marco Pennese, Sveva Pepe, Enrico Rocchi, Aurelio Uncini, Danilo Comminiello

    Abstract: The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results s… ▽ More

    Submitted 29 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Documentation paper for the L3DAS21 Challenge for IEEE MLSP 2021. Further information on www.l3das.com/mlsp2021

    Journal ref: 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021, pp. 1-6

  4. arXiv:2006.06494  [pdf, other

    cs.LG cs.NE cs.SD eess.AS stat.ML

    Anti-Transfer Learning for Task Invariance in Convolutional Neural Networks for Speech Processing

    Authors: Eric Guizzo, Tillman Weyde, Giacomo Tarroni

    Abstract: We introduce the novel concept of anti-transfer learning for speech processing with convolutional neural networks. While transfer learning assumes that the learning process for a target task will benefit from re-using representations learned for another task, anti-transfer avoids the learning of representations that have been learned for an orthogonal task, i.e., one that is not relevant and poten… ▽ More

    Submitted 13 January, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Neural Networks Journal

  5. arXiv:2003.03375  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Multi-Time-Scale Convolution for Emotion Recognition from Speech Audio Signals

    Authors: Eric Guizzo, Tillman Weyde, Jack Barnett Leveson

    Abstract: Robustness against temporal variations is important for emotion recognition from speech audio, since emotion is ex-pressed through complex spectral patterns that can exhibit significant local dilation and compression on the time axis depending on speaker and context. To address this and potentially other tasks, we introduce the multi-time-scale (MTS) method to create flexibility towards temporal v… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

  6. arXiv:2003.03160  [pdf

    cs.SD eess.AS

    A Neural Network Based Framework for Archetypical Sound Synthesis

    Authors: Eric Guizzo, Alberto Novello

    Abstract: This paper describes a preliminary approach to algorithmically reproduce the archetypical structure adopted by humans to classify sounds. In particular, we propose an approach to predict the human perceived chaos/order level in a sound and synthesize new timbres that present the desired amount of this feature. We adopted a Neural Network based method, in order to exploit its inner predisposition t… ▽ More

    Submitted 10 March, 2020; v1 submitted 6 March, 2020; originally announced March 2020.