Skip to main content

Showing 1–17 of 17 results for author: Antonacci, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.04333  [pdf, ps, other

    cs.SD eess.AS

    PAGURI: a user experience study of creative interaction with text-to-music models

    Authors: Francesca Ronchini, Luca Comanducci, Gabriele Perego, Fabio Antonacci

    Abstract: In recent years, text-to-music models have been the biggest breakthrough in automatic music generation. While they are unquestionably a showcase of technological progress, it is not clear yet how they can be realistically integrated into the artistic practice of musicians and music practitioners. This paper aims to address this question via Prompt Audio Generation User Research Investigation (PAGU… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2404.03436  [pdf, other

    eess.AS cs.SD

    Interpreting End-to-End Deep Learning Models for Speech Source Localization Using Layer-wise Relevance Propagation

    Authors: Luca Comanducci, Fabio Antonacci, Augusto Sarti

    Abstract: Deep learning models are widely applied in the signal processing community, yet their inner working procedure is often treated as a black box. In this paper, we investigate the use of eXplainable Artificial Intelligence (XAI) techniques to learning-based end-to-end speech source localization models. We consider the Layer-wise Relevance Propagation (LRP) technique, which aims to determine which par… ▽ More

    Submitted 26 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  3. arXiv:2403.17864  [pdf, other

    eess.AS cs.SD eess.SP

    Synthetic training set generation using text-to-audio models for environmental sound classification

    Authors: Francesca Ronchini, Luca Comanducci, Fabio Antonacci

    Abstract: In recent years, text-to-audio models have revolutionized the field of automatic audio generation. This paper investigates their application in generating synthetic datasets for training data-driven models. Specifically, this study analyzes the performance of two environmental sound classification systems trained with data generated from text-to-audio models. We considered three scenarios: a) augm… ▽ More

    Submitted 6 July, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  4. arXiv:2403.09524  [pdf, other

    eess.AS

    Physics-Informed Neural Network for Volumetric Sound field Reconstruction of Speech Signals

    Authors: Marco Olivieri, Xenofon Karakonstantis, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti, Efren Fernandez-Grande

    Abstract: Recent developments in acoustic signal processing have seen the integration of deep learning methodologies, alongside the continued prominence of classical wave expansion-based approaches, particularly in sound field reconstruction. Physics-Informed Neural Networks (PINNs) have emerged as a novel framework, bridging the gap between data-driven and model-based techniques for addressing physical phe… ▽ More

    Submitted 23 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  5. arXiv:2402.13896  [pdf, other

    eess.AS eess.SP

    HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

    Authors: Federico Miotello, Paolo Ostan, Mirco Pezzoli, Luca Comanducci, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

    Abstract: In this paper, we present HOMULA-RIR, a dataset of room impulse responses (RIRs) acquired using both higher-order microphones (HOMs) and a uniform linear array (ULA), in order to model a remote attendance teleconferencing scenario. Specifically, measurements were performed in a seminar room, where a 64-microphone ULA was used as a multichannel audio acquisition system in the proximity of the speak… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted for publication at ICASSP 2024 - HSCMA Workshop

  6. arXiv:2402.04866  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones

    Authors: Francesca Ronchini, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

    Abstract: Reconstructing the room transfer functions needed to calculate the complex sound field in a room has several important real-world applications. However, an unpractical number of microphones is often required. Recently, in addition to classical signal processing methods, deep learning techniques have been applied to reconstruct the room transfer function starting from a very limited set of measurem… ▽ More

    Submitted 11 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at EUSIPCO 2024

  7. arXiv:2312.08821  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Reconstruction of Sound Field through Diffusion Models

    Authors: Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

    Abstract: Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probab… ▽ More

    Submitted 21 February, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at ICASSP 2024

  8. arXiv:2307.04586  [pdf, other

    eess.AS

    Timbre transfer using image-to-image denoising diffusion implicit models

    Authors: Luca Comanducci, Fabio Antonacci, Augusto Sarti

    Abstract: Timbre transfer techniques aim at converting the sound of a musical piece generated by one instrument into the same one as if it was played by another instrument, while maintaining as much as possible the content in terms of musical characteristics such as melody and dynamics. Following their recent breakthroughs in deep learning-based generation, we apply Denoising Diffusion Models (DDMs) to perf… ▽ More

    Submitted 28 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

  9. arXiv:2306.11509  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Implicit neural representation with physics-informed neural networks for the reconstruction of the early part of room impulse responses

    Authors: Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

    Abstract: Recently deep learning and machine learning approaches have been widely employed for various applications in acoustics. Nonetheless, in the area of sound field processing and reconstruction classic methods based on the solutions of wave equation are still widespread. Recently, physics-informed neural networks have been proposed as a deep learning paradigm for solving partial differential equations… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted for publication at Forum Acusticum 2023

  10. Acoustic source localization in the spherical harmonics domain exploiting low-rank approximations

    Authors: Maximo Cobos, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

    Abstract: Acoustic signal processing in the spherical harmonics domain (SHD) is an active research area that exploits the signals acquired by higher order microphone arrays. A very important task is that concerning the localization of active sound sources. In this paper, we propose a simple yet effective method to localize prominent acoustic sources in adverse acoustic scenarios. By using a proper normaliza… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: To appear in ICASSP 2023

    Journal ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  11. Synthesis of Soundfields through Irregular Loudspeaker Arrays Based on Convolutional Neural Networks

    Authors: Luca Comanducci, Fabio Antonacci, Augusto Sarti

    Abstract: Most soundfield synthesis approaches deal with extensive and regular loudspeaker arrays, which are often not suitable for home audio systems, due to physical space constraints. In this article we propose a technique for soundfield synthesis through more easily deployable irregular loudspeaker arrays, i.e. where the spacing between loudspeakers is not constant, based on deep learning. The input are… ▽ More

    Submitted 11 September, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

  12. arXiv:2103.16935  [pdf, other

    cs.SD cs.LG eess.AS

    Near field Acoustic Holography on arbitrary shapes using Convolutional Neural Network

    Authors: Marco Olivieri, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

    Abstract: Near-field Acoustic Holography (NAH) is a well-known problem aimed at estimating the vibrational velocity field of a structure by means of acoustic measurements. In this paper, we propose a NAH technique based on Convolutional Neural Network (CNN). The devised CNN predicts the vibrational field on the surface of arbitrary shaped plates (violin plates) with orthotropic material properties from a li… ▽ More

    Submitted 29 June, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    Comments: accepted for publication in EUSIPCO21

  13. arXiv:2103.14895  [pdf, ps, other

    cs.SD cs.LG eess.AS eess.SP

    Feature-based Representation for Violin Bridge Admittances

    Authors: R. Malvermi, S. Gonzalez, M. Quintavalla, F. Antonacci, A. Sarti, J. A. Torres, R. Corradi

    Abstract: Frequency Response Functions (FRFs) are one of the cornerstones of musical acoustic experimental research. They describe the way in which musical instruments vibrate in a wide range of frequencies and are used to predict and understand the acoustic differences between them. In the specific case of stringed musical instruments such as violins, FRFs evaluated at the bridge are known to capture the o… ▽ More

    Submitted 27 March, 2021; originally announced March 2021.

    Comments: 8 pages, 6 figures, submitted to "The 27th International Congress on Sound and Vibration" (ICSV)

  14. arXiv:2102.07133  [pdf, other

    cs.SD cs.LG eess.AS

    Parametric Optimization of Violin Top Plates using Machine Learning

    Authors: Davide Salvi, Sebastian Gonzalez, Fabio Antonacci, Augusto Sarti

    Abstract: We recently developed a neural network that receives as input the geometrical and mechanical parameters that define a violin top plate and gives as output its first ten eigenfrequencies computed in free boundary conditions. In this manuscript, we use the network to optimize several error functions, with the goal of analyzing the relationship between the eigenspectrum problem for violin top plates… ▽ More

    Submitted 18 February, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

  15. arXiv:2102.04254  [pdf, other

    cs.CE cs.AI cs.LG cs.SD eess.AS

    A Data-Driven Approach to Violin Making

    Authors: Sebastian Gonzalez, Davide Salvi, Daniel Baeza, Fabio Antonacci, Augusto Sarti

    Abstract: Of all the characteristics of a violin, those that concern its shape are probably the most important ones, as the violin maker has complete control over them. Contemporary violin making, however, is still based more on tradition than understanding, and a definitive scientific study of the specific relations that exist between shape and vibrational properties is yet to come and sorely missed. In th… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  16. arXiv:2002.00641  [pdf, ps, other

    eess.AS cs.SD

    Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

    Authors: Luca Comanducci, Maximo Cobos, Fabio Antonacci, Augusto Sarti

    Abstract: The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TD… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

    Comments: Paper accepted for presentation in ICASSP 2020

    MSC Class: 94A12; 68T10 ACM Class: I.2.0; I.5.4

  17. arXiv:1910.08838  [pdf, other

    eess.AS

    Frequency-Sliding Generalized Cross-Correlation: A Sub-band Time Delay Estimation Approach

    Authors: Maximo Cobos, Fabio Antonacci, Luca Comanducci, Augusto Sarti

    Abstract: The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization… ▽ More

    Submitted 24 March, 2020; v1 submitted 19 October, 2019; originally announced October 2019.

    Comments: Article accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processing