Search | arXiv e-print repository

Data-Driven Room Acoustic Modeling Via Differentiable Feedback Delay Networks With Learnable Delay Lines

Authors: Alessandro Ilic Mezza, Riccardo Giampiccolo, Enzo De Sena, Alberto Bernardini

Abstract: Over the past few decades, extensive research has been devoted to the design of artificial reverberation algorithms aimed at emulating the room acoustics of physical environments. Despite significant advancements, automatic parameter tuning of delay-network models remains an open challenge. We introduce a novel method for finding the parameters of a Feedback Delay Network (FDN) such that its outpu… ▽ More Over the past few decades, extensive research has been devoted to the design of artificial reverberation algorithms aimed at emulating the room acoustics of physical environments. Despite significant advancements, automatic parameter tuning of delay-network models remains an open challenge. We introduce a novel method for finding the parameters of a Feedback Delay Network (FDN) such that its output renders target attributes of a measured room impulse response. The proposed approach involves the implementation of a differentiable FDN with trainable delay lines, which, for the first time, allows us to simultaneously learn each and every delay-network parameter via backpropagation. The iterative optimization process seeks to minimize a perceptually-motivated time-domain loss function incorporating differentiable terms accounting for energy decay and echo density. Through experimental validation, we show that the proposed method yields time-invariant frequency-independent FDNs capable of closely matching the desired acoustical characteristics, and outperforms existing methods based on genetic algorithms and analytical FDN design. △ Less

Submitted 17 May, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

Comments: The article has been submitted to EURASIP Journal on Audio, Speech, and Music Processing on Jan 02, 2024 and is currently under review

arXiv:2402.13896 [pdf, other]

HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Authors: Federico Miotello, Paolo Ostan, Mirco Pezzoli, Luca Comanducci, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Abstract: In this paper, we present HOMULA-RIR, a dataset of room impulse responses (RIRs) acquired using both higher-order microphones (HOMs) and a uniform linear array (ULA), in order to model a remote attendance teleconferencing scenario. Specifically, measurements were performed in a seminar room, where a 64-microphone ULA was used as a multichannel audio acquisition system in the proximity of the speak… ▽ More In this paper, we present HOMULA-RIR, a dataset of room impulse responses (RIRs) acquired using both higher-order microphones (HOMs) and a uniform linear array (ULA), in order to model a remote attendance teleconferencing scenario. Specifically, measurements were performed in a seminar room, where a 64-microphone ULA was used as a multichannel audio acquisition system in the proximity of the speakers, while HOMs were used to model 25 attendees actually present in the seminar room. The HOMs cover a wide area of the room, making the dataset suitable also for applications of virtual acoustics. Through the measurement of the reverberation time and clarity index, and sample applications such as source localization and separation, we demonstrate the effectiveness of the HOMULA-RIR dataset. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted for publication at ICASSP 2024 - HSCMA Workshop

arXiv:2312.09663 [pdf, other]

doi 10.1016/j.patrec.2024.04.026

Toward Deep Drum Source Separation

Authors: Alessandro Ilic Mezza, Riccardo Giampiccolo, Alberto Bernardini, Augusto Sarti

Abstract: In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of… ▽ More In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of expressive drums performances using ten real-sounding acoustic drum kits. Totaling 1224 hours, StemGMD is the largest audio dataset of drums to date and the first to comprise isolated audio clips for every instrument in a canonical nine-piece drum kit. We leverage StemGMD to develop LarsNet, a novel deep drum source separation model. Through a bank of dedicated U-Nets, LarsNet can separate five stems from a stereo drum mixture faster than real-time and is shown to significantly outperform state-of-the-art nonnegative spectro-temporal factorization methods. △ Less

Submitted 20 May, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 9 pages, 2 figures, 3 tables. Published in Pattern Recognition Letters, vol. 183, pp. 86-91, 2024

Journal ref: Pattern Recognit. Lett. 183 (2024) 86-91

arXiv:2312.08821 [pdf, other]

Reconstruction of Sound Field through Diffusion Models

Authors: Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Abstract: Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probab… ▽ More Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain. The architecture is devised in order to be conditioned on a set of limited available measurements at different frequencies and generate the sound field in target, unknown, locations. The results show that SF-Diff is able to provide accurate reconstructions, outperforming a state-of-the-art baseline based on kernel interpolation. △ Less

Submitted 21 February, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted for publication at ICASSP 2024

arXiv:2009.14192 [pdf]

doi 10.5121/csit.2020.101115

Analysis of the displacement of terrestrial mobile robots in corridors using paraconsistent annotated evidential logic eτ

Authors: Flavio Amadeu Bernardini, Marcia Terra da Silva, Jair Minoro Abe, Luiz Antonio de Lima, Kanstantsin Miatluk

Abstract: This article proposes an algorithm for a servo motor that controls the movement of an autonomous terrestrial mobile robot using Paraconsistent Logic. The design process of mechatronic systems guided the robot construction phases. The project intends to monitor the robot through its sensors that send positioning signals to the microcontroller. The signals are adjusted by an embedded technology inte… ▽ More This article proposes an algorithm for a servo motor that controls the movement of an autonomous terrestrial mobile robot using Paraconsistent Logic. The design process of mechatronic systems guided the robot construction phases. The project intends to monitor the robot through its sensors that send positioning signals to the microcontroller. The signals are adjusted by an embedded technology interface maintained in the concepts of Paraconsistent Annotated Logic acting directly on the servo steering motor. The electric signals sent to the servo motor were analyzed, and it indicates that the algorithm paraconsistent can contribute to the increase of precision of movements of servo motors. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Showing 1–5 of 5 results for author: Bernardini, A