Search | arXiv e-print repository

Data-Driven Room Acoustic Modeling Via Differentiable Feedback Delay Networks With Learnable Delay Lines

Authors: Alessandro Ilic Mezza, Riccardo Giampiccolo, Enzo De Sena, Alberto Bernardini

Abstract: Over the past few decades, extensive research has been devoted to the design of artificial reverberation algorithms aimed at emulating the room acoustics of physical environments. Despite significant advancements, automatic parameter tuning of delay-network models remains an open challenge. We introduce a novel method for finding the parameters of a Feedback Delay Network (FDN) such that its outpu… ▽ More Over the past few decades, extensive research has been devoted to the design of artificial reverberation algorithms aimed at emulating the room acoustics of physical environments. Despite significant advancements, automatic parameter tuning of delay-network models remains an open challenge. We introduce a novel method for finding the parameters of a Feedback Delay Network (FDN) such that its output renders target attributes of a measured room impulse response. The proposed approach involves the implementation of a differentiable FDN with trainable delay lines, which, for the first time, allows us to simultaneously learn each and every delay-network parameter via backpropagation. The iterative optimization process seeks to minimize a perceptually-motivated time-domain loss function incorporating differentiable terms accounting for energy decay and echo density. Through experimental validation, we show that the proposed method yields time-invariant frequency-independent FDNs capable of closely matching the desired acoustical characteristics, and outperforms existing methods based on genetic algorithms and analytical FDN design. △ Less

Submitted 17 May, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

Comments: The article has been submitted to EURASIP Journal on Audio, Speech, and Music Processing on Jan 02, 2024 and is currently under review

arXiv:2312.09663 [pdf, other]

doi 10.1016/j.patrec.2024.04.026

Toward Deep Drum Source Separation

Authors: Alessandro Ilic Mezza, Riccardo Giampiccolo, Alberto Bernardini, Augusto Sarti

Abstract: In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of… ▽ More In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of expressive drums performances using ten real-sounding acoustic drum kits. Totaling 1224 hours, StemGMD is the largest audio dataset of drums to date and the first to comprise isolated audio clips for every instrument in a canonical nine-piece drum kit. We leverage StemGMD to develop LarsNet, a novel deep drum source separation model. Through a bank of dedicated U-Nets, LarsNet can separate five stems from a stereo drum mixture faster than real-time and is shown to significantly outperform state-of-the-art nonnegative spectro-temporal factorization methods. △ Less

Submitted 20 May, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 9 pages, 2 figures, 3 tables. Published in Pattern Recognition Letters, vol. 183, pp. 86-91, 2024

Journal ref: Pattern Recognit. Lett. 183 (2024) 86-91

arXiv:2312.08821 [pdf, other]

Reconstruction of Sound Field through Diffusion Models

Authors: Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Abstract: Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probab… ▽ More Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain. The architecture is devised in order to be conditioned on a set of limited available measurements at different frequencies and generate the sound field in target, unknown, locations. The results show that SF-Diff is able to provide accurate reconstructions, outperforming a state-of-the-art baseline based on kernel interpolation. △ Less

Submitted 21 February, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted for publication at ICASSP 2024

arXiv:2302.05179 [pdf, other]

doi 10.1016/j.artmed.2021.102133

AIOSA: An approach to the automatic identification of obstructive sleep apnea events based on deep learning

Authors: Andrea Bernardini, Andrea Brunello, Gian Luigi Gigli, Angelo Montanari, Nicola Saccomanno

Abstract: Obstructive Sleep Apnea Syndrome (OSAS) is the most common sleep-related breathing disorder. It is caused by an increased upper airway resistance during sleep, which determines episodes of partial or complete interruption of airflow. The detection and treatment of OSAS is particularly important in stroke patients, because the presence of severe OSAS is associated with higher mortality, worse neuro… ▽ More Obstructive Sleep Apnea Syndrome (OSAS) is the most common sleep-related breathing disorder. It is caused by an increased upper airway resistance during sleep, which determines episodes of partial or complete interruption of airflow. The detection and treatment of OSAS is particularly important in stroke patients, because the presence of severe OSAS is associated with higher mortality, worse neurological deficits, worse functional outcome after rehabilitation, and a higher likelihood of uncontrolled hypertension. The gold standard test for diagnosing OSAS is polysomnography (PSG). Unfortunately, performing a PSG in an electrically hostile environment, like a stroke unit, on neurologically impaired patients is a difficult task; also, the number of strokes per day outnumbers the availability of polysomnographs and dedicated healthcare professionals. Thus, a simple and automated recognition system to identify OSAS among acute stroke patients, relying on routinely recorded vital signs, is desirable. The majority of the work done so far focuses on data recorded in ideal conditions and highly selected patients, and thus it is hardly exploitable in real-life settings, where it would be of actual use. In this paper, we propose a convolutional deep learning architecture able to reduce the temporal resolution of raw waveform data, like physiological signals, extracting key features that can be used for further processing. We exploit models based on such an architecture to detect OSAS events in stroke unit recordings obtained from the monitoring of unselected patients. Unlike existing approaches, annotations are performed at one-second granularity, allowing physicians to better interpret the model outcome. Results are considered to be satisfactory by the domain experts. Moreover, based on a widely-used benchmark, we show that the proposed approach outperforms current state-of-the-art solutions. △ Less

Submitted 10 February, 2023; originally announced February 2023.

Comments: Final article published on Artificial Intelligence in Medicine Journal

Journal ref: Artificial Intelligence in Medicine, Volume 118, 2021

arXiv:2009.14192 [pdf]

doi 10.5121/csit.2020.101115

Analysis of the displacement of terrestrial mobile robots in corridors using paraconsistent annotated evidential logic eτ

Authors: Flavio Amadeu Bernardini, Marcia Terra da Silva, Jair Minoro Abe, Luiz Antonio de Lima, Kanstantsin Miatluk

Abstract: This article proposes an algorithm for a servo motor that controls the movement of an autonomous terrestrial mobile robot using Paraconsistent Logic. The design process of mechatronic systems guided the robot construction phases. The project intends to monitor the robot through its sensors that send positioning signals to the microcontroller. The signals are adjusted by an embedded technology inte… ▽ More This article proposes an algorithm for a servo motor that controls the movement of an autonomous terrestrial mobile robot using Paraconsistent Logic. The design process of mechatronic systems guided the robot construction phases. The project intends to monitor the robot through its sensors that send positioning signals to the microcontroller. The signals are adjusted by an embedded technology interface maintained in the concepts of Paraconsistent Annotated Logic acting directly on the servo steering motor. The electric signals sent to the servo motor were analyzed, and it indicates that the algorithm paraconsistent can contribute to the increase of precision of movements of servo motors. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Showing 1–5 of 5 results for author: Bernardini, A