Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings
Authors:
Mahyar Gohari,
Paolo Bestagini,
Sergio Benini,
Nicola Adami
Abstract:
In the domain of music production and audio processing, the implementation of automatic pitch correction of the singing voice, also known as Auto-Tune, has significantly transformed the landscape of vocal performance. While auto-tuning technology has offered musicians the ability to tune their vocal pitches and achieve a desired level of precision, its use has also sparked debates regarding its im…
▽ More
In the domain of music production and audio processing, the implementation of automatic pitch correction of the singing voice, also known as Auto-Tune, has significantly transformed the landscape of vocal performance. While auto-tuning technology has offered musicians the ability to tune their vocal pitches and achieve a desired level of precision, its use has also sparked debates regarding its impact on authenticity and artistic integrity. As a result, detecting and analyzing Auto-Tuned vocals in music recordings has become essential for music scholars, producers, and listeners. However, to the best of our knowledge, no prior effort has been made in this direction. This study introduces a data-driven approach leveraging triplet networks for the detection of Auto-Tuned songs, backed by the creation of a dataset composed of original and Auto-Tuned audio clips. The experimental results demonstrate the superiority of the proposed method in both accuracy and robustness compared to Rawnet2, an end-to-end model proposed for anti-spoofing and widely used for other audio forensic tasks.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
BS-Net: learning COVID-19 pneumonia severity on a large Chest X-Ray dataset
Authors:
Alberto Signoroni,
Mattia Savardi,
Sergio Benini,
Nicola Adami,
Riccardo Leonardi,
Paolo Gibellini,
Filippo Vaccher,
Marco Ravanelli,
Andrea Borghesi,
Roberto Maroldi,
Davide Farina
Abstract:
In this work we design an end-to-end deep learning architecture for predicting, on Chest X-rays images (CXR), a multi-regional score conveying the degree of lung compromise in COVID-19 patients. Such semi-quantitative scoring system, namely Brixia~score, is applied in serial monitoring of such patients, showing significant prognostic value, in one of the hospitals that experienced one of the highe…
▽ More
In this work we design an end-to-end deep learning architecture for predicting, on Chest X-rays images (CXR), a multi-regional score conveying the degree of lung compromise in COVID-19 patients. Such semi-quantitative scoring system, namely Brixia~score, is applied in serial monitoring of such patients, showing significant prognostic value, in one of the hospitals that experienced one of the highest pandemic peaks in Italy. To solve such a challenging visual task, we adopt a weakly supervised learning strategy structured to handle different tasks (segmentation, spatial alignment, and score estimation) trained with a "from-the-part-to-the-whole" procedure involving different datasets. In particular, we exploit a clinical dataset of almost 5,000 CXR annotated images collected in the same hospital. Our BS-Net demonstrates self-attentive behavior and a high degree of accuracy in all processing stages. Through inter-rater agreement tests and a gold standard comparison, we show that our solution outperforms single human annotators in rating accuracy and consistency, thus supporting the possibility of using this tool in contexts of computer-assisted monitoring. Highly resolved (super-pixel level) explainability maps are also generated, with an original technique, to visually help the understanding of the network activity on the lung areas. We also consider other scores proposed in literature and provide a comparison with a recently proposed non-specific approach. We eventually test the performance robustness of our model on an assorted public COVID-19 dataset, for which we also provide Brixia~score annotations, observing good direct generalization and fine-tuning capabilities that highlight the portability of BS-Net in other clinical settings. The CXR dataset along with the source code and the trained model are publicly released for research purposes.
△ Less
Submitted 3 April, 2021; v1 submitted 8 June, 2020;
originally announced June 2020.