Skip to main content

Showing 1–19 of 19 results for author: Bogdanov, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.09318  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio

    Authors: Pablo Alonso-Jiménez, Leonardo Pepino, Roser Batlle-Roca, Pablo Zinemanas, Dmitry Bogdanov, Xavier Serra, Martín Rocamora

    Abstract: We present PECMAE, an interpretable model for music audio classification based on prototype learning. Our model is based on a previous method, APNet, which jointly learns an autoencoder and a prototypical network. Instead, we propose to decouple both training processes. This enables us to leverage existing self-supervised autoencoders pre-trained on much larger data (EnCodecMAE), providing represe… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  2. arXiv:2312.05994  [pdf, other

    cs.SD cs.IR eess.AS

    mir_ref: A Representation Evaluation Framework for Music Information Retrieval Tasks

    Authors: Christos Plachouras, Pablo Alonso-Jiménez, Dmitry Bogdanov

    Abstract: Music Information Retrieval (MIR) research is increasingly leveraging representation learning to obtain more compact, powerful music audio representations for various downstream MIR tasks. However, current representation evaluation methods are fragmented due to discrepancies in audio and label preprocessing, downstream model and metric implementations, data availability, and computational resource… ▽ More

    Submitted 12 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: Machine Learning for Audio Workshop, Neural Information Processing Systems (NeurIPS) 2023, New Orleans, LA

  3. arXiv:2311.10057  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

    Authors: Ilaria Manco, Benno Weck, SeungHeon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

    Abstract: We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. The dataset consists of 1.1k human-written natural language descriptions of 706 music recordings, all publicly accessible and released under Creative Common licenses. To showcase the use of our dataset, we benchmark popular models o… ▽ More

    Submitted 22 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023 Workshop on Machine Learning for Audio

  4. arXiv:2309.16418  [pdf, other

    cs.SD eess.AS

    Efficient Supervised Training of Audio Transformers for Music Representation Learning

    Authors: Pablo Alonso-Jiménez, Xavier Serra, Dmitry Bogdanov

    Abstract: In this work, we address music representation learning using convolution-free transformers. We build on top of existing spectrogram-based audio transformers such as AST and train our models on a supervised task using patchout training similar to PaSST. In contrast to previous works, we study how specific design decisions affect downstream music tagging tasks instead of focusing on the training tas… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted at the 2023 International Society for Music Information Retrieval Conference (ISMIR'23)

  5. arXiv:2304.12257  [pdf, other

    cs.SD eess.AS

    Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity

    Authors: Pablo Alonso-Jiménez, Xavier Favory, Hadrien Foroughmand, Grigoris Bourdalas, Xavier Serra, Thomas Lidy, Dmitry Bogdanov

    Abstract: In this work, we investigate an approach that relies on contrastive learning and music metadata as a weak source of supervision to train music representation models. Recent studies show that contrastive learning can be used with editorial metadata (e.g., artist or album name) to learn audio representations that are useful for different classification tasks. In this paper, we extend this idea to us… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Accepted at the 2023 International Conference on Acoustics, Speech, and Signal Processing (ICASSP'23)

  6. arXiv:2301.06167  [pdf

    cs.CY cs.CR

    UN Handbook on Privacy-Preserving Computation Techniques

    Authors: David W. Archer, Borja de Balle Pigem, Dan Bogdanov, Mark Craddock, Adria Gascon, Ronald Jansen, Matjaž Jug, Kim Laine, Robert McLellan, Olga Ohrimenko, Mariana Raykova, Andrew Trask, Simon Wardley

    Abstract: This paper describes privacy-preserving approaches for the statistical analysis. It describes motivations for privacy-preserving approaches for the statistical analysis of sensitive data, presents examples of use cases where such methods may apply and describes relevant technical capabilities to assure privacy preservation while still allowing analysis of sensitive data. Our focus is on methods th… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

    Comments: 50 pages

  7. arXiv:2203.15448  [pdf, other

    cs.PL cs.CR

    ZK-SecreC: a Domain-Specific Language for Zero Knowledge Proofs

    Authors: Dan Bogdanov, Joosep Jääger, Peeter Laud, Härmel Nestra, Martin Pettai, Jaak Randmets, Ville Sokk, Kert Tali, Sandhra-Mirella Valdma

    Abstract: We present ZK-SecreC, a domain-specific language for zero-knowledge proofs. We present the rationale for its design, its syntax and semantics, and demonstrate its usefulness on the basis of a number of non-trivial examples. The design features a type system, where each piece of data is assigned both a confidentiality and an integrity type, which are not orthogonal to each other. We perform an empi… ▽ More

    Submitted 26 August, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: 75 pp

  8. arXiv:2104.00437  [pdf, other

    cs.SD cs.IR cs.MM eess.AS

    Enriched Music Representations with Multiple Cross-modal Contrastive Learning

    Authors: Andres Ferraro, Xavier Favory, Konstantinos Drossos, Yuntae Kim, Dmitry Bogdanov

    Abstract: Modeling various aspects that make a music piece unique is a challenging task, requiring the combination of multiple sources of information. Deep learning is commonly used to obtain representations using various sources of information, such as the audio, interactions between users and songs, or associated genre metadata. Recently, contrastive learning has led to representations that generalize bet… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: Accepted for publication to IEEE Signal Processing Letters

    Report number: SPL-30069-2021

  9. arXiv:2102.00201  [pdf, other

    cs.SD cs.IR cs.LG cs.MM eess.AS

    Melon Playlist Dataset: a public dataset for audio-based playlist generation and music tagging

    Authors: Andres Ferraro, Yuntae Kim, Soohyeon Lee, Biho Kim, Namjun Jo, Semi Lim, Suyon Lim, Jungtaek Jang, Sehwan Kim, Xavier Serra, Dmitry Bogdanov

    Abstract: One of the main limitations in the field of audio signal processing is the lack of large public datasets with audio representations and high-quality annotations due to restrictions of copyrighted commercial music. We present Melon Playlist Dataset, a public dataset of mel-spectrograms for 649,091tracks and 148,826 associated playlists annotated by 30,652 different tags. All the data is gathered fr… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.

    Comments: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing

  10. arXiv:2012.12927  [pdf

    cs.CY

    Towards a common performance and effectiveness terminology for digital proximity tracing applications

    Authors: Justus Benzler, Dan Bogdanov, Göran Kirchner, Wouter Lueks, Raquel Lucas, Rui Oliveira, Bart Preneel, Marcel Salathe, Carmela Troncoso, Viktor von Wyl

    Abstract: Digital proximity tracing (DPT) for Sars-CoV-2 pandemic mitigation is a complex intervention with the primary goal to notify app users about possible risk exposures to infected persons. Policymakers and DPT operators need to know whether their system works as expected in terms of speed or yield (performance) and whether DPT is making an effective contribution to pandemic mitigation (also in compar… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

  11. arXiv:2008.11507  [pdf, other

    eess.AS cs.SD

    The Freesound Loop Dataset and Annotation Tool

    Authors: Antonio Ramires, Frederic Font, Dmitry Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, Xavier Serra

    Abstract: Music loops are essential ingredients in electronic music production, and there is a high demand for pre-recorded loops in a variety of styles. Several commercial and community databases have been created to meet this demand, but most are not suitable for research due to their strict licensing. We present the Freesound Loop Dataset (FSLD), a new large-scale dataset of music loops annotated by expe… ▽ More

    Submitted 23 September, 2020; v1 submitted 26 August, 2020; originally announced August 2020.

    Comments: This work will be presented in the 21st International Society for Music Information Retrieval (ISMIR2020). Annotator website: http://mtg.upf.edu/fslannotator Dataset: https://zenodo.org/record/3967852

  12. arXiv:2006.00751  [pdf, other

    eess.AS cs.SD

    Evaluation of CNN-based Automatic Music Tagging Models

    Authors: Minz Won, Andres Ferraro, Dmitry Bogdanov, Xavier Serra

    Abstract: Recent advances in deep learning accelerated the development of content-based automatic music tagging systems. Music information retrieval (MIR) researchers proposed various architecture designs, mainly based on convolutional neural networks (CNNs), that achieve state-of-the-art results in this multi-label binary classification task. However, due to the differences in experimental setups followed… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 7 pages, 2 figures, Sound and Music Computing 2020 (SMC 2020)

  13. arXiv:2003.07393  [pdf, ps, other

    eess.AS cs.LG cs.SD

    TensorFlow Audio Models in Essentia

    Authors: Pablo Alonso-Jiménez, Dmitry Bogdanov, Jordi Pons, Xavier Serra

    Abstract: Essentia is a reference open-source C++/Python library for audio and music analysis. In this work, we present a set of algorithms that employ TensorFlow in Essentia, allow predictions with pre-trained deep learning models, and are designed to offer flexibility of use, easy extensibility, and real-time inference. To show the potential of this new interface with TensorFlow, we provide a number of pr… ▽ More

    Submitted 16 March, 2020; originally announced March 2020.

  14. arXiv:1911.04827  [pdf, other

    cs.IR

    Artist and style exposure bias in collaborative filtering based music recommendations

    Authors: Andres Ferraro, Dmitry Bogdanov, Xavier Serra, Jason Yoon

    Abstract: Algorithms have an increasing influence on the music that we consume and understanding their behavior is fundamental to make sure they give a fair exposure to all artists across different styles. In this on-going work we contribute to this research direction analyzing the impact of collaborative filtering recommendations from the perspective of artist and music style exposure given by the system.… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

    Comments: Presented at Workshop on Designing Human-Centric MIR Systems, ISMIR 2019

  15. arXiv:1911.04824  [pdf, other

    cs.IR cs.SD eess.AS

    How Low Can You Go? Reducing Frequency and Time Resolution in Current CNN Architectures for Music Auto-tagging

    Authors: Andres Ferraro, Dmitry Bogdanov, Xavier Serra, Jay Ho Jeon, Jason Yoon

    Abstract: Automatic tagging of music is an important research topic in Music Information Retrieval and audio analysis algorithms proposed for this task have achieved improvements with advances in deep learning. In particular, many state-of-the-art systems use Convolutional Neural Networks and operate on mel-spectrogram representations of the audio. In this paper, we compare commonly used mel-spectrogram rep… ▽ More

    Submitted 28 June, 2020; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: The 28th European Signal Processing Conference (EUSIPCO)

  16. arXiv:1903.11833  [pdf, ps, other

    cs.IR

    Skip prediction using boosting trees based on acoustic features of tracks in sessions

    Authors: Andrés Ferraro, Dmitry Bogdanov, Xavier Serra

    Abstract: The Spotify Sequential Skip Prediction Challenge focuses on predicting if a track in a session will be skipped by the user or not. In this paper, we describe our approach to this problem and the final system that was submitted to the challenge by our team from the Music Technology Group (MTG) under the name "aferraro". This system consists in combining the predictions of multiple boosting trees mo… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

    Journal ref: WSDM Cup 2019 Workshop on the 12th ACM International Conference on Web Search and Data Mining

  17. arXiv:1901.02296  [pdf, other

    cs.IR

    Using offline metrics and user behavior analysis to combine multiple systems for music recommendation

    Authors: Andres Ferraro, Dmitry Bogdanov, Kyumin Choi, Xavier Serra

    Abstract: There are many offline metrics that can be used as a reference for evaluation and optimization of the performance of recommender systems. Hybrid recommendation approaches are commonly used to improve some of those metrics by combining different systems. In this work we focus on music recommendation and propose a new way to improve recommendations, with respect to a desired metric of choice, by com… ▽ More

    Submitted 8 January, 2019; originally announced January 2019.

    Journal ref: Conference on Recommender Systems (RecSys) 2018, REVEAL Workshop

  18. Automatic playlist continuation using a hybrid recommender system combining features from text and audio

    Authors: Andres Ferraro, Dmitry Bogdanov, Jisang Yoon, KwangSeob Kim, Xavier Serra

    Abstract: The ACM RecSys Challenge 2018 focuses on music recommendation in the context of automatic playlist continuation. In this paper, we describe our approach to the problem and the final hybrid system that was submitted to the challenge by our team Cocoplaya. This system consists in combining the recommendations produced by two different models using ranking fusion. The first model is based on Matrix F… ▽ More

    Submitted 2 January, 2019; originally announced January 2019.

    Comments: 5 pages

    Journal ref: Proceeding RecSys Challenge '18 Proceedings of the ACM Recommender Systems Challenge 2018

  19. arXiv:1604.03603  [pdf, other

    cs.CG cs.SC

    Algorithmic computation of polynomial amoebas

    Authors: D. V. Bogdanov, A. A. Kytmanov, T. M. Sadykov

    Abstract: We present algorithms for computation and visualization of amoebas, their contours, compactified amoebas and sections of three-dimensional amoebas by two-dimensional planes. We also provide method and an algorithm for the computation of~polynomials whose amoebas exhibit the most complicated topology among all polynomials with a fixed Newton polytope. The presented algorithms are implemented in com… ▽ More

    Submitted 12 April, 2016; originally announced April 2016.

    MSC Class: 14Q10; 14Q05; 14Q15 ACM Class: G.4