Skip to main content

Showing 1–6 of 6 results for author: Quinton, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.10057  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

    Authors: Ilaria Manco, Benno Weck, SeungHeon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

    Abstract: We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. The dataset consists of 1.1k human-written natural language descriptions of 706 music recordings, all publicly accessible and released under Creative Common licenses. To showcase the use of our dataset, we benchmark popular models o… ▽ More

    Submitted 22 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023 Workshop on Machine Learning for Audio

  2. arXiv:2310.11165  [pdf, other

    cs.SD cs.LG eess.AS

    Serenade: A Model for Human-in-the-loop Automatic Chord Estimation

    Authors: Hendrik Vincent Koops, Gianluca Micchi, Ilaria Manco, Elio Quinton

    Abstract: Computational harmony analysis is important for MIR tasks such as automatic segmentation, corpus analysis and automatic chord label estimation. However, recent research into the ambiguous nature of musical harmony, causing limited inter-rater agreement, has made apparent that there is a glass ceiling for common metrics such as accuracy. Commonly, these issues are addressed either in the training d… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted at MMRP23. 7 pages, 5 figures, 2 tables

  3. arXiv:2209.01478  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Equivariant Self-Supervision for Musical Tempo Estimation

    Authors: Elio Quinton

    Abstract: Self-supervised methods have emerged as a promising avenue for representation learning in the recent years since they alleviate the need for labeled datasets, which are scarce and expensive to acquire. Contrastive methods are a popular choice for self-supervision in the audio domain, and typically provide a learning signal by forcing the model to be invariant to some transformations of the input.… ▽ More

    Submitted 3 September, 2022; originally announced September 2022.

    Comments: Accepted at ISMIR 2022

  4. arXiv:2208.12208  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Contrastive Audio-Language Learning for Music

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas

    Abstract: As one of the most intuitive interfaces known to humans, natural language has the potential to mediate many tasks that involve human-computer interaction, especially in application-focused fields like Music Information Retrieval. In this work, we explore cross-modal learning in an attempt to bridge audio and language in the music domain. To this end, we propose MusCALL, a framework for Music Contr… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: Accepted to ISMIR 2022

  5. arXiv:2112.04214  [pdf, other

    cs.SD cs.CL cs.IR cs.LG eess.AS

    Learning music audio representations via weak language supervision

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

    Abstract: Audio representations for music information retrieval are typically learned via supervised learning in a task-specific fashion. Although effective at producing state-of-the-art results, this scheme lacks flexibility with respect to the range of applications a model can have and requires extensively annotated datasets. In this work, we pose the question of whether it may be possible to exploit weak… ▽ More

    Submitted 17 February, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Accepted to ICASSP 2022

  6. arXiv:2104.11984  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    MusCaps: Generating Captions for Music Audio

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

    Abstract: Content-based music information retrieval has seen rapid progress with the adoption of deep learning. Current approaches to high-level music description typically make use of classification models, such as in auto-tagging or genre and mood classification. In this work, we propose to address music description via audio captioning, defined as the task of generating a natural language description of… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: Accepted to IJCNN 2021 for the Special Session on Representation Learning for Audio, Speech, and Music Processing