Skip to main content

Showing 1–7 of 7 results for author: McCallum, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2401.08902  [pdf, other

    cs.SD cs.DL cs.IR cs.LG eess.AS

    Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and Search

    Authors: Matthew C. McCallum, Florian Henkel, Jaehun Kim, Samuel E. Sandberg, Matthew E. P. Davies

    Abstract: Audio embeddings enable large scale comparisons of the similarity of audio files for applications such as search and recommendation. Due to the subjectivity of audio similarity, it can be desirable to design systems that answer not only whether audio is similar, but similar in what way (e.g., wrt. tempo, mood or genre). Previous works have proposed disentangled embedding spaces where subspaces rep… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  2. arXiv:2401.08891  [pdf, other

    cs.SD cs.LG eess.AS

    Tempo estimation as fully self-supervised binary classification

    Authors: Florian Henkel, Jaehun Kim, Matthew C. McCallum, Samuel E. Sandberg, Matthew E. P. Davies

    Abstract: This paper addresses the problem of global tempo estimation in musical audio. Given that annotating tempo is time-consuming and requires certain musical expertise, few publicly available data sources exist to train machine learning models for this task. Towards alleviating this issue, we propose a fully self-supervised approach that does not rely on any human labeled data. Our method builds on the… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  3. arXiv:2401.08889  [pdf, other

    cs.SD cs.IR cs.LG cs.MM eess.AS

    On the Effect of Data-Augmentation on Local Embedding Properties in the Contrastive Learning of Music Audio Representations

    Authors: Matthew C. McCallum, Matthew E. P. Davies, Florian Henkel, Jaehun Kim, Samuel E. Sandberg

    Abstract: Audio embeddings are crucial tools in understanding large catalogs of music. Typically embeddings are evaluated on the basis of the performance they provide in a wide range of downstream tasks, however few studies have investigated the local properties of the embedding spaces themselves which are important in nearest neighbor algorithms, commonly used in music search and recommendation. In this wo… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  4. arXiv:2210.03799  [pdf, other

    cs.SD cs.AI cs.IR cs.LG cs.MM eess.AS

    Supervised and Unsupervised Learning of Audio Representations for Music Understanding

    Authors: Matthew C. McCallum, Filip Korzeniowski, Sergio Oramas, Fabien Gouyon, Andreas F. Ehmann

    Abstract: In this work, we provide a broad comparative analysis of strategies for pre-training audio understanding models for several tasks in the music domain, including labelling of genre, era, origin, mood, instrumentation, key, pitch, vocal characteristics, tempo and sonority. Specifically, we explore how the domain of pre-training datasets (music or generic audio) and the pre-training methodology (supe… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  5. arXiv:2111.00704  [pdf, other

    cs.SD cs.IR eess.AS eess.SP

    A Novel 1D State Space for Efficient Music Rhythmic Analysis

    Authors: Mojtaba Heydari, Matthew McCallum, Andreas Ehmann, Zhiyao Duan

    Abstract: Inferring music time structures has a broad range of applications in music production, processing and analysis. Scholars have proposed various methods to analyze different aspects of time structures, such as beat, downbeat, tempo and meter. Many state-of-the-art (SOFA) methods, however, are computationally expensive. This makes them inapplicable in real-world industrial settings where the scale of… ▽ More

    Submitted 20 February, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: International Conference on Acoustics, Speech and Signal Processing (ICASSP), May. 2022

  6. arXiv:2108.12955  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Unsupervised Learning of Deep Features for Music Segmentation

    Authors: Matthew C. McCallum

    Abstract: Music segmentation refers to the dual problem of identifying boundaries between, and labeling, distinct music segments, e.g., the chorus, verse, bridge etc. in popular music. The performance of a range of music segmentation algorithms has been shown to be dependent on the audio features chosen to represent the audio. Some approaches have proposed learning feature transformations from music segment… ▽ More

    Submitted 29 August, 2021; originally announced August 2021.

    Journal ref: ICASSP 2019

  7. arXiv:2010.11512  [pdf, other

    cs.SD cs.IR eess.AS

    Mood Classification Using Listening Data

    Authors: Filip Korzeniowski, Oriol Nieto, Matthew McCallum, Minz Won, Sergio Oramas, Erik Schmidt

    Abstract: The mood of a song is a highly relevant feature for exploration and recommendation in large collections of music. These collections tend to require automatic methods for predicting such moods. In this work, we show that listening-based features outperform content-based ones when classifying moods: embeddings obtained through matrix factorization of listening data appear to be more informative of a… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Appears in Proc. of the International Society for Music Information Retrieval Conference 2020 (ISMIR 2020)