Skip to main content

Showing 1–5 of 5 results for author: Thomé, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.10384  [pdf, other

    cs.SD eess.AS

    Retrieval Augmented Generation of Symbolic Music with LLMs

    Authors: Nicolas Jonason, Luca Casini, Carl Thomé, Bob L. T. Sturm

    Abstract: We explore the use of large language models (LLMs) for music generation using a retrieval system to select relevant examples. We find promising initial results for music generation in a dialogue with the user, especially considering the ease with which such a system can be implemented. The code is available online.

    Submitted 28 December, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: LBD @ ISMIR 2023

  2. arXiv:2202.02115  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Polyphonic pitch detection with convolutional recurrent neural networks

    Authors: Carl Thomé, Sven Ahlbäck

    Abstract: Recent directions in automatic speech recognition (ASR) research have shown that applying deep learning models from image recognition challenges in computer vision is beneficial. As automatic music transcription (AMT) is superficially similar to ASR, in the sense that methods often rely on transforming spectrograms to symbolic sequences of events (e.g. words or notes), deep learning should benefit… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

    Comments: MIREX 2017

  3. arXiv:2202.02112  [pdf, other

    cs.SD cs.IR cs.LG cs.MM eess.AS

    Musical Audio Similarity with Self-supervised Convolutional Neural Networks

    Authors: Carl Thomé, Sebastian Piwell, Oscar Utterbäck

    Abstract: We have built a music similarity search engine that lets video producers search by listenable music excerpts, as a complement to traditional full-text search. Our system suggests similar sounding track segments in a large music catalog by training a self-supervised convolutional neural network with triplet loss terms and musical transformations. Semi-structured user interviews demonstrate that we… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

    Comments: ISMIR LBD 2021

  4. arXiv:2006.06287  [pdf, other

    cs.SD cs.LG eess.AS

    Perceiving Music Quality with GANs

    Authors: Agrin Hilmkil, Carl Thomé, Anders Arpteg

    Abstract: Several methods have been developed to assess the perceptual quality of audio under transforms like lossy compression. However, they require paired reference signals of the unaltered content, limiting their use in applications where references are unavailable. This has hindered progress in audio generation and style transfer, where a no-reference quality assessment method would allow more reproduc… ▽ More

    Submitted 4 April, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Extended abstract (first version) accepted for the Northern Lights Deep Learning Workshop 2020

  5. arXiv:1602.08750  [pdf, other

    cs.HC cs.SD

    Filtering Video Noise as Audio with Motion Detection to Form a Musical Instrument

    Authors: Carl Thomé

    Abstract: Even though they differ in the physical domain, digital video and audio share many characteristics. Both are temporal data streams often stored in buffers with 8-bit values. This paper investigates a method for creating harmonic sounds with a video signal as input. A musical instrument is proposed, that utilizes video in both a sound synthesis method, and in a controller interface for selecting mu… ▽ More

    Submitted 28 February, 2016; originally announced February 2016.

    Comments: Received the 2015 best paper award in the KTH Royal Institute of Technology course "Musical Communication and Music Technology"