Skip to main content

Showing 1–3 of 3 results for author: Rouard, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02315  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    An Independence-promoting Loss for Music Generation with Language Models

    Authors: Jean-Marie Lemercier, Simon Rouard, Jade Copet, Yossi Adi, Alexandre Défossez

    Abstract: Music generation schemes using language modeling rely on a vocabulary of audio tokens, generally provided as codes in a discrete latent space learnt by an auto-encoder. Multi-stage quantizers are often employed to produce these tokens, therefore the decoding strategy used for token prediction must be adapted to account for multiple codebooks: either it should model the joint distribution over all… ▽ More

    Submitted 9 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  2. arXiv:2211.08553  [pdf, other

    eess.AS cs.SD

    Hybrid Transformers for Music Source Separation

    Authors: Simon Rouard, Francisco Massa, Alexandre Défossez

    Abstract: A natural question arising in Music Source Separation (MSS) is whether long range contextual information is useful, or whether local acoustic features are sufficient. In other fields, attention based Transformers have shown their ability to integrate information over long sequences. In this work, we introduce Hybrid Transformer Demucs (HT Demucs), an hybrid temporal/spectral bi-U-Net based on Hybr… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  3. arXiv:2106.07431  [pdf, other

    cs.SD cs.LG eess.AS

    CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis

    Authors: Simon Rouard, Gaëtan Hadjeres

    Abstract: In this paper, we propose a novel score-base generative model for unconditional raw audio synthesis. Our proposal builds upon the latest developments on diffusion process modeling with stochastic differential equations, which already demonstrated promising results on image generation. We motivate novel heuristics for the choice of the diffusion processes better suited for audio generation, and con… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: 12 pages, 11 figures