Skip to main content

Showing 1–4 of 4 results for author: Shier, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.06649  [pdf, other

    cs.SD eess.AS

    Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis

    Authors: Jordie Shier, Franco Caspe, Andrew Robertson, Mark Sandler, Charalampos Saitis, Andrew McPherson

    Abstract: Differentiable digital signal processing (DDSP) techniques, including methods for audio synthesis, have gained attention in recent years and lend themselves to interpretability in the parameter space. However, current differentiable synthesis methods have not explicitly sought to model the transient portion of signals, which is important for percussive sounds. In this work, we present a unified sy… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: To be published in The Proceedings of Forum Acusticum, Sep 2023, Turin, Italy

  2. arXiv:2308.15422  [pdf, other

    cs.SD eess.AS

    A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis

    Authors: Ben Hayes, Jordie Shier, György Fazekas, Andrew McPherson, Charalampos Saitis

    Abstract: The term "differentiable digital signal processing" describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music & speech synthesis. We catalogue applications to tasks including mu… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Under review for Frontiers in Signal Processing

  3. arXiv:2203.03022  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS stat.ML

    HEAR: Holistic Evaluation of Audio Representations

    Authors: Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu **, Yonatan Bisk

    Abstract: What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, in… ▽ More

    Submitted 29 May, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

    Comments: to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

  4. arXiv:2104.12922  [pdf, other

    cs.SD cs.AI cs.LG eess.AS eess.SP

    One Billion Audio Sounds from GPU-enabled Modular Synthesis

    Authors: Joseph Turian, Jordie Shier, George Tzanetakis, Kirk McNally, Max Henry

    Abstract: We release synth1B1, a multi-modal audio corpus consisting of 1 billion 4-second synthesized sounds, paired with the synthesis parameters used to generate them. The dataset is 100x larger than any audio dataset in the literature. We also introduce torchsynth, an open source modular synthesizer that generates the synth1B1 samples on-the-fly at 16200x faster than real-time (714MHz) on a single GPU.… ▽ More

    Submitted 20 July, 2021; v1 submitted 26 April, 2021; originally announced April 2021.