Skip to main content

Showing 1–14 of 14 results for author: Hawthorne, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2209.14458  [pdf, other

    cs.SD cs.IR cs.LG eess.AS

    The Chamber Ensemble Generator: Limitless High-Quality MIR Data via Generative Modeling

    Authors: Yusong Wu, Josh Gardner, Ethan Manilow, Ian Simon, Curtis Hawthorne, Jesse Engel

    Abstract: Data is the lifeblood of modern machine learning systems, including for those in Music Information Retrieval (MIR). However, MIR has long been mired by small datasets and unreliable labels. In this work, we propose to break this bottleneck using generative modeling. By pipelining a generative model of notes (Coconet trained on Bach Chorales) with a structured synthesis model of chamber ensembles (… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  2. arXiv:2206.05408  [pdf, other

    cs.SD cs.LG eess.AS

    Multi-instrument Music Synthesis with Spectrogram Diffusion

    Authors: Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse Engel

    Abstract: An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural synthesizers have exhibited a tradeoff between domain-specific models that offer detailed control of only specific instruments, or raw waveform models that can train on any music but with minimal control and slow generat… ▽ More

    Submitted 12 December, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

  3. arXiv:2203.15140  [pdf, other

    cs.SD eess.AS

    Improving Source Separation by Explicitly Modeling Dependencies Between Sources

    Authors: Ethan Manilow, Curtis Hawthorne, Cheng-Zhi Anna Huang, Bryan Pardo, Jesse Engel

    Abstract: We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random s… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: To appear at ICASSP 2022

  4. arXiv:2202.07765  [pdf, other

    cs.LG cs.AI cs.CV cs.SD eess.AS

    General-purpose, long-context autoregressive modeling with Perceiver AR

    Authors: Curtis Hawthorne, Andrew Jaegle, Cătălina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, João Carreira, Jesse Engel

    Abstract: Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic… ▽ More

    Submitted 14 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: ICML 2022

  5. arXiv:2111.03017  [pdf, other

    cs.SD cs.LG eess.AS

    MT3: Multi-Task Multitrack Music Transcription

    Authors: Josh Gardner, Ian Simon, Ethan Manilow, Curtis Hawthorne, Jesse Engel

    Abstract: Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a challenging task at the core of music understanding. Unlike Automatic Speech Recognition (ASR), which typically focuses on the words of a single speaker, AMT often requires transcribing multiple instruments simultaneously, all while preserving fine-scale pitch and timing information. Further, many AMT datasets are "l… ▽ More

    Submitted 15 March, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: ICLR 2022 camera-ready version

  6. arXiv:2107.09142  [pdf, other

    cs.SD cs.LG eess.AS

    Sequence-to-Sequence Piano Transcription with Transformers

    Authors: Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel

    Abstract: Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets. However, these models have required extensive domain-specific design of network architectures, input/output representations, and complex decoding schemes. In this work, we show that equivalent performance can be achieved using a generic encoder-decoder Transformer… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

  7. arXiv:2103.16091  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Symbolic Music Generation with Diffusion Models

    Authors: Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon

    Abstract: Score-based generative models and diffusion probabilistic models have been successful at generating high-quality samples in continuous domains such as images and audio. However, due to their Langevin-inspired sampling mechanisms, their application to discrete and sequential data has been limited. In this work, we present a technique for training diffusion models on sequential data by parameterizin… ▽ More

    Submitted 25 November, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: ISMIR 2021

  8. arXiv:1912.05537  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Encoding Musical Style with Transformer Autoencoders

    Authors: Kristy Choi, Curtis Hawthorne, Ian Simon, Monica Dinculescu, Jesse Engel

    Abstract: We consider the problem of learning high-level controls over the global structure of generated sequences, particularly in the context of symbolic music generation with complex language models. In this work, we present the Transformer autoencoder, which aggregates encodings of the input data across time to obtain a global representation of style from a given performance. We show it is possible to c… ▽ More

    Submitted 30 June, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

  9. arXiv:1907.06637  [pdf, other

    cs.SD cs.HC cs.LG eess.AS stat.ML

    The Bach Doodle: Approachable music composition with machine learning at scale

    Authors: Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, Jacob Howcroft

    Abstract: To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented… ▽ More

    Submitted 14 July, 2019; originally announced July 2019.

    Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2019

  10. arXiv:1810.12247  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

    Authors: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck

    Abstract: Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of… ▽ More

    Submitted 17 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Examples available at https://goo.gl/magenta/maestro-examples

  11. arXiv:1809.04281  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Music Transformer

    Authors: Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

    Abstract: Music relies heavily on repetition to build structure and meaning. Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA structure. The Transformer (Vaswani et al., 2017), a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence.… ▽ More

    Submitted 12 December, 2018; v1 submitted 12 September, 2018; originally announced September 2018.

    Comments: Improved skewing section and accompanying figures. Previous titles are "An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation" and "Music Transformer"

  12. arXiv:1806.00195  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    Learning a Latent Space of Multitrack Measures

    Authors: Ian Simon, Adam Roberts, Colin Raffel, Jesse Engel, Curtis Hawthorne, Douglas Eck

    Abstract: Discovering and exploring the underlying structure of multi-instrumental music using learning-based approaches remains an open problem. We extend the recent MusicVAE model to represent multitrack polyphonic measures as vectors in a latent space. Our approach enables several useful operations such as generating plausible measures from scratch, interpolating between measures in a musically meaningfu… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  13. arXiv:1803.05428  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music

    Authors: Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, Douglas Eck

    Abstract: The Variational Autoencoder (VAE) has proven to be an effective model for producing semantically meaningful latent representations for natural data. However, it has thus far seen limited application to sequential data, and, as we demonstrate, existing recurrent VAE models have difficulty modeling sequences with long-term structure. To address this issue, we propose the use of a hierarchical decode… ▽ More

    Submitted 11 November, 2019; v1 submitted 13 March, 2018; originally announced March 2018.

    Comments: ICML Camera Ready Version (w/ fixed typos)

    Journal ref: ICML 2018

  14. arXiv:1710.11153  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Onsets and Frames: Dual-Objective Piano Transcription

    Authors: Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck

    Abstract: We advance the state of the art in polyphonic piano music transcription by using a deep convolutional and recurrent neural network which is trained to jointly predict onsets and frames. Our model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note t… ▽ More

    Submitted 5 June, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Examples available at https://goo.gl/magenta/onsets-frames-examples