Skip to main content

Showing 1–14 of 14 results for author: Engel, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  2. arXiv:2203.03022  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS stat.ML

    HEAR: Holistic Evaluation of Audio Representations

    Authors: Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu **, Yonatan Bisk

    Abstract: What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, in… ▽ More

    Submitted 29 May, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

    Comments: to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

  3. arXiv:2108.04068  [pdf, other

    stat.OT stat.AP

    On the role of data, statistics and decisions in a pandemic

    Authors: Beate Jahn, Sarah Friedrich, Joachim Behnke, Joachim Engel, Ursula Garczarek, Ralf Münnich, Markus Pauly, Adalbert Wilhelm, Olaf Wolkenhauer, Markus Zwick, Uwe Siebert, Tim Friede

    Abstract: A pandemic poses particular challenges to decision-making because of the need to continuously adapt decisions to rapidly changing evidence and available data. For example, which countermeasures are appropriate at a particular stage of the pandemic? How can the severity of the pandemic be measured? What is the effect of vaccination in the population and which groups should be vaccinated first? The… ▽ More

    Submitted 8 March, 2022; v1 submitted 6 August, 2021; originally announced August 2021.

  4. arXiv:2103.16091  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Symbolic Music Generation with Diffusion Models

    Authors: Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon

    Abstract: Score-based generative models and diffusion probabilistic models have been successful at generating high-quality samples in continuous domains such as images and audio. However, due to their Langevin-inspired sampling mechanisms, their application to discrete and sequential data has been limited. In this work, we present a technique for training diffusion models on sequential data by parameterizin… ▽ More

    Submitted 25 November, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: ISMIR 2021

  5. arXiv:2001.04643  [pdf, other

    cs.LG cs.SD eess.AS eess.SP stat.ML

    DDSP: Differentiable Digital Signal Processing

    Authors: Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, Adam Roberts

    Abstract: Most generative models of audio directly generate samples in one of two domains: time or frequency. While sufficient to express any signal, these representations are inefficient, as they do not utilize existing knowledge of how sound is generated and perceived. A third approach (vocoders/synthesizers) successfully incorporates strong domain knowledge of signal processing and perception, but has be… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

  6. arXiv:1912.05537  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Encoding Musical Style with Transformer Autoencoders

    Authors: Kristy Choi, Curtis Hawthorne, Ian Simon, Monica Dinculescu, Jesse Engel

    Abstract: We consider the problem of learning high-level controls over the global structure of generated sequences, particularly in the context of symbolic music generation with complex language models. In this work, we present the Transformer autoencoder, which aggregates encodings of the input data across time to obtain a global representation of style from a given performance. We show it is possible to c… ▽ More

    Submitted 30 June, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

  7. arXiv:1905.06118  [pdf, other

    cs.SD cs.LG cs.MM eess.AS stat.ML

    Learning to Groove with Inverse Sequence Transformations

    Authors: Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

    Abstract: We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models. Though Seq2Seq models usually require painstakingly aligned corpora, we show that it is possible to adapt an approach from the Generative Adversarial Network (GAN) literature (e.g. Pix2Pix (Isola et al., 2017) and Vid2V… ▽ More

    Submitted 26 July, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: Blog post and links: https://g.co/magenta/groovae

    ACM Class: J.5; I.2

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2269-2279, 2019

  8. arXiv:1902.08710  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    GANSynth: Adversarial Neural Audio Synthesis

    Authors: Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, Adam Roberts

    Abstract: Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence. Autoregressive models, such as WaveNet, model local structure at the expense of global latent structure and slow iterative sampling, while Generative Adversarial Networks (GANs), have global latent conditioning and efficient parall… ▽ More

    Submitted 14 April, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

    Comments: Colab Notebook: http://goo.gl/magenta/gansynth-demo

  9. arXiv:1902.08261  [pdf, other

    cs.LG cs.NE stat.ML

    Latent Translation: Crossing Modalities by Bridging Generative Models

    Authors: Yingtao Tian, Jesse Engel

    Abstract: End-to-end optimization has achieved state-of-the-art performance on many specific problems, but there is no straight-forward way to combine pretrained models for new problems. Here, we explore improving modularity by learning a post-hoc interface between two existing models to solve a new task. Specifically, we take inspiration from neural machine translation, and cast the challenging problem of… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

  10. arXiv:1810.12247  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

    Authors: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck

    Abstract: Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of… ▽ More

    Submitted 17 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Examples available at https://goo.gl/magenta/maestro-examples

  11. arXiv:1806.00195  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    Learning a Latent Space of Multitrack Measures

    Authors: Ian Simon, Adam Roberts, Colin Raffel, Jesse Engel, Curtis Hawthorne, Douglas Eck

    Abstract: Discovering and exploring the underlying structure of multi-instrumental music using learning-based approaches remains an open problem. We extend the recent MusicVAE model to represent multitrack polyphonic measures as vectors in a latent space. Our approach enables several useful operations such as generating plausible measures from scratch, interpolating between measures in a musically meaningfu… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  12. arXiv:1803.05428  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music

    Authors: Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, Douglas Eck

    Abstract: The Variational Autoencoder (VAE) has proven to be an effective model for producing semantically meaningful latent representations for natural data. However, it has thus far seen limited application to sequential data, and, as we demonstrate, existing recurrent VAE models have difficulty modeling sequences with long-term structure. To address this issue, we propose the use of a hierarchical decode… ▽ More

    Submitted 11 November, 2019; v1 submitted 13 March, 2018; originally announced March 2018.

    Comments: ICML Camera Ready Version (w/ fixed typos)

    Journal ref: ICML 2018

  13. arXiv:1711.05772  [pdf, other

    cs.LG cs.NE stat.ML

    Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models

    Authors: Jesse Engel, Matthew Hoffman, Adam Roberts

    Abstract: Deep generative neural networks have proven effective at both conditional and unconditional modeling of complex data distributions. Conditional generation enables interactive control, but creating new controls often requires expensive retraining. In this paper, we develop a method to condition generation without retraining the model. By post-hoc learning latent constraints, value functions that id… ▽ More

    Submitted 21 December, 2017; v1 submitted 15 November, 2017; originally announced November 2017.

  14. arXiv:1710.11153  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Onsets and Frames: Dual-Objective Piano Transcription

    Authors: Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck

    Abstract: We advance the state of the art in polyphonic piano music transcription by using a deep convolutional and recurrent neural network which is trained to jointly predict onsets and frames. Our model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note t… ▽ More

    Submitted 5 June, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Examples available at https://goo.gl/magenta/onsets-frames-examples