Skip to main content

Showing 1–5 of 5 results for author: Novack, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20289  [pdf, other

    cs.SD cs.AI cs.LG

    DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation

    Authors: Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas Bryan

    Abstract: Controllable music generation methods are critical for human-centered AI-based music creation, but are currently limited by speed, quality, and control design trade-offs. Diffusion Inference-Time T-optimization (DITTO), in particular, offers state-of-the-art results, but is over 10x slower than real-time, limiting practical use. We propose Distilled Diffusion Inference-Time T -Optimization (or DIT… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2401.12179  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    DITTO: Diffusion Inference-Time T-Optimization for Music Generation

    Authors: Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J. Bryan

    Abstract: We propose Diffusion Inference-Time T-Optimization (DITTO), a general-purpose frame-work for controlling pre-trained text-to-music diffusion models at inference-time via optimizing initial noise latents. Our method can be used to optimize through any differentiable feature matching loss to achieve a target (stylized) output and leverages gradient checkpointing for memory efficiency. We demonstrate… ▽ More

    Submitted 3 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: Oral at ICML 2024

  3. arXiv:2310.10772  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Unsupervised Lead Sheet Generation via Semantic Compression

    Authors: Zachary Novack, Nikita Srivatsan, Taylor Berg-Kirkpatrick, Julian McAuley

    Abstract: Lead sheets have become commonplace in generative music research, being used as an initial compressed representation for downstream tasks like multitrack music generation and automatic arrangement. Despite this, researchers have often fallen back on deterministic reduction methods (such as the skyline algorithm) to generate lead sheets when seeking paired lead sheets and full scores, with little a… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  4. arXiv:2302.02551  [pdf, other

    cs.CV cs.LG

    CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

    Authors: Zachary Novack, Julian McAuley, Zachary C. Lipton, Saurabh Garg

    Abstract: Open vocabulary models (e.g. CLIP) have shown strong performance on zero-shot classification through their ability generate embeddings for each class based on their (natural language) names. Prior work has focused on improving the accuracy of these models through prompt engineering or by incorporating a small amount of labeled downstream data (via finetuning). However, there has been little focus… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023

  5. arXiv:2211.15853  [pdf, other

    cs.LG

    Disentangling the Mechanisms Behind Implicit Regularization in SGD

    Authors: Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary C. Lipton

    Abstract: A number of competing hypotheses have been proposed to explain why small-batch Stochastic Gradient Descent (SGD)leads to improved generalization over the full-batch regime, with recent work crediting the implicit regularization of various quantities throughout training. However, to date, empirical evidence assessing the explanatory power of these hypotheses is lacking. In this paper, we conduct an… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted as Spotlight at the NeurIPS 2022 Workshop for Higher Order Optimization in Machine Learning