Skip to main content

Showing 1–2 of 2 results for author: Darefsky, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2402.06986  [pdf, other

    cs.SD eess.AS

    Cacophony: An Improved Contrastive Audio-Text Model

    Authors: Ge Zhu, Jordan Darefsky, Zhiyao Duan

    Abstract: Despite recent advancements in audio-text modeling, audio-text contrastive models still lag behind their image-text counterparts in scale and performance. We propose a method to improve both the scale and the training of audio-text contrastive models. Specifically, we craft a large-scale audio-text dataset containing 13,000 hours of text-labeled audio, using pretrained language models to process n… ▽ More

    Submitted 29 April, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: Work in Progress

  2. arXiv:2204.09079  [pdf, other

    eess.AS cs.SD eess.SP

    Music Source Separation with Generative Flow

    Authors: Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan

    Abstract: Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art. However, such parallel data is often difficult to obtain, and it is cumbersome to adapt trained models to mixtures with new sources. Source-only supervised models, in contrast, only require individual source data for training. In this paper, we first leverage flow-based gen… ▽ More

    Submitted 16 October, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted by Signal Processing Letters