Skip to main content

Showing 1–1 of 1 results for author: Yadala, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.00626  [pdf, other

    cs.MM cs.SD eess.AS

    Intelligent Text-Conditioned Music Generation

    Authors: Zhouyao Xie, Nikhil Yadala, Xinyi Chen, **g Xi Liu

    Abstract: CLIP (Contrastive Language-Image Pre-Training) is a multimodal neural network trained on (text, image) pairs to predict the most relevant text caption given an image. It has been used extensively in image generation by connecting its output with a generative model such as VQGAN, with the most notable example being OpenAI's DALLE-2. In this project, we apply a similar approach to bridge the gap bet… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.