Skip to main content

Showing 1–1 of 1 results for author: Adkins, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.15821  [pdf, other

    cs.SD cs.LG eess.AS

    Audiobox: Unified Audio Generation with Natural Language Prompts

    Authors: Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

    Abstract: Audio is an essential part of our life, but creating it often requires expertise and is time-consuming. Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data. However, these models lack controllability in sever… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.