Search | arXiv e-print repository

Audiobox: Unified Audio Generation with Natural Language Prompts

Authors: Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Abstract: Audio is an essential part of our life, but creating it often requires expertise and is time-consuming. Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data. However, these models lack controllability in sever… ▽ More Audio is an essential part of our life, but creating it often requires expertise and is time-consuming. Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data. However, these models lack controllability in several aspects: speech generation models cannot synthesize novel styles based on text description and are limited on domain coverage such as outdoor environments; sound generation models only provide coarse-grained control based on descriptions like "a person speaking" and would only generate mumbling human voices. This paper presents Audiobox, a unified model based on flow-matching that is capable of generating various audio modalities. We design description-based and example-based prompting to enhance controllability and unify speech and sound generation paradigms. We allow transcript, vocal, and other audio styles to be controlled independently when generating speech. To improve model generalization with limited labels, we adapt a self-supervised infilling objective to pre-train on large quantities of unlabeled audio. Audiobox sets new benchmarks on speech and sound generation (0.745 similarity on Librispeech for zero-shot TTS; 0.77 FAD on AudioCaps for text-to-sound) and unlocks new methods for generating audio with novel vocal and acoustic styles. We further integrate Bespoke Solvers, which speeds up generation by over 25 times compared to the default ODE solver for flow-matching, without loss of performance on several tasks. Our demo is available at https://audiobox.metademolab.com/ △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2209.14914 [pdf, ps, other]

Quantum invariants for the graph isomorphism problem

Authors: Hernán I. de la Cruz, Fernando L. Pelayo, Vicente Pascual, Jose J. Paulet, Fernando Cuartero, Luis Llana, Mauro Mezzini

Abstract: Graph Isomorphism is such an important problem in computer science, that it has been widely studied over the last decades. It is well known that it belongs to NP class, but is not NP-complete. It is thought to be of comparable difficulty to integer factorisation. The best known proved algorithm to solve this problem in general, was proposed by László Babai and Eugene Luks in 1983. Recently, ther… ▽ More Graph Isomorphism is such an important problem in computer science, that it has been widely studied over the last decades. It is well known that it belongs to NP class, but is not NP-complete. It is thought to be of comparable difficulty to integer factorisation. The best known proved algorithm to solve this problem in general, was proposed by László Babai and Eugene Luks in 1983. Recently, there has been some research in the topic by using quantum computing, that also leads the present piece of research. In fact, we present a quantum computing algorithm that defines an invariant over Graph Isomorphism characterisation. This quantum algorithm is able to distinguish more non-isomorphic graphs than most of the known invariants so far. The proof of correctness and some hints illustrating the extent and reason of the improvement are also included in this paper. △ Less

Submitted 5 October, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

arXiv:1912.12914 [pdf]

'Alexa, Do You Know Anything?' The Impact of an Intelligent Assistant on Team Interactions and Creative Performance Under Time Scarcity

Authors: Sonia Jawaid Shaikh, Ignacio Cruz

Abstract: Human-AI collaboration is on the rise with the deployment of AI-enabled intelligent assistants (e.g. Amazon Echo, Cortana, Siri, etc.) across organizational contexts. It is claimed that intelligent assistants can help people achieve more in less time (Personal Digital Assistant - Cortana, n.d.). However, despite the increasing presence of intelligent assistants in collaborative settings, there is… ▽ More Human-AI collaboration is on the rise with the deployment of AI-enabled intelligent assistants (e.g. Amazon Echo, Cortana, Siri, etc.) across organizational contexts. It is claimed that intelligent assistants can help people achieve more in less time (Personal Digital Assistant - Cortana, n.d.). However, despite the increasing presence of intelligent assistants in collaborative settings, there is a void in the literature on how the deployment of this technology intersects with time scarcity to impact team behaviors and performance. To fill this gap in the literature, we collected behavioral data from 56 teams who participated in a between-subjects 2 (Intelligent Assistant: Available vs. Not Available) x 2 (Time: Scarce vs. Not Scarce/Control) lab experiment. The results show that teams with an intelligent assistant had significantly fewer interactions between its members compared to teams without an intelligent assistant. Teams who faced time scarcity also used the intelligent assistant more often to seek its assistance during task completion compared to those in the control condition. Lastly, teams with an intelligent assistant underperformed on a creative task compared to those without the device. We discuss implications of this technology from theoretical, empirical, and practical perspectives. △ Less

Submitted 30 December, 2019; originally announced December 2019.

Comments: 29 pages, 1 figure

arXiv:1903.01589 [pdf, other]

Albatross: An optimistic consensus algorithm

Authors: Pascal Berrang, Inês Cruz, Bruno França, Philipp von Styp-Rekowsky, Marvin Wissfeld

Abstract: The consensus protocol is a critical component of distributed ledgers and blockchains. Achieving consensus over a decentralized network poses challenges to transaction finality and performance. Currently, the highest-performing consensus algorithms are speculative BFT algorithms, which, however, compromise on the transaction finality guarantees offered by their non-speculative counterparts. In t… ▽ More The consensus protocol is a critical component of distributed ledgers and blockchains. Achieving consensus over a decentralized network poses challenges to transaction finality and performance. Currently, the highest-performing consensus algorithms are speculative BFT algorithms, which, however, compromise on the transaction finality guarantees offered by their non-speculative counterparts. In this paper, we introduce Albatross, a Proof-of-Stake (PoS) blockchain consensus algorithm that aims to combine the best of both worlds. At its heart, Albatross is a high-performing, speculative BFT algorithm that offers strong probabilistic finality. We complement this by periodically guaranteeing finality through the Tendermint protocol. We prove our protocol to be secure under standard BFT assumptions and analyze its performance both on a theoretical and practical level. For that, we provide an open-source Rust implementation of Albatross. Our real-world measurements support that our protocol has a performance close to the theoretical maximum for single-chain Proof-of-Stake consensus algorithms. △ Less

Submitted 22 August, 2023; v1 submitted 4 March, 2019; originally announced March 2019.

arXiv:1710.04144 [pdf, other]

GUIDES - Geospatial Urban Infrastructure Data Engineering Solutions

Authors: Booma Sowkarthiga Balasubramani, Omar Belingheri, Eric S. Boria, Isabel F. Cruz, Sybil Derrible, Michael D. Siciliano

Abstract: As the underground infrastructure systems of cities age, maintenance and repair become an increasing concern. Cities face difficulties in planning maintenance, predicting and responding to infrastructure related issues, and in realizing their vision to be a smart city due to their incomplete understanding of the existing state of the infrastructure. Only few cities have accurate and complete digit… ▽ More As the underground infrastructure systems of cities age, maintenance and repair become an increasing concern. Cities face difficulties in planning maintenance, predicting and responding to infrastructure related issues, and in realizing their vision to be a smart city due to their incomplete understanding of the existing state of the infrastructure. Only few cities have accurate and complete digital information on their underground infrastructure (e.g., electricity, water, natural gas) systems, which poses problems to those planning and performing construction projects. To address these issues, we introduce GUIDES as a new data conversion and management framework for urban underground infrastructure systems that enable city administrators, workers, and contractors along with the general public and other users to query digitized and integrated data to make smarter decisions. This demo paper presents the GUIDES architecture and describes two of its central components: (i) map** of underground infrastructure systems, and (ii) integration of heterogeneous geospatial data. △ Less

Submitted 11 October, 2017; originally announced October 2017.

Comments: 4 pages, SIGSPATIAL'17, November 7-10, 2017, Los Angeles Area, CA, USA

arXiv:1212.1625 [pdf, ps, other]

Testing the AgreementMaker System in the Anatomy Task of OAEI 2012

Authors: Daniel Faria, Catia Pesquita, Emanuel Santos, Francisco M. Couto, Cosmin Stroe, Isabel F. Cruz

Abstract: The AgreementMaker system was the leading system in the anatomy task of the Ontology Alignment Evaluation Initiative (OAEI) competition in 2011. While AgreementMaker did not compete in OAEI 2012, here we report on its performance in the 2012 anatomy task, using the same configurations of AgreementMaker submitted to OAEI 2011. Additionally, we also test AgreementMaker using an updated version of th… ▽ More The AgreementMaker system was the leading system in the anatomy task of the Ontology Alignment Evaluation Initiative (OAEI) competition in 2011. While AgreementMaker did not compete in OAEI 2012, here we report on its performance in the 2012 anatomy task, using the same configurations of AgreementMaker submitted to OAEI 2011. Additionally, we also test AgreementMaker using an updated version of the UBERON ontology as a mediating ontology, and otherwise identical configurations. AgreementMaker achieved an F-measure of 91.8% with the 2011 configurations, and an F-measure of 92.2% with the updated UBERON ontology. Thus, AgreementMaker would have been the second best system had it competed in the anatomy task of OAEI 2012, and only 0.1% below the F-measure of the best system. △ Less

Submitted 7 December, 2012; originally announced December 2012.

Comments: 4 pages, 2 tables

Showing 1–6 of 6 results for author: Cruz, I