Skip to main content

Showing 1–16 of 16 results for author: Williamson, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.15821  [pdf, other

    cs.SD cs.LG eess.AS

    Audiobox: Unified Audio Generation with Natural Language Prompts

    Authors: Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

    Abstract: Audio is an essential part of our life, but creating it often requires expertise and is time-consuming. Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data. However, these models lack controllability in sever… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  2. arXiv:2312.08578  [pdf, other

    cs.CV

    A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

    Authors: Jack Urbanek, Florian Bordes, Pietro Astolfi, Mary Williamson, Vasu Sharma, Adriana Romero-Soriano

    Abstract: Curation methods for massive vision-language datasets trade off between dataset size and quality. However, even the highest quality of available curated captions are far too short to capture the rich visual detail in an image. To show the value of dense and highly-aligned image-text pairs, we collect the Densely Captioned Images (DCI) dataset, containing 7805 natural images human-annotated with ma… ▽ More

    Submitted 17 June, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  3. arXiv:2312.05187  [pdf, other

    cs.CL cs.SD eess.AS

    Seamless: Multilingual Expressive and Streaming Speech Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek , et al. (40 additional authors not shown)

    Abstract: Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  4. arXiv:2306.15687  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

    Authors: Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu

    Abstract: Large-scale generative models such as GPT and DALL-E have revolutionized the research community. These models not only generate high fidelity outputs, but are also generalists which can solve tasks not explicitly taught. In contrast, speech generative models are still primitive in terms of scale and task generalization. In this paper, we present Voicebox, the most versatile text-guided generative… ▽ More

    Submitted 19 October, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023

  5. arXiv:2101.10384  [pdf, other

    cs.RO cs.AI

    droidlet: modular, heterogenous, multi-modal agents

    Authors: Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam

    Abstract: In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale. But most of these systems are: (a) isolated (perception, speech, or language only); (b) trained on static datasets. On the other hand, in the field of robotics, large-scale learning has always been difficult. Supervision is hard to gather and real world physical interacti… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

  6. arXiv:2101.00390  [pdf, other

    cs.CL eess.AS

    VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

    Authors: Changhan Wang, Morgane Rivière, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux

    Abstract: We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We pro… ▽ More

    Submitted 27 July, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

    Comments: Accepted to ACL 2021 (long paper)

  7. arXiv:2012.13391  [pdf, other

    cs.CL cs.AI cs.LG

    I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling

    Authors: Yixin Nie, Mary Williamson, Mohit Bansal, Douwe Kiela, Jason Weston

    Abstract: To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with… ▽ More

    Submitted 28 December, 2020; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: 15 pages

  8. arXiv:2011.08298  [pdf, ps, other

    cs.CL

    Facebook AI's WMT20 News Translation Task Submission

    Authors: Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

    Abstract: This paper describes Facebook AI's submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil <-> English and Inuktitut <-> English, where there are limited out-of-domain bitext and monolingual data. We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the ta… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

  9. arXiv:2007.01868  [pdf, other

    astro-ph.IM cs.LG

    Dalek -- a deep-learning emulator for TARDIS

    Authors: Wolfgang E. Kerzendorf, Christian Vogl, Johannes Buchner, Gabriella Contardo, Marc Williamson, Patrick van der Smagt

    Abstract: Supernova spectral time series contain a wealth of information about the progenitor and explosion process of these energetic events. The modeling of these data requires the exploration of very high dimensional posterior probabilities with expensive radiative transfer codes. Even modest parametrizations of supernovae contain more than ten parameters and a detailed exploration demands at least sever… ▽ More

    Submitted 3 July, 2020; originally announced July 2020.

    Comments: 6 pages;5 figures submitted to AAS Journals. Constructive Criticism invited

  10. arXiv:2006.12442  [pdf, other

    cs.CL cs.AI

    Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

    Authors: Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

    Abstract: We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the ga** holes we have not filled yet. We present a biased view, focusing on work done by our own group, while citing related work in each area. In particular, we discuss in detail the properties of cont… ▽ More

    Submitted 13 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

  11. arXiv:2004.13637  [pdf, other

    cs.CL cs.AI

    Recipes for building an open-domain chatbot

    Authors: Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, **g Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston

    Abstract: Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a… ▽ More

    Submitted 30 April, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  12. arXiv:2004.08449  [pdf, other

    cs.CL

    Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills

    Authors: Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau

    Abstract: Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent. Previous work has introduced tasks and datasets that aim to help agents to learn those qualities in isolation and gauge how well they can express them. But rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them al… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Comments: accepted to ACL 2020 (long paper)

  13. arXiv:1603.08627  [pdf, ps, other

    cs.DS

    On the Shoshan-Zwick Algorithm for the All-Pairs Shortest Path Problem

    Authors: Pavlos Eirinakis, Matthew Williamson, K. Subramani

    Abstract: The Shoshan-Zwick algorithm solves the all pairs shortest paths problem in undirected graphs with integer edge costs in the range $\{1, 2, \dots, M\}$. It runs in $\tilde{O}(M\cdot n^ω)$ time, where $n$ is the number of vertices, $M$ is the largest integer edge cost, and $ω< 2.3727$ is the exponent of matrix multiplication. It is the fastest known algorithm for this problem. This paper points out… ▽ More

    Submitted 28 March, 2016; originally announced March 2016.

    Comments: 16 pages

  14. arXiv:1301.3902  [pdf

    cs.AI stat.AP stat.ME

    Model Criticism of Bayesian Networks with Latent Variables

    Authors: David M. Williamson, Russell Almond, Robert Mislevy

    Abstract: The application of Bayesian networks (BNs) to cognitive assessment and intelligent tutoring systems poses new challenges for model construction. When cognitive task analyses suggest constructing a BN with several latent variables, empirical model criticism of the latent structure becomes both critical and complex. This paper introduces a methodology for criticizing models both globally (a BN in… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-634-643

  15. arXiv:0805.1844  [pdf, other

    quant-ph cs.IT math.NA

    Practical recipes for the model order reduction, dynamical simulation, and compressive sampling of large-scale open quantum systems

    Authors: John A. Sidles, Joseph L. Garbini, Lee E. Harrell, Alfred O. Hero, Jonathan P. Jacky, Joseph R. Malcomb, Anthony G. Norman, Austin M. Williamson

    Abstract: This article presents numerical recipes for simulating high-temperature and non-equilibrium quantum spin systems that are continuously measured and controlled. The notion of a spin system is broadly conceived, in order to encompass macroscopic test masses as the limiting case of large-j spins. The simulation technique has three stages: first the deliberate introduction of noise into the simulati… ▽ More

    Submitted 13 May, 2008; originally announced May 2008.

    Comments: 104 pages, 13 figures, 2 tables

  16. arXiv:cs/0407048  [pdf, ps, other

    cs.NI cs.CY

    Technological networks and the spread of computer viruses

    Authors: Justin Balthrop, Stephanie Forrest, M. E. J. Newman, Matthew M. Williamson

    Abstract: Computer infections such as viruses and worms spread over networks of contacts between computers, with different types of networks being exploited by different types of infections. Here we analyze the structures of several of these networks, exploring their implications for modes of spread and the control of infection. We argue that vaccination strategies that focus on a limited number of networ… ▽ More

    Submitted 19 July, 2004; originally announced July 2004.

    Comments: 9 pages, 1 figure

    Journal ref: Science 304, 527-529 (2004)