Skip to main content

Showing 1–3 of 3 results for author: Wang, W Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.11092  [pdf, other

    cs.CL cs.AI cs.CV cs.CY eess.IV

    Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts

    Authors: Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang

    Abstract: Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set. One such benchmark, "Conceptual Coverage Across Languages" (CoCo-CroLa), assesses the tangible noun inventory of T2I models by prompting them to generate pictures from a concept list translated to seven languages and co… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: NAACL 2024 Main Conference

  2. arXiv:2306.01735  [pdf, other

    cs.CL cs.AI cs.CV eess.IV

    Multilingual Conceptual Coverage in Text-to-Image Models

    Authors: Michael Saxon, William Yang Wang

    Abstract: We propose "Conceptual Coverage Across Languages" (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns. For each model we can assess "conceptual coverage" of a given target language relative to a source language by comparing the population of images generated for a series… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: ACL 2023 main conference; 16 pages, 13 figures

  3. arXiv:2305.10684  [pdf, other

    eess.AS cs.SD

    Data Augmentation for Diverse Voice Conversion in Noisy Environments

    Authors: Avani Tanna, Michael Saxon, Amr El Abbadi, William Yang Wang

    Abstract: Voice conversion (VC) models have demonstrated impressive few-shot conversion quality on the clean, native speech populations they're trained on. However, when source or target speech accents, background noise conditions, or microphone characteristics differ from training, quality voice conversion is not guaranteed. These problems are often left unexamined in VC research, giving rise to frustratio… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Interspeech 2023 Show and Tell, 2 pp