Skip to main content

Showing 1–4 of 4 results for author: Havard, W N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2006.08387  [pdf, other

    cs.CL

    Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech

    Authors: William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

    Abstract: The language acquisition literature shows that children do not build their lexicon by segmenting the spoken input into phonemes and then building up words from them, but rather adopt a top-down approach and start by segmenting word-like units and then break them down into smaller units. This suggests that the ideal way of learning a language is by starting from full semantic units. In this paper,… ▽ More

    Submitted 20 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Accepted at CoNLL20

  2. arXiv:1909.08491  [pdf, other

    cs.CL cs.LG

    Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech

    Authors: William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

    Abstract: In this paper, we study how word-like units are represented and activated in a recurrent neural model of visually grounded speech. The model used in our experiments is trained to project an image and its spoken description in a common representation space. We show that a recurrent model trained on spoken sentences implicitly segments its input into word-like units and reliably maps them to their c… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: Accepted at CoNLL2019

  3. arXiv:1907.12895  [pdf, other

    cs.CL

    MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible

    Authors: Marcely Zanon Boito, William N. Havard, Mahault Garnerin, Éric Le Ferrand, Laurent Besacier

    Abstract: The CMU Wilderness Multilingual Speech Dataset (Black, 2019) is a newly published multilingual speech dataset based on recorded readings of the New Testament. It provides data to build Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models for potentially 700 languages. However, the fact that the source content (the Bible) is the same for all the languages is not exploited to date.Ther… ▽ More

    Submitted 26 February, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

    Comments: Accepted to LREC2020

  4. arXiv:1902.03052  [pdf, other

    cs.CL

    Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese

    Authors: William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

    Abstract: We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese. Experimental results show that attention focuses on nouns and this behaviour holds true for two very typologically different languages. We also draw parallels between artificial neural attention and human attention and show that neural attention focuses on word end… ▽ More

    Submitted 8 February, 2019; originally announced February 2019.

    Comments: 5 pages, 3 figures, accepted at ICASSP2019