Skip to main content

Showing 1–8 of 8 results for author: Matusevych, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.13922  [pdf, other

    cs.CL eess.AS

    Visually Grounded Speech Models have a Mutual Exclusivity Bias

    Authors: Leanne Nortje, Dan Oneaţă, Yevgen Matusevych, Herman Kamper

    Abstract: When children learn new words, they employ constraints such as the mutual exclusivity (ME) bias: a novel word is mapped to a novel object rather than a familiar one. This bias has been studied computationally, but only in models that use discrete word representations as input, ignoring the high variability of spoken words. We investigate the ME bias in the context of visually grounded speech model… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted to TACL, pre-MIT Press publication version

  2. arXiv:2103.10731  [pdf, other

    cs.CL eess.AS

    Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation

    Authors: Christiaan Jacobs, Yevgen Matusevych, Herman Kamper

    Abstract: Acoustic word embeddings (AWEs) are fixed-dimensional representations of variable-length speech segments. For zero-resource languages where labelled data is not available, one AWE approach is to use unsupervised autoencoder-based recurrent models. Another recent approach is to use multilingual transfer: a supervised AWE model is trained on several well-resourced languages and then applied to an un… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: Accepted to SLT 2021

  3. arXiv:2101.11332  [pdf, other

    cs.CL

    A phonetic model of non-native spoken word processing

    Authors: Yevgen Matusevych, Herman Kamper, Thomas Schatz, Naomi H. Feldman, Sharon Goldwater

    Abstract: Non-native speakers show difficulties with spoken word processing. Many studies attribute these difficulties to imprecise phonological encoding of words in the lexical memory. We test an alternative hypothesis: that some of these difficulties can arise from the non-native speakers' phonetic perception. We train a computational model of phonetic learning, which has no access to phonology, on either… ▽ More

    Submitted 11 March, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Accepted for publication in Proceedings of EACL-2021. 11 pages, 5 figures, 2 tables

  4. arXiv:2008.02888  [pdf, other

    cs.CL cs.SD eess.AS

    Evaluating computational models of infant phonetic learning across languages

    Authors: Yevgen Matusevych, Thomas Schatz, Herman Kamper, Naomi H. Feldman, Sharon Goldwater

    Abstract: In the first year of life, infants' speech perception becomes attuned to the sounds of their native language. Many accounts of this early phonetic learning exist, but computational models predicting the attunement patterns observed in infants from the speech input they hear have been lacking. A recent study presented the first such model, drawing on algorithms proposed for unsupervised learning fr… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: 7 pages, 1 figure

    Journal ref: 2020. In S. Denison, M. Mack, Y. Xu, and B. Armstrong (Eds.), Proceedings of the 42nd Annual Conference of the Cognitive Science Society (pp. 571-577). Austin, TX: Cognitive Science Society

  5. arXiv:2006.02295  [pdf, other

    cs.CL cs.SD eess.AS

    Improved acoustic word embeddings for zero-resource languages using multilingual transfer

    Authors: Herman Kamper, Yevgen Matusevych, Sharon Goldwater

    Abstract: Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. Such embeddings can form the basis for speech search, indexing and discovery systems when conventional speech recognition is not possible. In zero-resource settings where unlabelled speech is the only available resource, we need a method that gives robust embeddings on an arbitrary language. Here we… ▽ More

    Submitted 5 February, 2021; v1 submitted 2 June, 2020; originally announced June 2020.

    Comments: 11 pages, 7 figures, 8 tables. arXiv admin note: text overlap with arXiv:2002.02109. Submitted to the IEEE Transactions on Audio, Speech and Language Processing

  6. arXiv:2004.01647  [pdf, other

    cs.CL

    Analyzing autoencoder-based acoustic word embeddings

    Authors: Yevgen Matusevych, Herman Kamper, Sharon Goldwater

    Abstract: Recent studies have introduced methods for learning acoustic word embeddings (AWEs)---fixed-size vector representations of words which encode their acoustic features. Despite the widespread use of AWEs in speech processing research, they have only been evaluated quantitatively in their ability to discriminate between whole word tokens. To better understand the applications of AWEs in various downs… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: 6 pages, 7 figures, accepted to BAICS workshop (ICLR2020)

  7. arXiv:2002.02109  [pdf, other

    cs.CL eess.AS

    Multilingual acoustic word embedding models for processing zero-resource languages

    Authors: Herman Kamper, Yevgen Matusevych, Sharon Goldwater

    Abstract: Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. In settings where unlabelled speech is the only available resource, such embeddings can be used in "zero-resource" speech search, indexing and discovery systems. Here we propose to train a single supervised embedding model on labelled data from multiple well-resourced languages and then apply it to u… ▽ More

    Submitted 21 February, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: 5 pages, 4 figures, 1 table; accepted to ICASSP 2020. arXiv admin note: text overlap with arXiv:1811.00403

  8. arXiv:1906.01280  [pdf, other

    cs.CL

    Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection

    Authors: Maria Corkery, Yevgen Matusevych, Sharon Goldwater

    Abstract: The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstr… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted at ACL 2019