Skip to main content

Showing 1–6 of 6 results for author: Liberman, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2204.04579  [pdf, other

    cs.SD eess.AS

    Inferring Pitch from Coarse Spectral Features

    Authors: Danni Ma, Neville Ryant, Mark Liberman

    Abstract: Fundamental frequency (F0) has long been treated as the physical definition of "pitch" in phonetic analysis. But there have been many demonstrations that F0 is at best an approximation to pitch, both in production and in perception: pitch is not F0, and F0 is not pitch. Changes in the pitch involve many articulatory and acoustic covariates; pitch perception often deviates from what F0 analysis pre… ▽ More

    Submitted 26 August, 2022; v1 submitted 9 April, 2022; originally announced April 2022.

  2. arXiv:2012.01477  [pdf, other

    eess.AS cs.SD

    The Third DIHARD Diarization Challenge

    Authors: Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span… ▽ More

    Submitted 5 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.07839

  3. arXiv:2011.12649  [pdf, other

    cs.CL eess.AS

    Neural Representations for Modeling Variation in Speech

    Authors: Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling

    Abstract: Variation in speech is often quantified by comparing phonetic transcriptions of the same utterance. However, manually transcribing speech is time-consuming and error prone. As an alternative, therefore, we investigate the extraction of acoustic embeddings from several self-supervised neural models. We use these representations to compute word-based pronunciation differences between non-native and… ▽ More

    Submitted 26 January, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: Submitted to Journal of Phonetics

  4. arXiv:2010.13007  [pdf, other

    eess.AS cs.SD

    Probing Acoustic Representations for Phonetic Properties

    Authors: Danni Ma, Neville Ryant, Mark Liberman

    Abstract: Pre-trained acoustic representations such as wav2vec and DeCoAR have attained impressive word error rates (WER) for speech recognition benchmarks, particularly when labeled data is limited. But little is known about what phonetic properties these various representations acquire, and how well they encode transferable features of speech. We compare features from two conventional and four pre-trained… ▽ More

    Submitted 14 February, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

  5. arXiv:2006.05815  [pdf, other

    eess.AS cs.SD

    Third DIHARD Challenge Evaluation Plan

    Authors: Neville Ryant, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: This paper introduces the third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises two tracks evaluating diarization performance when starting from a reference speech segmentation (track 1) and diarization from ra… ▽ More

    Submitted 2 December, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: Version 1.2 - Planned schedule updated - Updated numbers in tables from final versions of development/evaluation sets - Corrected typo

  6. arXiv:1906.07839  [pdf, ps, other

    eess.AS cs.CL

    The Second DIHARD Diarization Challenge: Dataset, task, and baselines

    Authors: Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises four tracks evaluating diarization performance under two input conditions (single channel vs. multi-channel) and two segmentatio… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

    Comments: Accepted by Interspeech 2019