Skip to main content

Showing 1–11 of 11 results for author: Wieling, M

.
  1. arXiv:2406.06208  [pdf, other

    cs.SD eess.AS

    Quantifying the effect of speech pathology on automatic and human speaker verification

    Authors: Bence Mark Halpern, Thomas Tienkamp, Wen-Chin Huang, Lester Phillip Violeta, Teja Rebernik, Sebastiaan de Visscher, Max Witjes, Martijn Wieling, Defne Abur, Tomoki Toda

    Abstract: This study investigates how surgical intervention for speech pathology (specifically, as a result of oral cancer surgery) impacts the performance of an automatic speaker verification (ASV) system. Using two recently collected Dutch datasets with parallel pre and post-surgery audio from the same speaker, NKI-OC-VC and SPOKE, we assess the extent to which speech pathology influences ASV performance,… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 5 pages, 2 figures, 2 tables. Accepted to Interspeech 2024

    ACM Class: I.2.7

  2. arXiv:2401.16429  [pdf, ps, other

    cs.IR cs.CL cs.DL cs.LG

    Combining topic modelling and citation network analysis to study case law from the European Court on Human Rights on the right to respect for private and family life

    Authors: M. Mohammadi, L. M. Bruijn, M. Wieling, M. Vols

    Abstract: As legal case law databases such as HUDOC continue to grow rapidly, it has become essential for legal researchers to find efficient methods to handle such large-scale data sets. Such case law databases usually consist of the textual content of cases together with the citations between them. This paper focuses on case law from the European Court of Human Rights on Article 8 of the European Conventi… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  3. arXiv:2305.13026  [pdf, ps, other

    cs.CL

    DUMB: A Benchmark for Smart Evaluation of Dutch Models

    Authors: Wietse de Vries, Martijn Wieling, Malvina Nissim

    Abstract: We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks. The total set of nine tasks includes four tasks that were previously not available in Dutch. Instead of relying on a mean score across tasks, we propose Relative Error Reduction (RER), which compares the DUMB performance of language models to a strong baseline w… ▽ More

    Submitted 13 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 camera-ready

  4. arXiv:2305.10951  [pdf, other

    cs.CL eess.AS

    Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

    Authors: Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, Martijn Wieling

    Abstract: The performance of automatic speech recognition (ASR) systems has advanced substantially in recent years, particularly for languages for which a large amount of transcribed speech is available. Unfortunately, for low-resource languages, such as minority languages, regional languages or dialects, ASR performance generally remains much lower. In this study, we investigate whether data augmentation t… ▽ More

    Submitted 18 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  5. arXiv:2209.06521  [pdf

    physics.med-ph cs.CL

    Preregistered protocol for: Articulatory changes in speech following treatment for oral or oropharyngeal cancer: a systematic review

    Authors: Thomas B. Tienkamp, Teja Rebernik, Defne Abur, Rob J. J. H. van Son, Sebastiaan A. H. J. de Visscher, Max J. H. Witjes, Martijn Wieling

    Abstract: This document outlines a PROSPERO pre-registered protocol for a systematic review regarding articulatory changes in speech following oral or orophayrngeal cancer treatment. Treatment of tumours in the oral cavity may result in physiological changes that could lead to articulatory difficulties. The tongue becomes less mobile due to scar tissue and/or potential (postoperative) radiation therapy. Mor… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

  6. arXiv:2205.02694  [pdf, other

    cs.CL cs.SD eess.AS

    Quantifying Language Variation Acoustically with Few Resources

    Authors: Martijn Bartelds, Martijn Wieling

    Abstract: Deep acoustic models represent linguistic information based on massive amounts of data. Unfortunately, for regional languages and dialects such resources are mostly not available. However, deep acoustic models might have learned linguistic information that transfers to low-resource languages. In this study, we evaluate whether this is the case through the task of distinguishing low-resource (Dutch… ▽ More

    Submitted 25 May, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  7. arXiv:2203.17072  [pdf, other

    cs.SD cs.CL eess.AS

    Manipulation of oral cancer speech using neural articulatory synthesis

    Authors: Bence Mark Halpern, Teja Rebernik, Thomas Tienkamp, Rob van Son, Michiel van den Brekel, Martijn Wieling, Max Witjes, Odette Scharenborg

    Abstract: We present an articulatory synthesis framework for the synthesis and manipulation of oral cancer speech for clinical decision making and alleviation of patient stress. Objective and subjective evaluations demonstrate that the framework has acceptable naturalness and is worth further investigation. A subsequent subjective vowel and consonant identification experiment showed that the articulatory sy… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 5 pages, 4 tables, 1 figure. Submitted to Interspeech 2022

  8. arXiv:2110.07918  [pdf, other

    cs.CL

    Estimating the Level and Direction of Phonetic Dialect Change in the Northern Netherlands

    Authors: Raoul Buurke, Hedwig Sekeres, Wilbert Heeringa, Remco Knooihuizen, Martijn Wieling

    Abstract: This article reports ongoing investigations into phonetic change of dialect groups in the northern Netherlandic language area, particularly the Frisian and Low Saxon dialect groups, which are known to differ in vitality. To achieve this, we combine existing phonetically transcribed corpora with dialectometric approaches that allow us to quantify change among older male dialect speakers in a real-t… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: Submitted to Taal & Tongval

  9. Adapting Monolingual Models: Data can be Scarce when Language Similarity is High

    Authors: Wietse de Vries, Martijn Bartelds, Malvina Nissim, Martijn Wieling

    Abstract: For many (minority) languages, the resources needed to train large models are not available. We investigate the performance of zero-shot transfer learning with as little data as possible, and the influence of language similarity in this process. We retrain the lexical layers of four BERT-based models using data from two low-resource target language varieties, while the Transformer layers are indep… ▽ More

    Submitted 22 May, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: Findings of ACL 2021 Camera Ready

    Journal ref: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

  10. arXiv:2012.10301  [pdf, ps, other

    cs.LG cs.CY

    The Danger of Reverse-Engineering of Automated Judicial Decision-Making Systems

    Authors: Masha Medvedeva, Martijn Wieling, Michel Vols

    Abstract: In this paper we discuss the implications of using machine learning for judicial decision-making in situations where human rights may be infringed. We argue that the use of such tools in these situations should be limited due to inherent status quo bias and dangers of reverse-engineering. We discuss that these issues already exist in the judicial systems without using machine learning tools, but h… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

  11. arXiv:2011.12649  [pdf, other

    cs.CL eess.AS

    Neural Representations for Modeling Variation in Speech

    Authors: Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling

    Abstract: Variation in speech is often quantified by comparing phonetic transcriptions of the same utterance. However, manually transcribing speech is time-consuming and error prone. As an alternative, therefore, we investigate the extraction of acoustic embeddings from several self-supervised neural models. We use these representations to compute word-based pronunciation differences between non-native and… ▽ More

    Submitted 26 January, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: Submitted to Journal of Phonetics