Skip to main content

Showing 1–6 of 6 results for author: Doukhan, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.10316  [pdf, ps, other

    eess.AS cs.CY cs.MM cs.SD

    Gender Representation in TV and Radio: Automatic Information Extraction methods versus Manual Analyses

    Authors: David Doukhan, Lena Dodson, Manon Conan, Valentin Pelloin, Aurélien Clamouse, Mélina Lepape, Géraldine Van Hille, Cécile Méadel, Marlène Coulomb-Gully

    Abstract: This study investigates the relationship between automatic information extraction descriptors and manual analyses to describe gender representation disparities in TV and Radio. Automatic descriptors, including speech time, facial categorization and speech transcriptions are compared with channel reports on a vast 32,000-hour corpus of French broadcasts from 2023. Findings reveal systemic gender im… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: keywords : Gender representation, computational humanities, TV, Radio, face classification, speaker traits, ASR, media, SLU. Accepted to InterSpeech 2024, Kos Island, Greece, september 2024

  2. arXiv:2406.10073  [pdf, other

    eess.AS cs.CL cs.HC cs.SD

    Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content

    Authors: Rémi Uro, Marie Tahon, David Doukhan, Antoine Laurent, Albert Rilliard

    Abstract: Transition Relevance Places are defined as the end of an utterance where the interlocutor may take the floor without interrupting the current speaker --i.e., a place where the turn is terminal. Analyzing turn terminality is useful to study the dynamic of turn-taking in spontaneous conversations. This paper presents an automatic classification of spoken utterances as Terminal or Non-Terminal in mul… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: keywords : Spoken interaction, Media, TV, Radio, Transition-Relevance Places, Turn Taking, Interruption. Accepted to InterSpeech 2024, Kos Island, Greece

  3. arXiv:2406.04429  [pdf, other

    eess.AS cs.DL cs.MM cs.SD

    InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation

    Authors: David Doukhan, Christine Maertens, William Le Personnic, Ludovic Speroni, Reda Dehak

    Abstract: InaGVAD is an audio corpus collected from 10 French radio and 18 TV channels categorized into 4 groups: generalist radio, music radio, news TV, and generalist TV. It contains 277 1-minute-long annotated recordings aimed at representing the acoustic diversity of French audiovisual programs and was primarily designed to build systems able to monitor men's and women's speaking time in media. inaGVAD… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Voice Activity Detection (VAD), Speaker Gender Segmentation, Audiovisual Speech Resource, Speaker Traits, Speech Overlap, Benchmark, X-vector, Gender Representation in the Media, Dataset

    Journal ref: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 8963-8974, Torino, Italia. ELRA and ICCL

  4. arXiv:2404.17552  [pdf, other

    eess.AS cs.CL cs.DL cs.LG cs.SD

    A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification

    Authors: Rémi Uro, David Doukhan, Albert Rilliard, Laëtitia Larcher, Anissa-Claire Adgharouamane, Marie Tahon, Antoine Laurent

    Abstract: This paper presents a semi-automatic approach to create a diachronic corpus of voices balanced for speaker's age, gender, and recording period, according to 32 categories (2 genders, 4 age ranges and 4 recording periods). Corpora were selected at French National Institute of Audiovisual (INA) to obtain at least 30 speakers per category (a total of 960 speakers; only 874 have be found yet). For eac… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Keywords:, semi-automatic processing, corpus creation, diarization, speaker identification, gender-balanced, age-balanced, speaker corpus, diachrony

    Journal ref: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 3271-3280, Marseille, 20-25 June 2022. European Language Resources Association (ELRA)

  5. arXiv:2404.16104  [pdf, other

    eess.AS cs.CL cs.SD

    Evolution of Voices in French Audiovisual Media Across Genders and Age in a Diachronic Perspective

    Authors: Albert Rilliard, David Doukhan, Rémi Uro, Simon Devauchelle

    Abstract: We present a diachronic acoustic analysis of the voice of 1023 speakers from French media archives. The speakers are spread across 32 categories based on four periods (years 1955/56, 1975/76, 1995/96, 2015/16), four age groups (20-35; 36-50; 51-65, >65), and two genders. The fundamental frequency ($F_0$) and the first four formants (F1-4) were estimated. Procedures used to ensure the quality of th… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 5 pages, 2 figures, keywords:, Gender, Diachrony, Vocal Tract Resonance, Vocal register, Broadcast speech

    Journal ref: Radek Skarnitzl & Jan Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), Prague 2023, pp. 753-757. Guarant International. ISBN 978-80-908 114-2-3

  6. arXiv:2404.15176  [pdf, other

    eess.AS cs.HC cs.LG cs.SD

    Voice Passing : a Non-Binary Voice Gender Prediction System for evaluating Transgender voice transition

    Authors: David Doukhan, Simon Devauchelle, Lucile Girard-Monneron, Mía Chávez Ruz, V. Chaddouk, Isabelle Wagner, Albert Rilliard

    Abstract: This paper presents a software allowing to describe voices using a continuous Voice Femininity Percentage (VFP). This system is intended for transgender speakers during their voice transition and for voice therapists supporting them in this process. A corpus of 41 French cis- and transgender speakers was recorded. A perceptual evaluation allowed 57 participants to estimate the VFP for each voice.… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 5 pages, 1 figure, keywords: Transgender voice, Gender perception, Speaker gender classification, CNN, X-Vector

    Journal ref: Proc. INTERSPEECH 2023, 5207-5211