Skip to main content

Showing 1–6 of 6 results for author: Alisamir, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.06728  [pdf, other

    cs.HC

    THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare

    Authors: Hippolyte Fournier, Sina Alisamir, Safaa Azzakhnini, Hanna Chainay, Olivier Koenig, Isabella Zsoldos, Eléeonore Trân, Gérard Bailly, Frédéeric Elisei, Béatrice Bouchot, Brice Varini, Patrick Constant, Joan Fruitet, Franck Tarpin-Bernard, Solange Rossato, François Portet, Fabien Ringeval

    Abstract: We present THERADIA WoZ, an ecological corpus designed for audiovisual research on affect in healthcare. Two groups of senior individuals, consisting of 52 healthy participants and 9 individuals with Mild Cognitive Impairment (MCI), performed Computerised Cognitive Training (CCT) exercises while receiving support from a virtual assistant, tele-operated by a human in the role of a Wizard-of-Oz (WoZ… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  2. arXiv:2309.05472  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

    Authors: Titouan Parcollet, Ha Nguyen, Solene Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Esteve, Mickael Rouvier, Jerome Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

    Abstract: Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing and building SSL-… ▽ More

    Submitted 18 March, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Published in Computer Science and Language. Preprint allowed

  3. arXiv:2209.11061  [pdf, other

    eess.AS cs.HC cs.LG

    Cross-domain Voice Activity Detection with Self-Supervised Representations

    Authors: Sina Alisamir, Fabien Ringeval, Francois Portet

    Abstract: Voice Activity Detection (VAD) aims at detecting speech segments on an audio signal, which is a necessary first step for many today's speech based applications. Current state-of-the-art methods focus on training a neural network exploiting features directly contained in the acoustics, such as Mel Filter Banks (MFBs). Such methods therefore require an extra normalisation step to adapt to a new doma… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  4. arXiv:2209.10223  [pdf, other

    cs.SD cs.AI cs.HC eess.AS

    Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks

    Authors: Sina Alisamir, Fabien Ringeval, Francois Portet

    Abstract: Most automatic emotion recognition systems exploit time-continuous annotations of emotion to provide fine-grained descriptions of spontaneous expressions as observed in real-life interactions. As emotion is rather subjective, its annotation is usually performed by several annotators who provide a trace for a given dimension, i.e. a time-continuous series describing a dimension such as arousal or v… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  5. LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

    Authors: Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

    Abstract: Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image and natural language processing. Recent works also investigated SSL from speech. They were notably successful to improve performance on downstream tasks such as automatic speech recognition (ASR). While these works suggest it is possible to reduce dependence on labeled data for building efficient spee… ▽ More

    Submitted 10 June, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: Will be presented at Interspeech 2021

    Journal ref: Proc. Interspeech 2021

  6. arXiv:1907.11510  [pdf, ps, other

    cs.HC cs.CV cs.IR cs.LG stat.ML

    AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

    Authors: Fabien Ringeval, Björn Schuller, Michel Valstar, NIcholas Cummins, Roddy Cowie, Leili Tavabi, Maximilian Schmitt, Sina Alisamir, Shahin Amiriparian, Eva-Maria Messner, Siyang Song, Shuo Liu, Zi** Zhao, Adria Mallol-Ragolta, Zhao Ren, Mohammad Soleymani, Maja Pantic

    Abstract: The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challen… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.