Skip to main content

Showing 1–4 of 4 results for author: Zeeshan, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.10488  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    Joint Multimodal Transformer for Emotion Recognition in the Wild

    Authors: Paul Waligora, Haseeb Aslam, Osama Zeeshan, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

    Abstract: Multimodal emotion recognition (MMER) systems typically outperform unimodal systems by leveraging the inter- and intra-modal relationships between, e.g., visual, textual, physiological, and auditory modalities. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention. This framework can exploit the complementary nature of dive… ▽ More

    Submitted 20 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 10 pages, 4 figures, 6 tables, CVPRw 2024

  2. arXiv:2401.15489  [pdf, other

    cs.CV cs.AI

    Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport

    Authors: Muhammad Haseeb Aslam, Muhammad Osama Zeeshan, Soufiane Belharbi, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Eric Granger

    Abstract: Deep learning models for multimodal expression recognition have reached remarkable performance in controlled laboratory environments because of their ability to learn complementary and redundant semantic information. However, these models struggle in the wild, mainly because of the unavailability and quality of modalities used for training. In practice, only a subset of the training-time modalitie… ▽ More

    Submitted 28 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  3. arXiv:2312.05632  [pdf, other

    cs.CV

    Subject-Based Domain Adaptation for Facial Expression Recognition

    Authors: Muhammad Osama Zeeshan, Muhammad Haseeb Aslam, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

    Abstract: Adapting a deep learning model to a specific target individual is a challenging facial expression recognition (FER) task that may be achieved using unsupervised domain adaptation (UDA) methods. Although several UDA methods have been proposed to adapt deep FER models across source and target data sets, multiple subject-specific source domains are needed to accurately represent the intra- and inter-… ▽ More

    Submitted 26 April, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

  4. arXiv:2203.14779  [pdf, other

    cs.CV cs.HC cs.SD eess.AS

    A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

    Authors: Gnana Praveen Rajasekar, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Patrick Cardinal, Eric Granger

    Abstract: Multimodal emotion recognition has recently gained much attention since it can leverage diverse and complementary relationships over multiple modalities (e.g., audio, visual, biosignals, etc.), and can provide some robustness to noisy modalities. Most state-of-the-art methods for audio-visual (A-V) fusion rely on recurrent networks or conventional attention mechanisms that do not effectively lever… ▽ More

    Submitted 20 April, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.05222