Skip to main content

Showing 1–2 of 2 results for author: Aslam, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.10488  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    Joint Multimodal Transformer for Emotion Recognition in the Wild

    Authors: Paul Waligora, Haseeb Aslam, Osama Zeeshan, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

    Abstract: Multimodal emotion recognition (MMER) systems typically outperform unimodal systems by leveraging the inter- and intra-modal relationships between, e.g., visual, textual, physiological, and auditory modalities. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention. This framework can exploit the complementary nature of dive… ▽ More

    Submitted 20 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 10 pages, 4 figures, 6 tables, CVPRw 2024

  2. arXiv:2203.14779  [pdf, other

    cs.CV cs.HC cs.SD eess.AS

    A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

    Authors: Gnana Praveen Rajasekar, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Patrick Cardinal, Eric Granger

    Abstract: Multimodal emotion recognition has recently gained much attention since it can leverage diverse and complementary relationships over multiple modalities (e.g., audio, visual, biosignals, etc.), and can provide some robustness to noisy modalities. Most state-of-the-art methods for audio-visual (A-V) fusion rely on recurrent networks or conventional attention mechanisms that do not effectively lever… ▽ More

    Submitted 20 April, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.05222