Skip to main content

Showing 1–3 of 3 results for author: Mahmood, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.10814  [pdf, other

    cs.CL cs.NE cs.SD eess.AS

    Cross-Corpus Multilingual Speech Emotion Recognition: Amharic vs. Other Languages

    Authors: Ephrem Afele Retta, Richard Sutcliffe, Jabar Mahmood, Michael Abebe Berwo, Eiad Almekhlafi, Sajjad Ahmed Khan, Shehzad Ashraf Chaudhry, Mustafa Mhamed, Jun Feng

    Abstract: In a conventional Speech emotion recognition (SER) task, a classifier for a given language is trained on a pre-existing dataset for that same language. However, where training data for a language does not exist, data from other languages can be used instead. We experiment with cross-lingual and multilingual SER, working with Amharic, English, German and URDU. For Amharic, we use our own publicly-a… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: 16 pages, 9 tables, 5 figures

  2. arXiv:2307.04610  [pdf, other

    cs.CV

    SPLAL: Similarity-based pseudo-labeling with alignment loss for semi-supervised medical image classification

    Authors: Md Junaid Mahmood, Pranaw Raj, Divyansh Agarwal, Suruchi Kumari, Pravendra Singh

    Abstract: Medical image classification is a challenging task due to the scarcity of labeled samples and class imbalance caused by the high variance in disease prevalence. Semi-supervised learning (SSL) methods can mitigate these challenges by leveraging both labeled and unlabeled data. However, SSL methods for medical image classification need to address two key challenges: (1) estimating reliable pseudo-la… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Under Review

  3. arXiv:2106.14118  [pdf, other

    cs.CV cs.MM

    Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization

    Authors: Anurag Bagchi, Jazib Mahmood, Dolton Fernandes, Ravi Kiran Sarvadevabhatla

    Abstract: State of the art architectures for untrimmed video Temporal Action Localization (TAL) have only considered RGB and Flow modalities, leaving the information-rich audio modality totally unexploited. Audio fusion has been explored for the related but arguably easier problem of trimmed (clip-level) action recognition. However, TAL poses a unique set of challenges. In this paper, we propose simple but… ▽ More

    Submitted 17 October, 2021; v1 submitted 26 June, 2021; originally announced June 2021.