Skip to main content

Showing 1–9 of 9 results for author: Yousaf, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.09342  [pdf, other

    cs.CV cs.SD eess.AS

    Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

    Abstract: The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2… ▽ More

    Submitted 16 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: ACM Multimedia Conference - Grand Challenge

  2. arXiv:2309.17426  [pdf

    cs.CV cs.AI

    Classification of Potholes Based on Surface Area Using Pre-Trained Models of Convolutional Neural Network

    Authors: Chauhdary Fazeel Ahmad, Abdullah Cheema, Waqas Qayyum, Rana Ehtisham, Muhammad Haroon Yousaf, Junaid Mir, Nasim Shakouri Mahmoudabadi, Afaq Ahmad

    Abstract: Potholes are fatal and can cause severe damage to vehicles as well as can cause deadly accidents. In South Asian countries, pavement distresses are the primary cause due to poor subgrade conditions, lack of subsurface drainage, and excessive rainfalls. The present research compares the performance of three pre-trained Convolutional Neural Network (CNN) models, i.e., ResNet 50, ResNet 18, and Mobil… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: 24 Pages, 26 Figures

  3. arXiv:2303.06129  [pdf, other

    cs.CV

    Single-branch Network for Multimodal Training

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Muhammad Zaigham Zaheer, Karthik Nandakumar, Muhammad Haroon Yousaf, Arif Mahmood

    Abstract: With the rapid growth of social media platforms, users are sharing billions of multimedia posts containing audio, images, and text. Researchers have focused on building autonomous systems capable of processing such multimedia data to solve challenging multimodal tasks including cross-modal retrieval, matching, and verification. Existing works use separate networks to extract embeddings of each mod… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  4. arXiv:2302.13033  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Speaker Recognition in Realistic Scenario Using Multimodal Data

    Authors: Saqlain Hussain Shah, Muhammad Saad Saeed, Shah Nawaz, Muhammad Haroon Yousaf

    Abstract: In recent years, an association is established between faces and voices of celebrities leveraging large scale audio-visual information from YouTube. The availability of large scale audio-visual datasets is instrumental in develo** speaker recognition methods based on standard Convolutional Neural Networks. Thus, the aim of this paper is to leverage large scale audio-visual information to improve… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: Accepted at the International Conference on Artificial Intelligence (ICAI'2023)

  5. arXiv:2208.10238  [pdf, other

    cs.CV

    Learning Branched Fusion and Orthogonal Projection for Face-Voice Association

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Sajid Javed, Muhammad Haroon Yousaf, Alessio Del Bue

    Abstract: Recent years have seen an increased interest in establishing association between faces and voices of celebrities leveraging audio-visual information from YouTube. Prior works adopt metric learning methods to learn an embedding space that is amenable for associated matching and verification tasks. Albeit showing some progress, such formulations are, however, restrictive due to dependency on distanc… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: Submitted: IEEE Transactions on Multimedia. arXiv admin note: substantial text overlap with arXiv:2112.10483

  6. arXiv:2112.10483  [pdf, other

    cs.CV

    Fusion and Orthogonal Projection for Improved Face-Voice Association

    Authors: Muhammad Saad Saeed, Muhammad Haris Khan, Shah Nawaz, Muhammad Haroon Yousaf, Alessio Del Bue

    Abstract: We study the problem of learning association between face and voice, which is gaining interest in the computer vision community lately. Prior works adopt pairwise or triplet loss formulations to learn an embedding space amenable for associated matching and verification tasks. Albeit showing some progress, such loss formulations are, however, restrictive due to dependency on distance-dependent marg… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

  7. arXiv:2004.13780  [pdf, other

    cs.CV cs.CL cs.SD eess.AS

    Cross-modal Speaker Verification and Recognition: A Multilingual Perspective

    Authors: Muhammad Saad Saeed, Shah Nawaz, Pietro Morerio, Arif Mahmood, Ignazio Gallo, Muhammad Haroon Yousaf, Alessio Del Bue

    Abstract: Recent years have seen a surge in finding association between faces and voices within a cross-modal biometric application along with speaker recognition. Inspired from this, we introduce a challenging task in establishing association between faces and voices across multiple languages spoken by the same set of persons. The aim of this paper is to answer two closely related questions: "Is face-voice… ▽ More

    Submitted 22 April, 2021; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: Accepted: CVPRW

  8. arXiv:1809.09617  [pdf, other

    cs.NI

    UAV-Empowered Disaster-Resilient Edge Architecture for Delay-Sensitive Communication

    Authors: Zeeshan Kaleem, Muhammad Yousaf, Aamir Qamar, Ayaz Ahmad, Trung Q. Duong, Wan Choi, Abbas Jamalipour

    Abstract: The fifth-generation (5G) communication systems will enable enhanced mobile broadband, ultra-reliable low latency, and massive connectivity services. The broadband and low-latency services are indispensable to public safety (PS) communication during natural or man-made disasters. Recently, the third generation partnership project long term evolution (3GPPLTE) has emerged as a promising candidate t… ▽ More

    Submitted 28 January, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

    Comments: 9,5

  9. Evaluating Impact of Mobility on Wireless Routing Protocols

    Authors: N. Javaid, M. Yousaf, A. Ahmad, A. Naveed, K. Djouani

    Abstract: In this paper, we evaluate, analyze, and compare the impact of mobility on the behavior of three reactive protocols (AODV, DSR, DYMO) and three proactive protocols (DSDV, FSR, OLSR) in multi-hop wireless networks. We take into account throughput, end-to-end delay, and normalized routing load as performance parameters. Based upon the extensive simulation results in NS-2, we rank all of six protocol… ▽ More

    Submitted 18 August, 2011; originally announced August 2011.

    Journal ref: IEEE Symposium on Wireless Telecommunications Applications (ISWTA) 2011