Skip to main content

Showing 1–7 of 7 results for author: Kalayeh, M M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.00073  [pdf, other

    cs.CV

    On Negative Sampling for Audio-Visual Contrastive Learning from Movies

    Authors: Mahdi M. Kalayeh, Shervin Ardeshir, Lingyi Liu, Nagendra Kamath, Ashok Chandrashekar

    Abstract: The abundance and ease of utilizing sound, along with the fact that auditory clues reveal a plethora of information about what happens in a scene, make the audio-visual space an intuitive choice for representation learning. In this paper, we explore the efficacy of audio-visual self-supervised learning from uncurated long-form content i.e movies. Studying its differences with conventional short-fo… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2106.08513

  2. arXiv:2106.08513  [pdf, other

    cs.CV

    Watching Too Much Television is Good: Self-Supervised Audio-Visual Representation Learning from Movies and TV Shows

    Authors: Mahdi M. Kalayeh, Nagendra Kamath, Lingyi Liu, Ashok Chandrashekar

    Abstract: The abundance and ease of utilizing sound, along with the fact that auditory clues reveal so much about what happens in the scene, make the audio-visual space a perfectly intuitive choice for self-supervised representation learning. However, the current literature suggests that training on \textit{uncurated} data yields considerably poorer representations compared to the \textit{curated} alternati… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  3. arXiv:1911.11612  [pdf, other

    cs.CV

    On Symbiosis of Attribute Prediction and Semantic Segmentation

    Authors: Mahdi M. Kalayeh, Mubarak Shah

    Abstract: In this paper, we propose to employ semantic segmentation to improve person-related attribute prediction. The core idea lies in the fact that the probability of an attribute to appear in an image is far from being uniform in the spatial domain. We build our attribute prediction model jointly with a deep semantic segmentation network. This harnesses the localization cues learned by the semantic seg… ▽ More

    Submitted 23 November, 2019; originally announced November 2019.

    Comments: Accepted for publication in PAMI. arXiv admin note: substantial text overlap with arXiv:1704.08740

  4. arXiv:1806.02892  [pdf, other

    cs.LG cs.CV stat.ML

    Training Faster by Separating Modes of Variation in Batch-normalized Models

    Authors: Mahdi M. Kalayeh, Mubarak Shah

    Abstract: Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes inputs to the layers during training using the statistics of each mini-batch. In this work, we study BN from the viewpoint of Fisher kernels. We show that assuming samples within a mini-batch are from the same probability density function, then BN is identical to the… ▽ More

    Submitted 14 November, 2018; v1 submitted 7 June, 2018; originally announced June 2018.

  5. arXiv:1804.00216  [pdf, other

    cs.CV

    Human Semantic Parsing for Person Re-identification

    Authors: Mahdi M. Kalayeh, Emrah Basaran, Muhittin Gokmen, Mustafa E. Kamasak, Mubarak Shah

    Abstract: Person re-identification is a challenging task mainly due to factors such as background clutter, pose, illumination and camera point of view variations. These elements hinder the process of extracting robust and discriminative representations, hence preventing different identities from being successfully distinguished. To improve the representation learning, usually, local features from human body… ▽ More

    Submitted 31 March, 2018; originally announced April 2018.

  6. arXiv:1704.08740  [pdf, other

    cs.CV

    Improving Facial Attribute Prediction using Semantic Segmentation

    Authors: Mahdi M. Kalayeh, Boqing Gong, Mubarak Shah

    Abstract: Attributes are semantically meaningful characteristics whose applicability widely crosses category boundaries. They are particularly important in describing and recognizing concepts where no explicit training example is given, \textit{e.g., zero-shot learning}. Additionally, since attributes are human describable, they can be used for efficient human-computer interaction. In this paper, we propose… ▽ More

    Submitted 27 April, 2017; originally announced April 2017.

  7. arXiv:1501.00614  [pdf, other

    cs.CV

    Understanding Trajectory Behavior: A Motion Pattern Approach

    Authors: Mahdi M. Kalayeh, Stephen Mussmann, Alla Petrakova, Niels da Vitoria Lobo, Mubarak Shah

    Abstract: Mining the underlying patterns in gigantic and complex data is of great importance to data analysts. In this paper, we propose a motion pattern approach to mine frequent behaviors in trajectory data. Motion patterns, defined by a set of highly similar flow vector groups in a spatial locality, have been shown to be very effective in extracting dominant motion behaviors in video sequences. Inspired… ▽ More

    Submitted 3 January, 2015; originally announced January 2015.