Skip to main content

Showing 1–10 of 10 results for author: Janjua, M K

.
  1. arXiv:2401.15235  [pdf, other

    eess.IV cs.CV cs.LG

    CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

    Authors: Amirhosein Ghasemabadi, Muhammad Kamran Janjua, Mohammad Salameh, Chunhua Zhou, Fengyu Sun, Di Niu

    Abstract: Image restoration tasks traditionally rely on convolutional neural networks. However, given the local nature of the convolutional operator, they struggle to capture global information. The promise of attention mechanisms in Transformers is to circumvent this problem, but it comes at the cost of intensive computational overhead. Many recent studies in image restoration have focused on solving the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR), 2024. 20 pages

  2. arXiv:2312.01624  [pdf, other

    cs.LG cs.AI

    GVFs in the Real World: Making Predictions Online for Water Treatment

    Authors: Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White

    Abstract: In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant. Develo** such a prediction system is a critical step on the path to optimizing and automating water treatment. Before that, there are many questions to answer about the predictability of the data, suitable neural network architectures, how to overcome partial obse… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: Published in Machine Learning (2023)

    Journal ref: Machine Learning (2023): 1-31

  3. arXiv:2010.09105  [pdf, other

    cs.CV

    Movement-induced Priors for Deep Stereo

    Authors: Yuxin Hou, Muhammad Kamran Janjua, Juho Kannala, Arno Solin

    Abstract: We propose a method for fusing stereo disparity estimation with movement-induced prior information. Instead of independent inference frame-by-frame, we formulate the problem as a non-parametric learning task in terms of a temporal Gaussian process prior with a movement-driven kernel for inter-frame reasoning. We present a hierarchy of three Gaussian process kernels depending on the availability of… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

  4. arXiv:1909.08685  [pdf, ps, other

    cs.CV cs.SD eess.AS

    Deep Latent Space Learning for Cross-modal Map** of Audio and Visual Signals

    Authors: Shah Nawaz, Muhammad Kamran Janjua, Ignazio Gallo, Arif Mahmood, Alessandro Calefati

    Abstract: We propose a novel deep training algorithm for joint representation of audio and visual information which consists of a single stream network (SSNet) coupled with a novel loss function to learn a shared deep latent space representation of multimodal information. The proposed framework characterizes the shared latent space by leveraging the class centers which helps to eliminate the need for pairwi… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: Accepted to DICTA 2019

  5. arXiv:1909.01976  [pdf, other

    cs.CV

    Do Cross Modal Systems Leverage Semantic Relationships?

    Authors: Shah Nawaz, Muhammad Kamran Janjua, Ignazio Gallo, Arif Mahmood, Alessandro Calefati, Faisal Shafait

    Abstract: Current cross-modal retrieval systems are evaluated using R@K measure which does not leverage semantic relationships rather strictly follows the manually marked image text query pairs. Therefore, current systems do not generalize well for the unseen data in the wild. To handle this, we propose a new measure, SemanticMap, to evaluate the performance of cross-modal systems. Our proposed measure eval… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted to cross modal learning in real world in conjunction with ICCV 2019. arXiv admin note: text overlap with arXiv:1807.07364

  6. arXiv:1810.07037  [pdf, other

    cs.CV

    Learning Inward Scaled Hypersphere Embedding: Exploring Projections in Higher Dimensions

    Authors: Muhammad Kamran Janjua, Shah Nawaz, Alessandro Calefati, Ignazio Gallo

    Abstract: Majority of the current dimensionality reduction or retrieval techniques rely on embedding the learned feature representations onto a computable metric space. Once the learned features are mapped, a distance metric aids the bridging of gaps between similar instances. Since the scaled projection is not exploited in these methods, discriminative embedding onto a hyperspace becomes a challenge. In th… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

  7. arXiv:1810.02001  [pdf, ps, other

    cs.CV

    Image and Encoded Text Fusion for Multi-Modal Classification

    Authors: Ignazio Gallo, Alessandro Calefati, Shah Nawaz, Muhammad Kamran Janjua

    Abstract: Multi-modal approaches employ data from multiple input streams such as textual and visual domains. Deep neural networks have been successfully employed for these approaches. In this paper, we present a novel multi-modal approach that fuses images and text descriptions to improve multi-modal classification performance in real-world scenarios. The proposed approach embeds an encoded text onto an ima… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: Accepted to DICTA 2018

  8. arXiv:1808.10822  [pdf, other

    cs.CV

    Seeing Colors: Learning Semantic Text Encoding for Classification

    Authors: Shah Nawaz, Alessandro Calefati, Muhammad Kamran Janjua, Ignazio Gallo

    Abstract: The question we answer with this work is: can we convert a text document into an image to exploit best image classification models to classify documents? To answer this question we present a novel text classification method which converts a text document into an encoded image, using word embedding and capabilities of Convolutional Neural Networks (CNNs), successfully employed in image classificati… ▽ More

    Submitted 31 August, 2018; originally announced August 2018.

    Comments: 9 pages. Under review at IJDAR

  9. arXiv:1807.08512  [pdf, other

    cs.CV

    Git Loss for Deep Face Recognition

    Authors: Alessandro Calefati, Muhammad Kamran Janjua, Shah Nawaz, Ignazio Gallo

    Abstract: Convolutional Neural Networks (CNNs) have been widely used in computer vision tasks, such as face recognition and verification, and have achieved state-of-the-art results due to their ability to capture discriminative deep features. Conventionally, CNNs have been trained with softmax as supervision signal to penalize the classification loss. In order to further enhance the discriminative capabilit… ▽ More

    Submitted 28 July, 2018; v1 submitted 23 July, 2018; originally announced July 2018.

    Comments: 12 pages. Accepted at BMVC2018

  10. arXiv:1807.07364  [pdf, other

    cs.CV

    Revisiting Cross Modal Retrieval

    Authors: Shah Nawaz, Muhammad Kamran Janjua, Alessandro Calefati, Ignazio Gallo

    Abstract: This paper proposes a cross-modal retrieval system that leverages on image and text encoding. Most multimodal architectures employ separate networks for each modality to capture the semantic relationship between them. However, in our work image-text encoding can achieve comparable results in terms of cross-modal retrieval without having to use a separate network for each modality. We show that tex… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

    Comments: 14 pages. Under review at ECCVW (MULA 2018)