Skip to main content

Showing 1–12 of 12 results for author: Damirchi, H

.
  1. arXiv:2405.17139  [pdf, other

    cs.CV cs.AI cs.LG

    Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

    Authors: Cristian Rodriguez-Opazo, Ehsan Abbasnejad, Damien Teney, Edison Marrese-Taylor, Hamed Damirchi, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various architectures, from vision transformers (ViTs) to convolutional networks (ResNets) have been trained with CLIP to serve as general solutions to diverse vision tasks. This paper explores the differences across various CLIP-trained vision backbones. Despite using the same data an… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.14400

  2. arXiv:2312.14400  [pdf, other

    cs.CV

    Unveiling Backbone Effects in CLIP: Exploring Representational Synergies and Variances

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Ehsan Abbasnejad, Hamed Damirchi, Ignacio M. Jara, Felipe Bravo-Marquez, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various neural architectures, spanning Transformer-based models like Vision Transformers (ViTs) to Convolutional Networks (ConvNets) like ResNets, are trained with CLIP and serve as universal backbones across diverse vision tasks. Despite utilizing the same data and training objectives… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  3. arXiv:2311.17949  [pdf, other

    cs.CV

    Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines

    Authors: Hamed Damirchi, Cristian Rodríguez-Opazo, Ehsan Abbasnejad, Damien Teney, Javen Qinfeng Shi, Stephen Gould, Anton van den Hengel

    Abstract: Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box. The Web likely contains the information necessary to excel on any specific application, but identifying the right data a priori is challenging. This paper shows how to leverage recent advances in NLP and multi-modal le… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  4. arXiv:2307.03786  [pdf, other

    cs.CV

    Context-aware Pedestrian Trajectory Prediction with Multimodal Transformer

    Authors: Haleh Damirchi, Michael Greenspan, Ali Etemad

    Abstract: We propose a novel solution for predicting future trajectories of pedestrians. Our method uses a multimodal encoder-decoder transformer architecture, which takes as input both pedestrian locations and ego-vehicle speeds. Notably, our decoder predicts the entire future trajectory in a single-pass and does not perform one-step-ahead prediction, which makes the method effective for embedded edge depl… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  5. arXiv:2306.01316  [pdf, other

    cs.CV cs.LG

    Independent Modular Networks

    Authors: Hamed Damirchi, Forest Agostinelli, Pooyan Jamshidi

    Abstract: Monolithic neural networks that make use of a single set of weights to learn useful representations for downstream tasks explicitly dismiss the compositional nature of data generation processes. This characteristic exists in data where every instance can be regarded as the combination of an identity concept, such as the shape of an object, combined with modifying concepts, such as orientation, col… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: ICRA23 RAP4Robots Workshop

  6. Action Capsules: Human Skeleton Action Recognition

    Authors: Ali Farajzadeh Bavil, Hamed Damirchi, Hamid D. Taghirad

    Abstract: Due to the compact and rich high-level representations offered, skeleton-based human action recognition has recently become a highly active research topic. Previous studies have demonstrated that investigating joint relationships in spatial and temporal dimensions provides effective information critical to action recognition. However, effectively encoding global dependencies of joints during spati… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: 11 pages, 11 figures

    Journal ref: Computer Vision and Image Understanding Volume 233, August 2023, 103722

  7. arXiv:2202.09942  [pdf, other

    cs.CV cs.AI

    Multiscale Crowd Counting and Localization By Multitask Point Supervision

    Authors: Mohsen Zand, Haleh Damirchi, Andrew Farley, Mahdiyar Molahasani, Michael Greenspan, Ali Etemad

    Abstract: We propose a multitask approach for crowd counting and person localization in a unified framework. As the detection and localization tasks are well-correlated and can be jointly tackled, our model benefits from a multitask solution by learning multiscale representations of encoded crowd images, and subsequently fusing them. In contrast to the relatively more popular density-based methods, our mode… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: 4 pages + references, 3 figures, 2 tables, Accepted by ICASSP 2022 Conference

  8. arXiv:2107.00366  [pdf, other

    cs.LG cs.RO

    A Consistency-Based Loss for Deep Odometry Through Uncertainty Propagation

    Authors: Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad, Behzad Moshiri

    Abstract: The incremental poses computed through odometry can be integrated over time to calculate the pose of a device with respect to an initial location. The resulting global pose may be used to formulate a second, consistency based, loss term in a deep odometry setting. In such cases where multiple losses are imposed on a network, the uncertainty over each output can be derived to weigh the different lo… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: 8 pages, 5 figures, 3 tables

    ACM Class: I.2.9; I.2.10; I.5.1

  9. arXiv:2101.07061  [pdf, other

    cs.LG

    Deep Inertial Odometry with Accurate IMU Preintegration

    Authors: Rooholla Khorrambakht, Chris Xiaoxuan Lu, Hamed Damirchi, Zhenghua Chen, Zhengguo Li

    Abstract: Inertial Measurement Units (IMUs) are interceptive modalities that provide ego-motion measurements independent of the environmental factors. They are widely adopted in various autonomous systems. Motivated by the limitations in processing the noisy measurements from these sensors using their mathematical models, researchers have recently proposed various deep learning architectures to estimate ine… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

  10. arXiv:2011.08634  [pdf, other

    cs.CV cs.LG cs.RO

    Exploring Self-Attention for Visual Odometry

    Authors: Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad

    Abstract: Visual odometry networks commonly use pretrained optical flow networks in order to derive the ego-motion between consecutive frames. The features extracted by these networks represent the motion of all the pixels between frames. However, due to the existence of dynamic objects and texture-less surfaces in the scene, the motion information for every image region might not be reliable for inferring… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 8 pages, 7 figures, 1 table

  11. arXiv:2007.03063  [pdf, other

    cs.LG eess.SP

    ARC-Net: Activity Recognition Through Capsules

    Authors: Hamed Damirchi, Rooholla Khorrambakht, Hamid Taghirad

    Abstract: Human Activity Recognition (HAR) is a challenging problem that needs advanced solutions than using handcrafted features to achieve a desirable performance. Deep learning has been proposed as a solution to obtain more accurate HAR systems being robust against noise. In this paper, we introduce ARC-Net and propose the utilization of capsules to fuse the information from multiple inertial measurement… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: 6 pages, 6 figures

  12. arXiv:2007.02929  [pdf, other

    cs.LG stat.ML

    IMU Preintegrated Features for Efficient Deep Inertial Odometry

    Authors: R. Khorrambakht, H. Damirchi, H. D. Taghirad

    Abstract: MEMS Inertial Measurement Units (IMUs) as ubiquitous proprioceptive motion measurement devices are available on various everyday gadgets and robotic platforms. Nevertheless, the direct inference of geometrical transformations or odometry based on these data alone is a challenging task. This is due to the hard-to-model imperfections and high noise characteristics of the sensor, which has motivated… ▽ More

    Submitted 18 March, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

    ACM Class: C.3; C.4; H.1; I.2.9; I.5.4; J.3; J.2