Skip to main content

Showing 1–12 of 12 results for author: Ghodrati, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.08128  [pdf, other

    cs.CV

    Clockwork Diffusion: Efficient Generation With Model-Step Distillation

    Authors: Amirhossein Habibian, Amir Ghodrati, Noor Fathima, Guillaume Sautiere, Risheek Garrepalli, Fatih Porikli, Jens Petersen

    Abstract: This work aims to improve the efficiency of text-to-image diffusion models. While diffusion models use computationally expensive UNet-based denoising operations in every generation step, we identify that not all operations are equally relevant for the final output quality. In particular, we observe that UNet layers operating on high-res feature maps are relatively sensitive to small perturbations.… ▽ More

    Submitted 20 February, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  2. arXiv:2301.02240  [pdf, other

    cs.CV

    Skip-Attention: Improving Vision Transformers by Paying Less Attention

    Authors: Shashanka Venkataramanan, Amir Ghodrati, Yuki M. Asano, Fatih Porikli, Amirhossein Habibian

    Abstract: This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to ap… ▽ More

    Submitted 17 January, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

  3. arXiv:2204.02397  [pdf, other

    cs.CV

    SALISA: Saliency-based Input Sampling for Efficient Video Object Detection

    Authors: Babak Ehteshami Bejnordi, Amirhossein Habibian, Fatih Porikli, Amir Ghodrati

    Abstract: High-resolution images are widely adopted for high-performance object detection in videos. However, processing high-resolution inputs comes with high computation costs, and naive down-sampling of the input to reduce the computation costs quickly degrades the detection performance. In this paper, we propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detecti… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: 20 pages, 7 figures

  4. arXiv:2104.13400  [pdf, other

    cs.CV cs.LG

    FrameExit: Conditional Early Exiting for Efficient Video Recognition

    Authors: Amir Ghodrati, Babak Ehteshami Bejnordi, Amirhossein Habibian

    Abstract: In this paper, we propose a conditional early exiting framework for efficient video recognition. While existing works focus on selecting a subset of salient frames to reduce the computation costs, we propose to use a simple sampling strategy combined with conditional early exiting to enable efficient recognition. Our model automatically learns to process fewer frames for simpler videos and more fr… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: CVPR 2021 | Oral paper

  5. arXiv:1903.03197  [pdf, other

    cs.DM math.CO

    Well-indumatched Trees and Graphs of Bounded Girth

    Authors: S. Akbari, T. Ekim, A. H. Ghodrati, S. Zare

    Abstract: A graph G is called well-indumatched if all of its maximal induced matchings have the same size. In this paper we characterize all well-indumatched trees. We provide a linear time algorithm to decide if a tree is well-indumatched or not. Then, we characterize minimal well-indumatched graphs of girth at least 9 and show subsequently that for an odd integer g greater than or equal to 9 and different… ▽ More

    Submitted 16 December, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

  6. arXiv:1807.06980  [pdf, other

    cs.CV

    Video Time: Properties, Encoders and Evaluation

    Authors: Amir Ghodrati, Efstratios Gavves, Cees G. M. Snoek

    Abstract: Time-aware encoding of frame sequences in a video is a fundamental problem in video understanding. While many attempted to model time in videos, an explicit study on quantifying video time is missing. To fill this lacuna, we aim to evaluate video time explicitly. We describe three properties of video time, namely a) temporal asymmetry, b)temporal continuity and c) temporal causality. Based on each… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

    Comments: 14 pages, BMVC 2018

  7. arXiv:1803.07485  [pdf, other

    cs.CV

    Actor and Action Video Segmentation from a Sentence

    Authors: Kirill Gavrilyuk, Amir Ghodrati, Zhenyang Li, Cees G. M. Snoek

    Abstract: This paper strives for pixel-level segmentation of actors and their actions in video content. Different from existing works, which all learn to segment from a fixed vocabulary of actor and action pairs, we infer the segmentation from a natural language input sentence. This allows to distinguish between fine-grained actors in the same super-category, identify actor and action instances, and segment… ▽ More

    Submitted 20 March, 2018; originally announced March 2018.

    Comments: Accepted to CVPR 2018 as oral

  8. arXiv:1606.04702  [pdf, other

    cs.CV

    DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers

    Authors: Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc Van Gool

    Abstract: In this paper, a new method for generating object and action proposals in images and videos is proposed. It builds on activations of different convolutional layers of a pretrained CNN, combining the localization accuracy of the early layers with the high informative-ness (and hence recall) of the later layers. To this end, we build an inverse cascade that, going backward from the later to the earl… ▽ More

    Submitted 15 June, 2016; originally announced June 2016.

    Comments: 15 pages

  9. arXiv:1604.06506  [pdf, other

    cs.CV

    Online Action Detection

    Authors: Roeland De Geest, Efstratios Gavves, Amir Ghodrati, Zhenyang Li, Cees Snoek, Tinne Tuytelaars

    Abstract: In online action detection, the goal is to detect the start of an action in a video stream as soon as it happens. For instance, if a child is chasing a ball, an autonomous car should recognize what is going on and respond immediately. This is a very challenging problem for four reasons. First, only partial actions are observed. Second, there is a large variability in negative data. Third, the star… ▽ More

    Submitted 30 August, 2016; v1 submitted 21 April, 2016; originally announced April 2016.

    Comments: Project page: http://homes.esat.kuleuven.be/~rdegeest/OnlineActionDetection.html

  10. Rank Pooling for Action Recognition

    Authors: Basura Fernando, Efstratios Gavves, Jose Oramas, Amir Ghodrati, Tinne Tuytelaars

    Abstract: We propose a function-based temporal pooling method that captures the latent structure of the video sequence data - e.g. how frame-level features evolve over time in a video. We show how the parameters of a function that has been fit to the video data can serve as a robust new video representation. As a specific example, we learn a pooling function via ranking machines. By learning to rank the fra… ▽ More

    Submitted 15 May, 2016; v1 submitted 6 December, 2015; originally announced December 2015.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence

  11. arXiv:1511.08446  [pdf, other

    cs.CV

    Towards Automatic Image Editing: Learning to See another You

    Authors: Amir Ghodrati, Xu Jia, Marco Pedersoli, Tinne Tuytelaars

    Abstract: Learning the distribution of images in order to generate new samples is a challenging task due to the high dimensionality of the data and the highly non-linear relations that are involved. Nevertheless, some promising results have been reported in the literature recently,building on deep network architectures. In this work, we zoom in on a specific type of image generation: given an image and know… ▽ More

    Submitted 26 November, 2015; originally announced November 2015.

  12. arXiv:1510.04445  [pdf, other

    cs.CV

    DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

    Authors: Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc Van Gool

    Abstract: In this paper we evaluate the quality of the activation layers of a convolutional neural network (CNN) for the gen- eration of object proposals. We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the coarseness of the feature maps. Instead, the… ▽ More

    Submitted 15 October, 2015; originally announced October 2015.

    Comments: ICCV 2015