Skip to main content

Showing 1–10 of 10 results for author: Derakhshani, M M

.
  1. arXiv:2402.10099  [pdf, other

    cs.CV

    Any-Shift Prompting for Generalization over Distributions

    Authors: Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani, Shengcai Liao, Cees G. M. Snoek

    Abstract: Image-language models with prompt learning have shown remarkable advances in numerous downstream vision tasks. Nevertheless, conventional prompt learning methods overfit their training distribution and lose the generalization ability on test distributions. To improve generalization across various distribution shifts, we propose any-shift prompting: a general probabilistic inference framework that… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  2. arXiv:2311.17937  [pdf, other

    cs.CV

    Unlocking Spatial Comprehension in Text-to-Image Diffusion Models

    Authors: Mohammad Mahdi Derakhshani, Menglin Xia, Harkirat Behl, Cees G. M. Snoek, Victor Rühle

    Abstract: We propose CompFuser, an image generation pipeline that enhances spatial comprehension and attribute assignment in text-to-image generative models. Our pipeline enables the interpretation of instructions defining spatial relationships between objects in a scene, such as `An image of a gray cat on the left of an orange dog', and generate corresponding images. This is especially important in order t… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  3. arXiv:2310.00500  [pdf, other

    cs.CV

    Self-Supervised Open-Ended Classification with Small Visual Language Models

    Authors: Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring, Yuki M. Asano

    Abstract: We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models. Our approach imitates image captions in a self-supervised way based on clustering a large pool of images followed by assigning semantically-unrelated names to clusters. By doing so, we construct a training signal consisting of inter… ▽ More

    Submitted 6 December, 2023; v1 submitted 30 September, 2023; originally announced October 2023.

  4. arXiv:2303.05977  [pdf, other

    cs.CV

    Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models

    Authors: Tom van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring

    Abstract: Medical Visual Question Answering (VQA) is an important challenge, as it would lead to faster and more accurate diagnoses and treatment decisions. Most existing methods approach it as a multi-class classification problem, which restricts the outcome to a predefined closed-set of curated answers. We focus on open-ended VQA and motivated by the recent advances in language models consider it as a gen… ▽ More

    Submitted 21 July, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    MSC Class: 68T07

  5. arXiv:2210.02390  [pdf, other

    cs.CV cs.AI cs.LG

    Bayesian Prompt Learning for Image-Language Model Generalization

    Authors: Mohammad Mahdi Derakhshani, Enrique Sanchez, Adrian Bulat, Victor Guilherme Turrisi da Costa, Cees G. M. Snoek, Georgios Tzimiropoulos, Brais Martinez

    Abstract: Foundational image-language models have generated considerable interest due to their efficient adaptation to downstream tasks by prompt learning. Prompt learning treats part of the language model input as trainable while freezing the rest, and optimizes an Empirical Risk Minimization objective. However, Empirical Risk Minimization is known to suffer from distributional shifts which hurt generaliza… ▽ More

    Submitted 20 August, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted at ICCV 2023

  6. arXiv:2204.05737  [pdf, other

    cs.CV

    LifeLonger: A Benchmark for Continual Disease Classification

    Authors: Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Tom van Sonsbeek, Xiantong Zhen, Dwarikanath Mahapatra, Marcel Worring, Cees G. M. Snoek

    Abstract: Deep learning models have shown a great effectiveness in recognition of findings in medical images. However, they cannot handle the ever-changing clinical environment, bringing newly annotated medical data from different sources. To exploit the incoming streams of data, these models would benefit largely from sequentially learning from new samples, without forgetting the previously obtained knowle… ▽ More

    Submitted 30 June, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    MSC Class: 68T07

  7. arXiv:2112.13410  [pdf, other

    cs.LG cs.AI

    Generative Kernel Continual learning

    Authors: Mohammad Mahdi Derakhshani, Xiantong Zhen, Ling Shao, Cees G. M. Snoek

    Abstract: Kernel continual learning by \citet{derakhshani2021kernel} has recently emerged as a strong continual learner due to its non-parametric ability to tackle task interference and catastrophic forgetting. Unfortunately its success comes at the expense of an explicit memory to store samples from past tasks, which hampers scalability to continual learning settings with a large number of tasks. In this p… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: work in progress

  8. arXiv:2107.05757  [pdf, other

    cs.LG cs.AI

    Kernel Continual Learning

    Authors: Mohammad Mahdi Derakhshani, Xiantong Zhen, Ling Shao, Cees G. M. Snoek

    Abstract: This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically… ▽ More

    Submitted 14 July, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: accepted to ICML 2021

  9. arXiv:1906.05388  [pdf, other

    cs.CV

    Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors

    Authors: Mohammad Mahdi Derakhshani, Saeed Masoudnia, Amir Hossein Shaker, Omid Mersa, Mohammad Amin Sadeghi, Mohammad Rastegari, Babak N. Araabi

    Abstract: We present a simple and effective learning technique that significantly improves mAP of YOLO object detectors without compromising their speed. During network training, we carefully feed in localization information. We excite certain activations in order to help the network learn to better localize. In the later stages of training, we gradually reduce our assisted excitation to zero. We reached a… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

  10. arXiv:1805.11091  [pdf, other

    cs.CV

    BlockCNN: A Deep Network for Artifact Removal and Image Compression

    Authors: Danial Maleki, Soheila Nadalian, Mohammad Mahdi Derakhshani, Mohammad Amin Sadeghi

    Abstract: We present a general technique that performs both artifact removal and image compression. For artifact removal, we input a JPEG image and try to remove its compression artifacts. For compression, we input an image and process its 8 by 8 blocks in a sequence. For each block, we first try to predict its intensities based on previous blocks; then, we store a residual with respect to the input image.… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 2555-2558