Skip to main content

Showing 1–3 of 3 results for author: Udhayanan, P

.
  1. arXiv:2309.00613  [pdf, other

    cs.CV cs.AI cs.LG

    Iterative Multi-granular Image Editing using Diffusion Models

    Authors: K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic… ▽ More

    Submitted 28 October, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

  2. arXiv:2308.16649  [pdf, other

    cs.CV

    Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval

    Authors: Prateksha Udhayanan, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: We consider the problem of composed image retrieval that takes an input query consisting of an image and a modification text indicating the desired changes to be made on the image and retrieves images that match these changes. Current state-of-the-art techniques that address this problem use global features for the retrieval, resulting in incorrect localization of the regions of interest to be mod… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  3. arXiv:2307.00910  [pdf, other

    cs.CV cs.AI

    CoPL: Contextual Prompt Learning for Vision-Language Understanding

    Authors: Koustava Goswami, Srikrishna Karanam, Prateksha Udhayanan, K J Joseph, Balaji Vasan Srinivasan

    Abstract: Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we ide… ▽ More

    Submitted 12 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted at AAAI 2024