Skip to main content

Showing 1–36 of 36 results for author: Sain, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01810  [pdf, other

    cs.CV

    Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval

    Authors: Aneeshan Sain, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song

    Abstract: In this paper, we delve into the intricate dynamics of Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) by addressing a critical yet overlooked aspect -- the choice of viewpoint during sketch creation. Unlike photo systems that seamlessly handle diverse views through extensive datasets, sketch systems, with limited data collected from fixed perspectives, face challenges. Our pilot study, employ… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted in European Conference on Computer Vision (ECCV) 2024

  2. arXiv:2405.18716  [pdf, other

    cs.CV

    SketchDeco: Decorating B&W Sketches with Colour

    Authors: Chaitat Utintu, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song

    Abstract: This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding. Striking a balance between precision and convenience, our method utilises region masks and colour palettes to allow intuitive user control, steering clear of the meticulousness of manual colour assignments or the l… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  3. arXiv:2403.09480  [pdf, other

    cs.CV cs.AI

    What Sketch Explainability Really Means for Downstream Tasks

    Authors: Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies. Beyond explanations of network behavior, we discern the genuine implications of explainability across diverse downstream sketch-related tasks. We propose a lightweight and portable explainability solution -- a seamless plugin… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  4. arXiv:2403.09344  [pdf, other

    cs.CV cs.AI

    SketchINR: A First Look into Sketches as Implicit Neural Representations

    Authors: Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, Yi-Zhe Song

    Abstract: We propose SketchINR, to advance the representation of vector sketches with implicit neural models. A variable length vector sketch is compressed into a latent space of fixed dimension that implicitly encodes the underlying shape as a function of time and strokes. The learned function predicts the $xy$ point coordinates in a sketch at each time and stroke. Despite its simplicity, SketchINR outperf… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  5. arXiv:2403.07234  [pdf, other

    cs.CV

    It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI. We importantly democratise the process, enabling amateur sketches to generate precise images, living up to the commitment of "what you sketch is what you get". A pilot study underscores the necessity, revealing that deformities in existing models stem from… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/StableSketching

  6. arXiv:2403.07222  [pdf, other

    cs.CV

    You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper, we question the reliance on sketches alone for fine-grained image retrieval by simultaneously ex… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/Sketch2Word

  7. arXiv:2403.07214  [pdf, other

    cs.CV

    Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos. This proficiency is underpinned by their robust cross-modal capabilities and shape bias, findings that are substantiated through our pi… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/DiffusionZSSBIR

  8. arXiv:2403.07203  [pdf, other

    cs.CV

    How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the nec… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/AbstractAway

  9. arXiv:2312.04364  [pdf, other

    cs.CV

    DemoCaricature: Democratising Caricature Generation with a Rough Sketch

    Authors: Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch. Our objective is to strike a delicate balance between abstraction and identity, while preserving the creativity and subjectivity inherent in a sketch. To achieve this, we present Explicit Rank-1 Model Editing alongside single-image pe… ▽ More

    Submitted 24 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  10. arXiv:2312.04043  [pdf, other

    cs.CV cs.AI

    Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

    Authors: Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills. We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence. Leveraging the same part-level decoder, our approach seamlessly extends to sketch modelling by… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: CVPR 2024, Project Page: https://hmrishavbandy.github.io/doodle23d/

  11. arXiv:2306.09206  [pdf, other

    cs.CR eess.SY

    Concealing CAN Message Sequences to Prevent Schedule-based Bus-off Attacks

    Authors: Sunandan Adhikary, Ipsita Koley, Arkaprava Sain, Soumyadeep das, Shuvam Saha, Soumyajit Dey

    Abstract: This work focuses on eliminating timing-side channels in real-time safety-critical cyber-physical network protocols like Controller Area Networks (CAN). Automotive Electronic Control Units (ECUs) implement predictable scheduling decisions based on task level response time estimation. Such levels of determinism exposes timing information about task executions and therefore corresponding message tra… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  12. arXiv:2303.15149  [pdf, other

    cs.CV

    What Can Human Sketches Do for Object Detection?

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: Sketches are highly expressive, inherently capturing subjective and fine-grained visual cues. The exploration of such innate properties of human sketches has, however, been limited to that of image retrieval. In this paper, for the first time, we cultivate the expressiveness of sketches but for the fundamental vision task of object detection. The end result is a sketch-enabled object detection fra… ▽ More

    Submitted 28 October, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: Best Paper Finalist (Top 12 Best Papers). Presented in special single-track plenary sessions to all attendees in Computer Vision and Pattern Recognition (CVPR), 2023. Updated an error in Fig.3 (from Softmax to Cross Entropy). Thanks to the community for pointing it out

  13. arXiv:2303.13779  [pdf, other

    cs.CV

    Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song

    Abstract: This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%. This is not via complicated design though, but by addressing two critical issues facing the community (i) the gold standard triplet loss does not enforce holistic latent space geometry, and (ii) there are never enough sketches… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://aneeshan95.github.io/Sketch_PVT/

  14. arXiv:2303.13440  [pdf, other

    cs.CV

    CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we leverage CLIP for zero-shot sketch based image retrieval (ZS-SBIR). We are largely inspired by recent advances on foundation models and the unparalleled generalisation ability they seem to offer, but for the first time tailor it to benefit the sketch community. We put forward novel designs on how best to achieve this synergy, for both the category setting and the fine-grained set… ▽ More

    Submitted 27 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://aneeshan95.github.io/Sketch_LVM/

  15. arXiv:2303.11502  [pdf, other

    cs.CV

    Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings

    Authors: Ayan Kumar Bhunia, Subhadeep Koley, Amandeep Kumar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Human sketch has already proved its worth in various visual understanding tasks (e.g., retrieval, segmentation, image-captioning, etc). In this paper, we reveal a new trait of sketches - that they are also salient. This is intuitive as sketching is a natural attentive process at its core. More specifically, we aim to study how sketches can be used as a weak label to detect salient objects present… ▽ More

    Submitted 30 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Project page available at https://ayankumarbhunia.github.io/Sketch2Saliency/

  16. arXiv:2303.11162  [pdf, other

    cs.CV

    Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image - just like those shown in Fig. 1(a), all non-cherry-picked. We differ significantly from prior art in that we do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches. In doing so, we essentially democratise the sk… ▽ More

    Submitted 30 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://subhadeepkoley.github.io/PictureThatSketch

  17. arXiv:2211.17161  [pdf, other

    cs.CV

    Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

    Authors: Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, Yi-Zhe Song

    Abstract: The main challenge for fine-grained few-shot image classification is to learn feature representations with higher inter-class and lower intra-class variations, with a mere few labelled samples. Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i.e., lower inter-class variati… ▽ More

    Submitted 8 January, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted in AAAI-23

  18. arXiv:2207.01723  [pdf, other

    cs.CV

    Adaptive Fine-Grained Sketch-Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: The recent focus on Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) has shifted towards generalising a model to new categories without any training data from them. In real-world applications, however, a trained FG-SBIR model is often applied to both new categories and different human sketchers, i.e., different drawing styles. Although this complicates the generalisation problem, fortunately, a… ▽ More

    Submitted 19 August, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted in ECCV 2022. Minor typos and Eq.4 corrected

  19. arXiv:2204.11964  [pdf, other

    cs.CV

    SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we extend scene understanding to include that of human sketch. The result is a complete trilogy of scene representation from three diverse and complementary modalities -- sketch, photo, and text. Instead of learning a rigid three-way embedding and be done with it, we focus on learning a flexible joint embedding that fully supports the ``optionality" that this complementarity brings.… ▽ More

    Submitted 26 March, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: Accepted in Computer Vision and Pattern Recognition (CVPR), 2023

  20. arXiv:2203.14843  [pdf, other

    cs.CV

    Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

    Authors: Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: The human visual system is remarkable in learning new visual concepts from just a few examples. This is precisely the goal behind few-shot class incremental learning (FSCIL), where the emphasis is additionally placed on ensuring the model does not suffer from "forgetting". In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous applicati… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 10 pages, 3 figures. Accepted in CVPR 2022

  21. arXiv:2203.14817  [pdf, other

    cs.CV

    Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Sketching enables many exciting applications, notably, image retrieval. The fear-to-sketch problem (i.e., "I can't sketch") has however proven to be fatal for its widespread adoption. This paper tackles this "fear" head on, and for the first time, proposes an auxiliary module for existing retrieval models that predominantly lets the users sketch without having to worry. We first conducted a pilot… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022 Code: https://github.com/AyanKumarBhunia/Stroke_Subset_Selector-for-FGSBIR

  22. arXiv:2203.14804  [pdf, other

    cs.CV

    Partially Does It: Towards Scene-Level FG-SBIR with Partial Input

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial". A quick pilot study reveals: (i) a scene sketch does not necessarily contain all objects in the corresponding photo, due to the subjective holistic interpretation of scenes, (ii) there exists significant empty (white) regions as a result of object-level abstrac… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted in CVPR 2022

  23. arXiv:2203.14691  [pdf, other

    cs.CV

    Sketch3T: Test-Time Training for Zero-Shot SBIR

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Zero-shot sketch-based image retrieval typically asks for a trained model to be applied as is to unseen categories. In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i.e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 10 pages, 5 figures. Accepted in CVPR 2022

  24. arXiv:2203.02113  [pdf, other

    cs.CV

    FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

    Authors: Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song

    Abstract: We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO. With practical applications in mind, we collect sketches that convey scene content well but can be sketched within a few minutes by a person with any sketching skills. Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals, offeri… ▽ More

    Submitted 20 July, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted in ECCV 2022. Project Page: https://fscoco.github.io

  25. arXiv:2107.12090  [pdf, other

    cs.CV

    Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: Although text recognition has significantly evolved over the years, state-of-the-art (SOTA) models still struggle in the wild scenarios due to complex backgrounds, varying fonts, uncontrolled illuminations, distortions and other artefacts. This is because such models solely depend on visual information for text recognition, thus lacking semantic reasoning capabilities. In this paper, we argue that… ▽ More

    Submitted 26 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  26. arXiv:2107.12087  [pdf, other

    cs.CV

    Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: Text recognition remains a fundamental and extensively researched topic in computer vision, largely owing to its wide array of commercial applications. The challenging nature of the very problem however dictated a fragmentation of research efforts: Scene Text Recognition (STR) that deals with text in everyday scenes, and Handwriting Text Recognition (HTR) that tackles hand-written text. In this pa… ▽ More

    Submitted 27 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  27. arXiv:2107.12081  [pdf, other

    cs.CV

    Towards the Unseen: Iterative Text Recognition by Distilling from Errors

    Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

    Abstract: Visual text recognition is undoubtedly one of the most extensively researched topics in computer vision. Great progress have been made to date, with the latest models starting to focus on the more practical "in-the-wild" setting. However, a salient problem still hinders practical deployment -- prior arts mostly struggle with recognising unseen (or rarely seen) character sequences. In this paper, w… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  28. arXiv:2104.03589  [pdf, ps, other

    cs.CV

    PQA: Perceptual Question Answering

    Authors: Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song

    Abstract: Perceptual organization remains one of the very few established theories on the human visual system. It underpinned many pre-deep seminal works on segmentation and detection, yet research has seen a rapid decline since the preferential shift to learning deep models. Of the limited attempts, most aimed at interpreting complex visual scenes using perceptual organizational rules. This has however bee… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: Accepted by CVPR 2021

  29. arXiv:2104.01876  [pdf, other

    cs.CV

    MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

    Authors: Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

    Abstract: Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. Prior works however generally operate with the assumption that there is a limited number of styles, most of which have already been captured by existing datasets. In this paper, we take a completely different perspective -- we work on the assumption that there… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021

  30. arXiv:2103.15706  [pdf, other

    cs.CV

    StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. However, a fundamental challenge in SBIR has been largely ignored so far, that is, sketches are drawn by humans and considerable style variations exist amongst different users. An eff… ▽ More

    Submitted 31 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021

  31. arXiv:2103.13990  [pdf, other

    cs.CV

    More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. Whilst the number of photos can be easily scaled, each corresponding sketch still needs to be individually produced. In this paper, we aim to mitigate such an upper-bound on sketch data, and study… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/semisupervised-FGSBIR

  32. arXiv:2007.15103  [pdf, other

    cs.CV cs.IR

    Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: Sketch as an image search query is an ideal alternative to text in capturing the fine-grained visual details. Prior successes on fine-grained sketch-based image retrieval (FG-SBIR) have demonstrated the importance of tackling the unique traits of sketches as opposed to photos, e.g., temporal vs. static, strokes vs. pixels, and abstract vs. pixel-perfect. In this paper, we study a further trait of… ▽ More

    Submitted 11 August, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: Accepted for ORAL presentation in BMVC 2020

  33. arXiv:2003.03787  [pdf, other

    cs.CV

    Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation

    Authors: Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo

    Abstract: Unsupervised domain adaptation aims to leverage labeled data from a source domain to learn a classifier for an unlabeled target domain. Among its many variants, open set domain adaptation (OSDA) is perhaps the most challenging, as it further assumes the presence of unknown classes in the target domain. In this paper, we study OSDA with a particular focus on enriching its ability to traverse across… ▽ More

    Submitted 10 March, 2020; v1 submitted 8 March, 2020; originally announced March 2020.

  34. arXiv:1810.11120  [pdf, other

    cs.CV

    Improving Document Binarization via Adversarial Noise-Texture Augmentation

    Authors: Ankan Kumar Bhunia, Ayan Kumar Bhunia, Aneeshan Sain, Partha Pratim Roy

    Abstract: Binarization of degraded document images is an elementary step in most of the problems in document image analysis domain. The paper re-visits the binarization problem by introducing an adversarial learning approach. We construct a Texture Augmentation Network that transfers the texture element of a degraded reference document image to a clean binary image. In this way, the network creates multiple… ▽ More

    Submitted 1 May, 2019; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: IEEE International Conference on Image Processing (ICIP), 2019. The full source code of the proposed system is publicly available at https://github.com/ankanbhunia/AdverseBiNet

  35. arXiv:1802.08568  [pdf, other

    cs.CV

    Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network

    Authors: Ayan Kumar Bhunia, Subham Mukherjee, Aneeshan Sain, Ankan Kumar Bhunia, Partha Pratim Roy, Umapada Pal

    Abstract: In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. The advantages of using character level data for training have been outlined in section I. Our method uses a multimodal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities join… ▽ More

    Submitted 15 October, 2019; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: Accepted in Information Fusion, Elsevier

  36. Multi-Oriented Text Detection and Verification in Video Frames and Scene Images

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal

    Abstract: In this paper, we bring forth a novel approach of video text detection using Fourier-Laplacian filtering in the frequency domain that includes a verification technique using Hidden Markov Model (HMM). The proposed approach deals with the text region appearing not only in horizontal or vertical directions, but also in any other oblique or curved orientation in the image. Until now only a few method… ▽ More

    Submitted 4 October, 2017; v1 submitted 22 July, 2017; originally announced July 2017.

    Comments: Accepted in Neurocomputing, Elsevier