Skip to main content

Showing 1–50 of 79 results for author: Bhunia, A

.
  1. arXiv:2407.01810  [pdf, other

    cs.CV

    Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval

    Authors: Aneeshan Sain, Pinaki Nath Chowdhury, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song

    Abstract: In this paper, we delve into the intricate dynamics of Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) by addressing a critical yet overlooked aspect -- the choice of viewpoint during sketch creation. Unlike photo systems that seamlessly handle diverse views through extensive datasets, sketch systems, with limited data collected from fixed perspectives, face challenges. Our pilot study, employ… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted in European Conference on Computer Vision (ECCV) 2024

  2. arXiv:2406.20099  [pdf, other

    cs.CV

    Odd-One-Out: Anomaly Detection by Comparing with Neighbors

    Authors: Ankan Bhunia, Changjian Li, Hakan Bilen

    Abstract: This paper introduces a novel anomaly detection (AD) problem that focuses on identifying `odd-looking' objects relative to the other instances within a scene. Unlike the traditional AD benchmarks, in our setting, anomalies in this context are scene-specific, defined by the regular instances that make up the majority. Since object instances are often partly visible from a single viewpoint, our sett… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Codes & Dataset at https://github.com/VICO-UoE/OddOneOutAD

  3. arXiv:2406.19393  [pdf, other

    cs.CV

    Looking 3D: Anomaly Detection with 2D-3D Alignment

    Authors: Ankan Bhunia, Changjian Li, Hakan Bilen

    Abstract: Automatic anomaly detection based on visual cues holds practical significance in various domains, such as manufacturing and product quality assessment. This paper introduces a new conditional anomaly detection problem, which involves identifying anomalies in a query image by comparing it to a reference shape. To address this challenge, we have created a large dataset, BrokenChairs-180K, consisting… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted at CVPR'24. Codes & dataset available at https://github.com/VICO-UoE/Looking3D

  4. arXiv:2405.18716  [pdf, other

    cs.CV

    SketchDeco: Decorating B&W Sketches with Colour

    Authors: Chaitat Utintu, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song

    Abstract: This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding. Striking a balance between precision and convenience, our method utilises region masks and colour palettes to allow intuitive user control, steering clear of the meticulousness of manual colour assignments or the l… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2404.05882  [pdf, other

    quant-ph

    Strong quantum nonlocality: Unextendible biseparability beyond unextendible product basis

    Authors: Atanu Bhunia, Subrata Bera, Indranil Biswas, Indrani Chattopadhyay, Debasis Sarkar

    Abstract: An unextendible biseparable basis (UBB) is a set of orthogonal pure biseparable states which span a subspace of a given Hilbert space while the complementary subspace contains only genuinely entangled states. These biseparable bases are useful to produce genuinely entangled subspace in multipartite system. Such a subspace could be more beneficial for information theoretic applications if we are ab… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 15 pages, 4 figures, latex2e, comments welcome

  6. arXiv:2403.09480  [pdf, other

    cs.CV cs.AI

    What Sketch Explainability Really Means for Downstream Tasks

    Authors: Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies. Beyond explanations of network behavior, we discern the genuine implications of explainability across diverse downstream sketch-related tasks. We propose a lightweight and portable explainability solution -- a seamless plugin… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  7. arXiv:2403.09344  [pdf, other

    cs.CV cs.AI

    SketchINR: A First Look into Sketches as Implicit Neural Representations

    Authors: Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, Yi-Zhe Song

    Abstract: We propose SketchINR, to advance the representation of vector sketches with implicit neural models. A variable length vector sketch is compressed into a latent space of fixed dimension that implicitly encodes the underlying shape as a function of time and strokes. The learned function predicts the $xy$ point coordinates in a sketch at each time and stroke. Despite its simplicity, SketchINR outperf… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  8. arXiv:2403.08458  [pdf, other

    quant-ph physics.app-ph physics.optics

    Dielectric microwave resonator with large optical apertures for spin-based quantum devices

    Authors: Tatsuki Hamamoto, Amit Bhunia, Rupak Kumar Bhattacharya, Hiroki Takahashi, Yuimaru Kubo

    Abstract: Towards a spin-based quantum microwave-optical photon transducer, we demonstrate a low-loss dielectric microwave resonator with an internal quality factor of $2.30\times10^4$ while accommodating optical apertures with a diameter of $8\, \mathrm{mm}$. The two seemingly conflicting requirements, high quality factor and large optical apertures, are satisfied thanks to the large dielectric constant of… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  9. arXiv:2403.07234  [pdf, other

    cs.CV

    It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI. We importantly democratise the process, enabling amateur sketches to generate precise images, living up to the commitment of "what you sketch is what you get". A pilot study underscores the necessity, revealing that deformities in existing models stem from… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/StableSketching

  10. arXiv:2403.07222  [pdf, other

    cs.CV

    You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper, we question the reliance on sketches alone for fine-grained image retrieval by simultaneously ex… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/Sketch2Word

  11. arXiv:2403.07214  [pdf, other

    cs.CV

    Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos. This proficiency is underpinned by their robust cross-modal capabilities and shape bias, findings that are substantiated through our pi… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/DiffusionZSSBIR

  12. arXiv:2403.07203  [pdf, other

    cs.CV

    How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the nec… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in CVPR 2024. Project page available at https://subhadeepkoley.github.io/AbstractAway

  13. arXiv:2312.04364  [pdf, other

    cs.CV

    DemoCaricature: Democratising Caricature Generation with a Rough Sketch

    Authors: Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch. Our objective is to strike a delicate balance between abstraction and identity, while preserving the creativity and subjectivity inherent in a sketch. To achieve this, we present Explicit Rank-1 Model Editing alongside single-image pe… ▽ More

    Submitted 24 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  14. arXiv:2312.04043  [pdf, other

    cs.CV cs.AI

    Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

    Authors: Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills. We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence. Leveraging the same part-level decoder, our approach seamlessly extends to sketch modelling by… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: CVPR 2024, Project Page: https://hmrishavbandy.github.io/doodle23d/

  15. arXiv:2307.11433  [pdf

    physics.optics cond-mat.mes-hall quant-ph

    Site-specific stable deterministic single photon emitters with low Huang-Rhys value in layered hexagonal boron nitride at room temperature

    Authors: Amit Bhunia, Pragya Joshi, Nitesh Singh, Biswanath Chakraborty, Rajesh V Nair

    Abstract: Development of stable room-temperature bright single-photon emitters using atomic defects in hexagonal-boron nitride flakes (h-BN) provides significant promises for quantum technologies. However, an outstanding challenge in h-BN is creating site-specific, stable, high emission rate single photon emitters with very low Huang-Rhys (HR) factor. Here, we discuss the photonic properties of site-specifi… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  16. arXiv:2304.01992  [pdf, other

    eess.IV cs.CV

    Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification

    Authors: Amandeep Kumar, Ankan kumar Bhunia, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Fahad Shahbaz Khan

    Abstract: In this work, we propose a few-shot colorectal tissue image generation method for addressing the scarcity of histopathological training data for rare cancer tissues. Our few-shot generation method, named XM-GAN, takes one base and a pair of reference tissue images as input and generates high-quality yet diverse images. Within our XM-GAN, a novel controllable fusion block densely aggregates local r… ▽ More

    Submitted 4 July, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Early Accept in MICCAI 2023

  17. arXiv:2304.01172  [pdf, other

    cs.CV

    Generative Multiplane Neural Radiance for 3D-Aware Image Generation

    Authors: Amandeep Kumar, Ankan Kumar Bhunia, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

    Abstract: We present a method to efficiently generate 3D-aware high-resolution images that are view-consistent across multiple target views. The proposed multiplane neural radiance model, named GMNR, consists of a novel α-guided view-dependent representation (α-VdR) module for learning view-dependent information. The α-VdR module, faciliated by an α-guided pixel sampling technique, computes the view-depende… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Technical report

  18. arXiv:2303.15149  [pdf, other

    cs.CV

    What Can Human Sketches Do for Object Detection?

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: Sketches are highly expressive, inherently capturing subjective and fine-grained visual cues. The exploration of such innate properties of human sketches has, however, been limited to that of image retrieval. In this paper, for the first time, we cultivate the expressiveness of sketches but for the fundamental vision task of object detection. The end result is a sketch-enabled object detection fra… ▽ More

    Submitted 28 October, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: Best Paper Finalist (Top 12 Best Papers). Presented in special single-track plenary sessions to all attendees in Computer Vision and Pattern Recognition (CVPR), 2023. Updated an error in Fig.3 (from Softmax to Cross Entropy). Thanks to the community for pointing it out

  19. arXiv:2303.13779  [pdf, other

    cs.CV

    Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang, Yi-Zhe Song

    Abstract: This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by ~11%. This is not via complicated design though, but by addressing two critical issues facing the community (i) the gold standard triplet loss does not enforce holistic latent space geometry, and (ii) there are never enough sketches… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://aneeshan95.github.io/Sketch_PVT/

  20. arXiv:2303.13645  [pdf, ps, other

    quant-ph

    More assistance of entanglement, less rounds of classical communication

    Authors: Atanu Bhunia, Indranil Biswas, Indrani Chattopadhyay, Debasis Sarkar

    Abstract: Classical communication plays a crucial role to distinguish locally a class of quantum states. Despite considerable advances, we have very little knowledge about the number of measurement and communication rounds needed to implement a discrimination task by local quantum operations and classical communications (in short, LOCC). In this letter, we are able to show the relation between round numbers… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: 11 pages, 3 figures, revtex, comments welcome

  21. arXiv:2303.13440  [pdf, other

    cs.CV

    CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we leverage CLIP for zero-shot sketch based image retrieval (ZS-SBIR). We are largely inspired by recent advances on foundation models and the unparalleled generalisation ability they seem to offer, but for the first time tailor it to benefit the sketch community. We put forward novel designs on how best to achieve this synergy, for both the category setting and the fine-grained set… ▽ More

    Submitted 27 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://aneeshan95.github.io/Sketch_LVM/

  22. arXiv:2303.11502  [pdf, other

    cs.CV

    Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings

    Authors: Ayan Kumar Bhunia, Subhadeep Koley, Amandeep Kumar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Human sketch has already proved its worth in various visual understanding tasks (e.g., retrieval, segmentation, image-captioning, etc). In this paper, we reveal a new trait of sketches - that they are also salient. This is intuitive as sketching is a natural attentive process at its core. More specifically, we aim to study how sketches can be used as a weak label to detect salient objects present… ▽ More

    Submitted 30 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Project page available at https://ayankumarbhunia.github.io/Sketch2Saliency/

  23. arXiv:2303.11162  [pdf, other

    cs.CV

    Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

    Authors: Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image - just like those shown in Fig. 1(a), all non-cherry-picked. We differ significantly from prior art in that we do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches. In doing so, we essentially democratise the sk… ▽ More

    Submitted 30 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023. Project page available at https://subhadeepkoley.github.io/PictureThatSketch

  24. arXiv:2303.07775  [pdf, other

    cs.CV

    Data-Free Sketch-Based Image Retrieval

    Authors: Abhra Chaudhuri, Ayan Kumar Bhunia, Yi-Zhe Song, Anjan Dutta

    Abstract: Rising concerns about privacy and anonymity preservation of deep learning models have facilitated research in data-free learning (DFL). For the first time, we identify that for data-scarce tasks like Sketch-Based Image Retrieval (SBIR), where the difficulty in acquiring paired photos and hand-drawn sketches limits data-dependent cross-modal learning algorithms, DFL can prove to be a much more prac… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Computer Vision and Pattern Recognition (CVPR) 2023

  25. Entangled state distillation from single copy mixed states beyond LOCC

    Authors: Indranil Biswas, Atanu Bhunia, Indrani Chattopadhyay, Debasis Sarkar

    Abstract: No pure entangled state can be distilled from a $2\otimes 2$ or $2\otimes 3$ mixed state by separable operations. In $3\otimes 3$, pure entanglement can be distilled by separable operation but not by LOCC. In this letter, we proved the conjecture [PRL. 103, 110502 (2009)] that it is possible to distill pure entanglement for $2\otimes 4$ system by LOCC and further improve these in higher dimensions… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 6 pages, No figure, revtex, comments welcome

  26. arXiv:2211.12500  [pdf, other

    cs.CV

    Person Image Synthesis via Denoising Diffusion Model

    Authors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Mubarak Shah, Fahad Shahbaz Khan

    Abstract: The pose-guided person image generation task requires synthesizing photorealistic images of humans in arbitrary poses. The existing approaches use generative adversarial networks that do not necessarily maintain realistic textures or need dense correspondences that struggle to handle complex deformations and severe occlusions. In this work, we show how denoising diffusion models can be applied for… ▽ More

    Submitted 28 February, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR 2023

  27. arXiv:2210.15146  [pdf, other

    cs.CV

    Towards Practicality of Sketch-Based Visual Understanding

    Authors: Ayan Kumar Bhunia

    Abstract: Sketches have been used to conceptualise and depict visual objects from pre-historic times. Sketch research has flourished in the past decade, particularly with the proliferation of touchscreen devices. Much of the utilisation of sketch has been anchored around the fact that it can be used to delineate visual concepts universally irrespective of age, race, language, or demography. The fine-grained… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: PhD thesis successfully defended by Ayan Kumar Bhunia, Supervisor: Prof. Yi-Zhe Song, Thesis Examiners: Prof Stella Yu and Prof Adrian Hilton

  28. arXiv:2207.01723  [pdf, other

    cs.CV

    Adaptive Fine-Grained Sketch-Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: The recent focus on Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) has shifted towards generalising a model to new categories without any training data from them. In real-world applications, however, a trained FG-SBIR model is often applied to both new categories and different human sketchers, i.e., different drawing styles. Although this complicates the generalisation problem, fortunately, a… ▽ More

    Submitted 19 August, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted in ECCV 2022. Minor typos and Eq.4 corrected

  29. arXiv:2204.11964  [pdf, other

    cs.CV

    SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song

    Abstract: In this paper, we extend scene understanding to include that of human sketch. The result is a complete trilogy of scene representation from three diverse and complementary modalities -- sketch, photo, and text. Instead of learning a rigid three-way embedding and be done with it, we focus on learning a flexible joint embedding that fully supports the ``optionality" that this complementarity brings.… ▽ More

    Submitted 26 March, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: Accepted in Computer Vision and Pattern Recognition (CVPR), 2023

  30. arXiv:2203.14843  [pdf, other

    cs.CV

    Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

    Authors: Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: The human visual system is remarkable in learning new visual concepts from just a few examples. This is precisely the goal behind few-shot class incremental learning (FSCIL), where the emphasis is additionally placed on ensuring the model does not suffer from "forgetting". In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous applicati… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 10 pages, 3 figures. Accepted in CVPR 2022

  31. arXiv:2203.14817  [pdf, other

    cs.CV

    Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Sketching enables many exciting applications, notably, image retrieval. The fear-to-sketch problem (i.e., "I can't sketch") has however proven to be fatal for its widespread adoption. This paper tackles this "fear" head on, and for the first time, proposes an auxiliary module for existing retrieval models that predominantly lets the users sketch without having to worry. We first conducted a pilot… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022 Code: https://github.com/AyanKumarBhunia/Stroke_Subset_Selector-for-FGSBIR

  32. arXiv:2203.14804  [pdf, other

    cs.CV

    Partially Does It: Towards Scene-Level FG-SBIR with Partial Input

    Authors: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

    Abstract: We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial". A quick pilot study reveals: (i) a scene sketch does not necessarily contain all objects in the corresponding photo, due to the subjective holistic interpretation of scenes, (ii) there exists significant empty (white) regions as a result of object-level abstrac… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted in CVPR 2022

  33. arXiv:2203.14691  [pdf, other

    cs.CV

    Sketch3T: Test-Time Training for Zero-Shot SBIR

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

    Abstract: Zero-shot sketch-based image retrieval typically asks for a trained model to be applied as is to unseen categories. In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i.e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 10 pages, 5 figures. Accepted in CVPR 2022

  34. arXiv:2203.02113  [pdf, other

    cs.CV

    FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

    Authors: Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song

    Abstract: We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO. With practical applications in mind, we collect sketches that convey scene content well but can be sketched within a few minutes by a person with any sketching skills. Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals, offeri… ▽ More

    Submitted 20 July, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted in ECCV 2022. Project Page: https://fscoco.github.io

  35. arXiv:2112.03258  [pdf, other

    cs.CV cs.GR

    DoodleFormer: Creative Sketch Drawing with Transformers

    Authors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

    Abstract: Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage fra… ▽ More

    Submitted 15 September, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: Accepted to ECCV-2022. Project webpage: https://ankanbhunia.github.io/doodleformer/

  36. arXiv:2111.14399  [pdf, ps, other

    quant-ph

    Nonlocality without entanglement: Party asymmetric case

    Authors: Atanu Bhunia, Indrani Chattopadhyay, Debasis Sarkar

    Abstract: A set of orthogonal product states of a composite Hilbert space is genuinely nonlocal if the states are locally indistinguishable across any bipartition. In this work, we construct a minimal set of party asymmetry genuine nonlocal set in arbitrary large dimensional composite quantum systems $C^d\otimes C^d\otimes C^d$. We provide a local discriminating protocol by using a three qubit GHZ state as… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: 17 pages, 2 figures, revtex, comments welcome

  37. arXiv:2107.13518  [pdf

    cond-mat.quant-gas cond-mat.mes-hall quant-ph

    0D-2D Heterostructure for making very Large Quantum Registers using itinerant Bose-Einstein Condensate of Excitons

    Authors: Amit Bhunia, Mohit Kumar Singh, Maryam Al Huwayz, Mohamed Henini, Shouvik Datta

    Abstract: Presence of coherent resonant tunneling in quantum dot (zero-dimensional) - quantum well (two-dimensional) heterostructure is necessary to explain the collective oscillations of average electrical polarization of excitonic dipoles over a macroscopically large area. This was measured using photo excited capacitance as a function of applied voltage bias. Resonant tunneling in this heterostructure de… ▽ More

    Submitted 19 June, 2023; v1 submitted 28 July, 2021; originally announced July 2021.

    Comments: 53 pages, Manuscript + 14 Figures

    Journal ref: Materials Today Electronics, Volume 4, 100039 (2023)

  38. arXiv:2107.12090  [pdf, other

    cs.CV

    Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: Although text recognition has significantly evolved over the years, state-of-the-art (SOTA) models still struggle in the wild scenarios due to complex backgrounds, varying fonts, uncontrolled illuminations, distortions and other artefacts. This is because such models solely depend on visual information for text recognition, thus lacking semantic reasoning capabilities. In this paper, we argue that… ▽ More

    Submitted 26 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  39. arXiv:2107.12087  [pdf, other

    cs.CV

    Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

    Authors: Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

    Abstract: Text recognition remains a fundamental and extensively researched topic in computer vision, largely owing to its wide array of commercial applications. The challenging nature of the very problem however dictated a fragmentation of research efforts: Scene Text Recognition (STR) that deals with text in everyday scenes, and Handwriting Text Recognition (HTR) that tackles hand-written text. In this pa… ▽ More

    Submitted 27 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  40. arXiv:2107.12081  [pdf, other

    cs.CV

    Towards the Unseen: Iterative Text Recognition by Distilling from Errors

    Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

    Abstract: Visual text recognition is undoubtedly one of the most extensively researched topics in computer vision. Great progress have been made to date, with the latest models starting to focus on the more practical "in-the-wild" setting. However, a salient problem still hinders practical deployment -- prior arts mostly struggle with recognising unseen (or rarely seen) character sequences. In this paper, w… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Computer Vision (ICCV), 2021

  41. arXiv:2104.03964  [pdf, other

    cs.CV

    Handwriting Transformers

    Authors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Mubarak Shah

    Abstract: We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the propos… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Journal ref: ICCV 2021

  42. arXiv:2104.01876  [pdf, other

    cs.CV

    MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

    Authors: Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

    Abstract: Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. Prior works however generally operate with the assumption that there is a limited number of styles, most of which have already been captured by existing datasets. In this paper, we take a completely different perspective -- we work on the assumption that there… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021

  43. arXiv:2103.15706  [pdf, other

    cs.CV

    StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. However, a fundamental challenge in SBIR has been largely ignored so far, that is, sketches are drawn by humans and considerable style variations exist amongst different users. An eff… ▽ More

    Submitted 31 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021

  44. arXiv:2103.13990  [pdf, other

    cs.CV

    More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. Whilst the number of photos can be easily scaled, each corresponding sketch still needs to be individually produced. In this paper, we aim to mitigate such an upper-bound on sketch data, and study… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/semisupervised-FGSBIR

  45. arXiv:2103.13716  [pdf, other

    cs.CV

    Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

    Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

    Abstract: Self-supervised learning has gained prominence due to its efficacy at learning powerful representations from unlabelled data that achieve excellent performance on many challenging downstream tasks. However supervision-free pre-text tasks are challenging to design and usually modality specific. Although there is a rich literature of self-supervised methods for either spatial (such as images) or tem… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/Self-Supervised-Learning-for-Sketch

  46. arXiv:2011.03830  [pdf, ps, other

    quant-ph

    Nonlocality of tripartite orthogonal product states

    Authors: Atanu Bhunia, Indrani Chattopadhyay, Debasis Sarkar

    Abstract: Local distinguishability of orthogonal product states is an area of active research in quantum information theory. However, most of the relevant results about local distinguishability found in bipartite quantum systems and very few are known in multipartite systems. In this work, we construct a locally indistinguishable subset in… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: 15 pages, 7 figures, revtex, comments welcome

  47. arXiv:2007.15103  [pdf, other

    cs.CV cs.IR

    Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

    Authors: Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

    Abstract: Sketch as an image search query is an ideal alternative to text in capturing the fine-grained visual details. Prior successes on fine-grained sketch-based image retrieval (FG-SBIR) have demonstrated the importance of tackling the unique traits of sketches as opposed to photos, e.g., temporal vs. static, strokes vs. pixels, and abstract vs. pixel-perfect. In this paper, we study a further trait of… ▽ More

    Submitted 11 August, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: Accepted for ORAL presentation in BMVC 2020

  48. arXiv:2003.03836  [pdf, other

    cs.CV

    Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches

    Authors: Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia, Jiyang Xie, Zhanyu Ma, Yi-Zhe Song, Jun Guo

    Abstract: Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks due to the inherently subtle intra-class object variations. Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts, more complementary parts, and parts of various granularities. However, less effort has been placed to which granularities are the most… ▽ More

    Submitted 19 July, 2020; v1 submitted 8 March, 2020; originally announced March 2020.

  49. arXiv:2002.10310  [pdf, other

    cs.CV

    Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

    Authors: Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

    Abstract: Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch. Its widespread applicability is however hindered by the fact that drawing a sketch takes time, and most people struggle to draw a complete and faithful sketch. In this paper, we reformulate the conventional FG-SBIR framework to tackle these challenges, wi… ▽ More

    Submitted 11 May, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020 [Oral Presentation] Code: https://github.com/AyanKumarBhunia/on-the-fly-FGSBIR

  50. The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

    Authors: Dongliang Chang, Yifeng Ding, Jiyang Xie, Ayan Kumar Bhunia, Xiaoxu Li, Zhanyu Ma, Ming Wu, Jun Guo, Yi-Zhe Song

    Abstract: Key for solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show it is possible to cultivate subtle details without the need for overly complicated network designs or training m… ▽ More

    Submitted 10 August, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: TIP2020. Code available at https://github.com/dongliangchang/Mutual-Channel-Loss