Skip to main content

Showing 1–13 of 13 results for author: Saha, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16273  [pdf, other

    cs.CV

    YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals

    Authors: Sandeep Mishra, Oindrila Saha, Alan C. Bovik

    Abstract: 3D generation guided by text-to-image diffusion models enables the creation of visually compelling assets. However previous methods explore generation based on image or text. The boundaries of creativity are limited by what can be expressed through words or the images that can be sourced. We present YouDream, a method to generate high-quality anatomically controllable animals. YouDream is guided u… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.11988  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Decomposed evaluations of geographic disparities in text-to-image models

    Authors: Abhishek Sureddy, Dishant Padalia, Nandhinee Periyakaruppa, Oindrila Saha, Adina Williams, Adriana Romero-Soriano, Megan Richards, Polina Kirichenko, Melissa Hall

    Abstract: Recent work has identified substantial disparities in generated images of different geographic regions, including stereotypical depictions of everyday objects like houses and cars. However, existing measures for these disparities have been limited to either human evaluations, which are time-consuming and costly, or automatic metrics evaluating full images, which are unable to attribute these dispa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.07742  [pdf, other

    cs.CV

    C3DAG: Controlled 3D Animal Generation using 3D pose guidance

    Authors: Sandeep Mishra, Oindrila Saha, Alan C. Bovik

    Abstract: Recent advancements in text-to-3D generation have demonstrated the ability to generate high quality 3D assets. However while generating animals these methods underperform, often portraying inaccurate anatomy and geometry. Towards ameliorating this defect, we present C3DAG, a novel pose-Controlled text-to-3D Animal Generation framework which generates a high quality 3D animal consistent with a give… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  4. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  5. arXiv:2401.02460  [pdf, other

    cs.CV

    Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions

    Authors: Oindrila Saha, Grant Van Horn, Subhransu Maji

    Abstract: The zero-shot performance of existing vision-language models (VLMs) such as CLIP is limited by the availability of large-scale, aligned image and text datasets in specific domains. In this work, we leverage two complementary sources of information -- descriptions of categories generated by large language models (LLMs) and abundant, fine-grained image classification datasets -- to improve the zero-… ▽ More

    Submitted 3 April, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  6. arXiv:2309.13822  [pdf, other

    cs.CV

    PARTICLE: Part Discovery and Contrastive Learning for Fine-grained Recognition

    Authors: Oindrila Saha, Subhransu Maji

    Abstract: We develop techniques for refining representations for fine-grained classification and segmentation tasks in a self-supervised manner. We find that fine-tuning methods based on instance-discriminative contrastive learning are not as effective, and posit that recognizing part-specific variations is crucial for fine-grained categorization. We present an iterative learning approach that incorporates… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  7. arXiv:2204.05393  [pdf, other

    cs.CV

    Improving Few-Shot Part Segmentation using Coarse Supervision

    Authors: Oindrila Saha, Zezhou Cheng, Subhransu Maji

    Abstract: A significant bottleneck in training deep networks for part segmentation is the cost of obtaining detailed annotations. We propose a framework to exploit coarse labels such as figure-ground masks and keypoint locations that are readily available for some categories to improve part segmentation models. A key challenge is that these annotations were collected for different tasks and with different l… ▽ More

    Submitted 27 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: ECCV'22 Camera Ready

  8. arXiv:2112.00854  [pdf, other

    cs.CV

    GANORCON: Are Generative Models Useful for Few-shot Segmentation?

    Authors: Oindrila Saha, Zezhou Cheng, Subhransu Maji

    Abstract: Advances in generative modeling based on GANs has motivated the community to find their use beyond image generation and editing tasks. In particular, several recent works have shown that GAN representations can be re-purposed for discriminative tasks such as part segmentation, especially when training data is limited. But how do these improvements stack-up against recent advances in self-supervise… ▽ More

    Submitted 28 April, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: CVPR 2022 Camera Ready Version

  9. arXiv:2110.09018  [pdf, other

    cs.RO cs.AI cs.LG

    Reinforcement Learning-Based Coverage Path Planning with Implicit Cellular Decomposition

    Authors: Javad Heydari, Olimpiya Saha, Viswanath Ganapathy

    Abstract: Coverage path planning in a generic known environment is shown to be NP-hard. When the environment is unknown, it becomes more challenging as the robot is required to rely on its online map information built during coverage for planning its path. A significant research effort focuses on designing heuristic or approximate algorithms that achieve reasonable performance. Such algorithms have sub-opti… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 20 pages

  10. arXiv:2110.05015  [pdf, other

    cs.LG

    A Survey on Proactive Customer Care: Enabling Science and Steps to Realize it

    Authors: Viswanath Ganapathy, Sauptik Dhar, Olimpiya Saha, Pelin Kurt Garberson, Javad Heydari, Mohak Shah

    Abstract: In recent times, advances in artificial intelligence (AI) and IoT have enabled seamless and viable maintenance of appliances in home and building environments. Several studies have shown that AI has the potential to provide personalized customer support which could predict and avoid errors more reliably than ever before. In this paper, we have analyzed the various building blocks needed to enable… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:1912.07383, arXiv:2007.02500 by other authors

  11. arXiv:2008.13745  [pdf, other

    cs.CV cs.LG

    RecSal : Deep Recursive Supervision for Visual Saliency Prediction

    Authors: Sandeep Mishra, Oindrila Saha

    Abstract: State-of-the-art saliency prediction methods develop upon model architectures or loss functions; while training to generate one target saliency map. However, publicly available saliency prediction datasets can be utilized to create more information for each stimulus than just a final aggregate saliency map. This information when utilized in a biologically inspired fashion can contribute in better… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

    Comments: to appear in BMVC 2020

  12. arXiv:2002.11921  [pdf, other

    cs.CV cs.LG

    RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference

    Authors: Oindrila Saha, Aditya Kusupati, Harsha Vardhan Simhadri, Manik Varma, Prateek Jain

    Abstract: Standard Convolutional Neural Networks (CNNs) designed for computer vision tasks tend to have large intermediate activation maps. These require large working memory and are thus unsuitable for deployment on resource-constrained devices typically used for inference on the edge. Aggressively downsampling the images via pooling or strided convolutions can address the problem but leads to a significan… ▽ More

    Submitted 22 October, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: 25 pages, 8 figures. Published at Advances in Neural Information Processing Systems (NeurIPS) 2020

  13. arXiv:1902.03122  [pdf, other

    cs.CV

    Fully Convolutional Neural Network for Semantic Segmentation of Anatomical Structure and Pathologies in Colour Fundus Images Associated with Diabetic Retinopathy

    Authors: Oindrila Saha, Rachana Sathish, Debdoot Sheet

    Abstract: Diabetic retinopathy (DR) is the most common form of diabetic eye disease. Retinopathy can affect all diabetic patients and becomes particularly dangerous, increasing the risk of blindness, if it is left untreated. The success rate of its curability solemnly depends on diagnosis at an early stage. The development of automated computer aided disease diagnosis tools could help in faster detection of… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

    Comments: arXiv admin note: text overlap with arXiv:1511.00561 by other authors