Skip to main content

Showing 1–50 of 73 results for author: Roy-Chowdhury, A K

.
  1. arXiv:2407.03549  [pdf, other

    cs.CV

    POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation

    Authors: Arindam Dutta, Rohit Lal, Yash Garg, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, Amit K. Roy-Chowdhury

    Abstract: Existing algorithms for human body part segmentation have shown promising results on challenging datasets, primarily relying on end-to-end supervision. However, these algorithms exhibit severe performance drops in the face of domain shifts, leading to inaccurate segmentation masks. To tackle this issue, we introduce POSTURE: \underline{Po}se Guided Un\underline{s}upervised Domain Adap\underline{t}… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2406.02575  [pdf, other

    cs.CL cs.CR cs.LG

    Cross-Modal Safety Alignment: Is textual unlearning all you need?

    Authors: Trishna Chakraborty, Erfan Shayegani, Zikui Cai, Nael Abu-Ghazaleh, M. Salman Asif, Yue Dong, Amit K. Roy-Chowdhury, Chengyu Song

    Abstract: Recent studies reveal that integrating new modalities into Large Language Models (LLMs), such as Vision-Language Models (VLMs), creates a new attack surface that bypasses existing safety training techniques like Supervised Fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). While further SFT and RLHF-based safety training can be conducted in multi-modal settings, collecting mu… ▽ More

    Submitted 27 May, 2024; originally announced June 2024.

  3. arXiv:2402.08769  [pdf, other

    cs.LG cs.DC

    FLASH: Federated Learning Across Simultaneous Heterogeneities

    Authors: Xiangyu Chang, Sk Miraj Ahmed, Srikanth V. Krishnamurthy, Basak Guler, Ananthram Swami, Samet Oymak, Amit K. Roy-Chowdhury

    Abstract: The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from variations in data distribution, but also in data quality, as well as compute/communication latency. An integrated view of these diverse and concurrent sources of h… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  4. arXiv:2401.04130  [pdf, other

    cs.LG cs.AI

    Plug-and-Play Transformer Modules for Test-Time Adaptation

    Authors: Xiangyu Chang, Sk Miraj Ahmed, Srikanth V. Krishnamurthy, Basak Guler, Ananthram Swami, Samet Oymak, Amit K. Roy-Chowdhury

    Abstract: Parameter-efficient tuning (PET) methods such as LoRA, Adapter, and Visual Prompt Tuning (VPT) have found success in enabling adaptation to new domains by tuning small modules within a transformer model. However, the number of domains encountered during test time can be very large, and the data is usually unlabeled. Thus, adaptation to new domains is challenging; it is also impractical to generate… ▽ More

    Submitted 8 February, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  5. arXiv:2401.02561  [pdf, other

    cs.LG

    MeTA: Multi-source Test Time Adaptation

    Authors: Sk Miraj Ahmed, Fahim Faisal Niloy, Dripta S. Raychaudhuri, Samet Oymak, Amit K. Roy-Chowdhury

    Abstract: Test time adaptation is the process of adapting, in an unsupervised manner, a pre-trained source model to each incoming batch of the test data (i.e., without requiring a substantial portion of the test data to be available, as in traditional domain adaptation) and without access to the source data. Since it works with each batch of test data, it is well-suited for dynamic environments where decisi… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: Under Review

  6. arXiv:2312.16221  [pdf, other

    cs.CV

    STRIDE: Single-video based Temporally Continuous Occlusion Robust 3D Pose Estimation

    Authors: Rohit Lal, Saketh Bachu, Yash Garg, Arindam Dutta, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, M. Salman Asif, Amit K. Roy-Chowdhury

    Abstract: The capability to accurately estimate 3D human poses is crucial for diverse fields such as action recognition, gait recognition, and virtual/augmented reality. However, a persistent and significant challenge within this field is the accurate prediction of human poses under conditions of severe occlusion. Traditional image-based estimators struggle with heavy occlusions due to a lack of temporal co… ▽ More

    Submitted 13 March, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

  7. arXiv:2312.05407  [pdf, other

    cs.CV

    Active Learning Guided Federated Online Adaptation: Applications in Medical Image Segmentation

    Authors: Md Shazid Islam, Sayak Nag, Arindam Dutta, Miraj Ahmed, Fahim Faisal Niloy, Amit K. Roy-Chowdhury

    Abstract: Data privacy, storage, and distribution shifts are major bottlenecks in medical image analysis. Data cannot be shared across patients, physicians, and facilities due to privacy concerns, usually requiring each patient's data to be analyzed in a discreet setting at a near real-time pace. However, one would like to take advantage of the accumulated knowledge across healthcare facilities as the compu… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  8. arXiv:2312.02420  [pdf, other

    cs.CV

    Towards Granularity-adjusted Pixel-level Semantic Annotation

    Authors: Rohit Kundu, Sudipta Paul, Rohit Lal, Amit K. Roy-Chowdhury

    Abstract: Recent advancements in computer vision predominantly rely on learning-based systems, leveraging annotations as the driving force to develop specialized models. However, annotating pixel-level information, particularly in semantic segmentation, presents a challenging and labor-intensive task, prompting the need for autonomous processes. In this work, we propose GranSAM which distinguishes itself by… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  9. arXiv:2311.05077  [pdf, other

    cs.CV

    POISE: Pose Guided Human Silhouette Extraction under Occlusions

    Authors: Arindam Dutta, Rohit Lal, Dripta S. Raychaudhuri, Calvin Khang Ta, Amit K. Roy-Chowdhury

    Abstract: Human silhouette extraction is a fundamental task in computer vision with applications in various downstream tasks. However, occlusions pose a significant challenge, leading to incomplete and distorted silhouettes. To address this challenge, we introduce POISE: Pose Guided Human Silhouette Extraction under Occlusions, a novel self-supervised fusion framework that enhances accuracy and robustness i… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Journal ref: Winter Conference on Applications of Computer Vision, 2024

  10. arXiv:2311.04991  [pdf, other

    cs.LG cs.CV

    Effective Restoration of Source Knowledge in Continual Test Time Adaptation

    Authors: Fahim Faisal Niloy, Sk Miraj Ahmed, Dripta S. Raychaudhuri, Samet Oymak, Amit K. Roy-Chowdhury

    Abstract: Traditional test-time adaptation (TTA) methods face significant challenges in adapting to dynamic environments characterized by continuously changing long-term target distributions. These challenges primarily stem from two factors: catastrophic forgetting of previously learned valuable source knowledge and gradual error accumulation caused by miscalibrated pseudo labels. To address these issues, t… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  11. arXiv:2309.11157  [pdf, other

    cs.CV

    Learning Deformable 3D Graph Similarity to Track Plant Cells in Unregistered Time Lapse Images

    Authors: Md Shazid Islam, Arindam Dutta, Calvin-Khang Ta, Kevin Rodriguez, Christian Michael, Mark Alber, G. Venugopala Reddy, Amit K. Roy-Chowdhury

    Abstract: Tracking of plant cells in images obtained by microscope is a challenging problem due to biological phenomena such as large number of cells, non-uniform growth of different layers of the tightly packed plant cells and cell division. Moreover, images in deeper layers of the tissue being noisy and unavoidable systemic errors inherent in the imaging process further complicates the problem. In this pa… ▽ More

    Submitted 21 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  12. arXiv:2308.13954  [pdf, other

    cs.CV

    Prior-guided Source-free Domain Adaptation for Human Pose Estimation

    Authors: Dripta S. Raychaudhuri, Calvin-Khang Ta, Arindam Dutta, Rohit Lal, Amit K. Roy-Chowdhury

    Abstract: Domain adaptation methods for 2D human pose estimation typically require continuous access to the source data during adaptation, which can be challenging due to privacy, memory, or computational constraints. To address this limitation, we focus on the task of source-free domain adaptation for pose estimation, where a source model must adapt to a new target domain using only unlabeled target data.… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023

  13. arXiv:2308.11880  [pdf, other

    cs.CV cs.LG

    SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets

    Authors: Cody Simons, Dripta S. Raychaudhuri, Sk Miraj Ahmed, Suya You, Konstantinos Karydis, Amit K. Roy-Chowdhury

    Abstract: Scene understanding using multi-modal data is necessary in many applications, e.g., autonomous navigation. To achieve this in a variety of situations, existing models must be able to adapt to shifting data distributions without arduous data annotation. Current approaches assume that the source data is available during adaptation and that the source consists of paired multi-modal data. Both these a… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 12 pages, 5 figures, 9 tables, ICCV 2023

  14. arXiv:2308.11744  [pdf, other

    cs.CV

    Efficient Controllable Multi-Task Architectures

    Authors: Abhishek Aich, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker, Yumin Suh

    Abstract: We aim to train a multi-task model such that users can adjust the desired compute budget and relative importance of task performances after deployment, without retraining. This enables optimizing performance for dynamically varying user needs, without heavy computational overhead to train and save models for various scenarios. To this end, we propose a multi-task model consisting of a shared encod… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  15. arXiv:2307.04905  [pdf, other

    cs.LG cs.DC

    FedYolo: Augmenting Federated Learning with Pretrained Transformers

    Authors: Xuechen Zhang, Mingchen Li, Xiangyu Chang, Jiasi Chen, Amit K. Roy-Chowdhury, Ananda Theertha Suresh, Samet Oymak

    Abstract: The growth and diversity of machine learning applications motivate a rethinking of learning with mobile and edge devices. How can we address diverse client goals and learn with scarce heterogeneous data? While federated learning aims to address these issues, it has challenges hindering a unified solution. Large transformer models have been shown to work across a variety of tasks achieving remarkab… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 20 pages, 18 figures

  16. Collaborative Multi-Agent Video Fast-Forwarding

    Authors: Shuyue Lan, Zhilu Wang, Ermin Wei, Amit K. Roy-Chowdhury, Qi Zhu

    Abstract: Multi-agent applications have recently gained significant popularity. In many computer vision tasks, a network of agents, such as a team of robots with cameras, could work collaboratively to perceive the environment for efficient and accurate situation awareness. However, these agents often have limited computation, communication, and storage resources. Thus, reducing resource consumption while st… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: IEEE Transactions on Multimedia, 2023. arXiv admin note: text overlap with arXiv:2008.04437

  17. arXiv:2212.07010  [pdf, other

    cs.CV

    Cross-Domain Video Anomaly Detection without Target Domain Adaptation

    Authors: Abhishek Aich, Kuan-Chuan Peng, Amit K. Roy-Chowdhury

    Abstract: Most cross-domain unsupervised Video Anomaly Detection (VAD) works assume that at least few task-relevant target domain training data are available for adaptation from the source to the target domain. However, this requires laborious model-tuning by the end-user who may prefer to have a system that works ``out-of-the-box." To address such practical scenarios, we identify a novel target domain (inf… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted at WACV 2023; Includes Supplementary Material

  18. arXiv:2210.07940  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments

    Authors: Sudipta Paul, Amit K. Roy-Chowdhury, Anoop Cherian

    Abstract: Recent years have seen embodied visual navigation advance in two distinct directions: (i) in equip** the AI agent to follow natural language instructions, and (ii) in making the navigable world multimodal, e.g., audio-visual navigation. However, the real world is not only multimodal, but also often complex, and thus in spite of these advances, agents still need to understand the uncertainty in t… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  19. arXiv:2210.01298  [pdf, other

    cs.CV cs.RO

    Centroid Distance Keypoint Detector for Colored Point Clouds

    Authors: Hanzhe Teng, Dimitrios Chatziparaschis, Xinyue Kan, Amit K. Roy-Chowdhury, Konstantinos Karydis

    Abstract: Keypoint detection serves as the basis for many computer vision and robotics applications. Despite the fact that colored point clouds can be readily obtained, most existing keypoint detectors extract only geometry-salient keypoints, which can impede the overall performance of systems that intend to (or have the potential to) leverage color information. To promote advances in such systems, we propo… ▽ More

    Submitted 15 June, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023; copyright will be transferred to IEEE upon publication

  20. arXiv:2209.09883  [pdf, other

    cs.CV

    Leveraging Local Patch Differences in Multi-Object Scenes for Generative Adversarial Attacks

    Authors: Abhishek Aich, Shasha Li, Chengyu Song, M. Salman Asif, Srikanth V. Krishnamurthy, Amit K. Roy-Chowdhury

    Abstract: State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attac… ▽ More

    Submitted 3 October, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

    Comments: Accepted at WACV 2023 (Round 1), camera-ready version

  21. arXiv:2209.09502  [pdf, other

    cs.CV

    GAMA: Generative Adversarial Multi-Object Scene Attacks

    Authors: Abhishek Aich, Calvin-Khang Ta, Akash Gupta, Chengyu Song, Srikanth V. Krishnamurthy, M. Salman Asif, Amit K. Roy-Chowdhury

    Abstract: The majority of methods for crafting adversarial attacks have focused on scenes with a single dominant object (e.g., images from ImageNet). On the other hand, natural scenes include multiple dominant objects that are semantically related. Thus, it is crucial to explore designing attack strategies that look beyond learning on single-object scenes or attack single-object victim classifiers. Due to t… ▽ More

    Submitted 15 October, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022; First two authors contributed equally; Includes Supplementary Material

  22. arXiv:2209.04027  [pdf, other

    cs.CV

    Cross-Modal Knowledge Transfer Without Task-Relevant Source Data

    Authors: Sk Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Amit K. Roy-Chowdhury

    Abstract: Cost-effective depth and infrared sensors as alternatives to usual RGB sensors are now a reality, and have some advantages over RGB in domains like autonomous navigation and remote sensing. As such, building computer vision and deep learning systems for depth and infrared data are crucial. However, large labeled datasets for these modalities are still lacking. In such cases, transferring knowledge… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  23. Poisson2Sparse: Self-Supervised Poisson Denoising From a Single Image

    Authors: Calvin-Khang Ta, Abhishek Aich, Akash Gupta, Amit K. Roy-Chowdhury

    Abstract: Image enhancement approaches often assume that the noise is signal independent, and approximate the degradation model as zero-mean additive Gaussian. However, this assumption does not hold for biomedical imaging systems where sensor-based sources of noise are proportional to signal strengths, and the noise is better represented as a Poisson process. In this work, we explore a sparsity and dictiona… ▽ More

    Submitted 27 June, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted to MICCAI 2022

  24. arXiv:2204.00942  [pdf, other

    cs.CV

    A-ACT: Action Anticipation through Cycle Transformations

    Authors: Akash Gupta, **gen Liu, Liefeng Bo, Amit K. Roy-Chowdhury, Tao Mei

    Abstract: While action anticipation has garnered a lot of research interest recently, most of the works focus on anticipating future action directly through observed visual cues only. In this work, we take a step back to analyze how the human capability to anticipate the future can be transferred to machine learning algorithms. To incorporate this ability in intelligent systems a question worth pondering up… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

  25. arXiv:2203.15230  [pdf, other

    cs.CV cs.CR cs.LG

    Zero-Query Transfer Attacks on Context-Aware Object Detectors

    Authors: Zikui Cai, Shantanu Rane, Alejandro E. Brito, Chengyu Song, Srikanth V. Krishnamurthy, Amit K. Roy-Chowdhury, M. Salman Asif

    Abstract: Adversarial attacks perturb images such that a deep neural network produces incorrect classification results. A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check, wherein, if the detected objects are not consistent with an appropriately defined context, then an attack is suspected. Stronger attacks are needed to fool su… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: CVPR 2022 Accepted

  26. arXiv:2203.14949  [pdf, other

    cs.CV cs.LG

    Controllable Dynamic Multi-Task Architectures

    Authors: Dripta S. Raychaudhuri, Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Amit K. Roy-Chowdhury, Manmohan Chandraker

    Abstract: Multi-task learning commonly encounters competition for resources among tasks, specifically when model capacity is limited. This challenge motivates models which allow control over the relative importance of tasks and total compute cost during inference time. In this work, we propose such a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired t… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  27. arXiv:2112.03223  [pdf, other

    cs.CV cs.AI cs.LG

    Context-Aware Transfer Attacks for Object Detection

    Authors: Zikui Cai, Xinxin Xie, Shasha Li, Mingjun Yin, Chengyu Song, Srikanth V. Krishnamurthy, Amit K. Roy-Chowdhury, M. Salman Asif

    Abstract: Blackbox transfer attacks for image classifiers have been extensively studied in recent years. In contrast, little progress has been made on transfer attacks for object detectors. Object detectors take a holistic view of the image and the detection of one object (or lack thereof) often depends on other objects in the scene. This makes such detectors inherently context-aware and adversarial attacks… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: accepted to AAAI 2022

  28. arXiv:2110.12321  [pdf, other

    cs.CV cs.LG

    ADC: Adversarial attacks against object Detection that evade Context consistency checks

    Authors: Mingjun Yin, Shasha Li, Chengyu Song, M. Salman Asif, Amit K. Roy-Chowdhury, Srikanth V. Krishnamurthy

    Abstract: Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples, which are slightly perturbed input images which lead DNNs to make wrong predictions. To protect from such examples, various defense strategies have been proposed. A very recent defense strategy for detecting adversarial examples, that has been shown to be robust to current attacks, is to check for intrinsic conte… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: WCAV'22 Acceptted

  29. arXiv:2110.01823  [pdf, other

    cs.CV

    Adversarial Attacks on Black Box Video Classifiers: Leveraging the Power of Geometric Transformations

    Authors: Shasha Li, Abhishek Aich, Shitong Zhu, M. Salman Asif, Chengyu Song, Amit K. Roy-Chowdhury, Srikanth V. Krishnamurthy

    Abstract: When compared to the image classification models, black-box adversarial attacks against video classification models have been largely understudied. This could be possible because, with video, the temporal dimension poses significant additional challenges in gradient estimation. Query-efficient black-box attacks rely on effectively estimated gradients towards maximizing the probability of misclassi… ▽ More

    Submitted 26 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021; First two authors contributed equally; Includes Supplementary Material

  30. arXiv:2108.09891  [pdf, other

    cs.CV

    Multi-Expert Adversarial Attack Detection in Person Re-identification Using Context Inconsistency

    Authors: Xue** Wang, Shasha Li, Min Liu, Yaonan Wang, Amit K. Roy-Chowdhury

    Abstract: The success of deep neural networks (DNNs) has promoted the widespread applications of person re-identification (ReID). However, ReID systems inherit the vulnerability of DNNs to malicious attacks of visually inconspicuous adversarial perturbations. Detection of adversarial attacks is, therefore, a fundamental requirement for robust ReID systems. In this work, we propose a Multi-Expert Adversarial… ▽ More

    Submitted 31 March, 2022; v1 submitted 22 August, 2021; originally announced August 2021.

    Comments: Accepted at IEEE ICCV 2021

  31. arXiv:2108.08421  [pdf, other

    cs.CV cs.LG

    Exploiting Multi-Object Relationships for Detecting Adversarial Attacks in Complex Scenes

    Authors: Mingjun Yin, Shasha Li, Zikui Cai, Chengyu Song, M. Salman Asif, Amit K. Roy-Chowdhury, Srikanth V. Krishnamurthy

    Abstract: Vision systems that deploy Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Recent research has shown that checking the intrinsic consistencies in the input data is a promising way to detect adversarial attacks (e.g., by checking the object co-occurrence relationships in complex scenes). However, existing approaches are tied to specific models and do not offer genera… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: ICCV'21 Accepted

  32. arXiv:2108.02832  [pdf, other

    eess.IV cs.CV

    Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning

    Authors: Akash Gupta, Padmaja Jonnalagedda, Bir Bhanu, Amit K. Roy-Chowdhury

    Abstract: Most of the existing works in supervised spatio-temporal video super-resolution (STVSR) heavily rely on a large-scale external dataset consisting of paired low-resolution low-frame rate (LR-LFR)and high-resolution high-frame-rate (HR-HFR) videos. Despite their remarkable performance, these methods make a prior assumption that the low-resolution video is obtained by down-scaling the high-resolution… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  33. arXiv:2108.00340  [pdf, other

    cs.CV

    Reconstruction guided Meta-learning for Few Shot Open Set Recognition

    Authors: Sayak Nag, Dripta S. Raychaudhuri, Sujoy Paul, Amit K. Roy-Chowdhury

    Abstract: In many applications, we are constrained to learn classifiers from very limited data (few-shot classification). The task becomes even more challenging if it is also required to identify samples from unknown categories (open-set classification). Learning a good abstraction for a class with very few samples is extremely difficult, especially under open-set settings. As a result, open-set recognition… ▽ More

    Submitted 30 September, 2023; v1 submitted 31 July, 2021; originally announced August 2021.

    Comments: Accepted for publication in IEEE Transactions in Pattern Analysis and Machine Intelligence (TPAMI)

  34. arXiv:2107.14368  [pdf, other

    eess.IV cs.LG

    Deep Quantized Representation for Enhanced Reconstruction

    Authors: Akash Gupta, Abhishek Aich, Kevin Rodriguez, G. Venugopala Reddy, Amit K. Roy-Chowdhury

    Abstract: While machine learning approaches have shown remarkable performance in biomedical image analysis, most of these methods rely on high-quality and accurate imaging data. However, collecting such data requires intensive and careful manual effort. One of the major challenges in imaging the Shoot Apical Meristem (SAM) of Arabidopsis thaliana, is that the deeper slices in the z-stack suffer from differe… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

    Comments: Accepted to ISBI Workshop, 2020

  35. arXiv:2107.11878  [pdf, other

    cs.CV

    Spatio-Temporal Representation Factorization for Video-based Person Re-Identification

    Authors: Abhishek Aich, Meng Zheng, Srikrishna Karanam, Terrence Chen, Amit K. Roy-Chowdhury, Ziyan Wu

    Abstract: Despite much recent progress in video-based person re-identification (re-ID), the current state-of-the-art still suffers from common real-world challenges such as appearance similarity among various people, occlusions, and frame misalignment. To alleviate these problems, we propose Spatio-Temporal Representation Factorization (STRF), a flexible new computational unit that can be used in conjunctio… ▽ More

    Submitted 14 August, 2021; v1 submitted 25 July, 2021; originally announced July 2021.

    Comments: Accepted at IEEE ICCV 2021, Includes Supplementary Material

  36. arXiv:2105.10037  [pdf, other

    cs.LG cs.AI

    Cross-domain Imitation from Observations

    Authors: Dripta S. Raychaudhuri, Sujoy Paul, Jeroen van Baar, Amit K. Roy-Chowdhury

    Abstract: Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior. With environments modeled as Markov Decision Processes (MDP), most of the existing imitation algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which a new imitation policy is to be learned. In this paper, we… ▽ More

    Submitted 20 May, 2021; originally announced May 2021.

    Comments: Accepted at ICML 2021 as a long presentation

  37. arXiv:2104.01845  [pdf, other

    cs.LG cs.CV

    Unsupervised Multi-source Domain Adaptation Without Access to Source Data

    Authors: Sk Miraj Ahmed, Dripta S. Raychaudhuri, Sujoy Paul, Samet Oymak, Amit K. Roy-Chowdhury

    Abstract: Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for an unlabeled domain by transferring knowledge from a separate labeled source domain. However, most of these conventional UDA approaches make the strong assumption of having access to the source data during training, which may not be very practical due to privacy, security and storage concerns. A recent line of work addressed… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: This paper will appear at CVPR 2021

  38. arXiv:2103.08134  [pdf, other

    cs.CV

    Detection and Localization of Facial Expression Manipulations

    Authors: Ghazal Mazaheri, Amit K. Roy-Chowdhury

    Abstract: Concern regarding the wide-spread use of fraudulent images/videos in social media necessitates precise detection of such fraud. The importance of facial expressions in communication is widely known, and adversarial attacks often focus on manipulating the expression related features. Thus, it is important to develop methods that can detect manipulations in facial expressions, and localize the manip… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

  39. arXiv:2102.01874  [pdf, other

    cs.CV

    Learning to identify image manipulations in scientific publications

    Authors: Ghazal Mazaheri, Kevin Urrutia Avila, Amit K. Roy-Chowdhury

    Abstract: Adherence to scientific community standards ensures objectivity, clarity, reproducibility, and helps prevent bias, fabrication, falsification, and plagiarism. To help scientific integrity officers and journal/publisher reviewers monitor if researchers stick with these standards, it is important to have a solid procedure to detect duplication as one of the most frequent types of manipulation in sci… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  40. arXiv:2010.09066  [pdf, other

    cs.CV cs.LG

    Exploiting Context for Robustness to Label Noise in Active Learning

    Authors: Sudipta Paul, Shivkumar Chandrasekaran, B. S. Manjunath, Amit K. Roy-Chowdhury

    Abstract: Several works in computer vision have demonstrated the effectiveness of active learning for adapting the recognition model when new unlabeled data becomes available. Most of these works consider that labels obtained from the annotator are correct. However, in a practical scenario, as the quality of the labels depends on the annotator, some of the labels might be wrong, which results in degraded re… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

  41. arXiv:2009.01005  [pdf, other

    cs.CV cs.LG eess.IV

    ALANET: Adaptive Latent Attention Network forJoint Video Deblurring and Interpolation

    Authors: Akash Gupta, Abhishek Aich, Amit K. Roy-Chowdhury

    Abstract: Existing works address the problem of generating high frame-rate sharp videos by separately learning the frame deblurring and frame interpolation modules. Most of these approaches have a strong prior assumption that all the input frames are blurry whereas in a real-world setting, the quality of frames varies. Moreover, such approaches are trained to perform either of the two tasks - deblurring or… ▽ More

    Submitted 31 August, 2020; originally announced September 2020.

    Comments: Accepted to ACM-MM 2020

  42. arXiv:2008.11772  [pdf, ps, other

    cs.CV

    Measurement-driven Security Analysis of Imperceptible Impersonation Attacks

    Authors: Shasha Li, Karim Khalil, Rameswar Panda, Chengyu Song, Srikanth V. Krishnamurthy, Amit K. Roy-Chowdhury, Ananthram Swami

    Abstract: The emergence of Internet of Things (IoT) brings about new security challenges at the intersection of cyber and physical spaces. One prime example is the vulnerability of Face Recognition (FR) based access control in IoT systems. While previous research has shown that Deep Neural Network(DNN)-based FR systems (FRS) are potentially susceptible to imperceptible impersonation attacks, the potency of… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: accepted and appears in ICCCN 2020

  43. Text-based Localization of Moments in a Video Corpus

    Authors: Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury

    Abstract: Prior works on text-based video moment localization focus on temporally grounding the textual query in an untrimmed video. These works assume that the relevant video is already known and attempt to localize the moment on that relevant video only. Different from such works, we relax this assumption and address the task of localizing moments in a corpus of videos for a given sentence query. This tas… ▽ More

    Submitted 18 August, 2021; v1 submitted 19 August, 2020; originally announced August 2020.

  44. arXiv:2008.05746  [pdf, other

    cs.CV

    Adversarial Knowledge Transfer from Unlabeled Data

    Authors: Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury

    Abstract: While machine learning approaches to visual recognition offer great promise, most of the existing methods rely heavily on the availability of large quantities of labeled training data. However, in the vast majority of real-world settings, manually collecting such large labeled datasets is infeasible due to the cost of labeling data or the paucity of data in a given domain. In this paper, we presen… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: Accepted to ACM Multimedia 2020

  45. arXiv:2008.04437  [pdf, other

    cs.CV

    Distributed Multi-agent Video Fast-forwarding

    Authors: Shuyue Lan, Zhilu Wang, Amit K. Roy-Chowdhury, Ermin Wei, Qi Zhu

    Abstract: In many intelligent systems, a network of agents collaboratively perceives the environment for better and more efficient situation awareness. As these agents often have limited resources, it could be greatly beneficial to identify the content overlap** among camera views from different agents and leverage it for reducing the processing, transmission and storage of redundant/unimportant video fra… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

    Comments: To appear at ACM Multimedia 2020

  46. arXiv:2007.15176  [pdf, other

    cs.CV

    Domain Adaptive Semantic Segmentation Using Weak Labels

    Authors: Sujoy Paul, Yi-Hsuan Tsai, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker

    Abstract: Learning semantic segmentation models requires a huge amount of pixel-wise labeling. However, labeled data may only be available abundantly in a domain different from the desired target domain, which only has minimal or no annotations. In this work, we propose a novel framework for domain adaptation in semantic segmentation with image-level weak labels in the target domain. The weak labels may be… ▽ More

    Submitted 12 August, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  47. arXiv:2007.11149  [pdf, other

    cs.CV

    Camera On-boarding for Person Re-identification using Hypothesis Transfer Learning

    Authors: Sk Miraj Ahmed, Aske R LejbĂžlle, Rameswar Panda, Amit K. Roy-Chowdhury

    Abstract: Most of the existing approaches for person re-identification consider a static setting where the number of cameras in the network is fixed. An interesting direction, which has received little attention, is to explore the dynamic nature of a camera network, where one tries to adapt the existing re-identification models after on-boarding new cameras, with little additional effort. There have been a… ▽ More

    Submitted 5 August, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: Accepted to CVPR 2020

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020) 12144-12153

  48. arXiv:2007.11064  [pdf, other

    cs.CV

    Exploiting Temporal Coherence for Self-Supervised One-shot Video Re-identification

    Authors: Dripta S. Raychaudhuri, Amit K. Roy-Chowdhury

    Abstract: While supervised techniques in re-identification are extremely effective, the need for large amounts of annotations makes them impractical for large camera networks. One-shot re-identification, which uses a singular labeled tracklet for each identity along with a pool of unlabeled tracklets, is a potential candidate towards reducing this labeling effort. Current one-shot re-identification methods… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: Accepted at ECCV 2020

  49. arXiv:2007.10631  [pdf, other

    cs.CV

    Learning Person Re-identification Models from Videos with Weak Supervision

    Authors: Xue** Wang, Sujoy Paul, Dripta S. Raychaudhuri, Min Liu, Yaonan Wang, Amit K. Roy-Chowdhury

    Abstract: Most person re-identification methods, being supervised techniques, suffer from the burden of massive annotation requirement. Unsupervised methods overcome this need for labeled data, but perform poorly compared to the supervised alternatives. In order to cope with this issue, we introduce the problem of learning person re-identification models from videos with weak supervision. The weak nature of… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

  50. arXiv:2003.09565  [pdf, other

    eess.IV cs.CV

    Non-Adversarial Video Synthesis with Learned Priors

    Authors: Abhishek Aich, Akash Gupta, Rameswar Panda, Rakib Hyder, M. Salman Asif, Amit K. Roy-Chowdhury

    Abstract: Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from… ▽ More

    Submitted 17 April, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020