Search | arXiv e-print repository

POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation

Authors: Arindam Dutta, Rohit Lal, Yash Garg, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, Amit K. Roy-Chowdhury

Abstract: Existing algorithms for human body part segmentation have shown promising results on challenging datasets, primarily relying on end-to-end supervision. However, these algorithms exhibit severe performance drops in the face of domain shifts, leading to inaccurate segmentation masks. To tackle this issue, we introduce POSTURE: \underline{Po}se Guided Un\underline{s}upervised Domain Adap\underline{t}… ▽ More Existing algorithms for human body part segmentation have shown promising results on challenging datasets, primarily relying on end-to-end supervision. However, these algorithms exhibit severe performance drops in the face of domain shifts, leading to inaccurate segmentation masks. To tackle this issue, we introduce POSTURE: \underline{Po}se Guided Un\underline{s}upervised Domain Adap\underline{t}ation for H\underline{u}man Body Pa\underline{r}t S\underline{e}gmentation - an innovative pseudo-labelling approach designed to improve segmentation performance on the unlabeled target data. Distinct from conventional domain adaptive methods for general semantic segmentation, POSTURE stands out by considering the underlying structure of the human body and uses anatomical guidance from pose keypoints to drive the adaptation process. This strong inductive prior translates to impressive performance improvements, averaging 8\% over existing state-of-the-art domain adaptive semantic segmentation methods across three benchmark datasets. Furthermore, the inherent flexibility of our proposed approach facilitates seamless extension to source-free settings (SF-POSTURE), effectively mitigating potential privacy and computational concerns, with negligible drop in performance. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2312.16221 [pdf, other]

STRIDE: Single-video based Temporally Continuous Occlusion Robust 3D Pose Estimation

Authors: Rohit Lal, Saketh Bachu, Yash Garg, Arindam Dutta, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, M. Salman Asif, Amit K. Roy-Chowdhury

Abstract: The capability to accurately estimate 3D human poses is crucial for diverse fields such as action recognition, gait recognition, and virtual/augmented reality. However, a persistent and significant challenge within this field is the accurate prediction of human poses under conditions of severe occlusion. Traditional image-based estimators struggle with heavy occlusions due to a lack of temporal co… ▽ More The capability to accurately estimate 3D human poses is crucial for diverse fields such as action recognition, gait recognition, and virtual/augmented reality. However, a persistent and significant challenge within this field is the accurate prediction of human poses under conditions of severe occlusion. Traditional image-based estimators struggle with heavy occlusions due to a lack of temporal context, resulting in inconsistent predictions. While video-based models benefit from processing temporal data, they encounter limitations when faced with prolonged occlusions that extend over multiple frames. This challenge arises because these models struggle to generalize beyond their training datasets, and the variety of occlusions is hard to capture in the training data. Addressing these challenges, we propose STRIDE (Single-video based TempoRally contInuous occlusion Robust 3D Pose Estimation), a novel Test-Time Training (TTT) approach to fit a human motion prior for each video. This approach specifically handles occlusions that were not encountered during the model's training. By employing STRIDE, we can refine a sequence of noisy initial pose estimates into accurate, temporally coherent poses during test time, effectively overcoming the limitations of prior methods. Our framework demonstrates flexibility by being model-agnostic, allowing us to use any off-the-shelf 3D pose estimation method for improving robustness and temporal consistency. We validate STRIDE's efficacy through comprehensive experiments on challenging datasets like Occluded Human3.6M, Human3.6M, and OCMotion, where it not only outperforms existing single-image and video-based pose estimation models but also showcases superior handling of substantial occlusions, achieving fast, robust, accurate, and temporally consistent 3D pose estimates. △ Less

Submitted 13 March, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

arXiv:2310.06124 [pdf, other]

Factorized Tensor Networks for Multi-Task and Multi-Domain Learning

Authors: Yash Garg, Nebiyou Yismaw, Rakib Hyder, Ashley Prater-Bennette, M. Salman Asif

Abstract: Multi-task and multi-domain learning methods seek to learn multiple tasks/domains, jointly or one after another, using a single unified network. The key challenge and opportunity is to exploit shared information across tasks and domains to improve the efficiency of the unified network. The efficiency can be in terms of accuracy, storage cost, computation, or sample complexity. In this paper, we pr… ▽ More Multi-task and multi-domain learning methods seek to learn multiple tasks/domains, jointly or one after another, using a single unified network. The key challenge and opportunity is to exploit shared information across tasks and domains to improve the efficiency of the unified network. The efficiency can be in terms of accuracy, storage cost, computation, or sample complexity. In this paper, we propose a factorized tensor network (FTN) that can achieve accuracy comparable to independent single-task/domain networks with a small number of additional parameters. FTN uses a frozen backbone network from a source model and incrementally adds task/domain-specific low-rank tensor factors to the shared frozen network. This approach can adapt to a large number of target domains and tasks without catastrophic forgetting. Furthermore, FTN requires a significantly smaller number of task-specific parameters compared to existing methods. We performed experiments on widely used multi-domain and multi-task datasets. We show the experiments on convolutional-based architecture with different backbones and on transformer-based architecture. We observed that FTN achieves similar accuracy as single-task/domain methods while using only a fraction of additional parameters per task. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2304.08597 [pdf, other]

eTOP: Early Termination of Pipelines for Faster Training of AutoML Systems

Authors: Haoxiang Zhang, Juliana Freire, Yash Garg

Abstract: Recent advancements in software and hardware technologies have enabled the use of AI/ML models in everyday applications has significantly improved the quality of service rendered. However, for a given application, finding the right AI/ML model is a complex and costly process, that involves the generation, training, and evaluation of multiple interlinked steps (called pipelines), such as data pre-p… ▽ More Recent advancements in software and hardware technologies have enabled the use of AI/ML models in everyday applications has significantly improved the quality of service rendered. However, for a given application, finding the right AI/ML model is a complex and costly process, that involves the generation, training, and evaluation of multiple interlinked steps (called pipelines), such as data pre-processing, feature engineering, selection, and model tuning. These pipelines are complex (in structure) and costly (both in compute resource and time) to execute end-to-end, with a hyper-parameter associated with each step. AutoML systems automate the search of these hyper-parameters but are slow, as they rely on optimizing the pipeline's end output. We propose the eTOP Framework which works on top of any AutoML system and decides whether or not to execute the pipeline to the end or terminate at an intermediate step. Experimental evaluation on 26 benchmark datasets and integration of eTOPwith MLBox4 reduces the training time of the AutoML system upto 40x than baseline MLBox. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: NA

arXiv:1701.05499 [pdf, ps, other]

Symmetry analysis and soliton solution of (2+1)- dimensional Zoomeron equation

Authors: Vishakha Jadaun, Sachin Kumar, Yogeeta Garg

Abstract: Traveling wave solutions of (2 + 1)-dimensional Zoomeron equation(ZE) are developed in terms of exponential functions involving free parameters. It is shown that the novel Lie group of transformations method is a competent and prominent tool in solving nonlinear partial differential equations(PDEs) in mathematical physics. The similarity transformation method(STM) is applied first on (2 + 1)-dimen… ▽ More Traveling wave solutions of (2 + 1)-dimensional Zoomeron equation(ZE) are developed in terms of exponential functions involving free parameters. It is shown that the novel Lie group of transformations method is a competent and prominent tool in solving nonlinear partial differential equations(PDEs) in mathematical physics. The similarity transformation method(STM) is applied first on (2 + 1)-dimensional ZE to find the infinitesimal generators. Discussing the different cases on these infinitesimal generators, STM reduce (2 + 1)-dimensional ZE into (1 + 1)-dimensional PDEs, later it reduces these PDEs into various ordinary differential equations(ODEs) and help to find exact solutions of (2 + 1)-dimensional ZE. △ Less

Submitted 19 January, 2017; originally announced January 2017.

Comments: 09 Pages

Showing 1–5 of 5 results for author: Garg, Y