Skip to main content

Showing 1–14 of 14 results for author: Parmar, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01299  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes

    Authors: Paritosh Parmar, Eric Peh, Ruirui Chen, Ting En Lam, Yuhan Chen, Elston Tan, Basura Fernando

    Abstract: Causal video question answering (QA) has garnered increasing interest, yet existing datasets often lack depth in causal reasoning. To address this gap, we capitalize on the unique properties of cartoons and construct CausalChaos!, a novel, challenging causal Why-QA dataset built upon the iconic "Tom and Jerry" cartoon series. Cartoons use the principles of animation that allow animators to create… ▽ More

    Submitted 14 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Project Page: https://github.com/LUNAProject22/CausalChaos

  2. arXiv:2403.13798  [pdf, other

    cs.CV cs.AI cs.LG cs.SC

    Hierarchical NeuroSymbolic Approach for Comprehensive and Explainable Action Quality Assessment

    Authors: Lauren Okamoto, Paritosh Parmar

    Abstract: Action quality assessment (AQA) applies computer vision to quantitatively assess the performance or execution of a human action. Current AQA approaches are end-to-end neural models, which lack transparency and tend to be biased because they are trained on subjective human judgements as ground-truth. To address these issues, we introduce a neuro-symbolic paradigm for AQA, which uses neural networks… ▽ More

    Submitted 24 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 CVSports (Oral Presentation; 3/3 Strong Accepts) + Selected for CVPR 2024 Demos

  3. arXiv:2401.10805  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Learning to Visually Connect Actions and their Effects

    Authors: Eric Peh, Paritosh Parmar, Basura Fernando

    Abstract: In this work, we introduce the novel concept of visually Connecting Actions and Their Effects (CATE) in video understanding. CATE can have applications in areas like task planning and learning from demonstration. We identify and explore two different aspects of the concept of CATE: Action Selection and Effect-Affinity Assessment, where video understanding models connect actions and effects at sema… ▽ More

    Submitted 26 April, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

  4. arXiv:2205.04841  [pdf, other

    cs.CV

    Object Detection in Indian Food Platters using Transfer Learning with YOLOv4

    Authors: Deepanshu Pandey, Purva Parmar, Gauri Toshniwal, Mansi Goel, Vishesh Agrawal, Shivangi Dhiman, Lavanya Gupta, Ganesh Bagler

    Abstract: Object detection is a well-known problem in computer vision. Despite this, its usage and pervasiveness in the traditional Indian food dishes has been limited. Particularly, recognizing Indian food dishes present in a single photo is challenging due to three reasons: 1. Lack of annotated Indian food datasets 2. Non-distinct boundaries between the dishes 3. High intra-class variation. We solve these… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: 6 pages, 7 figures, 38th IEEE International Conference on Data Engineering, 2022, DECOR Workshop

  5. arXiv:2202.14019  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment

    Authors: Paritosh Parmar, Amol Gharat, Helge Rhodin

    Abstract: Maintaining proper form while exercising is important for preventing injuries and maximizing muscle mass gains. Detecting errors in workout form naturally requires estimating human's body pose. However, off-the-shelf pose estimators struggle to perform well on the videos recorded in gym scenarios due to factors such as camera angles, occlusion from gym equipment, illumination, and clothing. To agg… ▽ More

    Submitted 21 October, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

  6. arXiv:2102.07355  [pdf, other

    cs.CV cs.LG

    Win-Fail Action Recognition

    Authors: Paritosh Parmar, Brendan Morris

    Abstract: Current video/action understanding systems have demonstrated impressive performance on large recognition tasks. However, they might be limiting themselves to learning to recognize spatiotemporal patterns, rather than attempting to thoroughly understand the actions. To spur progress in the direction of a truer, deeper understanding of videos, we introduce the task of win-fail action recognition --… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  7. arXiv:2101.04884  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Piano Skills Assessment

    Authors: Paritosh Parmar, Jaiden Reddy, Brendan Morris

    Abstract: Can a computer determine a piano player's skill level? Is it preferable to base this assessment on visual analysis of the player's performance or should we trust our ears over our eyes? Since current CNNs have difficulty processing long video videos, how can shorter clips be sampled to best reflect the players skill level? In this work, we collect and release a first-of-its-kind dataset for multim… ▽ More

    Submitted 20 June, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: Dataset is available from: https://github.com/ParitoshParmar/Piano-Skills-Assessment

  8. arXiv:2007.10021  [pdf, other

    cs.CL cs.LG

    Voice@SRIB at SemEval-2020 Task 9 and 12: Stacked Ensembling method for Sentiment and Offensiveness detection in Social Media

    Authors: Abhishek Singh, Surya Pratap Singh Parmar

    Abstract: In social-media platforms such as Twitter, Facebook, and Reddit, people prefer to use code-mixed language such as Spanish-English, Hindi-English to express their opinions. In this paper, we describe different models we used, using the external dataset to train embeddings, ensembling methods for Sentimix, and OffensEval tasks. The use of pre-trained embeddings usually helps in multiple tasks such a… ▽ More

    Submitted 11 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

    Comments: Changed title and few more changes. This version will be published in SemEval2020. Added code Link

  9. arXiv:1912.04430  [pdf, other

    cs.CV

    HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN

    Authors: Paritosh Parmar, Brendan Morris

    Abstract: Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action related tasks. However, 3D-CNN are notorious for being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNN student. By requir… ▽ More

    Submitted 21 October, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: Codebase: https://github.com/ParitoshParmar/HalluciNet

  10. arXiv:1904.04346  [pdf, other

    cs.CV

    What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment

    Authors: Paritosh Parmar, Brendan Tran Morris

    Abstract: Can performance on the task of action quality assessment (AQA) be improved by exploiting a description of the action and its quality? Current AQA and skills assessment approaches propose to learn features that serve only one task - estimating the final score. In this paper, we propose to learn spatio-temporal features that explain three related tasks - fine-grained action recognition, commentary g… ▽ More

    Submitted 14 June, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: CVPR 2019. Dataset temporarily made available at https://github.com/ParitoshParmar/MTL-AQA

  11. arXiv:1812.06367  [pdf, other

    cs.CV

    Action Quality Assessment Across Multiple Actions

    Authors: Paritosh Parmar, Brendan Tran Morris

    Abstract: Can learning to measure the quality of an action help in measuring the quality of other actions? If so, can consolidated samples from multiple actions help improve the performance of current approaches? In this paper, we carry out experiments to see if knowledge transfer is possible in the action quality assessment (AQA) setting. Experiments are carried out on our newly released AQA dataset (http:… ▽ More

    Submitted 8 April, 2019; v1 submitted 15 December, 2018; originally announced December 2018.

    Comments: WACV 2019

  12. arXiv:1611.05125  [pdf, other

    cs.CV

    Learning To Score Olympic Events

    Authors: Paritosh Parmar, Brendan Tran Morris

    Abstract: Estimating action quality, the process of assigning a "score" to the execution of an action, is crucial in areas such as sports and health care. Unlike action recognition, which has millions of examples to learn from, the action quality datasets that are currently available are small -- typically comprised of only a few hundred samples. This work presents three frameworks for evaluating Olympic sp… ▽ More

    Submitted 18 May, 2017; v1 submitted 15 November, 2016; originally announced November 2016.

    Comments: CVPR 2017 - CVSports Workshop

  13. arXiv:1608.09005  [pdf, other

    cs.CV

    Measuring the Quality of Exercises

    Authors: Paritosh Parmar, Brendan Tran Morris

    Abstract: This work explores the problem of exercise quality measurement since it is essential for effective management of diseases like cerebral palsy (CP). This work examines the assessment of quality of large amplitude movement (LAM) exercises designed to treat CP in an automated fashion. Exercise data was collected by trained participants to generate ideal examples to use as a positive samples for machi… ▽ More

    Submitted 31 August, 2016; originally announced August 2016.

    Comments: EMBC'16 (The 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society)

  14. Use of Computer Vision to Detect Tangles in Tangled Objects

    Authors: Paritosh Parmar

    Abstract: Untangling of structures like ropes and wires by autonomous robots can be useful in areas such as personal robotics, industries and electrical wiring & repairing by robots. This problem can be tackled by using computer vision system in robot. This paper proposes a computer vision based method for analyzing visual data acquired from camera for perceiving the overlap of wires, ropes, hoses i.e. dete… ▽ More

    Submitted 11 October, 2014; v1 submitted 19 May, 2014; originally announced May 2014.

    Comments: IEEE International Conference on Image Information Processing; untangle; untangling; computer vision; robotic vision; untangling by robot; Tangled-100 dataset; tangled linear deformable objects; personal robotics; image processing