Skip to main content

Showing 1–24 of 24 results for author: Mondal, A K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03216  [pdf, other

    cs.CV cs.AI

    Learning Disentangled Representation in Object-Centric Models for Visual Dynamics Prediction via Transformers

    Authors: Sanket Gandhi, Atul, Samanyu Mahajan, Vishal Sharma, Rushil Gupta, Arnab Kumar Mondal, Parag Singla

    Abstract: Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can learning disentangled representation further improve the accuracy of visual dynamics prediction in object-centric models?" While there has been some attempt to le… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2405.14089  [pdf, other

    cs.LG

    Improved Canonicalization for Model Agnostic Equivariance

    Authors: Siba Smarak Panigrahi, Arnab Kumar Mondal

    Abstract: This work introduces a novel approach to achieving architecture-agnostic equivariance in deep learning, particularly addressing the limitations of traditional equivariant architectures and the inefficiencies of the existing architecture-agnostic methods. Building equivariant models using traditional methods requires designing equivariant versions of existing models and training them from scratch,… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted to EquiVision workshop, CVPR 2024. 7 pages, 1 figure

  3. arXiv:2404.10880  [pdf, other

    cs.CV cs.AI

    HumMUSS: Human Motion Understanding using State Space Models

    Authors: Arnab Kumar Mondal, Stefano Alletto, Denis Tome

    Abstract: Understanding human motion from video is essential for a range of applications, including pose estimation, mesh recovery and action recognition. While state-of-the-art methods predominantly rely on transformer-based architectures, these approaches have limitations in practical scenarios. Transformers are slower when sequentially predicting on a continuous stream of frames in real-time, and do not… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: CVPR 24

  4. arXiv:2310.01647  [pdf, other

    cs.LG

    Equivariant Adaptation of Large Pretrained Models

    Authors: Arnab Kumar Mondal, Siba Smarak Panigrahi, Sékou-Oumar Kaba, Sai Rajeswar, Siamak Ravanbakhsh

    Abstract: Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to higher sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during… ▽ More

    Submitted 29 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 17 pages, 6 figures. Accepted to NeurIPS 2023

  5. arXiv:2306.11941  [pdf, other

    cs.LG cs.AI

    Efficient Dynamics Modeling in Interactive Environments with Koopman Theory

    Authors: Arnab Kumar Mondal, Siba Smarak Panigrahi, Sai Rajeswar, Kaleem Siddiqi, Siamak Ravanbakhsh

    Abstract: The accurate modeling of dynamics in interactive environments is critical for successful long-range prediction. Such a capability could advance Reinforcement Learning (RL) and Planning algorithms, but achieving it is challenging. Inaccuracies in model estimates can compound, resulting in increased errors over long horizons. We approach this problem from the lens of Koopman theory, where the nonlin… ▽ More

    Submitted 12 May, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted to ICLR 2024 and EWRL 2023

  6. arXiv:2305.14410  [pdf, other

    cs.CV cs.AI cs.CL

    Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach

    Authors: Harman Singh, Poorva Garg, Mohit Gupta, Kevin Shah, Ashish Goswami, Satyam Modi, Arnab Kumar Mondal, Dinesh Khandelwal, Dinesh Garg, Parag Singla

    Abstract: We are interested in image manipulation via natural language text -- a task that is useful for multiple AI applications but requires complex reasoning over multi-modal spaces. We extend recently proposed Neuro Symbolic Concept Learning (NSCL), which has been quite effective for the task of Visual Question Answering (VQA), for the task of image manipulation. Our system referred to as NeuroSIM can p… ▽ More

    Submitted 24 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 (long paper, main conference)

  7. arXiv:2211.06489  [pdf, other

    cs.LG cs.AI

    Equivariance with Learned Canonicalization Functions

    Authors: Sékou-Oumar Kaba, Arnab Kumar Mondal, Yan Zhang, Yoshua Bengio, Siamak Ravanbakhsh

    Abstract: Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations. In this paper, we propose an alternative that avoids this architectural constraint by learning to produce canonical representations of the data. These canonicalization functions can readily be plugged into non-equivariant backbone architectures. We offer exp… ▽ More

    Submitted 7 July, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 21 pages, 5 figures

  8. ImAiR: Airwriting Recognition framework using Image Representation of IMU Signals

    Authors: Ayush Tripathi, Arnab Kumar Mondal, Lalan Kumar, Prathosh A. P

    Abstract: The problem of Airwriting Recognition is focused on identifying letters written by movement of finger in free space. It is a type of gesture recognition where the dictionary corresponds to letters in a specific language. In particular, airwriting recognition using sensor data from wrist-worn devices can be used as a medium of user input for applications in Human-Computer Interaction (HCI). Recogni… ▽ More

    Submitted 8 September, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

  9. arXiv:2202.10930  [pdf, other

    cs.LG cs.AI

    Transformation Coding: Simple Objectives for Equivariant Representations

    Authors: Mehran Shakerinava, Arnab Kumar Mondal, Siamak Ravanbakhsh

    Abstract: We present a simple non-generative approach to deep representation learning that seeks equivariant deep embedding through simple objectives. In contrast to existing equivariant networks, our transformation coding approach does not constrain the choice of the feed-forward layer or the architecture and allows for an unknown group action on the input space. We introduce several such transformation co… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  10. arXiv:2202.05808  [pdf, other

    cs.LG cs.AI q-bio.NC

    Investigating Power laws in Deep Representation Learning

    Authors: Arna Ghosh, Arnab Kumar Mondal, Kumar Krishna Agrawal, Blake Richards

    Abstract: Representation learning that leverages large-scale labelled datasets, is central to recent progress in machine learning. Access to task relevant labels at scale is often scarce or expensive, motivating the need to learn from unlabelled datasets with self-supervised learning (SSL). Such large unlabelled datasets (with data augmentations) often provide a good coverage of the underlying input distrib… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  11. SCLAiR : Supervised Contrastive Learning for User and Device Independent Airwriting Recognition

    Authors: Ayush Tripathi, Arnab Kumar Mondal, Lalan Kumar, Prathosh A. P

    Abstract: Airwriting Recognition is the problem of identifying letters written in free space with finger movement. It is essentially a specialized case of gesture recognition, wherein the vocabulary of gestures corresponds to letters as in a particular language. With the wide adoption of smart wearables in the general population, airwriting recognition using motion sensors from a smart-band can be used as a… ▽ More

    Submitted 29 December, 2021; v1 submitted 25 November, 2021; originally announced November 2021.

  12. arXiv:2109.00659  [pdf, other

    cs.SE

    Semantic Slicing of Architectural Change Commits: Towards Semantic Design Review

    Authors: Amit Kumar Mondal, Chanchal K. Roy, Kevin A. Schneider, Banani Roy, Sristy Sumana Nath

    Abstract: Software architectural changes involve more than one module or component and are complex to analyze compared to local code changes. Development teams aiming to review architectural aspects (design) of a change commit consider many essential scenarios such as access rules and restrictions on usage of program entities across modules. Moreover, design review is essential when proper architectural for… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

  13. arXiv:2107.07709  [pdf, other

    cs.LG

    ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data

    Authors: Arnab Kumar Mondal, Himanshu Asnani, Parag Singla, Prathosh AP

    Abstract: Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear map** f… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Comments: IEEE/ACM Transactions on Computational Biology and Bioinformatics

  14. arXiv:2105.03237  [pdf, other

    cs.CV cs.AI

    Mini-batch graphs for robust image classification

    Authors: Arnab Kumar Mondal, Vineet Jain, Kaleem Siddiqi

    Abstract: Current deep learning models for classification tasks in computer vision are trained using mini-batches. In the present article, we take advantage of the relationships between samples in a mini-batch, using graph neural networks to aggregate information from similar images. This helps mitigate the adverse effects of alterations to the input images on classification performance. Diverse experiments… ▽ More

    Submitted 21 April, 2021; originally announced May 2021.

  15. arXiv:2008.09466  [pdf, other

    cs.LG eess.IV stat.ML

    RespVAD: Voice Activity Detection via Video-Extracted Respiration Patterns

    Authors: Arnab Kumar Mondal, Prathosh A. P

    Abstract: Voice Activity Detection (VAD) refers to the task of identification of regions of human speech in digital signals such as audio and video. While VAD is a necessary first step in many speech processing systems, it poses challenges when there are high levels of ambient noise during the audio recording. To improve the performance of VAD in such conditions, several methods utilizing the visual informa… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: Accepted in IEEE Sensor Letters

  16. arXiv:2007.03437  [pdf, other

    cs.LG cs.AI stat.ML

    Group Equivariant Deep Reinforcement Learning

    Authors: Arnab Kumar Mondal, Pratheeksha Nair, Kaleem Siddiqi

    Abstract: In Reinforcement Learning (RL), Convolutional Neural Networks(CNNs) have been successfully applied as function approximators in Deep Q-Learning algorithms, which seek to learn action-value functions and policies in various environments. However, to date, there has been little work on the learning of symmetry-transformation equivariant representations of the input environment state. In this paper,… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

    Comments: Presented at the ICML 2020 Workshop on Inductive Biases, Invariances and Generalization in RL

  17. arXiv:2006.05838  [pdf, other

    cs.LG cs.CV stat.ML

    To Regularize or Not To Regularize? The Bias Variance Trade-off in Regularized AEs

    Authors: Arnab Kumar Mondal, Himanshu Asnani, Parag Singla, Prathosh AP

    Abstract: Regularized Auto-Encoders (RAEs) form a rich class of neural generative models. They effectively model the joint-distribution between the data and the latent space using an Encoder-Decoder combination, with regularization imposed in terms of a prior over the latent space. Despite their advantages, such as stability in training, the performance of AE based models has not reached the superior standa… ▽ More

    Submitted 19 September, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

  18. arXiv:2005.08226  [pdf, other

    cs.LG cs.IT stat.ML

    C-MI-GAN : Estimation of Conditional Mutual Information using MinMax formulation

    Authors: Arnab Kumar Mondal, Arnab Bhattacharya, Sudipto Mukherjee, Prathosh AP, Sreeram Kannan, Himanshu Asnani

    Abstract: Estimation of information theoretic quantities such as mutual information and its conditional variant has drawn interest in recent times owing to their multifaceted applications. Newly proposed neural estimators for these quantities have overcome severe drawbacks of classical $k$NN-based estimators in high dimensions. In this work, we focus on conditional mutual information (CMI) estimation by uti… ▽ More

    Submitted 23 July, 2020; v1 submitted 17 May, 2020; originally announced May 2020.

    Comments: Updated for UAI, 2020 camera-ready version

  19. arXiv:2001.06921   

    cs.AI

    A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

    Authors: Amit Kumar Mondal

    Abstract: Reinforcement learning is one of the core components in designing an artificial intelligent system emphasizing real-time response. Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not. In this paper, we present a comprehensive study on Reinforcement Learning focusing on various dimensions in… ▽ More

    Submitted 27 January, 2020; v1 submitted 19 January, 2020; originally announced January 2020.

    Comments: This submission has been withdrawn by arXiv administrators as the second author was added without their knowledge or consent

  20. arXiv:1912.04564  [pdf, other

    cs.CV cs.LG

    MaskAAE: Latent space optimization for Adversarial Auto-Encoders

    Authors: Arnab Kumar Mondal, Sankalan Pal Chowdhury, Aravind Jayendran, Parag Singla, Himanshu Asnani, Prathosh AP

    Abstract: The field of neural generative models is dominated by the highly successful Generative Adversarial Networks (GANs) despite their challenges, such as training instability and mode collapse. Auto-Encoders (AE) with regularized latent space provide an alternative framework for generative models, albeit their performance levels have not reached that of GANs. In this work, we hypothesise that the dimen… ▽ More

    Submitted 17 May, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: To be presented at UAI 2020

  21. arXiv:1910.11125  [pdf, other

    cs.DC cs.SE

    Micro-level Modularity of Computaion-intensive Programs in Big Data Platforms: A Case Study with Image Data

    Authors: Amit Kumar Mondal, Banani Roy, Chanchal K. Roy, Kevin A. Schneider

    Abstract: With the rapid advancement of Big Data platforms such as Hadoop, Spark, and Dataflow, many tools are being developed that are intended to provide end users with an interactive environment for large-scale data analysis (e.g., IQmulus). However, there are challenges using these platforms. For example, developers find it difficult to use these platforms when develo** interactive and reusable data a… ▽ More

    Submitted 19 October, 2019; originally announced October 2019.

  22. arXiv:1908.11569  [pdf, other

    cs.CV

    Revisiting CycleGAN for semi-supervised segmentation

    Authors: Arnab Kumar Mondal, Aniket Agarwal, Jose Dolz, Christian Desrosiers

    Abstract: In this work, we study the problem of training deep networks for semantic image segmentation using only a fraction of annotated images, which may significantly reduce human annotation efforts. Particularly, we propose a strategy that exploits the unpaired image style transfer capabilities of CycleGAN in semi-supervised segmentation. Unlike recent works using adversarial learning for semi-supervise… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

  23. arXiv:1810.12241  [pdf, other

    cs.CV

    Few-shot 3D Multi-modal Medical Image Segmentation using Generative Adversarial Learning

    Authors: Arnab Kumar Mondal, Jose Dolz, Christian Desrosiers

    Abstract: We address the problem of segmenting 3D multi-modal medical images in scenarios where very few labeled examples are available for training. Leveraging the recent success of adversarial learning for semi-supervised segmentation, we propose a novel method based on Generative Adversarial Networks (GANs) to train a segmentation model with both labeled and unlabeled images. The proposed method prevents… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

    Comments: submitted to Medical Image Analysis for review

  24. arXiv:1608.01024  [pdf, other

    cs.CV

    Incremental Real-Time Multibody VSLAM with Trajectory Optimization Using Stereo Camera

    Authors: N Dinesh Reddy, Iman Abbasnejad, Sheetal Reddy, Amit Kumar Mondal, Vindhya Devalla

    Abstract: Real time outdoor navigation in highly dynamic environments is an crucial problem. The recent literature on real time static SLAM don't scale up to dynamic outdoor environments. Most of these methods assume moving objects as outliers or discard the information provided by them. We propose an algorithm to jointly infer the camera trajectory and the moving object trajectory simultaneously. In this p… ▽ More

    Submitted 2 August, 2016; originally announced August 2016.

    Comments: Available on IROS