Skip to main content

Showing 1–13 of 13 results for author: Uppal, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.07991  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    SPIN: Simultaneous Perception, Interaction and Navigation

    Authors: Shagun Uppal, Ananye Agarwal, Haoyu Xiong, Kenneth Shaw, Deepak Pathak

    Abstract: While there has been remarkable progress recently in the fields of manipulation and locomotion, mobile manipulation remains a long-standing challenge. Compared to locomotion or static manipulation, a mobile system must make a diverse range of long-horizon tasks feasible in unstructured and dynamic environments. While the applications are broad and interesting, there are a plethora of challenges in… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: In CVPR 2024. Website at https://spin-robot.github.io/

  2. arXiv:2312.02975  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Dexterous Functional Gras**

    Authors: Ananye Agarwal, Shagun Uppal, Kenneth Shaw, Deepak Pathak

    Abstract: While there have been significant strides in dexterous manipulation, most of it is limited to benchmark tasks like in-hand reorientation which are of limited utility in the real world. The main benefit of dexterous hands over two-fingered ones is their ability to pickup tools and other objects (including thin ones) and grasp them firmly to apply force. However, this task requires both a complex un… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: In CoRL 2023. Website at https://dexfunc.github.io/

  3. arXiv:2303.11548  [pdf, other

    cs.CV

    Emotionally Enhanced Talking Face Generation

    Authors: Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah

    Abstract: Several works have developed end-to-end pipelines for generating lip-synced talking faces with various real-world applications, such as teaching and language translation in videos. However, these prior works fail to create realistic-looking videos since they focus little on people's expressions and emotions. Moreover, these methods' effectiveness largely depends on the faces in the training datase… ▽ More

    Submitted 26 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  4. arXiv:2205.15870  [pdf, other

    cs.CV cs.AI

    FaIRCoP: Facial Image Retrieval using Contrastive Personalization

    Authors: Devansh Gupta, Aditya Saini, Drishti Bhasin, Sarthak Bhagat, Shagun Uppal, Rishi Raj Jain, Ponnurangam Kumaraguru, Rajiv Ratn Shah

    Abstract: Retrieving facial images from attributes plays a vital role in various systems such as face recognition and suspect identification. Compared to other image retrieval tasks, facial image retrieval is more challenging due to the high subjectivity involved in describing a person's facial features. Existing methods do so by comparing specific characteristics from the user's mental image against the su… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

  5. arXiv:2111.06383  [pdf, other

    cs.LG cs.AI cs.RO

    Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation

    Authors: I-Chun Arthur Liu, Shagun Uppal, Gaurav S. Sukhatme, Joseph J. Lim, Peter Englert, Youngwoon Lee

    Abstract: Learning complex manipulation tasks in realistic, obstructed environments is a challenging problem due to hard exploration in the presence of obstacles and high-dimensional visual observations. Prior work tackles the exploration problem by integrating motion planning and reinforcement learning. However, the motion planner augmented policy requires access to state information, which is often not av… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: Published at the Conference on Robot Learning (CoRL) 2021

  6. arXiv:2010.09522  [pdf, other

    cs.CV cs.CL

    Multimodal Research in Vision and Language: A Review of Current and Emerging Trends

    Authors: Shagun Uppal, Sarthak Bhagat, Devamanyu Hazarika, Navonil Majumdar, Soujanya Poria, Roger Zimmermann, Amir Zadeh

    Abstract: Deep Learning and its applications have cascaded impactful research and development with a diverse range of modalities present in the real-world data. More recently, this has enhanced research interests in the intersection of the Vision and Language arena with its numerous applications and fast-paced growth. In this paper, we present a detailed overview of the latest trends in research pertaining… ▽ More

    Submitted 21 December, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

  7. arXiv:2006.05895  [pdf, other

    cs.CV

    DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors

    Authors: Sarthak Bhagat, Vishaal Udandarao, Shagun Uppal

    Abstract: Disentangling the underlying feature attributes within an image with no prior supervision is a challenging task. Models that can disentangle attributes well provide greater interpretability and control. In this paper, we propose a self-supervised framework DisCont to disentangle multiple attributes by exploiting the structural inductive biases within images. Motivated by the recent surge in contra… ▽ More

    Submitted 29 June, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: Published at the 37th International Conference on Machine Learning (ICML 2020) Workshop on ML Interpretability for Scientific Discovery

  8. arXiv:2005.07771  [pdf, other

    cs.CV

    C3VQG: Category Consistent Cyclic Visual Question Generation

    Authors: Shagun Uppal, Anish Madan, Sarthak Bhagat, Yi Yu, Rajiv Ratn Shah

    Abstract: Visual Question Generation (VQG) is the task of generating natural questions based on an image. Popular methods in the past have explored image-to-sequence architectures trained with maximum likelihood which have demonstrated meaningful generated questions given an image and its associated ground-truth answer. VQG becomes more challenging if the image contains rich contextual information describin… ▽ More

    Submitted 9 January, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

  9. arXiv:2003.09761  [pdf, other

    cs.CY cs.LG physics.soc-ph stat.ML

    Smarter Parking: Using AI to Identify Parking Inefficiencies in Vancouver

    Authors: Devon Graham, Satish Kumar Sarraf, Taylor Lundy, Ali MohammadMehr, Sara Uppal, Tae Yoon Lee, Hedayat Zarkoob, Scott Duke Kominers, Kevin Leyton-Brown

    Abstract: On-street parking is convenient, but has many disadvantages: on-street spots come at the expense of other road uses such as traffic lanes, transit lanes, bike lanes, or parklets; drivers looking for parking contribute substantially to traffic congestion and hence to greenhouse gas emissions; safety is reduced both due to the fact that drivers looking for spots are more distracted than other road u… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

    Comments: All the authors contributed equally. This paper is an outcome of https://www.cs.ubc.ca/~kevinlb/teaching/cs532l%20-%202018-19/index.html. To be submitted to a journal in transportation or urban planning

  10. arXiv:2001.02408  [pdf, other

    cs.CV

    Disentangling Multiple Features in Video Sequences using Gaussian Processes in Variational Autoencoders

    Authors: Sarthak Bhagat, Shagun Uppal, Zhuyun Yin, Nengli Lim

    Abstract: We introduce MGP-VAE (Multi-disentangled-features Gaussian Processes Variational AutoEncoder), a variational autoencoder which uses Gaussian processes (GP) to model the latent space for the unsupervised learning of disentangled representations in video sequences. We improve upon previous work by establishing a framework by which multiple features, static or dynamic, can be disentangled. Specifical… ▽ More

    Submitted 19 July, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

  11. arXiv:1911.01155  [pdf, other

    cs.LG stat.ML

    Learning based Methods for Code Runtime Complexity Prediction

    Authors: Jagriti Sikka, Kushal Satya, Yaman Kumar, Shagun Uppal, Rajiv Ratn Shah, Roger Zimmermann

    Abstract: Predicting the runtime complexity of a programming code is an arduous task. In fact, even for humans, it requires a subtle analysis and comprehensive knowledge of algorithms to predict time complexity with high fidelity, given any code. As per Turing's Halting problem proof, estimating code complexity is mathematically impossible. Nevertheless, an approximate solution to such a task can help devel… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: 14 pages, 2 figures, 8 tables

  12. arXiv:1907.09554  [pdf, other

    cs.CV cs.LG

    Product of Orthogonal Spheres Parameterization for Disentangled Representation Learning

    Authors: Ankita Shukla, Sarthak Bhagat, Shagun Uppal, Saket Anand, Pavan Turaga

    Abstract: Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitioned representation or structured by a prior to impos… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

    Comments: Accepted at British Machine Vision Conference (BMVC) 2019

  13. arXiv:1902.06964  [pdf, other

    cs.CV

    Geometry of Deep Generative Models for Disentangled Representations

    Authors: Ankita Shukla, Shagun Uppal, Sarthak Bhagat, Saket Anand, Pavan Turaga

    Abstract: Deep generative models like variational autoencoders approximate the intrinsic geometry of high dimensional data manifolds by learning low-dimensional latent-space variables and an embedding function. The geometric properties of these latent spaces has been studied under the lens of Riemannian geometry; via analysis of the non-linearity of the generator function. In new developments, deep generati… ▽ More

    Submitted 19 February, 2019; originally announced February 2019.

    Comments: Accepted at ICVGIP, 2018