Skip to main content

Showing 1–50 of 199 results for author: ManMohan

.
  1. arXiv:2405.13779  [pdf, other

    cs.CV cs.AI cs.LG

    Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data

    Authors: Tarun Kalluri, Jihyeon Lee, Kihyuk Sohn, Sahil Singla, Manmohan Chandraker, Joseph Xu, Jeremiah Liu

    Abstract: We present a simple and efficient method to leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images. While significant recent advances have resulted in improved techniques for damage assessment using aerial or satellite imagery, they still suffer from poor robustness to domains where manual labeled data is… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2405.06063  [pdf, other

    cs.LG

    A Minimalist Prompt for Zero-Shot Policy Learning

    Authors: Meng Song, Xuezhi Wang, Tanay Biradar, Yao Qin, Manmohan Chandraker

    Abstract: Transformer-based methods have exhibited significant generalization ability when prompted with target-domain demonstrations or example solutions during inference. Although demonstrations, as a way of task specification, can capture rich information that may be hard to specify by language, it remains unclear what information is extracted from the demonstrations to help generalization. Moreover, ass… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.02781  [pdf, other

    cs.CV

    Instantaneous Perception of Moving Objects in 3D

    Authors: Di Liu, Bingbing Zhuang, Dimitris N. Metaxas, Manmohan Chandraker

    Abstract: The perception of 3D motion of surrounding traffic participants is crucial for driving safety. While existing works primarily focus on general large motions, we contend that the instantaneous detection and quantification of subtle motions is equally important as they indicate the nuances in driving behavior that may be safety critical, such as behaviors near a stop sign of parking positions. We de… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  4. arXiv:2405.00900  [pdf, other

    cs.CV

    LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes

    Authors: Shanlin Sun, Bingbing Zhuang, Ziyu Jiang, Buyu Liu, Xiaohui Xie, Manmohan Chandraker

    Abstract: Photorealistic simulation plays a crucial role in applications such as autonomous driving, where advances in neural radiance fields (NeRFs) may allow better scalability through the automatic creation of digital 3D assets. However, reconstruction quality suffers on street scenes due to largely collinear camera motions and sparser samplings at higher speeds. On the other hand, the application often… ▽ More

    Submitted 4 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: CVPR2024 Highlights

  5. arXiv:2404.15244  [pdf, other

    cs.CV cs.LG

    Efficient Transformer Encoders for Mask2Former-style models

    Authors: Manyi Yao, Abhishek Aich, Yumin Suh, Amit Roy-Chowdhury, Christian Shelton, Manmohan Chandraker

    Abstract: Vision transformer based models bring significant improvements for image segmentation tasks. Although these architectures offer powerful capabilities irrespective of specific segmentation tasks, their use of computational resources can be taxing on deployed devices. One way to overcome this challenge is by adapting the computation level to the specific needs of the input image rather than the curr… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  6. arXiv:2404.14657  [pdf, other

    cs.CV

    Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

    Authors: Abhishek Aich, Yumin Suh, Samuel Schulter, Manmohan Chandraker

    Abstract: A powerful architecture for universal segmentation relies on transformers that encode multi-scale image features and decode object queries into mask predictions. With efficiency being a high priority for scaling such models, we observed that the state-of-the-art method Mask2Former uses ~50% of its compute only on the transformer encoder. This is due to the retention of a full-length token-level re… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  7. arXiv:2404.12479  [pdf, other

    math.CA

    Inversion of generalized V-line transforms of vector fields in $\mathbb{R}^2$

    Authors: Rahul Bhardwaj, Rohit Kumar Mishra, Manmohan Vashisth

    Abstract: This article studies the inverse problem of recovering a vector field supported in $\mathbb{D}_R$, the disk of radius $R$ centered at the origin, through a set of generalized broken ray/V-line transforms, namely longitudinal and transverse V-line transforms. Geometrically, we work with broken lines that start from the boundary of a disk and break at a fixed angle after traveling a distance along t… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  8. arXiv:2404.04627  [pdf, other

    cs.CV

    Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Yun Fu, Manmohan Chandraker

    Abstract: Visual program synthesis is a promising approach to exploit the reasoning abilities of large language models for compositional computer vision tasks. Previous work has used few-shot prompting with frozen LLMs to synthesize visual programs. Training an LLM to write better visual programs is an attractive prospect, but it is unclear how to accomplish this. No dataset of visual programs for training… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  9. arXiv:2403.17373  [pdf, other

    cs.CV cs.AI cs.LG

    AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

    Authors: Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker

    Abstract: Autonomous vehicle (AV) systems rely on robust perception models as a cornerstone of safety assurance. However, objects encountered on the road exhibit a long-tailed distribution, with rare or unseen categories posing challenges to a deployed perception model. This necessitates an expensive process of continuously curating and annotating data with significant human effort. We propose to leverage r… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR-2024

  10. arXiv:2403.05535  [pdf, other

    cs.CV cs.AI cs.CL

    Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

    Authors: Tarun Kalluri, Bodhisattwa Prasad Majumder, Manmohan Chandraker

    Abstract: We introduce LaGTran, a novel framework that utilizes text supervision to guide robust transfer of discriminative knowledge from labeled source to unlabeled target data with domain gaps. While unsupervised adaptation methods have been established to address this problem, they show limitations in handling challenging domain shifts due to their exclusive operation within the pixel-space. Motivated b… ▽ More

    Submitted 5 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: ICML 2024 Camera-Ready. Project Page and Code: https://tarun005.github.io/lagtran/

  11. arXiv:2401.09416  [pdf, other

    cs.CV cs.GR

    TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

    Authors: Yu-Ying Yeh, Jia-Bin Huang, Changil Kim, Lei Xiao, Thu Nguyen-Phuoc, Numair Khan, Cheng Zhang, Manmohan Chandraker, Carl S Marshall, Zhao Dong, Zhengqin Li

    Abstract: We present TextureDreamer, a novel image-guided texture synthesis method to transfer relightable textures from a small number of input images (3 to 5) to target 3D shapes across arbitrary categories. Texture creation is a pivotal challenge in vision and graphics. Industrial companies hire experienced artists to manually craft textures for 3D assets. Classical methods require densely sampled views… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Project page: https://texturedreamer.github.io

  12. arXiv:2401.05980  [pdf, ps, other

    math.AP

    Boundary determination of coefficients appearing in a perturbed weighted $p$-Laplace equation

    Authors: Nitesh Kumar, Tanmay Sarkar, Manmohan Vashisth

    Abstract: We study an inverse boundary value problem associated with $p$-Laplacian which is further perturbed by a linear second order term, defined on a bounded set $Ω$ in $\R^n, n\geq 2$. We recover the coefficients at the boundary from the boundary measurements which are given by the Dirichlet to Neumann map. Our approach relies on the appropriate asymptotic expansion of the solution and it allows one to… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  13. arXiv:2401.02411  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs

    Authors: Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

    Abstract: 3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries of scenes from collections of 2D images via neural volume rendering. Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with p… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: See our project page: https://research.nvidia.com/labs/nxp/wysiwyg/

  14. arXiv:2401.00391  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Controllable Adversaries

    Authors: Wei-Jer Chang, Francesco Pittaluga, Masayoshi Tomizuka, Wei Zhan, Manmohan Chandraker

    Abstract: Evaluating the performance of autonomous vehicle planning algorithms necessitates simulating long-tail safety-critical traffic scenarios. However, traditional methods for generating such scenarios often fall short in terms of controllability and realism and neglect the dynamics of agent interactions. To mitigate these limitations, we introduce SAFE-SIM, a novel diffusion-based controllable closed-… ▽ More

    Submitted 15 June, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

    Comments: Under Review

    ACM Class: I.2.9; I.2.6

  15. arXiv:2401.00125  [pdf, other

    cs.AI cs.CV

    LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning

    Authors: S P Sharan, Francesco Pittaluga, Vijay Kumar B G, Manmohan Chandraker

    Abstract: Although planning is a crucial component of the autonomous driving stack, researchers have yet to develop robust planning algorithms that are capable of safely handling the diverse range of possible driving scenarios. Learning-based planners suffer from overfitting and poor long-tail performance. On the other hand, rule-based planners generalize well, but might fail to handle scenarios that requir… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    Comments: 15 pages, 8 figures, 7 tables

  16. arXiv:2401.00094  [pdf, other

    cs.CV

    Generating Enhanced Negatives for Training Language-Based Object Detectors

    Authors: Shiyu Zhao, Long Zhao, Vijay Kumar B. G, Yumin Suh, Dimitris N. Metaxas, Manmohan Chandraker, Samuel Schulter

    Abstract: The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations. Training such models with a discriminative objective function has proven successful, but requires good positive and negative samples. However, the free-form nature and the open vocabulary of object descriptions make… ▽ More

    Submitted 12 April, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

    Comments: Accepted to CVPR 2024. The supplementary document included

  17. arXiv:2312.01077  [pdf, other

    eess.IV

    OpEnCam: Lensless Optical Encryption Camera

    Authors: Salman S. Khan, Xiang Yu, Kaushik Mitra, Manmohan Chandraker, Francesco Pittaluga

    Abstract: Lensless cameras multiplex the incoming light before it is recorded by the sensor. This ability to multiplex the incoming light has led to the development of ultra-thin, high-speed, and single-shot 3D imagers. Recently, there have been various attempts at demonstrating another useful aspect of lensless cameras - their ability to preserve the privacy of a scene by capturing encrypted measurements.… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 11 pages, 11 figures, 3 tables

  18. arXiv:2310.17050  [pdf, other

    cs.CV

    Exploring Question Decomposition for Zero-Shot VQA

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Manmohan Chandraker, Yun Fu

    Abstract: Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy for VQA to overcome this limitation. We probe the ability of recently developed large vision-language models to use human-written decompositions and produce their… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 Camera Ready

  19. arXiv:2310.11152  [pdf, other

    hep-ph

    Revisiting representations of quark mixing matrix

    Authors: Gurjit Kaur, Aakriti Bagai, Gulsheen Ahuja, Manmohan Gupta

    Abstract: Using unitarity, unlike the approaches available in the literature, we have constructed 9 independent representations of CKM matrix starting with each of the 9 elements of the matrix. The relationship of these independently constructed representations with the already available ones in the literature has been compared and discussed. Further, the implications of these representations have been expl… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 19 pages, 1 figure, accepted for publication in PTEP

  20. arXiv:2310.07361  [pdf, other

    cs.CV

    Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters

    Authors: Mateusz Michalkiewicz, Masoud Faraki, Xiang Yu, Manmohan Chandraker, Mahsa Baktashmotlagh

    Abstract: Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical benchmarks such as ImageNet, their performance diminishes with the introduction of domain shif… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Paper was accepted to ICCV 2023

  21. arXiv:2308.11744  [pdf, other

    cs.CV

    Efficient Controllable Multi-Task Architectures

    Authors: Abhishek Aich, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker, Yumin Suh

    Abstract: We aim to train a multi-task model such that users can adjust the desired compute budget and relative importance of task performances after deployment, without retraining. This enables optimizing performance for dynamically varying user needs, without heavy computational overhead to train and save models for various scenarios. To this end, we propose a multi-task model consisting of a shared encod… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  22. arXiv:2308.09865  [pdf, other

    cs.CV cs.GR

    A Theory of Topological Derivatives for Inverse Rendering of Geometry

    Authors: Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi

    Abstract: We introduce a theoretical framework for differentiable surface evolution that allows discrete topology changes through the use of topological derivatives for variational optimization of image functionals. While prior methods for inverse rendering of geometry rely on silhouette gradients for topology changes, such signals are sparse. In contrast, our theory derives topological derivatives that rel… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: ICCV 23; Project Page at https://ishit.github.io/td/

  23. arXiv:2308.06412  [pdf, other

    cs.CV

    Taming Self-Training for Open-Vocabulary Object Detection

    Authors: Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B. G, Yumin Suh, Manmohan Chandraker, Dimitris N. Metaxas

    Abstract: Recent studies have shown promising performance in open-vocabulary object detection (OVD) by utilizing pseudo labels (PLs) from pretrained vision and language models (VLMs). However, teacher-student self-training, a powerful and widely used paradigm to leverage PLs, is rarely explored for OVD. This work identifies two challenges of using self-training in OVD: noisy PLs from VLMs and frequent distr… ▽ More

    Submitted 12 April, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: Accepted to CVPR 2024. The supplementary document included

  24. arXiv:2306.03932  [pdf, other

    cs.CV

    Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!

    Authors: Zaid Khan, Vijay Kumar BG, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker

    Abstract: Finetuning a large vision language model (VLM) on a target dataset after large scale pretraining is a dominant paradigm in visual question answering (VQA). Datasets for specialized tasks such as knowledge-based VQA or VQA in non natural-image domains are orders of magnitude smaller than those for general-purpose VQA. While collecting additional labels for specialized tasks or domains can be challe… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  25. arXiv:2305.17763  [pdf, other

    cs.CV

    NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization

    Authors: Zhixiang Min, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Enrique Dunn, Manmohan Chandraker

    Abstract: Monocular 3D object localization in driving scenes is a crucial task, but challenging due to its ill-posed nature. Estimating 3D coordinates for each pixel on the object surface holds great potential as it provides dense 2D-3D geometric constraints for the underlying PnP problem. However, high-quality ground truth supervision is not available in driving scenes due to sparsity and various artifacts… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Paper was accepted to CVPR 2023

  26. arXiv:2305.10675  [pdf, other

    cs.CV

    Tuned Contrastive Learning

    Authors: Chaitanya Animesh, Manmohan Chandraker

    Abstract: In recent times, contrastive learning based loss functions have become increasingly popular for visual self-supervised representation learning owing to their state-of-the-art (SOTA) performance. Most of the modern contrastive learning methods generalize only to one positive and multiple negatives per anchor. A recent state-of-the-art, supervised contrastive (SupCon) loss, extends self-supervised c… ▽ More

    Submitted 30 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Preprint Version

  27. arXiv:2305.04374  [pdf, other

    cs.CV

    Spatiotemporally Consistent HDR Indoor Lighting Estimation

    Authors: Zhengqin Li, Li Yu, Mikhail Okunev, Manmohan Chandraker, Zhao Dong

    Abstract: We propose a physically-motivated deep learning framework to solve a general version of the challenging indoor lighting estimation problem. Given a single LDR image with a depth map, our method predicts spatially consistent lighting at any given image position. Particularly, when the input is an LDR video sequence, our framework not only progressively refines the lighting prediction as it sees mor… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

  28. arXiv:2305.02310  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Real-Time Radiance Fields for Single-Image Portrait View Synthesis

    Authors: Alex Trevithick, Matthew Chan, Michael Stengel, Eric R. Chan, Chao Liu, Zhiding Yu, Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

    Abstract: We present a one-shot method to infer and render a photorealistic 3D representation from a single unposed image (e.g., face portrait) in real-time. Given a single RGB input, our image encoder directly predicts a canonical triplane representation of a neural radiance field for 3D-aware novel view synthesis via volume rendering. Our method is fast (24 fps) on consumer hardware, and produces higher q… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: Project page: https://research.nvidia.com/labs/nxp/lp3d/

  29. arXiv:2304.05669  [pdf, other

    cs.CV cs.GR

    Factorized Inverse Path Tracing for Efficient and Accurate Material-Lighting Estimation

    Authors: Liwen Wu, Rui Zhu, Mustafa B. Yaldiz, Yinhao Zhu, Hong Cai, Janarbek Matai, Fatih Porikli, Tzu-Mao Li, Manmohan Chandraker, Ravi Ramamoorthi

    Abstract: Inverse path tracing has recently been applied to joint material and lighting estimation, given geometry and multi-view HDR observations of an indoor scene. However, it has two major limitations: path tracing is expensive to compute, and ambiguities exist between reflection and emission. Our Factorized Inverse Path Tracing (FIPT) addresses these challenges by using a factored light transport formu… ▽ More

    Submitted 23 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Updated experiment results; modified real-world sections

  30. arXiv:2303.15443  [pdf, other

    cs.CV cs.AI cs.LG

    GeoNet: Benchmarking Unsupervised Adaptation across Geographies

    Authors: Tarun Kalluri, Wangdong Xu, Manmohan Chandraker

    Abstract: In recent years, several efforts have been aimed at improving the robustness of vision models to domains and environments unseen during training. An important practical problem pertains to models deployed in a new geography that is under-represented in the training dataset, posing a direct challenge to fair and inclusive computer vision. In this paper, we study the problem of geographic robustness… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: CVPR 2023 Camera Ready. Project Page: https://tarun005.github.io/GeoNet

  31. arXiv:2303.05503  [pdf, other

    cs.CV cs.AI cs.LG

    Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision

    Authors: Tarun Kalluri, Weiyao Wang, Heng Wang, Manmohan Chandraker, Lorenzo Torresani, Du Tran

    Abstract: Many top-down architectures for instance segmentation achieve significant success when trained and tested on pre-defined closed-world taxonomy. However, when deployed in the open world, they exhibit notable bias towards seen classes and suffer from significant performance drop. In this work, we propose a novel approach for open world instance segmentation called bottom-Up and top-Down Open-world S… ▽ More

    Submitted 13 May, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

    Comments: L3D-IVU Workshop, CVPR 2024. Project page: https://tarun005.github.io/UDOS

  32. arXiv:2302.02336  [pdf, other

    cs.LG cs.CV

    Using Intermediate Forward Iterates for Intermediate Generator Optimization

    Authors: Harsh Mishra, Jurijs Nazarovs, Manmohan Dogra, Sathya N. Ravi

    Abstract: Score-based models have recently been introduced as a richer framework to model distributions in high dimensions and are generally more suitable for generative tasks. In score-based models, a generative task is formulated using a parametric model (such as a neural network) to directly learn the gradient of such high dimensional distributions, instead of the density functions themselves, as is done… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  33. arXiv:2210.15908  [pdf, other

    cs.CV cs.RO

    Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object Transport

    Authors: Sriram Narayanan, Dinesh Jayaraman, Manmohan Chandraker

    Abstract: We address key challenges in long-horizon embodied exploration and navigation by proposing a new object transport task and a novel modular framework for temporally extended navigation. Our first contribution is the design of a novel Long-HOT environment focused on deep exploration and long-horizon planning where the agent is required to efficiently find and pick up target objects to be carried and… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  34. arXiv:2210.12878  [pdf, other

    cs.CV

    IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes

    Authors: Shubham Dokania, A. H. Abdul Hafez, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

    Abstract: Autonomous driving and assistance systems rely on annotated data from traffic and road scenarios to model and learn the various object relations in complex real-world scenarios. Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios and adapt to different situations. Currently, existing datasets, while large-scale, lack su… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: 10 pages, 8 figures, 5 tables, Accepted in Winter Conference on Applications of Computer Vision (WACV 2023)

  35. arXiv:2209.08780  [pdf, ps, other

    math.AP

    Inverse problem for a time-dependent Convection-diffusion equation in admissible geometries

    Authors: Rohit Kumar Mishra, Anamika Purohit, Manmohan Vashisth

    Abstract: We consider a partial data inverse problem for a time-dependent convection-diffusion equation on an admissible manifold. We prove that the time-dependent convection term and time-dependent density can be recovered uniquely modulo a known gauge invariance. There have been several works on inverse problems related to the steady state convection-diffusion operator in Euclidean as well as in Riemannia… ▽ More

    Submitted 2 May, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

    MSC Class: 35R30; 35K20; 58J35; 58J65

  36. arXiv:2208.07943  [pdf, other

    cs.CV

    TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments

    Authors: Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

    Abstract: High-quality structured data with rich annotations are critical components in intelligent vehicle systems dealing with road scenes. However, data curation and annotation require intensive investments and yield low-diversity scenarios. The recently growing interest in synthetic data raises questions about the scope of improvement in such systems and the amount of manual work still required to produ… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 18 pages, 5 figures, Accepted in European Conference on Computer Vision (ECCV 2022)

  37. arXiv:2208.02804  [pdf, other

    cs.CV cs.LG

    Cluster-to-adapt: Few Shot Domain Adaptation for Semantic Segmentation across Disjoint Labels

    Authors: Tarun Kalluri, Manmohan Chandraker

    Abstract: Domain adaptation for semantic segmentation across datasets consisting of the same categories has seen several recent successes. However, a more general scenario is when the source and target datasets correspond to non-overlap** label spaces. For example, categories in segmentation datasets change vastly depending on the type of environment or application, yet share many valuable semantic relati… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: Accepted to L3D workshop at CVPR 2022

  38. arXiv:2207.13339  [pdf, other

    cs.CV

    ALBench: A Framework for Evaluating Active Learning in Object Detection

    Authors: Zhanpeng Feng, Shiliang Zhang, Rinyoichi Takezoe, Wenze Hu, Manmohan Chandraker, Li-Jia Li, Vijay K. Narayanan, Xiaoyu Wang

    Abstract: Active learning is an important technology for automated machine learning systems. In contrast to Neural Architecture Search (NAS) which aims at automating neural network architecture design, active learning aims at automating training data selection. It is especially critical for training a long-tailed task, in which positive samples are sparsely distributed. Active learning alleviates the expens… ▽ More

    Submitted 24 November, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

  39. arXiv:2207.12389  [pdf, other

    cs.CV cs.AI cs.LG

    MemSAC: Memory Augmented Sample Consistency for Large Scale Unsupervised Domain Adaptation

    Authors: Tarun Kalluri, Astuti Sharma, Manmohan Chandraker

    Abstract: Practical real world datasets with plentiful categories introduce new challenges for unsupervised domain adaptation like small inter-class discriminability, that existing approaches relying on domain invariance alone cannot handle sufficiently well. In this work we propose MemSAC, which exploits sample level similarity across source and target domains to achieve discriminative transfer, along with… ▽ More

    Submitted 11 October, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022. Project Webpage: https://tarun005.github.io/MemSAC/

  40. Minimizing the phase structure of quark mass matrices

    Authors: Nikhila Awasthi, Manoj Kumar, Monika Randhawa, Manmohan Gupta

    Abstract: Fritzsch-Xing matrices are a particular class of texture 4 zero hermitian quark mass matrices, known to be successful in accommodating the quark mixing data. In the present work, it is shown that these texture 4-zero matrices with only one phase parameter, unlike the usually considered two phase parameters, are not only consistent with the latest experimental quark mixing data, but also predict th… ▽ More

    Submitted 1 August, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: 15 pages, 5 figures, to appear in the European Physical Journal C

    Journal ref: Eur. Phys. J. C, 82, 7 (2022), 653

  41. arXiv:2207.08954  [pdf, other

    cs.CV

    Exploiting Unlabeled Data with Vision and Language Models for Object Detection

    Authors: Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B. G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas

    Abstract: Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets. However, it is prohibitively costly to acquire annotations for thousands of categories at a large scale. We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectiv… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022 (with the supplementary document)

  42. arXiv:2207.00757  [pdf, other

    cs.CV

    PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes

    Authors: Yu-Ying Yeh, Zhengqin Li, Yannick Hold-Geoffroy, Rui Zhu, Zexiang Xu, Miloš Hašan, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: Most indoor 3D scene reconstruction methods focus on recovering 3D geometry and scene layout. In this work, we go beyond this to propose PhotoScene, a framework that takes input image(s) of a scene along with approximately aligned CAD geometry (either reconstructed automatically or manually specified) and builds a photorealistic digital twin with high-quality materials and similar lighting. We mod… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: Accepted to CVPR 2022; Code is available at https://github.com/ViLab-UCSD/photoscene

  43. arXiv:2206.12784  [pdf, other

    cs.RO

    Learning to Rearrange with Physics-Inspired Risk Awareness

    Authors: Meng Song, Yuhan Liu, Zhengqin Li, Manmohan Chandraker

    Abstract: Real-world applications require a robot operating in the physical world with awareness of potential risks besides accomplishing the task. A large part of risky behaviors arises from interacting with objects in ignorance of affordance. To prevent the agent from making unsafe decisions, we propose to train a robotic agent by reinforcement learning to execute tasks with an awareness of physical prope… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

    Comments: Accepted to Risk Aware Decision Making Workshop at Robotics, Science and Systems 2022

  44. arXiv:2206.08423  [pdf, other

    cs.CV

    IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

    Authors: Rui Zhu, Zhengqin Li, Janarbek Matai, Fatih Porikli, Manmohan Chandraker

    Abstract: Indoor scenes exhibit significant appearance variations due to myriad interactions between arbitrarily diverse object shapes, spatially-changing materials, and complex lighting. Shadows, highlights, and inter-reflections caused by visible and invisible light sources require reasoning about long-range interactions for inverse rendering, which seeks to recover the components of image formation, name… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: CVPR 22 camera ready version with supplementary

  45. arXiv:2205.09343  [pdf, other

    cs.CV

    Physically-Based Editing of Indoor Scene Lighting from a Single Image

    Authors: Zhengqin Li, Jia Shi, Sai Bi, Rui Zhu, Kalyan Sunkavalli, Miloš Hašan, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker

    Abstract: We present a method to edit complex indoor lighting from a single image with its predicted depth and light source segmentation masks. This is an extremely challenging problem that requires modeling complex light transport, and disentangling HDR lighting from material and geometry with only a partial LDR observation of the scene. We tackle this problem using two novel components: 1) a holistic scen… ▽ More

    Submitted 23 July, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  46. arXiv:2204.07159  [pdf, other

    cs.CV cs.GR cs.LG

    A Level Set Theory for Neural Implicit Evolution under Explicit Flows

    Authors: Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi

    Abstract: Coordinate-based neural networks parameterizing implicit surfaces have emerged as efficient representations of geometry. They effectively act as parametric level sets with the zero-level set defining the surface of interest. We present a framework that allows applying deformation operations defined for triangle meshes onto such implicit surfaces. Several of these operations can be viewed as energy… ▽ More

    Submitted 21 July, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: ECCV 2022 (Oral); Project Page at https://ishit.github.io/nie

  47. arXiv:2203.14949  [pdf, other

    cs.CV cs.LG

    Controllable Dynamic Multi-Task Architectures

    Authors: Dripta S. Raychaudhuri, Yumin Suh, Samuel Schulter, Xiang Yu, Masoud Faraki, Amit K. Roy-Chowdhury, Manmohan Chandraker

    Abstract: Multi-task learning commonly encounters competition for resources among tasks, specifically when model capacity is limited. This challenge motivates models which allow control over the relative importance of tasks and total compute cost during inference time. In this work, we propose such a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired t… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  48. arXiv:2203.14395  [pdf, other

    cs.CV

    Single-Stream Multi-Level Alignment for Vision-Language Pretraining

    Authors: Zaid Khan, Vijay Kumar BG, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu

    Abstract: Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level. Earlier, supervised, non-contrastive methods were capable of finer-grained alignment, but required dense annotations that were not scalable. We propose a si… ▽ More

    Submitted 27 July, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

    Comments: ECCV 2022

  49. arXiv:2203.03970  [pdf, other

    cs.LG cs.CV

    On Generalizing Beyond Domains in Cross-Domain Continual Learning

    Authors: Christian Simon, Masoud Faraki, Yi-Hsuan Tsai, Xiang Yu, Samuel Schulter, Yumin Suh, Mehrtash Harandi, Manmohan Chandraker

    Abstract: Humans have the ability to accumulate knowledge of new tasks in varying conditions, but deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Many recent methods focus on preventing catastrophic forgetting under the assumption of train and test data following similar distributions. In this work, we consider a more realistic scenar… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  50. arXiv:2202.14030  [pdf, other

    cs.CV

    Learning Semantic Segmentation from Multiple Datasets with Label Shifts

    Authors: Dongwan Kim, Yi-Hsuan Tsai, Yumin Suh, Masoud Faraki, Sparsh Garg, Manmohan Chandraker, Bohyung Han

    Abstract: With increasing applications of semantic segmentation, numerous datasets have been proposed in the past few years. Yet labeling remains expensive, thus, it is desirable to jointly train models across aggregations of datasets to enhance data volume and diversity. However, label spaces differ across datasets and may even be in conflict with one another. This paper proposes UniSeg, an effective appro… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.