Skip to main content

Showing 1–22 of 22 results for author: Bhambri, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20625  [pdf, other

    cs.AI

    Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning

    Authors: Atharva Gundawar, Mudit Verma, Lin Guan, Karthik Valmeekam, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: As the applicability of Large Language Models (LLMs) extends beyond traditional text processing tasks, there is a burgeoning interest in their potential to excel in planning and reasoning assignments, realms traditionally reserved for System 2 cognitive competencies. Despite their perceived versatility, the research community is still unraveling effective strategies to harness these models in such… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2405.15194  [pdf, other

    cs.LG cs.AI

    Efficient Reinforcement Learning via Large Language Model-based Search

    Authors: Siddhant Bhambri, Amrita Bhattacharjee, Huan Liu, Subbarao Kambhampati

    Abstract: Reinforcement Learning (RL) suffers from sample inefficiency in sparse reward domains, and the problem is pronounced if there are stochastic transitions. To improve the sample efficiency, reward sha** is a well-studied approach to introduce intrinsic rewards that can help the RL agent converge to an optimal policy faster. However, designing a useful reward sha** function specific to each probl… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 9 pages + Appendix

  3. arXiv:2405.13966  [pdf, other

    cs.AI cs.CL

    On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: The reasoning abilities of Large Language Models (LLMs) remain a topic of debate. Some methods such as ReAct-based prompting, have gained popularity for claiming to enhance sequential decision-making abilities of agentic LLMs. However, it is unclear what is the source of improvement in LLM reasoning with ReAct based prompting. In this paper we examine these claims of ReAct based prompting in impro… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  4. arXiv:2402.01817  [pdf, other

    cs.AI cs.LG

    LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

    Authors: Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

    Abstract: There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the probl… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  5. arXiv:2401.05302  [pdf, other

    cs.RO cs.AI cs.HC

    Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: Large Language Models have shown exceptional generative abilities in various natural language and generation tasks. However, possible anthropomorphization and leniency towards failure cases have propelled discussions on emergent abilities of Large Language Models especially on Theory of Mind (ToM) abilities in Large Language Models. While several false-belief tests exists to verify the ability to… ▽ More

    Submitted 17 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted in alt.HRI 2024

  6. arXiv:2312.14292  [pdf, other

    cs.AI cs.LG cs.MA

    Benchmarking Multi-Agent Preference-based Reinforcement Learning for Human-AI Teaming

    Authors: Siddhant Bhambri, Mudit Verma, Anil Murthy, Subbarao Kambhampati

    Abstract: Preference-based Reinforcement Learning (PbRL) is an active area of research, and has made significant strides in single-agent actor and in observer human-in-the-loop scenarios. However, its application within the co-operative multi-agent RL frameworks, where humans actively participate and express preferences for agent behavior, remains largely uncharted. We consider a two-agent (Human-AI) cooper… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  7. arXiv:2311.16294  [pdf, other

    cs.CV

    Aligning Non-Causal Factors for Transformer-Based Source-Free Domain Adaptation

    Authors: Sunandini Sanyal, Ashish Ramayee Asokan, Suvaansh Bhambri, Pradyumna YM, Akshay Kulkarni, Jogendra Nath Kundu, R Venkatesh Babu

    Abstract: Conventional domain adaptation algorithms aim to achieve better generalization by aligning only the task-discriminative causal factors between a source and target domain. However, we find that retaining the spurious correlation between causal and non-causal factors plays a vital role in bridging the domain gap and improving target adaptation. Therefore, we propose to build a framework that disenta… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: WACV 2024. Project Page: https://val.cds.iisc.ac.in/C-SFTrans/

  8. arXiv:2308.14023  [pdf, other

    cs.CV

    Domain-Specificity Inducing Transformers for Source-Free Domain Adaptation

    Authors: Sunandini Sanyal, Ashish Ramayee Asokan, Suvaansh Bhambri, Akshay Kulkarni, Jogendra Nath Kundu, R. Venkatesh Babu

    Abstract: Conventional Domain Adaptation (DA) methods aim to learn domain-invariant feature representations to improve the target adaptation performance. However, we motivate that domain-specificity is equally important since in-domain trained models hold crucial domain-specific properties that are beneficial for adaptation. Hence, we propose to build a framework that supports disentanglement and learning o… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: ICCV 2023. Project page: http://val.cds.iisc.ac.in/DSiT-SFDA

  9. arXiv:2308.09387  [pdf, other

    cs.RO cs.AI

    Multi-Level Compositional Reasoning for Interactive Instruction Following

    Authors: Suvaansh Bhambri, Byeonghwi Kim, Jonghyun Choi

    Abstract: Robotic agents performing domestic chores by natural language directives are required to master the complex job of navigating environment and interacting with objects in the environments. The tasks given to the agents are often composite thus are challenging as completing them require to reason about multiple subtasks, e.g., bring a cup of coffee. To address the challenge, we propose to divide and… ▽ More

    Submitted 12 March, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: AAAI 2023 (Oral) (Project page: https://bhkim94.github.io/projects/MCR-Agent)

  10. arXiv:2302.08738  [pdf, other

    cs.RO cs.AI

    Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: Preference Based Reinforcement Learning has shown much promise for utilizing human binary feedback on queried trajectory pairs to recover the underlying reward model of the Human in the Loop (HiL). While works have attempted to better utilize the queries made to the human, in this work we make two observations about the unlabeled trajectories collected by the agent and propose two corresponding lo… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: R2HCAI, AAAI 2023

  11. arXiv:2211.10298  [pdf, other

    cs.AI

    Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

    Authors: Siddhant Bhambri, Amrita Bhattacharjee, Dimitri Bertsekas

    Abstract: In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems. These methods are based on approximation in value space and the rollout approach, admit a straightforward implementation, and provide improved… ▽ More

    Submitted 29 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

  12. arXiv:2210.15909  [pdf, other

    cs.CV cs.LG

    Subsidiary Prototype Alignment for Universal Domain Adaptation

    Authors: Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu

    Abstract: Universal Domain Adaptation (UniDA) deals with the problem of knowledge transfer between two datasets with domain-shift as well as category-shift. The goal is to categorize unlabeled target samples, either into one of the "known" categories or into a single "unknown" category. A major problem in UniDA is negative transfer, i.e. misalignment of "known" and "unknown" classes. To this end, we first u… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022. Project page: https://sites.google.com/view/spa-unida

  13. arXiv:2210.15011  [pdf, other

    cs.GT cs.CR

    Using Deception in Markov Game to Understand Adversarial Behaviors through a Capture-The-Flag Environment

    Authors: Siddhant Bhambri, Purv Chauhan, Frederico Araujo, Adam Doupé, Subbarao Kambhampati

    Abstract: Identifying the actual adversarial threat against a system vulnerability has been a long-standing challenge for cybersecurity research. To determine an optimal strategy for the defender, game-theoretic based decision models have been widely used to simulate the real-world attacker-defender scenarios while taking the defender's constraints into consideration. In this work, we focus on understanding… ▽ More

    Submitted 9 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at GameSec 2022

  14. arXiv:2207.13247  [pdf, other

    cs.CV cs.LG

    Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation

    Authors: Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu

    Abstract: The prime challenge in unsupervised domain adaptation (DA) is to mitigate the domain shift between the source and target domains. Prior DA works show that pretext tasks could be used to mitigate this domain shift by learning domain invariant representations. However, in practice, we find that most existing pretext tasks are ineffective against other established techniques. Thus, we theoretically a… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: ECCV 2022. Project page: https://sites.google.com/view/sticker-sfda

  15. arXiv:2206.08009  [pdf, other

    cs.CV cs.LG

    Balancing Discriminability and Transferability for Source-Free Domain Adaptation

    Authors: Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Deepesh Mehta, Shreyas Kulkarni, Varun Jampani, R. Venkatesh Babu

    Abstract: Conventional domain adaptation (DA) techniques aim to improve domain transferability by learning domain-invariant representations; while concurrently preserving the task-discriminability knowledge gathered from the labeled source data. However, the requirement of simultaneous access to labeled source and unlabeled target renders them unsuitable for the challenging source-free DA setting. The trivi… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: ICML 2022. Project page: https://sites.google.com/view/mixup-sfda

  16. Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation

    Authors: Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Varun Jampani, R. Venkatesh Babu

    Abstract: Open compound domain adaptation (OCDA) has emerged as a practical adaptation setting which considers a single labeled source domain against a compound of multi-modal unlabeled target data in order to generalize better on novel unseen domains. We hypothesize that an improved disentanglement of domain-related and task-related factors of dense intermediate layer features can greatly aid OCDA. Prior-a… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: AAAI 2022. Project page: http://sites.google.com/view/ast-ocdaseg

  17. arXiv:2104.00878  [pdf, other

    cs.RO cs.AI cs.LG

    Contrastively Learning Visual Attention as Affordance Cues from Demonstrations for Robotic Gras**

    Authors: Yantian Zha, Siddhant Bhambri, Lin Guan

    Abstract: Conventional works that learn gras** affordance from demonstrations need to explicitly predict gras** configurations, such as gripper approaching angles or gras** preshapes. Classic motion planners could then sample trajectories by using such predicted configurations. In this work, our goal is instead to fill the gap between affordance discovery and affordance-based policy learning by integr… ▽ More

    Submitted 13 August, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  18. arXiv:2012.03208  [pdf, other

    cs.AI cs.CV cs.RO

    Factorizing Perception and Policy for Interactive Instruction Following

    Authors: Kunal Pratap Singh, Suvaansh Bhambri, Byeonghwi Kim, Roozbeh Mottaghi, Jonghyun Choi

    Abstract: Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for AI agents. The 'interactive instruction following' task attempts to make progress towards building agents that jointly navigate, interact, and reason in the environment at every step. To address the multifaceted problem, we propose a model that factorizes the task into int… ▽ More

    Submitted 2 September, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

    Comments: ICCV 2021

  19. arXiv:2009.13854  [pdf, other

    cs.AI eess.SY

    Multi-objective Reinforcement Learning based approach for User-Centric Power Optimization in Smart Home Environments

    Authors: Saurabh Gupta, Siddhant Bhambri, Karan Dhingra, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Smart homes require every device inside them to be connected with each other at all times, which leads to a lot of power wastage on a daily basis. As the devices inside a smart home increase, it becomes difficult for the user to control or operate every individual device optimally. Therefore, users generally rely on power management systems for such optimization but often are not satisfied with th… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 8 pages, 7 figures, Accepted at IEEE SMDS'2020

  20. arXiv:1912.03298  [pdf, other

    cs.AI

    Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop

    Authors: Mudit Verma, Siddhant Bhambri, Saurabh Gupta, Arun Balaji Buduru

    Abstract: Rapid advancements in the Internet of Things (IoT) have facilitated more efficient deployment of smart environment solutions for specific user requirement. With the increase in the number of IoT devices, it has become difficult for the user to control or operate every individual smart device into achieving some desired goal like optimized power consumption, scheduled appliance running time, etc. F… ▽ More

    Submitted 4 May, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

  21. arXiv:1912.01667  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    A Survey of Black-Box Adversarial Attacks on Computer Vision Models

    Authors: Siddhant Bhambri, Sumanyu Muku, Avinash Tulasi, Arun Balaji Buduru

    Abstract: Machine learning has seen tremendous advances in the past few years, which has lead to deep learning models being deployed in varied applications of day-to-day life. Attacks on such models using perturbations, particularly in real-life scenarios, pose a severe challenge to their applicability, pushing research into the direction which aims to enhance the robustness of these models. After the intro… ▽ More

    Submitted 7 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 33 pages

  22. arXiv:1410.6502  [pdf, other

    cs.DC

    Quantum Clouds: A future perspective

    Authors: Satish Bhambri

    Abstract: Quantum computing and cloud computing are two giants for futuristic computing. Both technologies complement each other. Quantum clouds, therefore, is deploying the resources of quantum computation in a cloud environment to provide solution to the challenges and problems faced by present model of classical cloud computation. State of the art challenges faced by the cloud such as VM migration, data… ▽ More

    Submitted 5 October, 2014; originally announced October 2014.

    Comments: 14 pages, 1 figure