Skip to main content

Showing 1–50 of 126 results for author: Kambhampati, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20625  [pdf, other

    cs.AI

    Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning

    Authors: Atharva Gundawar, Mudit Verma, Lin Guan, Karthik Valmeekam, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: As the applicability of Large Language Models (LLMs) extends beyond traditional text processing tasks, there is a burgeoning interest in their potential to excel in planning and reasoning assignments, realms traditionally reserved for System 2 cognitive competencies. Despite their perceived versatility, the research community is still unraveling effective strategies to harness these models in such… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2405.15804  [pdf, other

    cs.AI

    Explainable Human-AI Interaction: A Planning Perspective

    Authors: Sarath Sreedharan, Anagha Kulkarni, Subbarao Kambhampati

    Abstract: From its inception, AI has had a rather ambivalent relationship with humans -- swinging between their augmentation and replacement. Now, as AI technologies enter our everyday lives at an ever increasing pace, there is a greater need for AI systems to work synergistically with humans. One critical requirement for such synergistic human-AI interaction is that the AI systems be explainable to the hum… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  3. arXiv:2405.15194  [pdf, other

    cs.LG cs.AI

    Efficient Reinforcement Learning via Large Language Model-based Search

    Authors: Siddhant Bhambri, Amrita Bhattacharjee, Huan Liu, Subbarao Kambhampati

    Abstract: Reinforcement Learning (RL) suffers from sample inefficiency in sparse reward domains, and the problem is pronounced if there are stochastic transitions. To improve the sample efficiency, reward sha** is a well-studied approach to introduce intrinsic rewards that can help the RL agent converge to an optimal policy faster. However, designing a useful reward sha** function specific to each probl… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 9 pages + Appendix

  4. arXiv:2405.13966  [pdf, other

    cs.AI cs.CL

    On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: The reasoning abilities of Large Language Models (LLMs) remain a topic of debate. Some methods such as ReAct-based prompting, have gained popularity for claiming to enhance sequential decision-making abilities of agentic LLMs. However, it is unclear what is the source of improvement in LLM reasoning with ReAct based prompting. In this paper we examine these claims of ReAct based prompting in impro… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  5. arXiv:2405.04776  [pdf, other

    cs.AI

    Chain of Thoughtlessness? An Analysis of CoT in Planning

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution. Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution procedures-with the intuition that it is possible to in-context teach an LLM an algorithm for solving the problem. This paper presents a case study of chain of thought… ▽ More

    Submitted 5 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  6. arXiv:2403.04121  [pdf, other

    cs.AI cs.CL cs.LG

    Can Large Language Models Reason and Plan?

    Authors: Subbarao Kambhampati

    Abstract: While humans sometimes do show the capability of correcting their own erroneous guesses with self-critiquing, there seems to be no basis for that assumption in the case of LLMs.

    Submitted 8 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.01817 (v2 add creative commons attribution to Figure 2 graphic)

    Journal ref: Annals of The New York Academy of Sciences; March 2024

  7. arXiv:2402.08115  [pdf, other

    cs.AI

    On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples--ranging from multiplication to simple planning--there persists a wide spread belief that LLMs can self-critique and improve their own solutions in an itera… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.12397

  8. arXiv:2402.04210  [pdf, other

    cs.AI cs.RO

    "Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors

    Authors: Lin Guan, Yifan Zhou, Denis Liu, Yantian Zha, Heni Ben Amor, Subbarao Kambhampati

    Abstract: Large-scale generative models are shown to be useful for sampling meaningful candidate solutions, yet they often overlook task constraints and user preferences. Their full power is better harnessed when the models are coupled with external verifiers and the final solutions are derived iteratively or progressively according to the verification feedback. In the context of embodied AI, verification o… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  9. arXiv:2402.01817  [pdf, other

    cs.AI cs.LG

    LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

    Authors: Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

    Abstract: There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the probl… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  10. arXiv:2401.05302  [pdf, other

    cs.RO cs.AI cs.HC

    Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: Large Language Models have shown exceptional generative abilities in various natural language and generation tasks. However, possible anthropomorphization and leniency towards failure cases have propelled discussions on emergent abilities of Large Language Models especially on Theory of Mind (ToM) abilities in Large Language Models. While several false-belief tests exists to verify the ability to… ▽ More

    Submitted 17 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted in alt.HRI 2024

  11. arXiv:2312.14292  [pdf, other

    cs.AI cs.LG cs.MA

    Benchmarking Multi-Agent Preference-based Reinforcement Learning for Human-AI Teaming

    Authors: Siddhant Bhambri, Mudit Verma, Anil Murthy, Subbarao Kambhampati

    Abstract: Preference-based Reinforcement Learning (PbRL) is an active area of research, and has made significant strides in single-agent actor and in observer human-in-the-loop scenarios. However, its application within the co-operative multi-agent RL frameworks, where humans actively participate and express preferences for agent behavior, remains largely uncharted. We consider a two-agent (Human-AI) cooper… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  12. arXiv:2310.12397  [pdf, other

    cs.AI

    GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

    Authors: Kaya Stechly, Matthew Marquez, Subbarao Kambhampati

    Abstract: There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples, a wide spread belief in their iterative self-critique capabilities persists. In this paper, we set out to systematically investigate the effectiveness of i… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 18 pages, 3 figures

  13. arXiv:2310.08118  [pdf, other

    cs.AI

    Can Large Language Models Really Improve by Self-critiquing Their Own Plans?

    Authors: Karthik Valmeekam, Matthew Marquez, Subbarao Kambhampati

    Abstract: There have been widespread claims about Large Language Models (LLMs) being able to successfully verify or self-critique their candidate solutions in reasoning problems in an iterative mode. Intrigued by those claims, in this paper we set out to investigate the verification/self-critiquing abilities of large language models in the context of planning. We evaluate a planning system that employs LLMs… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  14. arXiv:2305.17077  [pdf, other

    cs.CL cs.AI

    Learning and Leveraging Verifiers to Improve Planning Capabilities of Pre-trained Language Models

    Authors: Daman Arora, Subbarao Kambhampati

    Abstract: There have been wide spread claims in the literature about the emergent reasoning capabilities of Pretrained Large Language Models. However, recent studies, have found that their ability to plan remains questionable. Through our experiments using GPT-2, we empirically demonstrate that the performance of a finetuned baseline remains poor because it violates pre-conditions of actions in the plans th… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  15. arXiv:2305.15771  [pdf, other

    cs.AI

    On the Planning Abilities of Large Language Models : A Critical Investigation

    Authors: Karthik Valmeekam, Matthew Marquez, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs in LLM-Modulo settings where they act as a source of heuristic guidance for external plan… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 Spotlight. arXiv admin note: substantial text overlap with arXiv:2206.10498

  16. arXiv:2305.14909  [pdf, other

    cs.AI

    Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning

    Authors: Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: There is a growing interest in applying pre-trained large language models (LLMs) to planning problems. However, methods that use LLMs directly as planners are currently impractical due to several factors, including limited correctness of plans, strong reliance on feedback from interactions with simulators or even the actual environment, and the inefficiency in utilizing human feedback. In this wor… ▽ More

    Submitted 1 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  17. arXiv:2302.14208  [pdf, other

    cs.AI

    Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

    Authors: Tung Thai, Ming Shen, Mayank Garg, Ayush Kalani, Nakul Vaidya, Utkarsh Soni, Mudit Verma, Sriram Gopalakrishnan, Neeraj Varshney, Chitta Baral, Subbarao Kambhampati, Jivko Sinapov, Matthias Scheutz

    Abstract: Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectu… ▽ More

    Submitted 5 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  18. arXiv:2302.08738  [pdf, other

    cs.RO cs.AI

    Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

    Authors: Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: Preference Based Reinforcement Learning has shown much promise for utilizing human binary feedback on queried trajectory pairs to recover the underlying reward model of the Human in the Loop (HiL). While works have attempted to better utilize the queries made to the human, in this work we make two observations about the unlabeled trajectories collected by the agent and propose two corresponding lo… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: R2HCAI, AAAI 2023

  19. arXiv:2302.08734  [pdf, other

    cs.AI cs.LG

    A State Augmentation based approach to Reinforcement Learning from Human Preferences

    Authors: Mudit Verma, Subbarao Kambhampati

    Abstract: Reinforcement Learning has suffered from poor reward specification, and issues for reward hacking even in simple enough domains. Preference Based Reinforcement Learning attempts to solve the issue by utilizing binary feedbacks on queried trajectory pairs by a human in the loop indicating their preferences about the agent's behavior to learn a reward model. In this work, we present a state augmenta… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: R2HCAI, AAAI 2023

  20. arXiv:2302.08733  [pdf, other

    cs.LG cs.AI

    Data Driven Reward Initialization for Preference based Reinforcement Learning

    Authors: Mudit Verma, Subbarao Kambhampati

    Abstract: Preference-based Reinforcement Learning (PbRL) methods utilize binary feedback from the human in the loop (HiL) over queried trajectory pairs to learn a reward model in an attempt to approximate the human's underlying reward function capturing their preferences. In this work, we investigate the issue of a high degree of variability in the initialized reward models which are sensitive to random see… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: R2HCAI, AAAI 2023

  21. arXiv:2302.06706  [pdf, other

    cs.AI cs.CL cs.LG

    On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark)

    Authors: Karthik Valmeekam, Sarath Sreedharan, Matthew Marquez, Alberto Olmo, Subbarao Kambhampati

    Abstract: Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) how good LLMs are by themselves in generating and validating simple plans in commonsense planning tasks (of the type that humans are generally quite good at) and (2) how good LLMs are in being a source of heu… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.10498

  22. arXiv:2301.12569  [pdf, other

    cs.AI

    A Mental Model Based Theory of Trust

    Authors: Zahra Zahedi, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Handling trust is one of the core requirements for facilitating effective interaction between the human and the AI agent. Thus, any decision-making framework designed to work with humans must possess the ability to estimate and leverage human trust. In this paper, we propose a mental model based theory of trust that not only can be used to infer trust, thus providing an alternative to psychologica… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  23. arXiv:2210.15906  [pdf, other

    cs.AI cs.HC cs.LG

    Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

    Authors: Lin Guan, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: Generating complex behaviors that satisfy the preferences of non-expert users is a crucial requirement for AI agents. Interactive reward learning from trajectory comparisons (a.k.a. RLHF) is one way to allow non-expert users to convey complex objectives by expressing preferences over short clips of agent behaviors. Even though this parametric method can encode complex tacit knowledge present in th… ▽ More

    Submitted 27 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: ICLR 2023 Camera Ready

  24. arXiv:2210.15096  [pdf, other

    cs.AI

    Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion

    Authors: Utkarsh Soni, Nupur Thakur, Sarath Sreedharan, Lin Guan, Mudit Verma, Matthew Marquez, Subbarao Kambhampati

    Abstract: There is a growing interest in develo** automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function… ▽ More

    Submitted 31 January, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

  25. arXiv:2210.15011  [pdf, other

    cs.GT cs.CR

    Using Deception in Markov Game to Understand Adversarial Behaviors through a Capture-The-Flag Environment

    Authors: Siddhant Bhambri, Purv Chauhan, Frederico Araujo, Adam Doupé, Subbarao Kambhampati

    Abstract: Identifying the actual adversarial threat against a system vulnerability has been a long-standing challenge for cybersecurity research. To determine an optimal strategy for the defender, game-theoretic based decision models have been widely used to simulate the real-world attacker-defender scenarios while taking the defender's constraints into consideration. In this work, we focus on understanding… ▽ More

    Submitted 9 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at GameSec 2022

  26. arXiv:2210.03455  [pdf, other

    cs.AI

    Advice Conformance Verification by Reinforcement Learning agents for Human-in-the-Loop

    Authors: Mudit Verma, Ayush Kharkwal, Subbarao Kambhampati

    Abstract: Human-in-the-loop (HiL) reinforcement learning is gaining traction in domains with large action and state spaces, and sparse rewards by allowing the agent to take advice from HiL. Beyond advice accommodation, a sequential decision-making agent must be able to express the extent to which it was able to utilize the human advice. Subsequently, the agent should provide a means for the HiL to inspect p… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at IROS-RLCONFORM 2022

  27. arXiv:2206.10498  [pdf, other

    cs.CL cs.AI

    PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

    Authors: Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Generating plans of action, and reasoning about change have long been considered a core competence of intelligent agents. It is thus no surprise that evaluating the planning and reasoning capabilities of large language models (LLMs) has become a hot topic of research. Most claims about LLM planning capabilities are however based on common sense tasks-where it becomes hard to tell whether LLMs are… ▽ More

    Submitted 25 November, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2023 Track on Datasets and Benchmarks

  28. arXiv:2202.09447  [pdf, other

    cs.AI cs.HC

    A Mental-Model Centric Landscape of Human-AI Symbiosis

    Authors: Zahra Zahedi, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: There has been significant recent interest in develo** AI agents capable of effectively interacting and teaming with humans. While each of these works try to tackle a problem quite central to the problem of human-AI interaction, they tend to rely on myopic formulations that obscure the possible inter-relatedness and complementarity of many of these works. The human-aware AI framework was a recen… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  29. arXiv:2202.02886  [pdf, other

    cs.AI

    Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity

    Authors: Lin Guan, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Creating reinforcement learning (RL) agents that are capable of accepting and leveraging task-specific knowledge from humans has been long identified as a possible strategy for develo** scalable approaches for solving long-horizon problems. While previous works have looked at the possibility of using symbolic models along with RL approaches, they tend to assume that the high-level action models… ▽ More

    Submitted 17 June, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

  30. Gradient-Based Mixed Planning with Symbolic and Numeric Action Parameters

    Authors: Kebing **, Hankz Hankui Zhuo, Zhanhao Xiao, Hai Wan, Subbarao Kambhampati

    Abstract: Dealing with planning problems with both logical relations and numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex constraints on numeric variables, which harms the performance when solving problems. In this paper, we propose a novel algorithm framework to solve numeric planning pro… ▽ More

    Submitted 9 October, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: 41 pages, 22 figures. Accepted by Artificial Intelligence

  31. arXiv:2110.05286  [pdf, other

    cs.LG

    Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning

    Authors: Yantian Zha, Lin Guan, Subbarao Kambhampati

    Abstract: Our work aims at efficiently leveraging ambiguous demonstrations for the training of a reinforcement learning (RL) agent. An ambiguous demonstration can usually be interpreted in multiple ways, which severely hinders the RL-Agent from learning stably and efficiently. Since an optimal demonstration may also suffer from being ambiguous, previous works that combine RL and learning from demonstration… ▽ More

    Submitted 7 February, 2024; v1 submitted 11 October, 2021; originally announced October 2021.

  32. arXiv:2109.09904  [pdf, other

    cs.AI

    Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable and Advisable AI Systems

    Authors: Subbarao Kambhampati, Sarath Sreedharan, Mudit Verma, Yantian Zha, Lin Guan

    Abstract: Despite the surprising power of many modern AI systems that often learn their own representations, there is significant discontent about their inscrutability and the attendant problems in their ability to interact with humans. While alternatives such as neuro-symbolic approaches have been proposed, there is a lack of consensus on what they are about. There are often two independent motivations (i)… ▽ More

    Submitted 9 December, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

  33. arXiv:2109.07436  [pdf, other

    cs.AI

    Computing Policies That Account For The Effects Of Human Agent Uncertainty During Execution In Markov Decision Processes

    Authors: Sriram Gopalakrishnan, Mudit Verma, Subbarao Kambhampati

    Abstract: When humans are given a policy to execute, there can be policy execution errors and deviations in policy if there is uncertainty in identifying a state. This can happen due to the human agent's cognitive limitations and/or perceptual errors. So an algorithm that computes a policy for a human to execute ought to consider these effects in its computations. An optimal Markov Decision Process (MDP) po… ▽ More

    Submitted 3 March, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: 7 page paper, 6 pages supplemental material

  34. arXiv:2107.04303  [pdf, ps, other

    cs.AI

    Integrating Planning, Execution and Monitoring in the presence of Open World Novelties: Case Study of an Open World Monopoly Solver

    Authors: Sriram Gopalakrishnan, Utkarsh Soni, Tung Thai, Panagiotis Lymperopoulos, Matthias Scheutz, Subbarao Kambhampati

    Abstract: The game of monopoly is an adversarial multi-agent domain where there is no fixed goal other than to be the last player solvent, There are useful subgoals like monopolizing sets of properties, and develo** them. There is also a lot of randomness from dice rolls, card-draws, and adversaries' strategies. This unpredictability is made worse when unknown novelties are added during gameplay. Given th… ▽ More

    Submitted 9 August, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

  35. arXiv:2106.12207  [pdf, other

    cs.AI cs.HC

    Not all users are the same: Providing personalized explanations for sequential decision making problems

    Authors: Utkarsh Soni, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: There is a growing interest in designing autonomous agents that can work alongside humans. Such agents will undoubtedly be expected to explain their behavior and decisions. While generating explanations is an actively researched topic, most works tend to focus on methods that generate explanations that are one size fits all. As in the specifics of the user-model are completely ignored. The handful… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  36. arXiv:2106.07131  [pdf, other

    cs.CL cs.AI

    GPT3-to-plan: Extracting plans from text using GPT-3

    Authors: Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Operations in many essential industries including finance and banking are often characterized by the need to perform repetitive sequential tasks. Despite their criticality to the business, workflows are rarely fully automated or even formally specified, though there may exist a number of natural language documents describing these procedures for the employees of the company. Plan extraction method… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  37. arXiv:2105.01220  [pdf, other

    cs.AI cs.RO

    Trust-Aware Planning: Modeling Trust Evolution in Longitudinal Human-Robot Interaction

    Authors: Zahra Zahedi, Mudit Verma, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Trust between team members is an essential requirement for any successful cooperation. Thus, engendering and maintaining the fellow team members' trust becomes a central responsibility for any member trying to not only successfully participate in the task but to ensure the team achieves its goals. The problem of trust management is particularly challenging in mixed human-robot teams where the huma… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  38. arXiv:2105.00525  [pdf, other

    cs.AI

    Planning for Proactive Assistance in Environments with Partial Observability

    Authors: Anagha Kulkarni, Siddharth Srivastava, Subbarao Kambhampati

    Abstract: This paper addresses the problem of synthesizing the behavior of an AI agent that provides proactive task assistance to a human in settings like factory floors where they may coexist in a common environment. Unlike in the case of requested assistance, the human may not be expecting proactive assistance and hence it is crucial for the agent to ensure that the human is aware of how the assistance af… ▽ More

    Submitted 4 September, 2021; v1 submitted 2 May, 2021; originally announced May 2021.

  39. arXiv:2104.10743  [pdf, other

    cs.AI

    A Unifying Bayesian Formulation of Measures of Interpretability in Human-AI

    Authors: Sarath Sreedharan, Anagha Kulkarni, David E. Smith, Subbarao Kambhampati

    Abstract: Existing approaches for generating human-aware agent behaviors have considered different measures of interpretability in isolation. Further, these measures have been studied under differing assumptions, thus precluding the possibility of designing a single framework that captures these measures under the same assumptions. In this paper, we present a unifying Bayesian framework that models a human… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2011.10920

  40. arXiv:2103.09990  [pdf, other

    cs.AI

    Human-AI Symbiosis: A Survey of Current Approaches

    Authors: Zahra Zahedi, Subbarao Kambhampati

    Abstract: In this paper, we aim at providing a comprehensive outline of the different threads of work in human-AI collaboration. By highlighting various aspects of works on the human-AI team such as the flow of complementing, task horizon, model representation, knowledge level, and teaming goal, we make a taxonomy of recent works according to these dimensions. We hope that the survey will provide a more cle… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

  41. arXiv:2011.12262  [pdf, other

    cs.AI

    Model Elicitation through Direct Questioning

    Authors: Sachin Grover, David Smith, Subbarao Kambhampati

    Abstract: The future will be replete with scenarios where humans are robots will be working together in complex environments. Teammates interact, and the robot's interaction has to be about getting useful information about the human's (teammate's) model. There are many challenges before a robot can interact, such as incorporating the structural differences in the human's model, ensuring simpler responses, e… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  42. arXiv:2011.10920  [pdf, other

    cs.AI

    A Bayesian Account of Measures of Interpretability in Human-AI Interaction

    Authors: Sarath Sreedharan, Anagha Kulkarni, Tathagata Chakraborti, David E. Smith, Subbarao Kambhampati

    Abstract: Existing approaches for the design of interpretable agent behavior consider different measures of interpretability in isolation. In this paper we posit that, in the design and deployment of human-aware agents in the real world, notions of interpretability are just some among many considerations; and the techniques developed in isolation lack two key properties to be useful when considered together… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

  43. arXiv:2011.09644  [pdf, other

    cs.AI

    RADAR-X: An Interactive Mixed Initiative Planning Interface Pairing Contrastive Explanations and Revised Plan Suggestions

    Authors: Karthik Valmeekam, Sarath Sreedharan, Sailik Sengupta, Subbarao Kambhampati

    Abstract: Decision support systems seek to enable informed decision-making. In the recent years, automated planning techniques have been leveraged to empower such systems to better aid the human-in-the-loop. The central idea for such decision support systems is to augment the capabilities of the human-in-the-loop with automated planning techniques and enhance the quality of decision-making. In addition to p… ▽ More

    Submitted 3 June, 2022; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: Accepted at ICAPS 2022

  44. arXiv:2010.15255  [pdf, other

    cs.AI

    Minimizing Robot Navigation-Graph For Position-Based Predictability By Humans

    Authors: Sriram Gopalakrishnan, Subbarao Kambhampati

    Abstract: In situations where humans and robots are moving in the same space whilst performing their own tasks, predictable paths taken by mobile robots can not only make the environment feel safer, but humans can also help with the navigation in the space by avoiding path conflicts or not blocking the way. So predictable paths become vital. The cognitive effort for the human to predict the robot's path bec… ▽ More

    Submitted 11 January, 2022; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: 8 pages, 6 pages supplemental material. Accepted as an extended abstract in the 21st International Conference on Autonomous Agents and Multiagent Systems(AAMAS2022

  45. arXiv:2010.03713  [pdf, other

    cs.GT

    Moving Target Defense for Robust Monitoring of Electric Grid Transformers in Adversarial Environments

    Authors: Sailik Sengupta, Kaustav Basu, Arunabha Sen, Subbarao Kambhampati

    Abstract: Electric power grid components, such as high voltage transformers (HVTs), generating stations, substations, etc. are expensive to maintain and, in the event of failure, replace. Thus, regularly monitoring the behavior of such components is of utmost importance. Furthermore, the recent increase in the number of cyberattacks on such systems demands that such monitoring strategies should be robust. I… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted to the Conference on Decision and Game Theory for Security (GameSec), 2020

  46. arXiv:2007.10457  [pdf, other

    cs.GT cs.AI cs.CR cs.LG

    Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense

    Authors: Sailik Sengupta, Subbarao Kambhampati

    Abstract: The field of cybersecurity has mostly been a cat-and-mouse game with the discovery of new attacks leading the way. To take away an attacker's advantage of reconnaissance, researchers have proposed proactive defense methods such as Moving Target Defense (MTD). To find good movement strategies, researchers have modeled MTD as leader-follower games between the defender and a cyber-adversary. We argue… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

  47. arXiv:2007.00820  [pdf, other

    cs.AI

    Designing Environments Conducive to Interpretable Robot Behavior

    Authors: Anagha Kulkarni, Sarath Sreedharan, Sarah Keren, Tathagata Chakraborti, David Smith, Subbarao Kambhampati

    Abstract: Designing robots capable of generating interpretable behavior is a prerequisite for achieving effective human-robot collaboration. This means that the robots need to be capable of generating behavior that aligns with human expectations and, when required, provide explanations to the humans in the loop. However, exhibiting such behavior in arbitrary environments could be quite expensive for robots,… ▽ More

    Submitted 2 August, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  48. arXiv:2006.14841  [pdf, other

    cs.LG cs.CV stat.ML

    Not all Failure Modes are Created Equal: Training Deep Neural Networks for Explicable (Mis)Classification

    Authors: Alberto Olmo, Sailik Sengupta, Subbarao Kambhampati

    Abstract: Deep Neural Networks are often brittle on image classification tasks and known to misclassify inputs. While these misclassifications may be inevitable, all failure modes cannot be considered equal. Certain misclassifications (eg. classifying the image of a dog to an airplane) can perplex humans and result in the loss of human trust in the system. Even worse, these errors (eg. a person misclassifie… ▽ More

    Submitted 1 November, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

  49. arXiv:2006.14804  [pdf, other

    cs.AI

    Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation

    Authors: Lin Guan, Mudit Verma, Sihang Guo, Ruohan Zhang, Subbarao Kambhampati

    Abstract: Human explanation (e.g., in terms of feature importance) has been recently used to extend the communication channel between human and agent in interactive machine learning. Under this setting, human trainers provide not only the ground truth but also some form of explanation. However, this kind of human guidance was only investigated in supervised learning tasks, and it remains unclear how to best… ▽ More

    Submitted 26 October, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

  50. arXiv:2002.11697  [pdf, other

    cs.AI cs.HC

    The Emerging Landscape of Explainable AI Planning and Decision Making

    Authors: Tathagata Chakraborti, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: In this paper, we provide a comprehensive outline of the different threads of work in Explainable AI Planning (XAIP) that has emerged as a focus area in the last couple of years and contrast that with earlier efforts in the field in terms of techniques, target users, and delivery mechanisms. We hope that the survey will provide guidance to new researchers in automated planning towards the role of… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.