Search | arXiv e-print repository

arXiv:2012.13037 [pdf, other]

SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning

Authors: Vasanth Sarathy, Daniel Kasenberg, Shivam Goel, Jivko Sinapov, Matthias Scheutz

Abstract: Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collect… ▽ More Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. However, RL approaches tend to require millions of episodes of experience and often learn policies that are not easily transferable to other tasks. In this paper, we address one aspect of the open problem of integrating these approaches: how can decision-making agents resolve discrepancies in their symbolic planning models while attempting to accomplish goals? We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed by the agent to accomplish goals that are initially unreachable for the agent. SPOTTER outperforms pure-RL approaches while also discovering transferable symbolic knowledge and does not require supervision, successful plan traces or any a priori knowledge about the missing planning operator. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: Accepted to AAMAS 2021

arXiv:1911.00229 [pdf, ps, other]

Engaging in Dialogue about an Agent's Norms and Behaviors

Authors: Daniel Kasenberg, Antonio Roque, Ravenna Thielstrom, Matthias Scheutz

Abstract: We present a set of capabilities allowing an agent planning with moral and social norms represented in temporal logic to respond to queries about its norms and behaviors in natural language, and for the human user to add and remove norms directly in natural language. The user may also pose hypothetical modifications to the agent's norms and inquire about their effects. We present a set of capabilities allowing an agent planning with moral and social norms represented in temporal logic to respond to queries about its norms and behaviors in natural language, and for the human user to add and remove norms directly in natural language. The user may also pose hypothetical modifications to the agent's norms and inquire about their effects. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Comments: Accepted to the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI)

arXiv:1911.00226 [pdf, ps, other]

Generating Justifications for Norm-Related Agent Decisions

Authors: Daniel Kasenberg, Antonio Roque, Ravenna Thielstrom, Meia Chita-Tegmark, Matthias Scheutz

Abstract: We present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent co… ▽ More We present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. We use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Comments: Accepted to the Proceedings of the 12th International Conference on Natural Language Generation (INLG 2019)

arXiv:1807.02572 [pdf, ps, other]

Quasi-Dilemmas for Artificial Moral Agents

Authors: Daniel Kasenberg, Vasanth Sarathy, Thomas Arnold, Matthias Scheutz, Tom Williams

Abstract: In this paper we describe moral quasi-dilemmas (MQDs): situations similar to moral dilemmas, but in which an agent is unsure whether exploring the plan space or the world may reveal a course of action that satisfies all moral requirements. We argue that artificial moral agents (AMAs) should be built to handle MQDs (in particular, by exploring the plan space rather than immediately accepting the in… ▽ More In this paper we describe moral quasi-dilemmas (MQDs): situations similar to moral dilemmas, but in which an agent is unsure whether exploring the plan space or the world may reveal a course of action that satisfies all moral requirements. We argue that artificial moral agents (AMAs) should be built to handle MQDs (in particular, by exploring the plan space rather than immediately accepting the inevitability of the moral dilemma), and that MQDs may be useful for evaluating AMA architectures. △ Less

Submitted 6 July, 2018; originally announced July 2018.

Comments: Accepted to the International Conference on Robot Ethics and Standards (ICRES), 2018

arXiv:1710.10532 [pdf, ps, other]

Interpretable Apprenticeship Learning with Temporal Logic Specifications

Authors: Daniel Kasenberg, Matthias Scheutz

Abstract: Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the… ▽ More Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains. △ Less

Submitted 28 October, 2017; originally announced October 2017.

Comments: Accepted to the 56th IEEE Conference on Decision and Control (CDC 2017)

arXiv:1706.07448 [pdf, ps, other]

Norm Conflict Resolution in Stochastic Domains

Authors: Daniel Kasenberg, Matthias Scheutz

Abstract: Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approac… ▽ More Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approaches lack principled ways of determining which normative alternatives exist in a given environment. We propose a hybrid approach, using Linear Temporal Logic (LTL) representations in Markov Decision Processes (MDPs), that manages norm conflicts in a systematic manner while accommodating domain stochasticity. We provide a proof-of-concept implementation in a simulated vacuum cleaning domain. △ Less

Submitted 18 November, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

Comments: New version of paper - new evaluations, accepted to AAAI 2018

Showing 1–6 of 6 results for author: Kasenberg, D