Skip to main content

Showing 1–10 of 10 results for author: Devidze, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.07019  [pdf, other

    cs.LG

    Informativeness of Reward Functions in Reinforcement Learning

    Authors: Rati Devidze, Parameswaran Kamalaruban, Adish Singla

    Abstract: Reward functions are central in specifying the task we want a reinforcement learning agent to perform. Given a task and desired optimal behavior, we study the problem of designing informative reward functions so that the designed rewards speed up the agent's convergence. In particular, we consider expert-driven reward design settings where an expert or teacher seeks to provide informative and inte… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: Longer version of the AAMAS'24 paper

  2. arXiv:2302.10720  [pdf, other

    cs.LG

    Learning to Play Text-based Adventure Games with Maximum Entropy Reinforcement Learning

    Authors: Weichen Li, Rati Devidze, Sophie Fellenz

    Abstract: Text-based games are a popular testbed for language-based reinforcement learning (RL). In previous work, deep Q-learning is commonly used as the learning agent. Q-learning algorithms are challenging to apply to complex real-world domains due to, for example, their instability in training. Therefore, in this paper, we adapt the soft-actor-critic (SAC) algorithm to the text-based environment. To dea… ▽ More

    Submitted 27 June, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

  3. arXiv:2106.04696  [pdf, other

    cs.LG cs.AI

    Curriculum Design for Teaching via Demonstrations: Theory and Applications

    Authors: Gaurav Yengera, Rati Devidze, Parameswaran Kamalaruban, Adish Singla

    Abstract: We consider the problem of teaching via demonstrations in sequential decision-making settings. In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence. We provide a unified curriculum strategy for two popular learner models: Maximum Causal Entropy Inverse Reinforcement Learning (MaxEnt-IRL) and Cross-Entropy Behavioral Cloning (Cros… ▽ More

    Submitted 15 December, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  4. arXiv:2011.10824  [pdf, other

    cs.LG cs.AI cs.CR

    Policy Teaching in Reinforcement Learning via Environment Poisoning Attacks

    Authors: Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiao** Zhu, Adish Singla

    Abstract: We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes reward in infinite-horizon problem settings. The attacker can manipulate the rewards and the transition dynamics in the learning environ… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

    Comments: Journal version of ICML'20 paper. New theoretical results for jointly poisoning rewards and transitions

  5. arXiv:2006.13160  [pdf, other

    cs.LG stat.ML

    Environment Sha** in Reinforcement Learning using State Abstraction

    Authors: Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla

    Abstract: One of the central challenges faced by a reinforcement learning (RL) agent is to effectively learn a (near-)optimal policy in environments with large state spaces having sparse and noisy feedback signals. In real-world applications, an expert with additional domain knowledge can help in speeding up the learning process via \emph{sha** the environment}, i.e., making the environment more learner-f… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

  6. arXiv:2003.12909  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

    Authors: Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiao** Zhu, Adish Singla

    Abstract: We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings. The attacker can manipulate the rewards or the transition dynamics in… ▽ More

    Submitted 18 August, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: ICML 2020

  7. arXiv:2003.09712  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding the Power and Limitations of Teaching with Imperfect Knowledge

    Authors: Rati Devidze, Farnam Mansouri, Luis Haug, Yuxin Chen, Adish Singla

    Abstract: Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. The typical assumption is that the teacher has perfect knowledge of the task---this knowledge comprises knowing the desired learning target, having the exact task representation used by the learner, and knowing the parameters capturi… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

  8. arXiv:1906.00429  [pdf, other

    cs.LG cs.AI stat.ML

    Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

    Authors: Sebastian Tschiatschek, Ahana Ghosh, Luis Haug, Rati Devidze, Adish Singla

    Abstract: Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example c… ▽ More

    Submitted 29 October, 2019; v1 submitted 2 June, 2019; originally announced June 2019.

  9. arXiv:1905.11867  [pdf, other

    cs.LG cs.AI stat.ML

    Interactive Teaching Algorithms for Inverse Reinforcement Learning

    Authors: Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla

    Abstract: We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher. More formally, we tackle the following algorithmic question: How could a teacher provide an informative sequence of demonstrations to an IRL learner to speed up the learning process? We present an interactive teaching framework where a teacher adaptively chooses the… ▽ More

    Submitted 5 June, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: IJCAI'19 paper (extended version)

  10. arXiv:1901.08029  [pdf, ps, other

    cs.LG stat.ML

    Learning to Collaborate in Markov Decision Processes

    Authors: Goran Radanovic, Rati Devidze, David C. Parkes, Adish Singla

    Abstract: We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting. We study the problem of designing a learning algorithm for the first agent (A1) that facilitates a successful collaboration even in cases when the second agent (A2) is adapting its policy in an unknown way. The key challenge in our setting is that the first agent faces non-stationarity in rewards… ▽ More

    Submitted 19 June, 2019; v1 submitted 23 January, 2019; originally announced January 2019.