Search | arXiv e-print repository

Boltzmann State-Dependent Rationality

Abstract: This paper expands on existing learned models of human behavior via a measured step in structured irrationality. Specifically, by replacing the suboptimality constant $β$ in a Boltzmann rationality model with a function over states $β(s)$, we gain natural expressivity in a computationally tractable manner. This paper discusses relevant mathematical theory, sets up several experimental designs, pre… ▽ More This paper expands on existing learned models of human behavior via a measured step in structured irrationality. Specifically, by replacing the suboptimality constant $β$ in a Boltzmann rationality model with a function over states $β(s)$, we gain natural expressivity in a computationally tractable manner. This paper discusses relevant mathematical theory, sets up several experimental designs, presents limited preliminary results, and proposes future investigations. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17668 [pdf, other]

Precise Object Placement Using Force-Torque Feedback

Authors: Osher Lerner, Zachary Tam, Michael Equi

Abstract: Precise object manipulation and placement is a common problem for household robots, surgery robots, and robots working on in-situ construction. Prior work using computer vision, depth sensors, and reinforcement learning lacks the ability to reactively recover from planning errors, execution errors, or sensor noise. This work introduces a method that uses force-torque sensing to robustly place obje… ▽ More Precise object manipulation and placement is a common problem for household robots, surgery robots, and robots working on in-situ construction. Prior work using computer vision, depth sensors, and reinforcement learning lacks the ability to reactively recover from planning errors, execution errors, or sensor noise. This work introduces a method that uses force-torque sensing to robustly place objects in stable poses, even in adversarial environments. On 46 trials, our method finds success rates of 100% for basic stacking, and 17% for cases requiring adjustment. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:1803.07482 [pdf, other]

Natural Gradient Deep Q-learning

Authors: Ethan Knight, Osher Lerner

Abstract: We present a novel algorithm to train a deep Q-learning agent using natural-gradient techniques. We compare the original deep Q-network (DQN) algorithm to its natural-gradient counterpart, which we refer to as NGDQN, on a collection of classic control domains. Without employing target networks, NGDQN significantly outperforms DQN without target networks, and performs no worse than DQN with target… ▽ More We present a novel algorithm to train a deep Q-learning agent using natural-gradient techniques. We compare the original deep Q-network (DQN) algorithm to its natural-gradient counterpart, which we refer to as NGDQN, on a collection of classic control domains. Without employing target networks, NGDQN significantly outperforms DQN without target networks, and performs no worse than DQN with target networks, suggesting that NGDQN stabilizes training and can help reduce the need for additional hyperparameter tuning. We also find that NGDQN is less sensitive to hyperparameter optimization relative to DQN. Together these results suggest that natural-gradient techniques can improve value-function optimization in deep reinforcement learning. △ Less

Submitted 13 November, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

Showing 1–3 of 3 results for author: Lerner, O