Skip to main content

Showing 1–11 of 11 results for author: Uchibe, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00727  [pdf, ps, other

    cs.RO

    Unsupervised Neural Motion Retargeting for Humanoid Teleoperation

    Authors: Satoshi Yagi, Mitsunori Tada, Eiji Uchibe, Suguru Kanoga, Takamitsu Matsubara, Jun Morimoto

    Abstract: This study proposes an approach to human-to-humanoid teleoperation using GAN-based online motion retargeting, which obviates the need for the construction of pairwise datasets to identify the relationship between the human and the humanoid kinematics. Consequently, it can be anticipated that our proposed teleoperation system will reduce the complexity and setup requirements typically associated wi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2405.11784  [pdf, other

    cs.LG cs.AI cs.RO

    Reward-Punishment Reinforcement Learning with Maximum Entropy

    Authors: Jiexin Wang, Eiji Uchibe

    Abstract: We introduce the ``soft Deep MaxPain'' (softDMP) algorithm, which integrates the optimization of long-term policy entropy into reward-punishment reinforcement learning objectives. Our motivation is to facilitate a smoother variation of operators utilized in the updating of action values beyond traditional ``max'' and ``min'' operators, where the goal is enhancing sample efficiency and robustness.… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: IJCNN2024

  3. arXiv:2207.01840  [pdf, other

    cs.RO cs.LG

    Randomized-to-Canonical Model Predictive Control for Real-world Visual Robotic Manipulation

    Authors: Tomoya Yamanokuchi, Yuhwan Kwon, Yoshihisa Tsurumine, Eiji Uchibe, Jun Morimoto, Takamitsu Matsubara

    Abstract: Many works have recently explored Sim-to-real transferable visual model predictive control (MPC). However, such works are limited to one-shot transfer, where real-world data must be collected once to perform the sim-to-real transfer, which remains a significant human effort in transferring the models learned in simulations to new domains in the real world. To alleviate this problem, we first propo… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: 8 pages, Accepted by Robotics and Automation Letters

  4. Model-Based Imitation Learning Using Entropy Regularization of Model and Policy

    Authors: Eiji Uchibe

    Abstract: Approaches based on generative adversarial networks for imitation learning are promising because they are sample efficient in terms of expert demonstrations. However, training a generator requires many interactions with the actual environment because model-free reinforcement learning is adopted to update a policy. To improve the sample efficiency using model-based reinforcement learning, we propos… ▽ More

    Submitted 31 August, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: This is a preprint version of the paper to appear at IEEE Robotics and Automation Letters (RA-L). The final journal version is downloadable from https://doi.org/10.1109/LRA.2022.3196139

    Journal ref: IEEE Robotics and Automation Letters, Volume 7, Issue 4, pages 10922-10929, October, 2022

  5. arXiv:2205.07885  [pdf, other

    cs.LG cs.AI

    Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning

    Authors: Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

    Abstract: Maximum Tsallis entropy (MTE) framework in reinforcement learning has gained popularity recently by virtue of its flexible modeling choices including the widely used Shannon entropy and sparse entropy. However, non-Shannon entropies suffer from approximation error and subsequent underperformance either due to its sensitivity or the lack of closed-form policy expression. To improve the tradeoff bet… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  6. arXiv:2205.07467  [pdf, other

    cs.LG cs.AI

    $q$-Munchausen Reinforcement Learning

    Authors: Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

    Abstract: The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy. Though significant improvement has been shown with the Boltzmann softmax policy, when the Tsallis sparsemax policy is considered, the augmentation leads to a flat learning curve for almost every pr… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  7. Forward and inverse reinforcement learning sharing network weights and hyperparameters

    Authors: Eiji Uchibe, Kenji Doya

    Abstract: This paper proposes model-free imitation learning named Entropy-Regularized Imitation Learning (ERIL) that minimizes the reverse Kullback-Leibler (KL) divergence. ERIL combines forward and inverse reinforcement learning (RL) under the framework of an entropy-regularized Markov decision process. An inverse RL step computes the log-ratio between two distributions by evaluating two binary discriminat… ▽ More

    Submitted 31 May, 2022; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: Accepted for publication in the Neural Networks

    Journal ref: Neural Networks, December 2021, Pages 138-153

  8. arXiv:1807.09443  [pdf, ps, other

    cs.LG stat.ML

    Unbounded Output Networks for Classification

    Authors: Stefan Elfwing, Eiji Uchibe, Kenji Doya

    Abstract: We proposed the expected energy-based restricted Boltzmann machine (EE-RBM) as a discriminative RBM method for classification. Two characteristics of the EE-RBM are that the output is unbounded and that the target value of correct classification is set to a value much greater than one. In this study, by adopting features of the EE-RBM approach to feed-forward neural networks, we propose the UnBoun… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: 8 pages, 7 figures

  9. arXiv:1710.10866  [pdf, other

    stat.ML cs.LG

    Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

    Authors: Tadashi Kozuno, Eiji Uchibe, Kenji Doya

    Abstract: Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks. In this paper we propose a new, robust dynamic programming algorithm that unifies value iterati… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.

  10. arXiv:1702.07490  [pdf, ps, other

    cs.LG

    Online Meta-learning by Parallel Algorithm Competition

    Authors: Stefan Elfwing, Eiji Uchibe, Kenji Doya

    Abstract: The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state s… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

    Comments: 15 pages, 10 figures. arXiv admin note: text overlap with arXiv:1702.03118

  11. arXiv:1702.03118  [pdf, ps, other

    cs.LG

    Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

    Authors: Stefan Elfwing, Eiji Uchibe, Kenji Doya

    Abstract: In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, we propose two activation functions for ne… ▽ More

    Submitted 1 November, 2017; v1 submitted 10 February, 2017; originally announced February 2017.

    Comments: 18 pages, 22 figures; added deep RL results for SZ-Tetris