Skip to main content

Showing 1–1 of 1 results for author: Peer, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.00445  [pdf, other

    cs.LG cs.AI stat.ML

    Ensemble Bootstrap** for Q-Learning

    Authors: Oren Peer, Chen Tessler, Nadav Merlis, Ron Meir

    Abstract: Q-learning (QL), a common reinforcement learning algorithm, suffers from over-estimation bias due to the maximization term in the optimal Bellman operator. This bias may lead to sub-optimal behavior. Double-Q-learning tackles this issue by utilizing two estimators, yet results in an under-estimation bias. Similar to over-estimation in Q-learning, in certain scenarios, the under-estimation bias may… ▽ More

    Submitted 20 April, 2021; v1 submitted 28 February, 2021; originally announced March 2021.