Skip to main content

Showing 1–5 of 5 results for author: Wurman, P R

.
  1. arXiv:2406.12563  [pdf, other

    cs.LG cs.CV cs.RO

    A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

    Authors: Miguel Vasco, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Peter R. Wurman, Peter Stone

    Abstract: Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces,… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted at the Reinforcement Learning Conference (RLC) 2024

  2. arXiv:2306.07372  [pdf, other

    cs.LG cs.AI cs.GT

    Composing Efficient, Robust Tests for Policy Selection

    Authors: Dustin Morrill, Thomas J. Walsh, Daniel Hernandez, Peter R. Wurman, Peter Stone

    Abstract: Modern reinforcement learning systems produce many high-quality policies throughout the learning process. However, to choose which policy to actually deploy in the real world, they must be tested under an intractable number of environmental conditions. We introduce RPOSST, an algorithm to select a small set of test cases from a larger pool based on a relatively small number of sample evaluations.… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 26 pages, 13 figures. To appear in Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI 2023)

    ACM Class: B.8.1; I.2.6

  3. arXiv:2206.13901  [pdf, other

    cs.LG cs.AI

    Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

    Authors: James MacGlashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter R. Wurman, Peter Stone

    Abstract: Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons, and standard RL methods provide too few tools to provide insight into the exact cause. In this paper, we show how to integrate value decomposition into a broad class of actor-critic algorithms and use it to assist in the iterative age… ▽ More

    Submitted 20 October, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

    Comments: 10 content pages, 12 Appendix pages, 19 figures

  4. arXiv:1601.05484  [pdf, other

    cs.RO

    Analysis and Observations from the First Amazon Picking Challenge

    Authors: Nikolaus Correll, Kostas E. Bekris, Dmitry Berenson, Oliver Brock, Albert Causo, Kris Hauser, Kei Okada, Alberto Rodriguez, Joseph M. Romano, Peter R. Wurman

    Abstract: This paper presents a overview of the inaugural Amazon Picking Challenge along with a summary of a survey conducted among the 26 participating teams. The challenge goal was to design an autonomous robot to pick items from a warehouse shelf. This task is currently performed by human workers, and there is hope that robots can someday help increase efficiency and throughput while lowering cost. We re… ▽ More

    Submitted 22 September, 2017; v1 submitted 20 January, 2016; originally announced January 2016.

  5. arXiv:1302.3611  [pdf

    cs.AI

    Optimal Factory Scheduling using Stochastic Dominance A*

    Authors: Peter R. Wurman, Michael P. Wellman

    Abstract: We examine a standard factory scheduling problem with stochastic processing and setup times, minimizing the expectation of the weighted number of tardy jobs. Because the costs of operators in the schedule are stochastic and sequence dependent, standard dynamic programming algorithms such as A* may fail to find the optimal schedule. The SDA* (Stochastic Dominance A*) algorithm remedies this diffi… ▽ More

    Submitted 13 February, 2013; originally announced February 2013.

    Comments: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996)

    Report number: UAI-P-1996-PG-554-563