Skip to main content

Showing 1–12 of 12 results for author: Martin, J D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19561  [pdf, other

    cs.LG cs.AI

    Meta-Gradient Search Control: A Method for Improving the Efficiency of Dyna-style Planning

    Authors: Bradley Burega, John D. Martin, Luke Kapeluck, Michael Bowling

    Abstract: We study how a Reinforcement Learning (RL) system can remain sample-efficient when learning from an imperfect model of the environment. This is particularly challenging when the learning system is resource-constrained and in continual settings, where the environment dynamics change. To address these challenges, our paper introduces an online, meta-gradient algorithm that tunes a probability with w… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2401.03306  [pdf, other

    cs.LG cs.AI cs.RO

    MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning

    Authors: Rafael Rafailov, Kyle Hatch, Victor Kolev, John D. Martin, Mariano Phielipp, Chelsea Finn

    Abstract: We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations in the context of realistic robot tasks. Recent offline model-free approaches successfully use online fine-tuning to either improve the performance of the agent over the data collection policy or adapt to novel tasks. At the same time, model-based RL algorithms have ach… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: This is an updated version of a manuscript that originally appeared at CoRL 2023. The project website is here https://sites.google.com/view/mo2o

    Journal ref: Proceedings of The 7th Conference on Robot Learning, PMLR 229:3654-3671, 2023

  3. arXiv:2304.09996  [pdf, other

    cs.RO

    Robust Route Planning with Distributional Reinforcement Learning in a Stochastic Road Network Environment

    Authors: Xi Lin, Paul Szenher, John D. Martin, Brendan Englot

    Abstract: Route planning is essential to mobile robot navigation problems. In recent years, deep reinforcement learning (DRL) has been applied to learning optimal planning policies in stochastic environments without prior knowledge. However, existing works focus on learning policies that maximize the expected return, the performance of which can vary greatly when the level of stochasticity in the environmen… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: The 20th International Conference on Ubiquitous Robots (UR 2023)

  4. arXiv:2212.10420  [pdf, other

    cs.AI cs.LG math.ST

    Settling the Reward Hypothesis

    Authors: Michael Bowling, John D. Martin, David Abel, Will Dabney

    Abstract: The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)." We aim to fully settle this hypothesis. This will not conclude with a simple affirmation or refutation, but rather specify completely the implicit requirements on goals and purposes under which the hy… ▽ More

    Submitted 16 September, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  5. arXiv:2205.10736  [pdf, other

    cs.LG cs.AI stat.ML

    Should Models Be Accurate?

    Authors: Esra'a Saleh, John D. Martin, Anna Koop, Arash Pourzarabi, Michael Bowling

    Abstract: Model-based Reinforcement Learning (MBRL) holds promise for data-efficiency by planning with model-generated experience in addition to learning with experience from the environment. However, in complex or changing environments, models in MBRL will inevitably be imperfect, and their detrimental effects on learning can be difficult to mitigate. In this work, we question whether the objective of thes… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Comments: The 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making ( RLDM 2022 )

  6. arXiv:2106.09776  [pdf, other

    cs.LG cs.AI

    Adapting the Function Approximation Architecture in Online Reinforcement Learning

    Authors: John D. Martin, Joseph Modayil

    Abstract: The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide both optimization techniques and architectures for approximating nonlinear functions from noisy, high-dimensional observations. However, prevailing optimization techniques are not designed for strictly-incremental online updates. Nor… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  7. arXiv:2008.00504  [pdf, other

    cs.RO stat.ML

    Variational Filtering with Copula Models for SLAM

    Authors: John D. Martin, Kevin Doherty, Caralyn Cyr, Brendan Englot, John Leonard

    Abstract: The ability to infer map variables and estimate pose is crucial to the operation of autonomous mobile robots. In most cases the shared dependency between these variables is modeled through a multivariate Gaussian distribution, but there are many situations where that assumption is unrealistic. Our paper shows how it is possible to relax this assumption and perform simultaneous localization and map… ▽ More

    Submitted 2 August, 2020; originally announced August 2020.

    Comments: Published at the 2020 International Conference on Intelligent Robots and Systems (IROS)

  8. arXiv:2007.12640  [pdf, other

    cs.RO cs.LG

    Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on Graphs

    Authors: Fanfei Chen, John D. Martin, Yewei Huang, **kun Wang, Brendan Englot

    Abstract: We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately map** the landmarks in an a priori unknown environment efficiently in real-time; it must choose sensing actions that both curb localization uncertainty and achieve information gain. For this problem, belief space planning methods that forward-simulate robot sensing and estimation may oft… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

  9. arXiv:2007.10407  [pdf, other

    cs.RO

    Fusing Concurrent Orthogonal Wide-aperture Sonar Images for Dense Underwater 3D Reconstruction

    Authors: John McConnell, John D. Martin, Brendan Englot

    Abstract: We propose a novel approach to handling the ambiguity in elevation angle associated with the observations of a forward looking multi-beam imaging sonar, and the challenges it poses for performing an accurate 3D reconstruction. We utilize a pair of sonars with orthogonal axes of uncertainty to independently observe the same points in the environment from two different perspectives, and associate th… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: Preprint - to appear at IROS 2020

  10. arXiv:2002.12499  [pdf, other

    cs.LG cs.AI stat.ML

    On Catastrophic Interference in Atari 2600 Games

    Authors: William Fedus, Dibya Ghosh, John D. Martin, Marc G. Bellemare, Yoshua Bengio, Hugo Larochelle

    Abstract: Model-free deep reinforcement learning is sample inefficient. One hypothesis -- speculated, but not confirmed -- is that catastrophic interference within an environment inhibits learning. We test this hypothesis through a large-scale empirical study in the Arcade Learning Environment (ALE) and, indeed, find supporting evidence. We show that interference causes performance to plateau; the network c… ▽ More

    Submitted 9 June, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: First two authors contributed equally. Code available to reproduce experiments at https://github.com/google-research/google-research/tree/master/memento

  11. arXiv:1905.07318  [pdf, other

    cs.LG stat.ML

    Stochastically Dominant Distributional Reinforcement Learning

    Authors: John D. Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot

    Abstract: We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of t… ▽ More

    Submitted 7 October, 2020; v1 submitted 17 May, 2019; originally announced May 2019.

    Comments: Accepted to the 2020 International Conference on Machine Learning

  12. arXiv:1802.09791  [pdf, ps, other

    cs.LG q-bio.QM stat.ML

    Bioinformatics and Medicine in the Era of Deep Learning

    Authors: Davide Bacciu, Paulo J. G. Lisboa, José D. Martín, Ruxandra Stoean, Alfredo Vellido

    Abstract: Many of the current scientific advances in the life sciences have their origin in the intensive use of data for knowledge discovery. In no area this is so clear as in bioinformatics, led by technological breakthroughs in data acquisition technologies. It has been argued that bioinformatics could quickly become the field of research generating the largest data repositories, beating other data-inten… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.