Skip to main content

Showing 1–10 of 10 results for author: Jiang, D R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.18803  [pdf, other

    cs.LG

    Weakly Coupled Deep Q-Networks

    Authors: Ibrahim El Shar, Daniel R. Jiang

    Abstract: We propose weakly coupled deep Q-networks (WCDQN), a novel deep reinforcement learning algorithm that enhances performance in a class of structured problems called weakly coupled Markov decision processes (WCMDP). WCMDPs consist of multiple independent subproblems connected by an action space constraint, which is a structural property that frequently emerges in practice. Despite this appealing str… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: To appear in proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  2. arXiv:2301.00922  [pdf, other

    cs.AI cs.LG eess.SY math.OC

    Faster Approximate Dynamic Programming by Freezing Slow States

    Authors: Yijia Wang, Daniel R. Jiang

    Abstract: We consider infinite horizon Markov decision processes (MDPs) with fast-slow structure, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transition more "slowly." Such structure is common in real-world problems where sequential decisions need to be made at high frequencies, yet information that varies at a slower timescale also infl… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: 69 pages, 9 figures

  3. arXiv:2111.06537  [pdf, other

    cs.LG math.OC stat.ML

    Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

    Authors: Raul Astudillo, Daniel R. Jiang, Maximilian Balandat, Eytan Bakshy, Peter I. Frazier

    Abstract: Bayesian optimization (BO) is a sample-efficient approach to optimizing costly-to-evaluate black-box functions. Most BO methods ignore how evaluation costs may vary over the optimization domain. However, these costs can be highly heterogeneous and are often unknown in advance. This occurs in many practical settings, such as hyperparameter tuning of machine learning algorithms or physics-based simu… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: In Advances in Neural Information Processing Systems, 2021

  4. arXiv:2006.15779  [pdf, other

    cs.LG math.NA stat.ML

    Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

    Authors: Shali Jiang, Daniel R. Jiang, Maximilian Balandat, Brian Karrer, Jacob R. Gardner, Roman Garnett

    Abstract: Bayesian optimization is a sequential decision making framework for optimizing expensive-to-evaluate black-box functions. Computing a full lookahead policy amounts to solving a highly intractable stochastic dynamic program. Myopic approaches, such as expected improvement, are often adopted in practice, but they ignore the long-term impact of the immediate decision. Existing nonmyopic approaches ar… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

  5. arXiv:2006.15690  [pdf, other

    cs.LG math.OC stat.ML

    Lookahead-Bounded Q-Learning

    Authors: Ibrahim El Shar, Daniel R. Jiang

    Abstract: We introduce the lookahead-bounded Q-learning (LBQL) algorithm, a new, provably convergent variant of Q-learning that seeks to improve the performance of standard Q-learning in stochastic environments through the use of ``lookahead'' upper and lower bounds. To do this, LBQL employs previously collected experience and each iteration's state-action values as dual feasible penalties to construct a se… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

    Comments: To appear in proceedings of the 37th International Conference on Machine Learning

  6. arXiv:1910.09143  [pdf, other

    math.OC cs.LG

    Dynamic Subgoal-based Exploration via Bayesian Optimization

    Authors: Yijia Wang, Matthias Poloczek, Daniel R. Jiang

    Abstract: Reinforcement learning in sparse-reward navigation environments with expensive and limited interactions is challenging and poses a need for effective exploration. Motivated by complex navigation tasks that require real-world training (when cheap simulators are not available), we consider an agent that faces an unknown distribution of environments and must decide on an exploration strategy. It may… ▽ More

    Submitted 12 October, 2023; v1 submitted 21 October, 2019; originally announced October 2019.

    Journal ref: Transactions on Machine Learning Research (09/2023)

  7. arXiv:1910.06403  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization

    Authors: Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, Eytan Bakshy

    Abstract: Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization that combines Monte-Carlo (MC) acquisition functions, a novel sample average approximation optimization approach, auto-differentiatio… ▽ More

    Submitted 8 December, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

    Journal ref: Advances in Neural Information Processing Systems 33, 2020

  8. arXiv:1805.05935  [pdf, other

    cs.AI cs.LG math.OC

    Feedback-Based Tree Search for Reinforcement Learning

    Authors: Daniel R. Jiang, Emmanuel Ekwedike, Han Liu

    Abstract: Inspired by recent successes of Monte-Carlo tree search (MCTS) in a number of artificial intelligence (AI) application domains, we propose a model-based reinforcement learning (RL) technique that iteratively applies MCTS on batches of small, finite-horizon versions of the original infinite-horizon Markov decision process. The terminal condition of the finite-horizon problems, or the leaf-node eval… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

    Comments: 19 pages, to be presented at ICML 2018

  9. arXiv:1704.05963  [pdf, other

    math.OC cs.AI cs.LG

    Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds

    Authors: Daniel R. Jiang, Lina Al-Kanj, Warren B. Powell

    Abstract: Monte Carlo Tree Search (MCTS), most famously used in game-play artificial intelligence (e.g., the game of Go), is a well-known strategy for constructing approximate solutions to sequential decision problems. Its primary innovation is the use of a heuristic, known as a default policy, to obtain Monte Carlo estimates of downstream values for states in a decision tree. This information is used to it… ▽ More

    Submitted 19 April, 2017; originally announced April 2017.

    Comments: 33 pages, 6 figures

  10. arXiv:1509.01920  [pdf, other

    math.OC cs.AI

    Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures

    Authors: Daniel R. Jiang, Warren B. Powell

    Abstract: In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measur… ▽ More

    Submitted 8 May, 2017; v1 submitted 7 September, 2015; originally announced September 2015.

    Comments: 39 pages, 7 figures