Skip to main content

Showing 1–11 of 11 results for author: Powell, W B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2201.00258  [pdf, other

    math.OC cs.AI

    The Parametric Cost Function Approximation: A new approach for multistage stochastic programming

    Authors: Warren B Powell, Saeed Ghadimi

    Abstract: The most common approaches for solving multistage stochastic programming problems in the research literature have been to either use value functions ("dynamic programming") or scenario trees ("stochastic programming") to approximate the impact of a decision now on the future. By contrast, common industry practice is to use a deterministic approximation of the future which is easier to understand a… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

    Comments: 3 figures

    MSC Class: 68 ACM Class: F.2; I.2

  2. arXiv:2004.05417  [pdf, other

    cs.LG cs.AI

    Optimal Learning for Sequential Decisions in Laboratory Experimentation

    Authors: Kristopher Reyes, Warren B Powell

    Abstract: The process of discovery in the physical, biological and medical sciences can be painstakingly slow. Most experiments fail, and the time from initiation of research until a new advance reaches commercial production can span 20 years. This tutorial is aimed to provide experimental scientists with a foundation in the science of making decisions. Using numerical examples drawn from the experiences of… ▽ More

    Submitted 13 April, 2020; v1 submitted 11 April, 2020; originally announced April 2020.

  3. arXiv:2002.06238  [pdf, other

    cs.LG cs.AI stat.ML

    On State Variables, Bandit Problems and POMDPs

    Authors: Warren B Powell

    Abstract: State variables are easily the most subtle dimension of sequential decision problems. This is especially true in the context of active learning problems (bandit problems") where decisions affect what we observe and learn. We describe our canonical framework that models {\it any} sequential decision problem, and present our definition of state variables that allows us to claim: Any properly modeled… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

  4. arXiv:1912.09484  [pdf, other

    math.OC cs.LG eess.SP eess.SY math.PR stat.ML

    Zeroth-order Stochastic Compositional Algorithms for Risk-Aware Learning

    Authors: Dionysios S. Kalogerias, Warren B. Powell

    Abstract: We present $\textit{Free-MESSAGE}^{p}$, the first zeroth-order algorithm for (weakly-)convex mean-semideviation-based risk-aware learning, which is also the first three-level zeroth-order compositional stochastic optimization algorithm whatsoever. Using a non-trivial extension of Nesterov's classical results on Gaussian smoothing, we develop the $\textit{Free-MESSAGE}^{p}$ algorithm from first pri… ▽ More

    Submitted 13 December, 2021; v1 submitted 19 December, 2019; originally announced December 2019.

    Comments: 31 pages, major revision of the first version

  5. arXiv:1912.03513  [pdf, other

    cs.AI cs.LG eess.SY stat.ML

    From Reinforcement Learning to Optimal Control: A unified framework for sequential decisions

    Authors: Warren B Powell

    Abstract: There are over 15 distinct communities that work in the general area of sequential decisions and information, often referred to as decisions under uncertainty or stochastic optimization. We focus on two of the most important fields: stochastic optimal control, with its roots in deterministic optimal control, and reinforcement learning, with its roots in Markov decision processes. Building on prior… ▽ More

    Submitted 18 December, 2019; v1 submitted 7 December, 2019; originally announced December 2019.

    Comments: 47 pages, 6 figures

  6. arXiv:1810.08124  [pdf, ps, other

    cs.AI eess.SY

    Approximate Dynamic Programming for Planning a Ride-Sharing System using Autonomous Fleets of Electric Vehicles

    Authors: Lina Al-Kanj, Juliana Nascimento, Warren B. Powell

    Abstract: Within a decade, almost every major auto company, along with fleet operators such as Uber, have announced plans to put autonomous vehicles on the road. At the same time, electric vehicles are quickly emerging as a next-generation technology that is cost effective, in addition to offering the benefits of reducing the carbon footprint. The combination of a centrally managed fleet of driverless vehic… ▽ More

    Submitted 11 December, 2018; v1 submitted 18 October, 2018; originally announced October 2018.

  7. arXiv:1704.05963  [pdf, other

    math.OC cs.AI cs.LG

    Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds

    Authors: Daniel R. Jiang, Lina Al-Kanj, Warren B. Powell

    Abstract: Monte Carlo Tree Search (MCTS), most famously used in game-play artificial intelligence (e.g., the game of Go), is a well-known strategy for constructing approximate solutions to sequential decision problems. Its primary innovation is the use of a heuristic, known as a default policy, to obtain Monte Carlo estimates of downstream values for states in a decision tree. This information is used to it… ▽ More

    Submitted 19 April, 2017; originally announced April 2017.

    Comments: 33 pages, 6 figures

  8. arXiv:1605.05711  [pdf, ps, other

    math.OC cs.AI eess.SY

    The Information-Collecting Vehicle Routing Problem: Stochastic Optimization for Emergency Storm Response

    Authors: Lina Al-Kanj, Warren B. Powell, Belgacem Bouzaiene-Ayari

    Abstract: Utilities face the challenge of responding to power outages due to storms and ice damage, but most power grids are not equipped with sensors to pinpoint the precise location of the faults causing the outage. Instead, utilities have to depend primarily on phone calls (trouble calls) from customers who have lost power to guide the dispatching of utility trucks. In this paper, we develop a policy tha… ▽ More

    Submitted 18 May, 2016; originally announced May 2016.

  9. arXiv:1509.01920  [pdf, other

    math.OC cs.AI

    Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures

    Authors: Daniel R. Jiang, Warren B. Powell

    Abstract: In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measur… ▽ More

    Submitted 8 May, 2017; v1 submitted 7 September, 2015; originally announced September 2015.

    Comments: 39 pages, 7 figures

  10. arXiv:1407.2676  [pdf, other

    math.OC cs.AI cs.LG eess.SY stat.ML

    A New Optimal Stepsize For Approximate Dynamic Programming

    Authors: Ilya O. Ryzhov, Peter I. Frazier, Warren B. Powell

    Abstract: Approximate dynamic programming (ADP) has proven itself in a wide range of applications spanning large-scale transportation problems, health care, revenue management, and energy systems. The design of effective ADP algorithms has many dimensions, but one crucial factor is the stepsize rule used to update a value function approximation. Many operations research applications are computationally inte… ▽ More

    Submitted 13 July, 2014; v1 submitted 9 July, 2014; originally announced July 2014.

    Comments: Matlab files are included with the paper source

  11. arXiv:1401.0843  [pdf, other

    math.OC cs.LG

    Least Squares Policy Iteration with Instrumental Variables vs. Direct Policy Search: Comparison Against Optimal Benchmarks Using Energy Storage

    Authors: Warren R. Scott, Warren B. Powell, Somayeh Moazehi

    Abstract: This paper studies approximate policy iteration (API) methods which use least-squares Bellman error minimization for policy evaluation. We address several of its enhancements, namely, Bellman error minimization using instrumental variables, least-squares projected Bellman error minimization, and projected Bellman error minimization using instrumental variables. We prove that for a general discrete… ▽ More

    Submitted 4 January, 2014; originally announced January 2014.

    Comments: 37 pages, 9 figures