Skip to main content

Showing 1–8 of 8 results for author: van Heeswijk, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.00140  [pdf, other

    math.OC cs.AI

    The Stochastic Dynamic Post-Disaster Inventory Allocation Problem with Trucks and UAVs

    Authors: Robert van Steenbergen, Wouter van Heeswijk, Martijn Mes

    Abstract: Humanitarian logistics operations face increasing difficulties due to rising demands for aid in disaster areas. This paper investigates the dynamic allocation of scarce relief supplies across multiple affected districts over time. It introduces a novel stochastic dynamic post-disaster inventory allocation problem with trucks and unmanned aerial vehicles delivering relief goods under uncertain supp… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    MSC Class: 93E35 (Primary) 90B50; 68T05; 68T20 (Secondary) ACM Class: I.2.8

  2. arXiv:2305.19891  [pdf, other

    cs.LG cs.AI

    Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces

    Authors: Fabian Akkerman, Julius Luy, Wouter van Heeswijk, Maximilian Schiffer

    Abstract: Large discrete action spaces (LDAS) remain a central challenge in reinforcement learning. Existing solution approaches can handle unstructured LDAS with up to a few million actions. However, many real-world applications in logistics, production, and transportation systems have combinatorial action spaces, whose size grows well beyond millions of actions, even on small instances. Fortunately, such… ▽ More

    Submitted 27 February, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ICLR 2024 Camera ready version. https://openreview.net/forum?id=80wh3jjCZf

  3. arXiv:2209.01820  [pdf, other

    cs.LG math.OC

    Natural Policy Gradients In Reinforcement Learning Explained

    Authors: W. J. A. van Heeswijk

    Abstract: Traditional policy gradient methods are fundamentally flawed. Natural gradients converge quicker and better, forming the foundation of contemporary Reinforcement Learning such as Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO). This lecture note aims to clarify the intuition behind natural policy gradients, focusing on the thought process and the key mathematical con… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

    Comments: 14 pages, 3 figures

  4. arXiv:2102.09253  [pdf, other

    cs.LG

    Strategic bidding in freight transport using deep reinforcement learning

    Authors: Wouter van Heeswijk

    Abstract: This paper presents a multi-agent reinforcement learning algorithm to represent strategic bidding behavior in freight transport markets. Using this algorithm, we investigate whether feasible market equilibriums arise without any central control or communication between agents. Studying behavior in such environments may serve as a step** stone towards self-organizing logistics systems like the Ph… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

  5. arXiv:2008.02622  [pdf, other

    cs.LG cs.AI

    A Gentle Lecture Note on Filtrations in Reinforcement Learning

    Authors: W. J. A. van Heeswijk

    Abstract: This note aims to provide a basic intuition on the concept of filtrations as used in the context of reinforcement learning (RL). Filtrations are often used to formally define RL problems, yet their implications might not be eminent for those without a background in measure theory. Essentially, a filtration is a construct that captures partial knowledge up to time $t$, without revealing any future… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

  6. arXiv:2005.00565  [pdf, other

    cs.AI cs.LG cs.MA

    Smart Containers With Bidding Capacity: A Policy Gradient Algorithm for Semi-Cooperative Learning

    Authors: Wouter van Heeswijk

    Abstract: Smart modular freight containers -- as propagated in the Physical Internet paradigm -- are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers can autonomously place bids on transport services in a spot market setting. However, f… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: 15 pages

  7. arXiv:2001.04513  [pdf, other

    cs.CY

    Donald Duck Holiday Game: A numerical analysis of a Game of the Goose role-playing variant

    Authors: W. J. A. van Heeswijk

    Abstract: The 1996 Donald Duck Holiday Game is a role-playing variant of the historical Game of the Goose, involving characters with unique attributes, event squares, and random event cards. The objective of the game is to reach the cam** before any other player does. We develop a Monte Carlo simulation model that automatically plays the game and enables analyzing its key characteristics. We assess the ga… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

  8. arXiv:1902.09855  [pdf, other

    cs.LG stat.ML

    Approximate Dynamic Programming with Neural Networks in Linear Discrete Action Spaces

    Authors: Wouter van Heeswijk, Han La Poutré

    Abstract: Real-world problems of operations research are typically high-dimensional and combinatorial. Linear programs are generally used to formulate and efficiently solve these large decision problems. However, in multi-period decision problems, we must often compute expected downstream values corresponding to current decisions. When applying stochastic methods to approximate these values, linear programs… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.