Search | arXiv e-print repository

Where the Action is: Let's make Reinforcement Learning for Stochastic Dynamic Vehicle Routing Problems work!

Authors: Florentin D Hildebrandt, Barrett Thomas, Marlin W Ulmer

Abstract: There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. Searching the combinatorial action space for efficient routing actions is by it… ▽ More There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. Searching the combinatorial action space for efficient routing actions is by itself a complex task of mixed-integer programming (MIP) well-known by the operations research community. This complexity is now multiplied by the challenge of evaluating such actions with respect to their effectiveness given future dynamism and uncertainty, a potentially ideal case for reinforcement learning (RL) well-known by the computer science community. For solving SDVRPs, joint work of both communities is needed, but as we show, essentially non-existing. Both communities focus on their individual strengths leaving potential for improvement. Our survey paper highlights this potential in research originating from both communities. We point out current obstacles in SDVRPs and guide towards joint approaches to overcome them. △ Less

Submitted 28 February, 2021; originally announced March 2021.

arXiv:2007.09541 [pdf, other]

Same-Day Delivery with Fairness

Authors: Xinwei Chen, Tong Wang, Barrett W. Thomas, Marlin W. Ulmer

Abstract: The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. The fast growth is not without its challenge. In 2016, due to low concentrations of memberships and far distance from the depot, certain minority neighborhoods were excluded from receiving Amazon's SDD service, raising concerns about fairness. In this paper,… ▽ More The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. The fast growth is not without its challenge. In 2016, due to low concentrations of memberships and far distance from the depot, certain minority neighborhoods were excluded from receiving Amazon's SDD service, raising concerns about fairness. In this paper, we study the problem of offering fair SDD-service to customers. The service area is partitioned into different regions. Over the course of a day, customers request for SDD service, and the timing of requests and delivery locations are not known in advance. The dispatcher dynamically assigns vehicles to make deliveries to accepted customers before their delivery deadline. In addition to the overall service rate (utility), we maximize the minimal regional service rate across all regions (fairness). We model the problem as a multi-objective Markov decision process and develop a deep Q-learning solution approach. We introduce a novel transformation of learning from rates to actual services, which creates a stable and efficient learning process. Computational results demonstrate the effectiveness of our approach in alleviating unfairness both spatially and temporally in different customer geographies. We also show this effectiveness is valid with different depot locations, providing businesses with an opportunity to achieve better fairness from any location. Further, we consider the impact of ignoring fairness in service, and results show that our policies eventually outperform the utility-driven baseline when customers have a high expectation on service level. △ Less

Submitted 22 December, 2021; v1 submitted 18 July, 2020; originally announced July 2020.

arXiv:1910.11901 [pdf, other]

doi 10.1016/j.ejor.2021.06.021

Deep Q-Learning for Same-Day Delivery with Vehicles and Drones

Authors: Xinwei Chen, Marlin W. Ulmer, Barrett W. Thomas

Abstract: In this paper, we consider same-day delivery with vehicles and drones. Customers make delivery requests over the course of the day, and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they ha… ▽ More In this paper, we consider same-day delivery with vehicles and drones. Customers make delivery requests over the course of the day, and the dispatcher dynamically dispatches vehicles and drones to deliver the goods to customers before their delivery deadline. Vehicles can deliver multiple packages in one route but travel relatively slowly due to the urban traffic. Drones travel faster, but they have limited capacity and require charging or battery swaps. To exploit the different strengths of the fleets, we propose a deep Q-learning approach. Our method learns the value of assigning a new customer to either drones or vehicles as well as the option to not offer service at all. In a systematic computational analysis, we show the superiority of our policy compared to benchmark policies and the effectiveness of our deep Q-learning approach. We also show that our policy can maintain effectiveness when the fleet size changes moderately. Experiments on data drawn from varied spatial/temporal distributions demonstrate that our trained policies can cope with changes in the input data. △ Less

Submitted 7 March, 2021; v1 submitted 25 October, 2019; originally announced October 2019.

Showing 1–3 of 3 results for author: Ulmer, M W