Distributional constrained reinforcement learning for supply chain optimization

Bermúdez, Jaime Sabal; Chanona, Antonio del Rio; Tsay, Calvin

Computer Science > Machine Learning

arXiv:2302.01727 (cs)

[Submitted on 3 Feb 2023]

Title:Distributional constrained reinforcement learning for supply chain optimization

Authors:Jaime Sabal Bermúdez, Antonio del Rio Chanona, Calvin Tsay

View PDF

Abstract:This work studies reinforcement learning (RL) in the context of multi-period supply chains subject to constraints, e.g., on production and inventory. We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for reliable constraint satisfaction in RL. Our approach is based on Constrained Policy Optimization (CPO), which is subject to approximation errors that in practice lead it to converge to infeasible policies. We address this issue by incorporating aspects of distributional RL into DCPO. Specifically, we represent the return and cost value functions using neural networks that output discrete distributions, and we reshape costs based on the associated confidence. Using a supply chain case study, we show that DCPO improves the rate at which the RL policy converges and ensures reliable constraint satisfaction by the end of training. The proposed method also improves predictability, greatly reducing the variance of returns between runs, respectively; this result is significant in the context of policy gradient methods, which intrinsically introduce significant variance during training.

Comments:	6 pages, 4 figures
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2302.01727 [cs.LG]
	(or arXiv:2302.01727v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2302.01727

Submission history

From: Calvin Tsay [view email]
[v1] Fri, 3 Feb 2023 13:43:02 UTC (812 KB)

Computer Science > Machine Learning

Title:Distributional constrained reinforcement learning for supply chain optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Distributional constrained reinforcement learning for supply chain optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators