-
Safe and Efficient Manoeuvring for Emergency Vehicles in Autonomous Traffic using Multi-Agent Proximal Policy Optimisation
Authors:
Leandro Parada,
Eduardo Candela,
Luis Marques,
Panagiotis Angeloudis
Abstract:
Manoeuvring in the presence of emergency vehicles is still a major issue for vehicle autonomy systems. Most studies that address this topic are based on rule-based methods, which cannot cover all possible scenarios that can take place in autonomous traffic. Multi-Agent Proximal Policy Optimisation (MAPPO) has recently emerged as a powerful method for autonomous systems because it allows for traini…
▽ More
Manoeuvring in the presence of emergency vehicles is still a major issue for vehicle autonomy systems. Most studies that address this topic are based on rule-based methods, which cannot cover all possible scenarios that can take place in autonomous traffic. Multi-Agent Proximal Policy Optimisation (MAPPO) has recently emerged as a powerful method for autonomous systems because it allows for training in thousands of different situations. In this study, we present an approach based on MAPPO to guarantee the safe and efficient manoeuvring of autonomous vehicles in the presence of an emergency vehicle. We introduce a risk metric that summarises the potential risk of collision in a single index. The proposed method generates cooperative policies allowing the emergency vehicle to go at $15 \%$ higher average speed while maintaining high safety distances. Moreover, we explore the trade-off between safety and traffic efficiency and assess the performance in a competitive scenario.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
A Decentralised Multi-Agent Reinforcement Learning Approach for the Same-Day Delivery Problem
Authors:
Elvin Ngu,
Leandro Parada,
Jose Javier Escribano Macias,
Panagiotis Angeloudis
Abstract:
Same-Day Delivery services are becoming increasingly popular in recent years. These have been usually modelled by previous studies as a certain class of Dynamic Vehicle Routing Problem (DVRP) where goods must be delivered from a depot to a set of customers in the same day that the orders were placed. Adaptive exact solution methods for DVRPs can become intractable even for small problem instances.…
▽ More
Same-Day Delivery services are becoming increasingly popular in recent years. These have been usually modelled by previous studies as a certain class of Dynamic Vehicle Routing Problem (DVRP) where goods must be delivered from a depot to a set of customers in the same day that the orders were placed. Adaptive exact solution methods for DVRPs can become intractable even for small problem instances. In this paper, we formulate the SDDP as a Markov Decision Process (MDP) and solve it using a parameter-sharing Deep Q-Network, which corresponds to a decentralised Multi-Agent Reinforcement Learning (MARL) approach. For this, we create a multi-agent grid-based SDD environment, consisting of multiple vehicles, a central depot and dynamic order generation. In addition, we introduce zone-specific order generation and reward probabilities. We compare the performance of our proposed MARL approach against a Mixed Inter Programming (MIP) solution. Results show that our proposed MARL framework performs on par with MIP-based policy when the number of orders is relatively low. For problem instances with higher order arrival rates, computational results show that the MARL approach underperforms the MIP by up to 30%. The performance gap between both methods becomes smaller when zone-specific parameters are employed. The gap is reduced from 30% to 3% for a 5x5 grid scenario with 30 orders. Execution time results indicate that the MARL approach is, on average, 65 times faster than the MIP-based policy, and therefore may be more advantageous for real-time control, at least for small-sized instances.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Transferring Multi-Agent Reinforcement Learning Policies for Autonomous Driving using Sim-to-Real
Authors:
Eduardo Candela,
Leandro Parada,
Luis Marques,
Tiberiu-Andrei Georgescu,
Yiannis Demiris,
Panagiotis Angeloudis
Abstract:
Autonomous Driving requires high levels of coordination and collaboration between agents. Achieving effective coordination in multi-agent systems is a difficult task that remains largely unresolved. Multi-Agent Reinforcement Learning has arisen as a powerful method to accomplish this task because it considers the interaction between agents and also allows for decentralized training -- which makes…
▽ More
Autonomous Driving requires high levels of coordination and collaboration between agents. Achieving effective coordination in multi-agent systems is a difficult task that remains largely unresolved. Multi-Agent Reinforcement Learning has arisen as a powerful method to accomplish this task because it considers the interaction between agents and also allows for decentralized training -- which makes it highly scalable. However, transferring policies from simulation to the real world is a big challenge, even for single-agent applications. Multi-agent systems add additional complexities to the Sim-to-Real gap due to agent collaboration and environment synchronization. In this paper, we propose a method to transfer multi-agent autonomous driving policies to the real world. For this, we create a multi-agent environment that imitates the dynamics of the Duckietown multi-robot testbed, and train multi-agent policies using the MAPPO algorithm with different levels of domain randomization. We then transfer the trained policies to the Duckietown testbed and compare the use of the MAPPO algorithm against a traditional rule-based method. We show that the rewards of the transferred policies with MAPPO and domain randomization are, on average, 1.85 times superior to the rule-based method. Moreover, we show that different levels of parameter randomization have a substantial impact on the Sim-to-Real gap.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.