A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

Jun Zhang, Wenxuan Ao11footnotemark: 1, Junbo Yan, Depeng **, Yong Li
Department of Electronic Engineering, BNRist, Tsinghua University
[email protected]
These authors contributed equally to this work.
Abstract

With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision optimization methods. Learning-based optimization methods require extensive interaction with highly realistic microscopic traffic simulators for optimization. However, existing microscopic traffic simulators are computationally inefficient in large-scale scenarios and therefore significantly reduce the efficiency of the data sampling process of optimization algorithms. In addition, the optimization scenarios supported by existing simulators are limited, mainly focusing on the traffic signal control. To address these challenges and limitations, we propose the first open-source GPU-accelerated large-scale microscopic simulator for transportation system simulation. The simulator is able to iterate at 84.09Hz, which achieves 88.92 times computational acceleration in the large-scale scenario with more than a million vehicles compared to the best baseline. Based on the simulator, we implement a set of microscopic and macroscopic controllable objects and metrics to support most typical transportation system optimization scenarios. These controllable objects and metrics are all provided by Python API for ease of use. We choose five important and representative transportation system optimization scenarios and benchmark classical rule-based algorithms, reinforcement learning, and black-box optimization in four cities. The codes are available at https://github.com/tsinghua-fib-lab/moss-benchmark with the MIT License.

1 Introduction

With the increasing level of urbanization and residents’ travel demand, the urban transportation system faces heavier traffic pressure, which brings higher commuting costs, environmental pollution and other society problems, affecting the sustainable development of the city [13, 33, 30]. To alleviate the above problems, governments usually build more transportation infrastructure and optimize the existing transportation infrastructure to enhance the systems’ capacity. For instance, these transportation system optimization methods include traffic signal control, congestion pricing, etc. However, the traditional transportation system optimization process is highly dependent on the experience of experts, which is labor-heavy and often sub-optimal [22]. With the development of reinforcement learning [23, 27] and black-box optimization [10, 5], the above optimization methods have great potential for improving the transportation system. But since all of these optimization methods use extensive interaction with the environment for feedback to perform optimization, it requires that the environment be able to model the transportation system as realistically as possible and can provide feedback fast. In the field of transportation system, simulators that simulate individual motion to provide a realistic result are referred to as microscopic simulators.

At present, there are several available microscopic simulators that can evaluate the efficiency of the transportation system, including SUMO [1], CityFlow [39], and CBLab [19]. However, these simulators face the following two key challenges:

  • Computational inefficiency in large-scale scenarios. Since the urban transportation system is a complex system with strong direct spatial and temporal correlation between different regions, the traffic improvement in a one area may lead to congestion in another area. Therefore, the effect of transportation system optimization should be evaluated in a global city-level perspective, which poses a requirement for large-scale microscopic simulation. However, existing simulators typically use CPUs for computation. The most popular open source simulator, SUMO, is still using a single-threaded computing architecture, which significantly reduces the efficiency of the data sampling process of optimization algorithms such as reinforcement learning. Even though CityFlow and CBLab use multi-threading techniques, it still takes more than 100 seconds to simulate 1 hour in a scenario of about 100,000 vehicles. Due to the large number of environment interactions required by optimization methods especially learning-based methods, we need simulators that can simulate in large-scale scenarios with a frequency of at least 10Hz for adopting these methods into transportation system optimization.

  • Limited supported optimization scenarios. In order to improve the efficiency of the transportation system, traffic management authorities usually apply a variety of transportation system optimization methods, including traffic signal control optimization, intersection lane turn assignment, tidal lane, congestion pricing, etc. If these methods can be used jointly, transportation system efficiency improvements will be further enhanced. However, existing simulators and related optimization studies usually focus on only a few of these scenarios, such as the traffic signal control optimization problem [35, 20, 37], ignoring other scenarios. This situation prevents traffic managers from fully evaluating and comparing the effectiveness of various transportation system optimization methods from microscopic like traffic signal control [42, 35, 38, 40] to macroscopic like congestion pricing [2, 4, 25]. To improve it, the simulator should be implemented to be able to support most common transportation system optimization methods and scenarios.

To address the above challenges, considering the characteristics of individual independent computation in microscopic simulation matches the GPU architecture and the massive computational power of GPUs compared to CPUs, we propose the first open-source GPU-accelerated large-scale microscopic simulator for transportation system simulation. This simulator adopts a parallel-friendly design of computational flow and data partitioning, and designs an efficient indexes for sensing between vehicles. Based on these, we implement microscopic traffic simulation on CUDA and substantially improves the scale and efficiency of simulation. In the largest scenario with more than a million vehicles, this simulator is able to iterate at 84.09Hz, which is 88.92 times better than the optimal baseline. To support the optimization of various scenarios, the simulator also implements a set of microscopic and macroscopic controllable objects and metrics, and provides a Python application programming interface (API) by pybind11 111https://github.com/pybind/pybind11. By combining controllable objects and metrics, we implement five typical transportation system optimization scenarios including traffic signal control, dynamic lane assignment within junctions, tidal lane control, congestion pricing, and road planning for benchmarking and evaluate the performance of classical rule-based algorithms, reinforcement learning algorithms and black-box optimization algorithms for these scenarios in 4 large cities including Bei**g, Shanghai, Paris, and New York.

Table 1: Comparison of microscopic simulators for transportation system. The Scale field indicates the approximate number of vehicles that can be computed by this simulator at a simulation computation frequency of 10Hz.
Simulator SUMO [1] CityFlow [39] CBLab [19] Ours
Scale (10Hz) <10000 ~130,000 ~150,000 >10,000,000
Controllable Objects Traffic Signal
Lane/Road Max Speed ×\times×
Lane Function ×\times× ×\times× ×\times×
Vehicle Route
Metrics Lane Queue Length
Road Travelling Time ×\times× ×\times×
Average Travelling Time
Throughput ×\times× ×\times×

In short, our contribution are two-fold. First, we propose an high-performance large-scale microscopic simulator for transportation system simulation on GPU and implement microscopic and macroscopic controllable objects and metrics to support transportation system optimization. Second, we choose and implement five typical transportation system optimization scenarios and benchmark common optimization algorithms in four cities to show the usability of our proposed simulator.

2 Related Works

2.1 Existing Simulators for Transportation System

Existing simulators for transportation system can be divided into three categories based on the level of simplification of the simulation models: microscopic simulators, mesoscopic simulators, and macroscopic simulators. Macroscopic simulators [21, 9] typically do not consider modeling individual vehicles, but rather treat the vehicles as a fluid for using velocity and density to describe them. Mesoscopic simulators like often speed up the simulation by simplifying the vehicle motion models. For instance, MATSIM [32] use a uniform motion model with intersection waiting queues [8] to model vehicles and do not consider acceleration and deceleration. Since macroscopic and mesoscopic simulators oversimplify vehicle motion, they are not usually used for AI algorithm based transportation system optimization. Among the microscopic simulators, SUMO [1], CityFlow [39], and CBLab [19] are popular simulators for transportation system optimization. SUMO offers a rich set of controllable objects and metrics. However, due to its software architecture, SUMO can almost exclusively use one CPU core for computation, which leads to small simulation scales shown in Table 1. For CityFlow and CBLab, they both use a multi-threaded architecture for computational acceleration, which improve computational speed by about 20~30 times on 64-threaded CPUs relative to SUMO. But with city-scale simulations of at least 100,000 vehicles, it still takes minutes for them to simulate an hour, which constrains the speed of reinforcement learning algorithms to learn by interacting with the environment. Besides, in terms of controllable objects, CityFlow only provides interfaces for setting traffic signal phases and vehicle routes while CBLab adds the setting of road speed limits as an additional feature. In terms of metrics, both CityFlow and CBLab provides lane queue length and average traveling time (ATT) directly. Most of these controllable objects and metrics are designed for traffic signal optimization, and other optimization scenarios cannot be directly implemented accordingly. Overall, there is a lack of simulators that can effectively simulate and provide rich controllable objects and metrics to support transportation system optimization problems in large scale scenarios.

2.2 Existing Transportation System Optimization Methods

Existing methods for optimizing transportation systems can be classified into rule-based and learning-based methods. Rule-based methods use expert experience to design and improve rules, relying on rules for control and optimization, e.g. the maximum pressure algorithm [31] in traffic signal control and the Δ-tollingΔ-tolling\Delta\textit{-tolling}roman_Δ -tolling algorithm [28] in congestion pricing. Such methods are difficult to adapt to complex and changing traffic conditions and only consider local optimization. Learning-based methods usually use reinforcement learning [23, 27] to find the global optimal solution by making a large number of tries in a simulation environment. The traffic signal control problem is the most extensively studied problem in the field of transportation system optimization, with both rule-based methods [31] and learning-based methods [42, 35, 38, 40, 36, 24]. In the congestion pricing problem, the reinforcement learning algorithm has also been adopted [2, 4, 25, 26, 34]. Comparatively, other transportation system optimization scenarios such as dynamic lane assignment [16, 43, 11], tidal lane control [18, 41, 17], etc. do not seem to have received much attention from researchers. And existing works only focus on small-scale problems. This is most likely due to the lack of simulators that support multiple scenarios simultaneously including those mentioned above.

3 The Simulator

Refer to caption
Figure 1: The framework and pipeline of the proposed simulator. (best view in color)

In the section, we give the overview of the design of our proposed simulator for efficient microscopic traffic simulation in large-scale urban scenarios and its interface for users as shown in Figure 1.

3.1 System Design

Microscopic traffic simulation is the process of modeling and discrete-time simulation calculations for each vehicle in the transportation system. Performing one step simulation usually represents simulating a 1-second change in the real world. When facing large-scale scenarios with hundreds of thousands of vehicles, The large number of vehicle model calculations will consume a lot of computational power, resulting in a low running speed.

The development of modern computational acceleration hardware provides the basis for a solution to this problem. Single instruction multiple data (SIMD), as the basic computational model of hardware acceleration cards such as GPUs, trades off instruction flexibility for the ability to parallelize a large number of homogeneous tasks and has been used with great success in areas such as matrix arithmetic acceleration and 3D image rendering. In microscopic traffic simulation, the simulation models of individual vehicles are also highly homogeneous and therefore highly compatible with the SIMD computational model.

However, before we can simply write vehicle simulation models as CUDA code, we need to address the two problems posed by the need for vehicles to sense each other. First, in an iteration, the vehicle needs to read the position, speed, and other attributes of other vehicles as inputs to the simulation model for computing appropriate driving behaviors such as accelerating, decelerating, and changing lanes. Thereafter, the vehicle will also modify its own attributes such as position, speed, etc. based on the driving behavior of the decision. This leads to the problem of read/write conflict of vehicle data, which will affect the correctness of the simulation results. Second, the sensing behavior of a vehicle is spatially localized. Specifically, the range that the vehicle needs to sense includes only the front vehicle in the current lane and the front and rear vehicles in the adjacent lanes. Thus, implementing SIMD-friendly vehicle sensing indexes for the above retrieval task is the key to fully utilize the massive arithmetic power of the modern computational acceleration cards. Our proposed simulator designs a two-phase parallel process for read/write separation and a link-list based vehicle sensing indexes to solve the above two problems respectively. The details are described below.

Two-phase Parallel Process for Read/Write Separation. In order to ensure that vehicles always correctly read the previous step’s attributes of other vehicles in the computation and to avoid interfering with the vehicle’s computation and attribute updating process, we divide the vehicle’s attributes into two partitions: snapshot and runtime. The snapshot is a read-only data partition that always saves the public attributes of the previous step for other vehicles to access. The runtime is a private and read-write partition, the attributes of which are changed after the vehicle completes its simulation calculations. In order to implement the data replication from the runtime partition to the snapshot partition, we also divide each iteration into two sequential phases, the prepare phase and the update phase. The prepare phase is used to perform vehicle data replication in parallel and update the vehicle sensing indices based on the new snapshot data. In the update phase, the vehicle performs sensing to obtain the attributes of the snapshot partitions of other vehicles and performs the car-following model [29] and the lane-changing model [12, 7] calculations to update its own runtime partition attributes. The above process effectively avoids the read/write conflict of vehicle data, which on one hand ensures the correctness of the calculation results, and on the other hand makes the calculation flow more suitable for the SIMD calculation model due to the mutex-free structure and the highly homogeneous calculation procedures.

Linked-list based Vehicle Sensing Indexes. Faced with the need for SIMD-friendly spatial relative position indexing, it is not appropriate to intuitively employ a binary tree search. This is due to the fact that the process of tree searching leads to the control flow divergence problem, which significantly reduces the efficiency of the operation under the SIMD computation model. Therefore, we choose a bidirectional ordered linked list data structure to build the index. One linked list records all vehicles in order of spatial location in one lane. Each node on the linked list additionally contains two sets of pointers to the front and rear vehicles in both the left lane and the right lane, respectively. With such an index structure, vehicle sensing always requires only one pointer operation, avoiding the control flow divergence problem. Since there are usually only a small number of vehicles entering or leaving the lane at each step, and the order of the original vehicles on the lane is basically unchanged, the number of operations such as adding new nodes, deleting nodes, and reordering of the linked list during the index update process is relatively small so that the impact on the computational performance is acceptable. With this design, we address the second problem by providing a SIMD-friendly vehicle sensing indexes with low update cost for vehicle simulation model computation.

To help make the simulator user-friendly, we also provide a toolchain for building simulator inputs and the simulator’s Python API.

Simulator Inputs and Toolchain. Following microscopic traffic simulation setup, the simulator inputs are map data and travel demand. The map data describes the geospatial attributes and topological relationships of road networks and the candidate traffic signal phases of junctions. Travel demand describes the vehicle’s origin, destination, departure time, and chosen route. These inputs are stored in a binary format defined by Protobuf222https://protobuf.dev/. In order to facilitate the construction of simulator inputs, we have developed a toolchain available at https://github.com/tsinghua-fib-lab/mosstool. The toolchain mainly provide map building based on OpenStreetMap333https://openstreetmap.org/ and real travel demand generation based on globally available public data represented by satellite imagery. By using this toolchain, users can quickly build maps, generate travel demands, and subsequently begin simulation and optimization.

Python API. The simulator exposes the C interfaces as Python API via pybind11. The Python API consists of a series of initialization functions, getter functions, setter functions, and the next_step function that control the progress of the simulation. The setter functions usually provide batch versions additionally with the _batch suffix to minimize the data transfer overhead for large numbers of calls. This Python API dose not directly provide the gymnasium-style reinforcement learning environments, but rather requires users to build the environment by combining the above functions according to the need of scenarios.

3.2 Controllable Objects

To support the major transportation system optimization scenarios, we set up the following APIs for the controllable objects of the transportation infrastructure and traffic participants, where the simulator instance in Python is always labeled with engine.

Traffic Signal. The simulator allows the user to set the traffic signal control policies for given junctions via engine.set_tl_policy(id, policy). The policy enumeration includes MANUAL, FIXED_TIME, MAX_PRESSURE and NONE. Under the MANUAL policy, the user can change the current phase and duration of the signal via engine.set_tl_phase(id, phase_index) and engine.set_tl_duration(id, duration). The FIXED_TIME policy indicates that the fixed phase procedure built into the map data is used. The MAX_PRESSURE policy indicates that the adaptive maximum pressure algorithm [31] is used. The NONE policy indicates that there is no signaling.

Lane. Lanes in the simulator include both clearly marked lanes on the roadway and "virtual" lanes within junctions that connect the two roadways. For lanes, the user can first set their maximum speed via engine.set_lane_max_speed(id, max_speed). Secondly, the user can set whether the lane is restricted from passing via engine.set_lane_restriction(id, flag).

Road. To support dynamic changes in lane function combinations, the roadway is pre-configured with multiple lane function combination plans. Of these, lane functions are referred to as being used for going straight, turning left, and turning right. The user can set the road’s lane function plan via engine.set_road_lane_plan(id, plan_index).

Vehicle. The user can change the route of the vehicle via engine.set_vehicle_route (vehicle_id, route, end_lane, end_s) to modify its route and destination lane position.

In addition to these controllable objects, the user can also change the map before simulating to build optimization scenarios.

3.3 Metrics

To make it easier for users to calculate common microscopic and macroscopic metrics, we also provide the following metric APIs.

Lane Queue Length. Lane queue length is used to count the number of vehicles waiting to be released at the end of the lane, which is a microscopic metric often used as an input to traffic signal control algorithms. The metric is provided via engine.get_lane_waiting_at_end_vehicle_counts().

Road Traveling Time. Road traveling time indicates the time taken by vehicles to pass through the road under the current traffic flow on the road, which is a microscopic metric that directly shows how congested the road is. The metric is provided via engine.get_road_average_vehicle_speed().

Average Traveling Time. Average traveling time (ATT) is the average time taken by all vehicles to complete a trip. It is a commonly used macroscopic metric that directly reflects the overall efficiency of the transportation system. The metric is provided via engine.get_finished_vehicle_average_traveling_time().

Throughput. Throughput (TP) is used to indicate how many vehicles complete a trip in a given time period. It is also commonly used as a macroscopic metric for assessing the efficiency and capacity of a transportation system. The metric is provided via engine.get_finished_vehicle_count().

4 Transportation System Optimization Scenarios

Refer to caption
Figure 2: The overview of the five transportation system optimization scenarios. (best view in color)

As shown in Figure 2, We choose three microscopic optimization scenarios and two macroscopic ones for benchmarking. The former ones focus on both junction-level and roadway-level transportation infrastructure control. The latter ones include pre-construction planning phase as well as the post-construction management phase.

Traffic Signal Control. Traffic signal control is the most convenient approach to optimize the transportation system, which is also the scenario where the AI optimization methods are most widely used in the research field of transportation. The approach adjusts the phase and duration of traffic signals at junctions to control the number of vehicles passing in different directions, making full use of road resources to reduce the time spent by vehicles in the transportation system. Therefore, the appropriate setting of signal phasing and timing taking into account the interactions between junctions will substantially affect the efficiency of the transportation system.

Dynamic Lane Assignment within Junctions. Dynamic lane assignment within junctions refers to the adaptive reallocation of lane functions, such as for straight, left turn or right turn, across all lanes at the junctions based on real-time traffic conditions. For example, when there is an increase in the number of left-turning vehicles in a particular direction at an junction, the method will increase the number of lanes on the corresponding roadway used for left-turning and decrease the number of lanes used for going straight, thereby decreasing the waiting time for vehicles at the junction. How to make the correct dynamic lane assignment based on the current situation and the prediction of the future is an important transportation system optimization problem.

Tidal Lane Control. Tidal lanes are a classical traffic management strategy to manage the increased traffic pressure that is predominantly in one direction during morning and evening rush hours. This method increases roadway capacity and reduces congestion by redirecting lane usage. For instance, during the morning rush hour, more lanes might be designated for inbound traffic, while in the evening, the direction is reversed to accommodate outbound traffic. Thus, optimization of the timing and direction of tidal lane adjustments can improve commuting efficiency throughout the city.

Congestion Pricing. Congestion pricing is a macroscopic traffic management strategy that uses congestion charges for vehicles driving into specific areas or roads to control and reduce traffic flow, thereby improving traffic conditions. Through such pricing tactics, vehicles will change to routes with lower costs. From a global perspective, a good pricing strategy will balance the traffic flow and traffic pressure in different areas, and thus improve the overall traffic congestion situation.

Road Planning. Building new roads is the most direct way to increase the carrying capacity of the transportation system. Properly planning the location of new roads and their relationship to existing roads is a prerequisite for maximizing the return on investment. In this scenario, we consider a numerous set of potential new road candidates and use optimization approaches to identify the road combinations that are optimal in terms of efficiency improvement of the overall transportation system under specific constraints such as total distances, total investment, number, etc.

5 Experiments

Simulator Performance. To illustrate the computational performance of our proposed simulators, we compared the computational efficiency of simulators including SUMO, CityFlow, CBLab, and our proposed simulator for different road network scales and vehicle sizes.

Refer to caption
Figure 3: The performance comparison with the number of vehicles. (best view in color)

We adopt the datasets from CBLab [19] with the MIT License, which includes 6 real-world city datasets and 9 synthesized datasets. We simulated 3600 steps for all the datasets and record the total running times as the performance of each simulator. All simulations are conducted in the same hardware environment with an Intel(R) Xeon(R) Platinum 8462Y CPU (64 threads) and an NVIDIA GeForce RTX 4090 GPU. As shown in Figure 3, the result indicates that our proposed simulator has a huge performance improvement over existing simulators. On the largest dataset, the running time of ours is 42.81s and that of the best baseline (CityFlow) is 3806.7s, a relative performance improvement of 88.09 times.


To benchmark the optimization algorithms for the five transportation system optimization scenarios described above, we chose Bei**g, Shanghai, Paris and New York as test cities. The road networks of these cities are built using our mosstool toolchain. The real origin-destination (OD) matrices of these cities are also generated by mosstool using generative AI methods. As synthetic datasets, in terms of vehicle departure times, we kept only the morning and evening peaks to challenge the optimization algorithms for each scenario. The total number of vehicles was scaled based on the generated real OD matrix to construct travel demand data for three different congestion levels including smooth (marked as City-S), normal (marked as City-N), congested (marked as City-C). More information on the synthetic datasets will be provided in Appendix A. We evaluate the optimization effectiveness of different algorithms under the above cities and congestion levels, using ATT and TP as global metrics for comparison. In the following text, the comparisons of the various optimization algorithms used in all the five scenarios and their performance under normal congestion will be reported. The detailed experimental settings and the complete results are presented in Appendix B and Appendix C respectively due to the page limit.

Traffic Signal Control Benchmark. In this scenario, the task is to choose the best traffic signal signal phase from the list of available phases for each junction. We compared the rule-based algorithms, including fixed-time algorithm [15] and maximum pressure algorithm [31], and the reinforcement learning-based algorithms, including FRAP [42], MPLight [3], CoLight [35], Efficient-MPLight [38], Advanced-MPLight and Advanced-Colight [40], as well as a pressure-based model trained with PPO [27]. The related algorithms are trained and tested in the morning rush hour scenario, from 7:00 to 10:00. The results are presented in Table 2.

Table 2: The benchmark results for the traffic signal control scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
FixedTime 4843.05 120049 4324.07 131020 4245.86 60020 4682.07 72725
MaxPressure 4580.41 132055 4045.75 144640 3984.20 64405 4309.91 83927
FRAP 5105.22 112321 4671.52 121896 4404.23 58481 5002.48 68196
MPLight 4790.12 124991 4674.93 122198 3980.07 64921 4196.37 85331
CoLight 5108.88 112672 4640.50 124565 4413.91 58513 4989.17 68184
Efficient-MPLight 5101.14 113272 4480.28 132995 4364.89 60428 5019.92 67303
Advanced-MPLight 5049.47 117568 4603.57 127911 4281.72 61470 5031.15 67322
Advanced-CoLight 5107.23 112778 4661.07 123336 4408.14 58929 5014.66 68292
PPO 4452.05 136630 4143.19 141768 4017.61 64277 4254.92 84411

Dynamic Lane Assignment within Junctions Benchmark. In this scenario, the task is to assign the direction, e.g. left or straight, for the in-going lanes of each junction. We compared the following methods: 1) NoChange, where we do not change the direction of the lane and leave it as it is, 2) Random, where we randomly change the direction in every period, 3) Rule, where we estimate the number of vehicles going for each direction and choosing the direction with the maximum number of vehicles, 4) PPO, where we use a PPO-trained model to estimate the number of vehicles. The above algorithms are trained and tested in the morning rush hour scenario, from 7:00 to 10:00. The results are presented in Table 3.

Table 3: The benchmark results for the dynamic lane assignment scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4846.70 119890 4322.21 131145 4245.84 60020 4674.08 72870
Random 4839.71 120338 4324.96 131216 4176.70 61810 4636.11 74055
Rule 4761.33 123673 4258.22 133346 4155.11 62366 4615.01 74254
PPO 4769.98 122929 4256.51 133379 4160.89 61792 4614.52 73907

Tidal Lane Control Benchmark. In this scenario, the task is to switch the direction of the tidal lane to be forward or backward. We compared the following methods: 1) NoChange, where we disable the tidal lane, 2) Random, where we randomly change the direction in every period, 3) Rule, where we count the number of vehicles going in each direction and choosing the direction with the maximum number of vehicles, 4) PPO, where we use a PPO-trained model to estimate the number of vehicles. The above algorithms are trained and tested in the morning rush hour scenario, from 7:00 to 10:00. The results are presented in Table 4.

Table 4: The benchmark results for the tidal lane control scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4844.57 120105 4334.80 130604 4224.94 60794 4675.10 72834
Random 4827.75 120778 4338.40 130283 4216.17 60946 4665.83 73335
Rule 4823.40 120901 4313.48 131315 4192.85 61636 4638.03 74284
PPO 4820.48 120936 4304.29 132167 4187.37 61738 4628.47 74756

Congestion Pricing Benchmark. In this scenario, each driver has three candidate routes and the task is to set the price of each road to motivate drivers to choose the route that avoids congested areas. We compared ΔΔ\Deltaroman_Δ-toll [28] and EBGtoll [26] with two baselines: 1) NoChange, where we do not set the prices, 2) Random, where the drivers randomly choose a route. The above algorithms are trained and tested in the morning rush hour scenario, from 7:00 to 10:00. The results are presented in Table 5.

Table 5: The benchmark results for the congestion pricing scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4840.18 120207 4328.34 130621 4239.23 60100 4681.75 72765
Random 5190.37 105640 4422.00 131346 4144.47 62284 4830.86 68976
ΔΔ\Deltaroman_Δ-toll 4667.77 131747 4096.62 147533 4040.68 65024 4549.26 78182
EBGtoll 5637.30 80476 4637.64 116624 4240.82 60362 5096.60 59154

Road Planning Benchmark. In this scenario, the algorithms are asked to select at most 30 roads from 50 candidates for construction to minimize post-construction ATT. We compared 5 methods: 1) NoChange, no of these 50 roads are built, 2) Random, where we select random roads to build, 3) Rule-based, where we select the top-30 vehicle count roads to build, 4) simulated annealing [14], 5) bayesian optimization [6]. The above algorithms are tested both on morning peak from 6:00 to 12:00 and evening peak from 17:00 to 23:00 and computes the mean of metrics. The result are presented in Table 6.

Table 6: The benchmark results for the road planning scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
No-Change 8439.44 161722 6699.66 161722 7193.90 76327 7892.50 105533
Random 8304.71 163148 6567.12 181063 7106.19 75839 7967.24 102996
Rule 8235.49 164247 6570.72 181504 7177.88 76176 7956.08 102951
SA 8332.44 163660 6590.66 180954 7154.16 76164 7871.23 105507
GeneralBO 8242.19 164182 6721.78 178790 7161.74 75979 7759.88 106883

6 Conclusion

In this paper, we propose a high-performance large-scale microscopic simulator powered by GPU for transportation system simulation and optimization. We also benchmarked the effect of different optimization algorithms on five transportation system optimization scenarios with different traffic flows in four cities. Interested researchers can use the same pipeline to benchmark most cities around the world with our open source simulator and toolchain. We believe that the proposed simulator will contribute to more researchers joining the research work on urban transportation system optimization. We hope that this will not only support more research work on transportation system optimization scenarios, but also promote the development of urban transportation systems towards AI-driven intelligent transportation systems.

References

  • [1] Michael Behrisch, Laura Bieker, Jakob Erdmann, and Daniel Krajzewicz. Sumo–simulation of urban mobility: an overview. In Proceedings of SIMUL 2011, The Third International Conference on Advances in System Simulation. ThinkMind, 2011.
  • [2] Hamid Mirzaei Buini, Guni Sharon, Stephen D. Boyles, Tony Givargis, and Peter Stone. Enhanced delta-tolling: Traffic optimization via policy gradient reinforcement learning. 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 47–52, 2018.
  • [3] C. Chen, H. Wei, N. Xu, G. Zheng, M. Yang, Y. Xiong, K. Xu, and Z. Li. Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 3414–3421, 2020.
  • [4] Haipeng Chen, Bo An, Guni Sharon, Josiah P. Hanna, Peter Stone, Chunyan Miao, and Yeng Chai Soh. Dyetc: Dynamic electronic toll collection for traffic congestion alleviation. In AAAI Conference on Artificial Intelligence, 2018.
  • [5] Alberto Costa and Giacomo Nannicini. Rbfopt: an open-source library for black-box optimization with costly function evaluations. Mathematical Programming Computation, 10:597–629, 2018.
  • [6] Alexander I Cowen-Rivers, Wenlong Lyu, Zhi Wang, Rasul Tutunov, Hao Jianye, Jun Wang, and Haitham Bou Ammar. Hebo: Heteroscedastic evolutionary bayesian optimisation. arXiv preprint arXiv:2012.03826, page 7, 2020.
  • [7] Shuo Feng, Xintao Yan, Haowei Sun, Yiheng Feng, and Henry X Liu. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nature communications, 12(1):748, 2021.
  • [8] Christian Gawron. An iterative algorithm to determine the dynamic user equilibrium in a traffic simulation model. International Journal of Modern Physics C, 9(03):393–407, 1998.
  • [9] PTV Group. Transport Planning Software | PTV Visum. https://www.ptvgroup.com/en/products/ptv-visum. Accessed: 2024-06-03.
  • [10] Nikolaus Hansen, Anne Auger, Raymond Ros, Steffen Finck, and Petr Pošík. Comparing results of 31 algorithms from the black-box optimization benchmarking bbob-2009. In Proceedings of the 12th annual conference companion on Genetic and evolutionary computation, pages 1689–1696, 2010.
  • [11] Qize Jiang, **gze Li, Weiwei Sun, and Baihua Zheng. Dynamic lane traffic signal control with group attention and multi-timescale reinforcement learning. In International Joint Conference on Artificial Intelligence, 2021.
  • [12] Arne Kesting, Martin Treiber, and Dirk Helbing. General lane-changing model mobil for car-following models. Transportation Research Record, 1999(1):86–94, 2007.
  • [13] Leonard Kirago, Michael J Gatari, Örjan Gustafsson, and August Andersson. Black carbon emissions from traffic contribute substantially to air pollution in nairobi, kenya. Communications Earth & Environment, 3(1):74, 2022.
  • [14] Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. Optimization by simulated annealing. science, 220(4598):671–680, 1983.
  • [15] P. Koonce and L. Rodegerdts. Traffic signal timing manual. Technical report, United States. Federal Highway Administration, 2008.
  • [16] Lili Li, Zhao wei Qu, Xian min Song, and Dianhai Wang. Research on variable lane signalized control method. 2009 International Conference on Measuring Technology and Mechatronics Automation, 3:575–578, 2009.
  • [17] Tao Li, Nengmin Wang, Meng Zhang, and Zheng wen He. Dynamic reversible lane optimization in autonomous driving environments: Balancing efficiency and safety. Journal of Industrial and Management Optimization, 2023.
  • [18] Xu Li, Jun-Hua Chen, and Hao Wang. Study on flow direction changing method of reversible lanes on urban arterial roadways in china. Procedia - Social and Behavioral Sciences, 96:807–816, 2013.
  • [19] Chumeng Liang, Zherui Huang, Yicheng Liu, Zhanyu Liu, Guanjie Zheng, Hanyuan Shi, Kan Wu, Yuhao Du, Fuliang Li, and Zhenhui Jessie Li. Cblab: Supporting the training of large-scale traffic control policies with scalable traffic simulation. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4449–4460, 2023.
  • [20] Yiling Liu, Guiyang Luo, Quan Yuan, **glin Li, Lei **, Bo Chen, and Rui Pan. Gplight: grouped multi-agent reinforcement learning for large-scale traffic signal control. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 199–207, 2023.
  • [21] HS Mahmassani. Dynamic traffic assignment and simulation for advanced network informatics (dynasmart). In the 2nd International Seminar on Urban Traffic Networks, 1992, 1992.
  • [22] Michael G McNally. The four-step model. In Handbook of transport modelling, volume 1, pages 35–53. Emerald Group Publishing Limited, 2007.
  • [23] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  • [24] Afshin Oroojlooy, M. Nazari, Davood Ha**ezhad, and Jorge Silva. Attendlight: Universal attention-based reinforcement learning model for traffic signal control. Advances in Neural Information Processing Systems, 2020.
  • [25] Venktesh Pandey and Stephen D. Boyles. Multiagent reinforcement learning algorithm for distributed dynamic pricing of managed lanes. 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 2346–2351, 2018.
  • [26] Wei Qiu, Haipeng Chen, and Bo An. Dynamic electronic toll collection via multi-agent deep reinforcement learning with edge-based graph convolutional networks. In IJCAI, pages 4568–4574, 2019.
  • [27] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  • [28] Guni Sharon, Michael W Levin, Josiah P Hanna, Tarun Rambha, Stephen D Boyles, and Peter Stone. Network-wide adaptive tolling for connected and automated vehicles. Transportation Research Part C: Emerging Technologies, 84:142–157, 2017.
  • [29] Martin Treiber, Ansgar Hennecke, and Dirk Helbing. Congested traffic states in empirical observations and microscopic simulations. Physical review E, 62(2):1805, 2000.
  • [30] Martin Treiber, Arne Kesting, and Christian Thiemann. How much does traffic congestion increase fuel consumption and emissions? applying a fuel consumption model to the ngsim trajectory data. In 87th Annual Meeting of the Transportation Research Board, Washington, DC, volume 71, pages 1–18, 2008.
  • [31] Pravin Varaiya. Max pressure control of a network of signalized intersections. Transportation Research Part C: Emerging Technologies, 36:177–195, 2013.
  • [32] Kay W Axhausen, Andreas Horni, and Kai Nagel. The multi-agent transport simulation MATSim. Ubiquity Press, 2016.
  • [33] Qi Wang, Haixia Feng, Haiying Feng, Yue Yu, Jian Li, and Erwei Ning. The impacts of road traffic on urban air quality in **an based gwr and remote sensing. Scientific reports, 11(1):15512, 2021.
  • [34] Yiheng Wang, Hexi **, and Guanjie Zheng. Ctrl: Cooperative traffic tolling via reinforcement learning. Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022.
  • [35] Hua Wei, Nan Xu, Huichu Zhang, Guanjie Zheng, Xinshi Zang, Chacha Chen, Weinan Zhang, Yanmin Zhu, Kai Xu, and Zhenhui Li. Colight: Learning network-level cooperation for traffic signal control. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 1913–1922, 2019.
  • [36] Qiang Wu, Ming Li, Jun Shen, Linyuan Lü, Bo Du, and Kecheng Zhang. Transformerlight: A novel sequence modeling based traffic signaling mechanism via gated transformer. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023.
  • [37] Qiang Wu, Mingyuan Li, Jun Shen, Linyuan Lü, Bo Du, and Ke Zhang. Transformerlight: A novel sequence modeling based traffic signaling mechanism via gated transformer. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2639–2647, 2023.
  • [38] Qiang Wu, Liang Zhang, Jun Shen, Linyuan Lü, Bo Du, and Jianqing Wu. Efficient pressure: Improving efficiency for signalized intersections. arXiv preprint arXiv:2112.02336, 2021.
  • [39] Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan Zhou, Weinan Zhang, Yong Yu, Haiming **, and Zhenhui Li. Cityflow: A multi-agent reinforcement learning environment for large scale city traffic scenario. In The world wide web conference, pages 3620–3624, 2019.
  • [40] Liang Zhang, Qiang Wu, Jun Shen, Linyuan Lü, Bo Du, and Jianqing Wu. Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control. In International Conference on Machine Learning, pages 26645–26654. PMLR, 2022.
  • [41] Zuoting Zhang and Suhua Tang. Enhancing urban road network by combining route planning and dynamic lane reversal. 2021 Thirteenth International Conference on Mobile Computing and Ubiquitous Network (ICMU), pages 1–6, 2021.
  • [42] Guanjie Zheng, Yuanhao Xiong, Xinshi Zang, Jie Feng, Hua Wei, Huichu Zhang, Yong Li, Kai Xu, and Zhenhui Li. Learning phase competition for traffic signal control. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 1963–1972, 2019.
  • [43] Lihua Zhou, Juanjuan Li, and Kangkang Ding. Research on variable lane control method based on traffic priority. In International Conferences on Artificial Intelligence, Information Processing and Cloud Computing, 2019.

Checklist

  1. 1.

    For all authors…

    1. (a)

      Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope? [Yes]

    2. (b)

      Did you describe the limitations of your work? [No]

    3. (c)

      Did you discuss any potential negative societal impacts of your work? [No] We do not think that the traffic simulator will have any negative societal impacts.

    4. (d)

      Have you read the ethics review guidelines and ensured that your paper conforms to them? [Yes]

  2. 2.

    If you are including theoretical results…

    1. (a)

      Did you state the full set of assumptions of all theoretical results? [N/A] There is no theoretical result in the paper.

    2. (b)

      Did you include complete proofs of all theoretical results? [N/A] There is no theoretical result in the paper.

  3. 3.

    If you ran experiments (e.g. for benchmarks)…

    1. (a)

      Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Provided in our codes URL.

    2. (b)

      Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Provided in Appendix A and Appendix B.

    3. (c)

      Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? [No]

    4. (d)

      Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]

  4. 4.

    If you are using existing assets (e.g., code, data, models) or curating/releasing new assets…

    1. (a)

      If your work uses existing assets, did you cite the creators? [Yes] We use CBLab datasets for performance evaluation and already cite it in the paper.

    2. (b)

      Did you mention the license of the assets? [Yes]

    3. (c)

      Did you include any new assets either in the supplemental material or as a URL? [Yes] We provide all the scenario codes and our synthetic datasets in the Github URL.

    4. (d)

      Did you discuss whether and how consent was obtained from people whose data you’re using/curating? [N/A] The data are generated from public data sources.

    5. (e)

      Did you discuss whether the data you are using/curating contains personally identifiable information or offensive content? [N/A] The data are generated without any personally identifiable information or offensive content.

  5. 5.

    If you used crowdsourcing or conducted research with human subjects…

    1. (a)

      Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] No crowdsourcing was used and no human subjects were included.

    2. (b)

      Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] No crowdsourcing was used and no human subjects were included.

    3. (c)

      Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? [N/A] No crowdsourcing was used and no human subjects were included.

Supplementary material

We present the following items in the supplementary material section:

  1. 1.

    Datasets for transportation system optimization benchmarking. (Section A)

  2. 2.

    The settings of transportation system optimization scenarios. (Section B)

  3. 3.

    Complete benchmark results. (Section C)

Appendix A Datasets for Transportation System Optimization Benchmarking

In the section, we describe in detail the process of constructing the datasets used for transportation system optimizaiton benchmarking. We chose 4 representative big cities around the world, including Bei**g, Shanghai, Paris and New York, as our targets. We use OpenStreetMap (OSM) 444https://openstreetmap.org/ as the data source for map construction and a diffusion model based on publicly available data represented by satellite imagery as input to generate realistic travel origin-destination (OD) matrices as travel demands.

Map Building. First, based on mosstool, we selected the bounding boxes as shown in Table 7 for each city for to build the map.

Table 7: Geometry bounding boxes of four cities.
Bounding Box maximum latitude minimum latitude maximum longitude minimum longitude
Bei**g 40.131 39.771 116.626 116.158
Shanghai 31.389 31.100 121.676 121.313
Paris 48.949 48.745 2.514 2.131
New York 40.941 40.567 -73.697 -74.058

Specifically, we first use code like https://github.com/tsinghua-fib-lab/mosstool/blob/main/examples/map_osm2geojson.py to convert OSM data within the bounding boxes to GeoJSON format, and secondly use code like https://github.com/tsinghua-fib-lab/mosstool/blob/main/examples/build_map.py to build maps from GeoJSON format data. The statistics of the maps of the four cities are shown in Table 8

Table 8: Statistics of the four maps.
Statistics # of roads # of junctions
Bei**g 25945 11953
Shanghai 14837 6270
Paris 14411 6588
New York 19046 8339

Realistic OD Matrix Generation. In order to generate realistic travel demands of the four cities, we perform OD matrix generation based on a diffusion model that has been pre-trained in several regions around the world. The diffusion model is also provided in mosstool.

Obtaining Travel Demand of Different Congestion Levels. In order not to introduce too many variables, we assume that driving is used for all trips. In addition, to better represent commuting traffic, we assume that the departure times of all vehicles are limited to the morning and evening peaks. And we scale the total traffic volume to get the travel demand under different congestion levels. For the morning peak, we adjust vehicles’ arrival time to create a morning peak flows by using a uniform distribution between 8 o’clock and 9 o’clock. We then subtracted the estimated travel time from the arrival time to get the departure time. The estimated travel time is calculated by dividing the route length by the vehicle speed, which is set as 60km/h60𝑘𝑚60km/h60 italic_k italic_m / italic_h for our experiment.

Similarly, we create an evening peak group by exchanging the origin and destination of individuals from the morning peak flows. Their departure times are set to be uniformly distributed between 17 o’clock and 18 o’clock.

After completing the above steps, we assign the route based on the shortest time to each vehicles and remove those who are unable to reach their destinations. For each city, we scale the number of vehicles and observe the arrival rate of all vehicles, which refers to the rate of vehicles that successfully reached their destinations, to construct datasets with different congestion levels.

The arrival rate of the dataset is determined as the minimum rate between the morning and evening peak periods. Specifically, an arrival rate of 80%percent8080\%80 % is considered congested, 90%percent9090\%90 % is considered normal, and 95%percent9595\%95 % is considered smooth. Based on the above rates, we construct the travel demand datasets under different congestion levels in the four cities, and the relevant statistics are shown in Table 9.

Table 9: # of trips of datasets.
Congestion Level Smooth Normal Congested
Bei**g 350838 439280 571412
Shanghai 348880 436888 612160
Paris 154276 202664 251236
New York 218712 262706 306078

Appendix B The Settings of Transportation System Optimization Scenarios

All experiments are conducted in the same hardware environment with an Intel(R) Xeon(R) Platinum 8462Y CPU (64 threads) and an NVIDIA GeForce RTX 4090 GPU. The training time varies across different scenarios. The optimal hyper-parameters are grid-searched and hard-coded into the released code. Please refer to the release files for detailed hyper-parameter settings.

B.1 Traffic Signal Control.

Scenario. There are multiple junctions in the road network with traffic signals to be controlled. Each junction has a list of available traffic signal phases predefined according to the geometry of the junction, like the number and direction of the incoming and outgoing lanes. Every T=30𝑇30T=30italic_T = 30 seconds, the agent has to choose one phase from the list to be applied in the next period.

Observation. The observation includes the geometry of the junction and the number of (all/waiting) vehicles on each lane.

Action. Choose one phase from the given list.

Reward. Opposite of the average number of waiting vehicles on the incoming lanes.

Training. The learning-based methods are all trained for 4 hours.

B.2 Dynamic Lane Assignment within Junctions.

Scenario. There are multiple roads in the road network with dynamic lanes at the end where the roads connect to junctions. Each road has exactly one dynamic lane whose direction can be either LEFT or STRAIGHT. Every T=30𝑇30T=30italic_T = 30 seconds, the agent has to assign the direction of the dynamic lane.

Observation. The observation includes the geometry of the junction and the number of (all/waiting) vehicles on each lane.

Action. Choose one of the two directions.

Reward. Opposite of the average number of waiting vehicles on the lanes of the road.

Training. The learning-based methods are all trained for 3 hours.

B.3 Tidal Lane Control.

Scenario. There are multiple road pairs in the road network with tidal lanes. Each road pair has exactly one tidal lane in the center whose direction can be either FORWARD or BACKWARD. Every T=180𝑇180T=180italic_T = 180 seconds, the agent has to choose the direction of the tidal lane.

Observation. The observation includes the geometry of the road and the number of (all/waiting) vehicles on each lane.

Action. Choose one of the two directions.

Reward. Opposite of the average number of waiting vehicles on the lanes of the road.

Training. The learning-based methods are all trained for 3 hours.

B.4 Congestion Pricing.

Scenario. All the roads in the road network can be set with a congestion price for vehicles traveling through it. Every T=20𝑇20T=20italic_T = 20 seconds, the agent can change the prices according to the traffic condition.

Observation. The observation includes the geometry of the road network and the number of (all/waiting) vehicles on each lane.

Action. Set the prices for each road.

Reward. The number of finished vehicles in the past period.

Training. The learning-based methods are all trained for 3 hours.

B.5 Road Planning.

Scenario. In the road network, there are multiple newly constructed roads during the past five years. Each of these roads has two statuses, either KEEP or REMOVE. The algorithms observe the ATT and are asked to minimize the ATT by setting the road statues as KEEP or REMOVE.

Candidate Roads Identification. For each city, we extract driving roads from OSM of 2019. We match every road in our map to the road network of 2019, there are three aspects to evaluate the matching, the distance between two roads, the distance between the middle point of road in our map and difference highway level between two roads. Any road that cannot be matched with any roads in 2019 is identified as a newly constructed road, and regarded as a candidate road. The spatial and length statistics of candidate roads are shown in Figure 4. We select the 50 roads with the highest number of vehicles from each candidate set as the optimization set for the algorithm.

Refer to caption
Figure 4: The spatial and length distribution of all candidate roads.
Table 10: # of candidate roads.
Basic Statistics # of roads
Bei**g 263
Shanghai 136
Paris 156
New York 612

Appendix C Complete Benchmark Results

C.1 Traffic Signal Control.

Table 11: The benchmark results for the traffic signal control scenario with smooth traffic conditions.
Method Bei**g-S Shanghai-S Paris-S New York-S
ATT TP ATT TP ATT TP ATT TP
FixedTime 4426.51 109588 3878.57 120449 3749.65 52459 4369.44 67046
MaxPressure 4099.14 122737 3523.45 132028 3477.37 55123 3951.76 77018
FRAP 4741.04 102556 4261.76 113613 3920.69 51301 4725.86 63572
MPLight 4086.96 121620 3474.95 133044 3486.59 55946 3840.82 78125
CoLight 4757.27 102433 4221.91 115244 3922.72 51351 4714.33 63553
Efficient-MPLight 4529.54 110199 4013.16 121301 3953.45 51236 4390.86 69279
Advanced-MPLight 4750.34 102649 3997.82 120529 3786.51 53982 4378.73 70352
Advanced-CoLight 4740.79 103172 4245.06 114927 3945.78 51164 4748.60 63463
PPO 4005.74 124001 3636.27 130485 3485.31 55915 3914.77 77007
Table 12: The benchmark results for the traffic signal control scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
FixedTime 4843.05 120049 4324.07 131020 4245.86 60020 4682.07 72725
MaxPressure 4580.41 132055 4045.75 144640 3984.20 64405 4309.91 83927
FRAP 5105.22 112321 4671.52 121896 4404.23 58481 5002.48 68196
MPLight 4790.12 124991 4674.93 122198 3980.07 64921 4196.37 85331
CoLight 5108.88 112672 4640.50 124565 4413.91 58513 4989.17 68184
Efficient-MPLight 5101.14 113272 4480.28 132995 4364.89 60428 5019.92 67303
Advanced-MPLight 5049.47 117568 4603.57 127911 4281.72 61470 5031.15 67322
Advanced-CoLight 5107.23 112778 4661.07 123336 4408.14 58929 5014.66 68292
PPO 4452.05 136630 4143.19 141768 4017.61 64277 4254.92 84411
Table 13: The benchmark results for the traffic signal control scenario with congested traffic conditions.
Method Bei**g-C Shanghai-C Paris-C New York-C
ATT TP ATT TP ATT TP ATT TP
FixedTime 5274.90 130419 4928.58 145308 4587.84 66608 4923.74 76165
MaxPressure 5106.47 142654 4790.71 158804 4371.45 71874 4596.29 87897
FRAP 5490.86 122061 5205.55 133039 4743.12 64711 5219.06 71448
MPLight 5496.34 121452 4740.95 160551 4378.05 72643 4468.59 90632
CoLight 5484.53 122508 5185.19 135131 4750.70 64463 5214.91 72091
Efficient-MPLight 5375.55 130151 5104.42 142562 4718.07 65795 5225.18 71556
Advanced-MPLight 5358.05 129512 5129.86 138727 4650.76 67593 5227.87 71713
Advanced-CoLight 5481.23 122564 5207.07 133709 4756.00 64517 5222.79 71697
PPO 4975.47 149589 4814.59 155279 4394.63 71491 4535.97 90111

C.2 Dynamic Lane Assignment within Junctions.

Table 14: The benchmark results for the dynamic lane assignment scenario with smooth traffic conditions.
Method Bei**g-S Shanghai-S Paris-S New York-S
ATT TP ATT TP ATT TP ATT TP
NoChange 4426.43 109588 3878.56 120449 3749.64 52459 4369.47 67046
Random 4435.44 109450 3875.35 120896 3681.61 53474 4344.24 67271
Rule 4340.58 112462 3807.56 122688 3675.52 53244 4310.52 67737
PPO 4345.00 111391 3804.21 122692 3663.37 53629 4309.67 67626
Table 15: The benchmark results for the dynamic lane assignment scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4846.70 119890 4322.21 131145 4245.84 60020 4674.08 72870
Random 4839.71 120338 4324.96 131216 4176.70 61810 4636.11 74055
Rule 4761.33 123673 4258.22 133346 4155.11 62366 4615.01 74254
PPO 4769.98 122929 4256.51 133379 4160.89 61792 4614.52 73907
Table 16: The benchmark results for the dynamic lane assignment scenario with congested traffic conditions.
Method Bei**g-C Shanghai-C Paris-C New York-C
ATT TP ATT TP ATT TP ATT TP
NoChange 5275.00 130419 4928.51 145308 4587.81 66608 4923.79 76165
Random 5294.29 129976 4932.60 144774 4539.27 68410 4907.50 77460
Rule 5221.05 134661 4875.04 148393 4521.80 68795 4863.84 78469
PPO 5240.12 131854 4874.53 148228 4522.36 68413 4854.74 78857

C.3 Tidal Lane Control.

Table 17: The benchmark results for the tidal lane control scenario with smooth traffic conditions.
Method Bei**g-S Shanghai-S Paris-S New York-S
ATT TP ATT TP ATT TP ATT TP
NoChange 4422.68 109912 3884.78 120156 3723.22 52992 4388.48 66269
Random 4418.86 110150 3887.78 119940 3711.32 53313 4360.21 67295
Rule 4416.80 109997 3870.30 120366 3687.63 53626 4341.27 68113
PPO 4411.00 110161 3857.94 121344 3674.53 53922 4325.07 68775
Table 18: The benchmark results for the tidal lane control scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4844.57 120105 4334.80 130604 4224.94 60794 4675.10 72834
Random 4827.75 120778 4338.40 130283 4216.17 60946 4665.83 73335
Rule 4823.40 120901 4313.48 131315 4192.85 61636 4638.03 74284
PPO 4820.48 120936 4304.29 132167 4187.37 61738 4628.47 74756
Table 19: The benchmark results for the tidal lane control scenario with congested traffic conditions.
Method Bei**g-C Shanghai-C Paris-C New York-C
ATT TP ATT TP ATT TP ATT TP
NoChange 5278.35 130133 4937.88 144546 4575.58 67201 4923.74 76253
Random 5274.19 130456 4937.78 143921 4568.53 67661 4903.82 77236
Rule 5259.98 131304 4908.70 146094 4555.47 68025 4877.93 79096
PPO 5258.72 131709 4909.53 146145 4544.70 68274 4860.92 79731

C.4 Congestion Pricing.

Table 20: The benchmark results for the congestion pricing scenario with smooth traffic conditions.
Method Bei**g-S Shanghai-S Paris-S New York-S
ATT TP ATT TP ATT TP ATT TP
NoChange 4433.37 109604 3875.91 120444 3743.18 52611 4368.11 66984
Random 4865.59 96231 3969.04 121137 3705.23 53904 4571.66 63670
ΔΔ\Deltaroman_Δ-toll 4267.05 118077 3630.38 133056 3611.00 54762 4246.56 72188
EBGtoll 5348.10 75078 4230.05 107683 3759.70 52424 4837.93 55155
Table 21: The benchmark results for the congestion pricing scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
NoChange 4840.18 120207 4328.34 130621 4239.23 60100 4681.75 72765
Random 5190.37 105640 4422.00 131346 4144.47 62284 4830.86 68976
ΔΔ\Deltaroman_Δ-toll 4667.77 131747 4096.62 147533 4040.68 65024 4549.26 78182
EBGtoll 5637.30 80476 4637.64 116624 4240.82 60362 5096.60 59154
Table 22: The benchmark results for the congestion pricing scenario with congested traffic conditions.
Method Bei**g-C Shanghai-C Paris-C New York-C
ATT TP ATT TP ATT TP ATT TP
NoChange 5281.92 129904 4927.87 145154 4588.83 66603 4928.00 76140
Random 5558.59 115184 5027.27 143400 4488.19 69075 5066.64 72916
ΔΔ\Deltaroman_Δ-toll 5141.39 143518 4767.74 162829 4399.95 72309 4801.01 82602
EBGtoll 5912.51 86945 5182.92 128301 4571.64 66971 5275.42 63052

C.5 Road Planning.

Table 23: The benchmark results for the road planning scenario with smooth traffic conditions.
Method Bei**g-S Shanghai-S Paris-S New York-S
ATT TP ATT TP ATT TP ATT TP
No-Change 7411.53 137793 5432.11 151689 5997.46 62016 6954.14 92756
Random 7216.43 139070 5325.55 151828 5899.89 62070 7176.53 88947
Rule 7240.73 138795 5352.50 152124 5907.31 62155 6942.72 92463
SA 7272.01 139326 5358.86 151983 5772.82 62990 6921.12 92754
GeneralBO 7102.82 139983 5297.74 152046 5755.44 62478 6808.24 93858
Table 24: The benchmark results for the road planning scenario with normal traffic conditions.
Method Bei**g-N Shanghai-N Paris-N New York-N
ATT TP ATT TP ATT TP ATT TP
No-Change 8439.44 161722 6699.66 161722 7193.90 76327 7892.50 105533
Random 8304.71 163148 6567.12 181063 7106.19 75839 7967.24 102996
Rule 8235.49 164247 6570.72 181504 7177.88 76176 7956.08 102951
SA 8332.44 163660 6590.66 180954 7154.16 76164 7871.23 105507
GeneralBO 8242.19 164182 6721.78 178790 7161.74 75979 7759.88 106883
Table 25: The benchmark results for the road planning scenario with congested traffic conditions.
Method Bei**g-C Shanghai-C Paris-C New York-C
ATT TP ATT TP ATT TP ATT TP
No-Change 9684.37 191786 8793.09 216625 8310.09 87786 8624.63 117606
Random 9496.80 195463 8651.96 219199 8230.79 87704 8663.12 115278
Rule 9489.22 195427 8656.58 218214 8250.89 87872 8682.73 114650
SA 9487.05 195918 8600.00 219711 8188.43 88650 8589.73 118036
GeneralBO 9369.14 198446 8508.84 221049 8161.15 88803 8542.19 118579