Large-Scale Urban Traffic Management Using Zero-Shot Knowledge Transfer in Multi-Agent Reinforcement Learning for Intersection Patterns
<p>Examples of different configurations covered by the 3-way and 4-way intersection patterns.</p> "> Figure 2
<p>Overview structure of the proposed method for traffic control in road networks consisting of three major modules.</p> "> Figure 3
<p>Traffic control in a 4-way intersection pattern using the proposed MARL scheme. Every road-agent (<math display="inline"><semantics> <mrow> <mi>R</mi> <msub> <mi>A</mi> <mi>i</mi> </msub> </mrow> </semantics></math>) is responsible for safely guiding vehicles through its designated road segment, while cooperatively coordinating with the other agents (in our case three).</p> "> Figure 4
<p>Matching the network’s intersections with two default intersection patterns. The road network contains four and two copies of the default “4-way” and “3-way” intersection patterns, respectively. The light-colored thin strip corresponds to a route that a vehicle may follow within this network.</p> "> Figure 5
<p>Learning curves of the 3-way and 4-way intersection patterns in terms of the average velocity, duration, and collisions per epoch, created by using a rolling window of fifty (50) episodes.</p> "> Figure 6
<p>Dynamic evolution of both the average velocity and the frequency of vehicles being served per second, obtained from the implementation of learned multi-agent policies on intersection patterns within a designated test scenario. Traffic state colored zones are also shown.</p> "> Figure 7
<p>Four artificial road networks of increasing complexity that were generated for evaluating the knowledge transfer process. Every intersection is a noisy copy of either the default 3-way or 4-way intersection pattern.</p> ">
Abstract
:1. Introduction
- We introduce an efficient multi-agent reinforcement learning scheme for modeling the traffic flow management at intersections. We consider the incoming roads to the intersections as agents, effectively mitigating the escalating complexity and constraints associated with conventional methods that focus on modeling agents at the vehicle level. Another novelty is the introduction of a priority feature within the state space to ensure seamless coordination among agents at intersections by incorporating this centralized vision, we aim to enhance the overall efficiency and safety of traffic flow by offering active solutions that can dynamically respond to changing traffic conditions.
- We aim to provide a compact strategy for urban traffic congestion by dividing the road network into homogeneous structures based on intersection patterns. This offers the advantage of independence from traffic fluctuations and immunity to changes of traffic density. Moreover, partitioning the road network enhances coordination among traffic patterns at a higher level, leading to improved traffic flow and a more consistent and reliable traffic management system.
- We promote an advantageous zero-shot knowledge transfer process that associates intersections in road networks with learned multi-agent policies. Our approach creates adaptable driving profiles for existing intersection patterns with exceptional generalization capabilities, facilitating seamless deployment of a traffic management system to address urban congestion effectively and rapidly.
2. Related Work
3. Proposed Method
3.1. Overall Description of the Method
- 1.
- Establishment of a well-organized and compact system for managing traffic flow and optimizing transportation networks.
- 2.
- Introduction of traffic and congestion patterns for modeling specific types of intersections.
3.2. Multi-Agent Reinforcement Learning for Traffic Management in Intersection Patterns
3.2.1. MDP Formulation
- is a society of agents ;
- S denotes the state space of the environment. Each agent, i, observes its local state, ;
- A denotes the set of possible actions of any agent. Each agent, i, takes a local action, ;
- specifies the probability of the agent, i, to transition to a new state, ;
- is the reward function for a state action pair ;
- is the discount factor across time for future rewards.
3.2.2. State and Action Representations
- v: normalized velocity (in ) of the ego vehicle, where we consider a maximum allowed velocity, (in our simulation studies, we set );
- : normalized relative position (in m) of the ego vehicle on the road it traverses with respect to the road’s length;
- : normalized velocity of the front vehicle. We consider the front vehicle to be the closest vehicle in front of the ego vehicle, within a visible range (in our experiments, we consider a visible range of 100 m). If there is no vehicle within this range, this variable is set to −1;
- : normalized distance from the front vehicle. It is calculated as the real distance between two vehicles divided by the visible range. Similarly, in the absence of a leading vehicle, this variable is set to −1;
- : priority term that takes three possible values (−1, 0, and 1). This is a centralized feature that acts as a factor of coordination and cooperation among road-agents, and is calculated based on the distances of incoming vehicles to the intersection’s center. To underscore the significance and meaningful contribution of this variable, we extended the designated area surrounding the intersection so as to guarantee the utmost precision and reliability in identifying the exact position of each vehicle. This enhances the agent’s ability to implement proactive measures and prevent potential collisions. Vehicles that are currently inside the intersection area will have their values set to . All other vehicles will have their values set to 0. The vehicle closest to the intersection’s center will be assigned a value of , indicating it has the highest priority to cross.
3.2.3. Reward Function
- The vehicle is currently found inside the intersection sector (). Then,
- -
- if its previous state was of similar () or highest priority (), then a positive reward would have been received to encourage the maintenance of its priority status:
- -
- if it forcefully finds itself inside the intersection (), a penalty will be given as a deterrent against “stealing” priority:
- The vehicle currently takes maximum priority () and prepares to cross the intersection. Then, it is positively rewarded in a manner proportional to its normalized velocity:
- The vehicle currently does not have any priority (). The reward depends on the distance to the vehicle in front (), and whether it exceeds a safety threshold value ():
3.2.4. Training
- The road-agents that enable the simultaneous control of multiple vehicles in the same road in order to construct optimal intersection traversing driving policies.
- The priority term that promotes coordination among vehicles from different road-agents, aiming to prevent collisions inside the intersection.
Algorithm 1. Multi-agent PPO for traffic control in an intersection pattern. |
|
3.3. Zero-Shot Transfer of Learned MARL Policies for Automated Traffic Control
4. Experimental Results
- Efficiently solving individual intersection patterns: the proposed multi-agent reinforcement learning platform aims to demonstrate its capability in effectively resolving congested situations in different intersection patterns.
- Optimizing the learning strategy for intersection patterns’ traffic management: we seek to develop an efficient learning strategy that enables the developed multi-agent systems to acquire a diverse set of generalized capabilities. These capabilities should allow the use of their learned driving policies directly and effectively across various scenarios and traffic network configurations.
- Ensuring efficient knowledge transfer: one of our key goals is to establish a robust and scalable knowledge transfer process that enables its application to complex road network structures and traffic scenarios satisfactorily.
4.1. Simulation Environment
4.2. Experimental Design and Implementation Issues
- Average velocity (V) (in m/s) of all vehicles across their routes;
- Average duration (T) (in s) of completion of the vehicle routes;
- Number of collisions during the execution of the scenario.
4.3. Results
4.3.1. Performance of Multi-Agent RL for Traffic Control on Intersection Patterns
4.3.2. Performance of Zero-Shot Transfer Knowledge to Large Road Networks
- : uniform arrival rate on interval —total of around 2100 vehicles (about 18 vehicles per minute);
- : uniform arrival rate on interval —total of around 2400 vehicles (about 20 vehicles per minute);
- : uniform arrival rate on interval —total of around 2900 vehicles (about 24 vehicles per minute);
- : uniform arrival rate on interval —total of around 3600 vehicles (about 30 vehicles per minute).
5. Conclusions and Future Work
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Qian, B.; Zhou, H.; Lyu, F.; Li, J.; Ma, T.; Hou, F. Toward Collision-Free and Efficient Coordination for Automated Vehicles at Unsignalized Intersection. IEEE Internet Things J. 2019, 6, 10408–10420. [Google Scholar] [CrossRef]
- Lianzhen, W.; Zirui, L.; Jianwei, G.; Cheng, G.; Jiachen, L. Autonomous Driving Strategies at Intersections: Scenarios, State-of-the-Art, and Future Outlooks. In Proceedings of the International Intelligent Transportation Systems Conference, Indianapolis, IN, USA, 19–22 September 2021; pp. 44–51. [Google Scholar]
- Dresner, K.; Stone, P. A multiagent approach to autonomous intersection management. J. Artif. Intell. Res. 2008, 31, 591–656. [Google Scholar] [CrossRef]
- Lee, J.; Park, B. Development and evaluation of a cooperative vehicle intersection control algorithm under the connected vehicles environment. IEEE Intell. Transp. Syst. Mag. 2012, 13, 81–90. [Google Scholar] [CrossRef]
- Sutton, R.; Barto, A. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Haydari, A.; Yılmaz, Y. Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11–32. [Google Scholar] [CrossRef]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2020, 109, 43–76. [Google Scholar] [CrossRef]
- Spatharis, C.; Blekas, K. Multiagent reinforcement learning for autonomous driving in traffic zones with unsignalized intersections. J. Intell. Transp. Syst. 2022, 28, 103–119. [Google Scholar] [CrossRef]
- Camponogara, E.; Kraus, W. Distributed Learning Agents in Urban Traffic Control. In Proceedings of the Portuguese Conference on Artificial Intelligence, Beja, Portugal, 4–7 December 2003. [Google Scholar]
- Salkham, A.; Cunningham, R.; Garg, A.; Cahill, V. A collaborative reinforcement learning approach to urban traffic control optimization. In Proceedings of the International Conference on Web Intelligence and Intelligent Agent Technology, Sydney, Australia, 9–12 December 2008; pp. 560–566. [Google Scholar]
- Arel, I.; Liu, C.; Urbanik, T.; Kohls, A. Reinforcement learning based multi-agent system for network traffic signal control. Intell. Transp. Syst. 2010, 4, 128–135. [Google Scholar] [CrossRef]
- Chen, C.; Wei, H.; Xu, N.; Zheng, G.; Yang, M.; Xiong, Y.; Xu, K.; Li, Z. Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control. Proc. AAAI Conf. Artif. Intell. 2020, 34, 3414–3421. [Google Scholar] [CrossRef]
- Rasheed, F.; Yau, K.L.A.; Noor, R.; Wu, C.; Low, Y.C. Deep Reinforcement Learning for Traffic Signal Control: A Review. IEEE Access 2020, 8, 208016–208044. [Google Scholar] [CrossRef]
- Bálint, K.; Tamás, T.; Tamás, B. Deep Reinforcement Learning based approach for Traffic Signal Control. Transp. Res. Procedia 2022, 62, 278–285. [Google Scholar] [CrossRef]
- Lee, D. A Theory of Visual Control of Braking Based on Information about Time-to-Collision. Perception 1976, 5, 437–459. [Google Scholar] [CrossRef] [PubMed]
- Zohdy, I.; Rakha, H. Optimizing driverless vehicles at intersections. In Proceedings of the 19th ITS World Congress, Vienna, Austria, 22–26 October 2012. [Google Scholar]
- Ji, J.; Khajepour, A.; Melek, W.W.; Huang, Y. Path Planning and Tracking for Vehicle Collision Avoidance Based on Model Predictive Control with Multiconstraints. IEEE Trans. Veh. Technol. 2017, 66, 952–964. [Google Scholar] [CrossRef]
- Rodrigues de Campos, G.; Falcone, P.; Hult, R.; Wymeersch, H.; Sjöberg, J. Traffic coordination at road intersections: Autonomous decision-making algorithms using model-based heuristics. IEEE Intell. Transp. Syst. Mag. 2017, 9, 8–21. [Google Scholar] [CrossRef]
- Pan, Y.; Lin, Q.; Shah, H.; Dolan, J. Safe Planning for Self-Driving Via Adaptive Constrained ILQR. In Proceedings of the International Conference on Intelligent Robots and Systems, Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 2377–2383. [Google Scholar]
- Carlino, D.; Boyles, S.D.; Stone, P. Auction-based autonomous intersection management. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, The Netherlands, 6–9 October 2013; pp. 529–534. [Google Scholar]
- Wang, S.; Mahlberg, A.; Levin, M.W. Optimal Control of Automated Vehicles for Autonomous Intersection Management with Design Specifications. Transp. Res. Rec. 2022, 2677, 1643–1658. [Google Scholar] [CrossRef]
- Levin, M.W.; Rey, D. Conflict-point formulation of intersection control for autonomous vehicles. Transp. Res. Part C Emerg. Technol. 2017, 85, 528–547. [Google Scholar] [CrossRef]
- Li, J.; Hoang, T.A.; Lin, E.; Vu, H.L.; Koenig, S. Intersection Coordination with Priority-Based Search for Autonomous Vehicles. Proc. AAAI Conf. Artif. Intell. 2023, 37, 11578–11585. [Google Scholar] [CrossRef]
- Lu, G.; Shen, Z.; Liu, X.; Nie, Y.M.; Xiong, Z. Are autonomous vehicles better off without signals at intersections? A comparative computational study. Transp. Res. Part B Methodol. 2022, 155, 26–46. [Google Scholar] [CrossRef]
- Codevilla, F.; Miiller, M.; López, A.; Koltun, V.; Dosovitskiy, A. End-to-End Driving Via Conditional Imitation Learning. In Proceedings of the International Conference on Robotics and Automation, Brisbane, QLD, Australia, 21–25 May 2018; IEEE Press: New York, NY, USA, 2018; pp. 1–9. [Google Scholar]
- Menda, K.; Driggs-Campbell, K.; Kochenderfer, M. EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning. In Proceedings of the International Conference on Intelligent Robots and Systems, Macau, China, 3–8 November 2019; pp. 5041–5048. [Google Scholar]
- Bouton, M.; Cosgun, A.; Kochenderfer, M. Belief State Planning for Autonomously Navigating Urban Intersections. In Proceedings of the IEEE Intelligent Vehicles Symposium, Los Angeles, CA, USA, 11–14 June 2017; pp. 825–830. [Google Scholar]
- Tram, T.; Jansson, A.; Grönberg, R.; Ali, M.; Sjöberg, J. Learning negotiating behavior between cars in intersections using deep q-learning. In Proceedings of the International Conference on Intelligent Transportation Systems, Maui, HI, USA, 4–7 November 2018; pp. 3169–3174. [Google Scholar]
- Tram, T.; Batkovic, I.; Ali, M.; Sjöberg, J. Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control. In Proceedings of the Intelligent Transportation Systems Conference, Auckland, New Zealand, 27–30 October 2019; pp. 3263–3268. [Google Scholar]
- Isele, D.; Cosgun, A.; Fujimura, K. Analyzing Knowledge Transfer in Deep Q-Networks for Autonomously Handling Multiple Intersections. arXiv 2017, arXiv:1705.01197. [Google Scholar]
- Isele, D.; Rahimi, R.; Cosgun, A.; Subramanian, K.; Fujimura, K. Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning. In Proceedings of the International Conference on Robotics and Automation, Brisbane, Australia, 21–25 May 2018; pp. 2034–2039. [Google Scholar]
- Li, C.; Czarnecki, K. Urban Driving with Multi-Objective Deep Reinforcement Learning. In Proceedings of the International Conference on Autonomous Agents and MultiAgent Systems, Montreal, QC, Canada, 13–17 May 2019; pp. 359–367. [Google Scholar]
- Shao, C.; Cheng, F.; Xiao, J.; Zhang, K. Vehicular intelligent collaborative intersection driving decision algorithm in Internet of Vehicles. Future Gener. Comput. Syst. 2023, 145, 384–395. [Google Scholar] [CrossRef]
- Akhauri, S.; Zheng, L.; Lin, M. Enhanced Transfer Learning for Autonomous Driving with Systematic Accident Simulation. In Proceedings of the International Conference on Intelligent Robots and Systems, Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 5986–5993. [Google Scholar]
- Chiba, S.; Sasaoka, H. Basic Study for Transfer Learning for Autonomous Driving in Car Race of Model Car. In Proceedings of the International Conference on Business and Industrial Research, Bangkok, Thailand, 20–21 May 2021; pp. 138–141. [Google Scholar]
- Shu, H.; Liu, T.; Mu, X.; Cao, D. Driving Tasks Transfer Using Deep Reinforcement Learning for Decision-Making of Autonomous Vehicles in Unsignalized Intersection. IEEE Trans. Veh. Technol. 2022, 71, 41–52. [Google Scholar] [CrossRef]
- Xu, Z.; Tang, C.; Tomizuka, M. Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control. In Proceedings of the Proceeding of the 21st International Conference on Intelligent Transportation Systems, Maui, HI, USA, 4–7 November 2018; pp. 2865–2871. [Google Scholar]
- Kirk, R.; Zhang, A.; Grefenstette, E.; Rocktaschel, T. A Survey of Zero-shot Generalisation in Deep Reinforcement Learning Systems. J. Artif. Intell. Res. 2023, 76, 201–264. [Google Scholar] [CrossRef]
- Qiao, Z.; Muelling, K.; Dolan, J.; Palanisamy, P.; Mudalige, P. Automatically Generated Curriculum based Reinforcement Learning for Autonomous Vehicles in Urban Environment. In Proceedings of the IEEE Intelligent Vehicles Symposium, Changshu, China, 26–30 June 2018; pp. 1233–1238. [Google Scholar]
- Anzalone, L.; Barra, S.; Nappi, M. Reinforced Curriculum Learning for Autonomous Driving in Carla. In Proceedings of the International Conference on Image Processing, Anchorage, AK, USA, 19–22 September 2021; pp. 3318–3322. [Google Scholar]
- Jin, H.; Peng, Y.; Yang, W.; Wang, S.; Zhang, Z. Federated Reinforcement Learning with Environment Heterogeneity. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, Virtual, 28–30 March 2022; pp. 18–37. [Google Scholar]
- Fan, F.X.; Ma, Y.; Dai, Z.; Tan, C.; Low, B.K.H. FedHQL: Federated Heterogeneous Q-Learning. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, London, UK, 29 May–2 June 2023; pp. 2810–2812. [Google Scholar]
- Liang, X.; Liu, Y.; Chen, T.; Liu, M.; Yang, Q. Federated Transfer Reinforcement Learning for Autonomous Driving. In Federated and Transfer Learning; Springer: Cham, Switzerland, 2023; pp. 357–371. [Google Scholar]
- Da Silva, F.L.; Taylor, M.; Reali Costa, A.H. Autonomously Reusing Knowledge in Multiagent Reinforcement Learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 5487–5493. [Google Scholar]
- Da Silva, F.L.; Reali Costa, A.H. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems. J. Artif. Intell. Res. 2019, 64, 645–703. [Google Scholar] [CrossRef]
- Zhou, Z.; Liu, G.; Tang, Y. Multi-Agent Reinforcement Learning: Methods, Applications, Visionary Prospects, and Challenges. arXiv 2023, arXiv:2305.10091. [Google Scholar]
- Candela, E.; Parada, L.; Marques, L.; Georgescu, T.; Demiris, Y.; Angeloudis, P. Transferring Multi-Agent Reinforcement Learning Policies for Autonomous Driving using Sim-to-Real. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan, 23–27 October 2022; pp. 8814–8820. [Google Scholar]
- Jang, K.; Vinitsky, E.; Chalaki, B.; Remer, B.; Beaver, L.; Andreas, M.; Bayen, A. Simulation to scaled city: Zero-shot policy transfer for traffic control via autonomous vehicles. In Proceedings of the ACM/IEEE International Conference on Cyber-Physical Systems, Montreal, QC, Canada, 16–18 April 2019; pp. 291–300. [Google Scholar]
- Kochenderfer, M. Decision Making under Uncertainty: Theory and Application; MIT Press: Cambridge, MA, USA, 2015. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 1856–1865. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Hartikainen, K.; Tucker, G.; Ha, S.; Tan, J.; Kumar, V.; Zhu, H.; Gupta, A.; Abbeel, P.; et al. Soft Actor-Critic Algorithms and Applications. arXiv 2018. [Google Scholar]
- Fujimoto, S.; van Hoof, H.; Meger, D. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Dy, J.G., Krause, A., Eds.; Volume 80, pp. 1582–1591. [Google Scholar]
- Lopez, P.A.; Behrisch, M.; Bieker-Walz, L.; Erdmann, J.; Flötteröd, Y.P.; Hilbrich, R.; Lücken, L.; Rummel, J.; Wagner, P.; Wießner, E. Microscopic Traffic Simulation using SUMO. In Proceedings of the 21st IEEE International Conference on Intelligent Transportation Systems, Maui, HI, USA, 4–7 November 2018. [Google Scholar]
- Krauss, S.; Wagner, P.; Gawron, C. Metastable states in a microscopic model of traffic flow. Phys. Rev. E 1997, 55, 5597–5602. [Google Scholar] [CrossRef]
- Afrin, T.; Yodo, N. A Survey of Road Traffic Congestion Measures towards a Sustainable and Resilient Transportation System. Sustainability 2020, 12, 4660. [Google Scholar] [CrossRef]
Environment | Average Velocity (m/s) | Average Duration (s) |
---|---|---|
3-way intersection | ||
(best) | ||
4-way intersection | ||
(best) |
Road Network | # Roads | # Intersections | |
---|---|---|---|
3-Way | 4-Way | ||
32 | 2 | 4 | |
54 | 9 | 4 | |
230 | 35 | 21 | |
334 | 43 | 35 |
Policy | Scenario | Scenario | Scenario | Scenario | ||||
---|---|---|---|---|---|---|---|---|
() | (s) | () | (s) | () | (s) | () | (s) | |
17.0 | 49.1 | 16.5 | 50.7 | 15.4 | 54.5 | 11.9 | 70.4 | |
18.0 | 46.1 | 17.5 | 47.6 | 16.5 | 51.0 | 12.6 | 67.1 | |
17.4 | 47.9 | 16.9 | 49.5 | 15.8 | 53.3 | 12.0 | 70.2 | |
17.9 | 46.5 | 17.4 | 47.8 | 16.4 | 51.2 | 12.5 | 67.9 | |
17.9 | 46.5 | 17.4 | 47.9 | 16.4 | 51.2 | 12.5 | 67.7 | |
18.0 | 46.3 | 17.5 | 47.7 | 16.4 | 51.2 | 12.5 | 68.1 | |
Krauss (collisions) | 13.8 | 58.7 | 13.7 | 59.1 | 13.6 | 59.8 | 13.4 | 60.4 |
(17) | (29) | (34) | (51) |
Policy | Scenario | Scenario | Scenario | Scenario | ||||
---|---|---|---|---|---|---|---|---|
() | (s) | () | (s) | () | (s) | () | (s) | |
17.8 | 78.2 | 17.6 | 79.3 | 17.1 | 81.4 | 15.5 | 88.9 | |
18.5 | 75.1 | 18.3 | 76.2 | 17.8 | 78.1 | 16.5 | 84.9 | |
18.1 | 77.1 | 17.8 | 78.3 | 17.0 | 80.9 | 15.7 | 88.1 | |
18.2 | 76.6 | 17.9 | 77.7 | 17.5 | 79.6 | 16.3 | 86.1 | |
18.1 | 77.0 | 17.8 | 78.1 | 17.4 | 79.9 | 16.2 | 86.4 | |
18.6 | 74.9 | 18.3 | 75.9 | 17.9 | 77.8 | 16.5 | 84.7 | |
Krauss (collisions) | 14.3 | 96.2 | 14.2 | 96.7 | 14.1 | 97.1 | 14.0 | 98.0 |
(17) | (17) | (28) | (40) |
Policy | Scenario | Scenario | Scenario | Scenario | ||||
---|---|---|---|---|---|---|---|---|
() | (s) | () | (s) | () | (s) | () | (s) | |
18.5 | 169.3 | 18.4 | 170.0 | 18.3 | 171.4 | 18.1 | 172.9 | |
19.4 | 161.3 | 19.3 | 161.9 | 19.2 | 163.1 | 19.1 | 164.3 | |
18.4 | 170.0 | 18.3 | 170.7 | 18.1 | 172.0 | 17.9 | 174.0 | |
18.9 | 165.2 | 18.9 | 165.8 | 18.8 | 167.0 | 18.6 | 168.2 | |
18.8 | 166.7 | 18.7 | 167.2 | 18.6 | 168.5 | 18.5 | 169.6 | |
19.5 | 160.7 | 19.4 | 161.3 | 19.3 | 162.5 | 19.2 | 163.7 | |
Krauss (collisions) | 14.6 | 213.8 | 14.6 | 214.0 | 14.5 | 215.5 | 14.4 | 215.5 |
(9) | (11) | (14) | (20) |
Policy | Scenario | Scenario | Scenario | Scenario | ||||
---|---|---|---|---|---|---|---|---|
() | (s) | () | (s) | () | (s) | () | (s) | |
18.5 | 183.5 | 18.4 | 183.6 | 18.3 | 184.9 | 18.2 | 185.4 | |
19.4 | 174.6 | 19.4 | 174.5 | 19.3 | 175.7 | 19.2 | 176.0 | |
18.7 | 180.9 | 18.7 | 181.0 | 18.6 | 182.4 | 18.4 | 183.1 | |
19.0 | 178.5 | 18.9 | 178.3 | 18.9 | 179.5 | 18.8 | 179.8 | |
18.8 | 180.1 | 18.8 | 179.9 | 18.7 | 181.2 | 18.6 | 181.5 | |
19.5 | 174.0 | 19.4 | 173.9 | 19.3 | 175.1 | 19.2 | 175.4 | |
Krauss (collisions) | 14.6 | 231.2 | 14.6 | 231.0 | 14.5 | 231.9 | 14.5 | 231.3 |
(11) | (10) | (17) | (23) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tranos, T.; Spatharis, C.; Blekas, K.; Stafylopatis, A.-G. Large-Scale Urban Traffic Management Using Zero-Shot Knowledge Transfer in Multi-Agent Reinforcement Learning for Intersection Patterns. Robotics 2024, 13, 109. https://doi.org/10.3390/robotics13070109
Tranos T, Spatharis C, Blekas K, Stafylopatis A-G. Large-Scale Urban Traffic Management Using Zero-Shot Knowledge Transfer in Multi-Agent Reinforcement Learning for Intersection Patterns. Robotics. 2024; 13(7):109. https://doi.org/10.3390/robotics13070109
Chicago/Turabian StyleTranos, Theodore, Christos Spatharis, Konstantinos Blekas, and Andreas-Giorgios Stafylopatis. 2024. "Large-Scale Urban Traffic Management Using Zero-Shot Knowledge Transfer in Multi-Agent Reinforcement Learning for Intersection Patterns" Robotics 13, no. 7: 109. https://doi.org/10.3390/robotics13070109
APA StyleTranos, T., Spatharis, C., Blekas, K., & Stafylopatis, A. -G. (2024). Large-Scale Urban Traffic Management Using Zero-Shot Knowledge Transfer in Multi-Agent Reinforcement Learning for Intersection Patterns. Robotics, 13(7), 109. https://doi.org/10.3390/robotics13070109