Abstract
Multi-agent encirclement with collision avoidance constitutes a common challenge in the multi-agent confrontation domain, wherein the focus lies in the development of cooperative strategies among agents. Previous studies encountered difficulties in addressing the dynamic encirclement of faster prey in obstacles environment. This paper introduces a novel multi-agent deep reinforcement learning approach based on prior knowledge. It is dedicated to exploring the multi-agent encirclement with collision avoidance task involving slower multiple pursuers collaboratively encircling faster prey in an obstacles environment. Firstly, the utilization of the classic Apollonius circle theory as prior knowledge guides agent action selection, narrows the exploratory action space, and accelerates the learning of strategies. Subsequently, the variance descriptor restricts the motion direction of pursuers, thus ensuring that pursuers continuously narrow the encirclement until the prey is successfully encircled. Finally, experiments in an obstacles environment were conducted to validate the proposed method. The results indicate that our method can acquire an effective encirclement strategy, with an encirclement success rate exceeding that of previous methods by more than 10%, and simulation experiment results demonstrate the effectiveness and practicability of our method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
For data access requests, interested researchers are encouraged to contact the corresponding author at. In addition to data access, we provide detailed information about the experimental setup and configurations to aid in result replication: We conducted experiments using Python 3.8 on a Linux-based server with the following dependencies: open AI gym(0.10.5) TensorFlow 2.4, Numpy(1.14.5), python(3.7), https://github.com/openai/multiagent-particle-envs. We are committed to fostering collaboration and transparency in research, and we encourage fellow researchers to reach out for any inquiries regarding data access or the experimental setup.
References
Turetsky V, Shima T (2016) Target evasion from a missile performing multiple switches in guidance law. J Guid Control Dyn 39(10):2364–2373
Perelman A, Shima T, Rusnak I (2011) Cooperative differential games strategies for active aircraft protection from a homing missile. J Guid Control Dyn 34(3):761–773
Camci E, Kayacan E (2016) Game of drones: UAV pursuit-evasion game with type-2 fuzzy logic controllers tuned by reinforcement learning. In: 2016 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 618–625
Sun Z, Wu H, Shi Y, Yu X, Gao Y, Pei W, Yang Z, Piao H, Hou Y (2023) Multi-agent air combat with two-stage graph-attention communication. Neural Comput Appl 35:1–17
Du W, Guo T, Chen J, Li B, Zhu G, Cao X (2021) Cooperative pursuit of unauthorized UAVS in urban airspace via multi-agent reinforcement learning. Transp Res Part C Emerg Technol 128:103122
Wan K, Wu D, Zhai Y, Li B, Gao X, Hu Z (2021) An improved approach towards multi-agent pursuit-evasion game decision-making using deep reinforcement learning. Entropy 23(11):1433
Alexopoulos A, Schmidt T, Badreddin E (2015) Cooperative pursue in pursuit-evasion games with unmanned aerial vehicles. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp 4538–4543
Peng K, Rong H, Qian Y (2023) Agrcnet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control. Neural Comput Appl 35(28):21007–21022
Wishart D (1966) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Phys Bull 17(2):60. https://doi.org/10.1088/0031-9112/17/2/009
Sun W, Tsiotras P, Lolla T, Subramani DN, Lermusiaux PF (2017) Multiple-pursuer/one-evader pursuit-evasion game in dynamic flowfields. J Guid Control Dyn 40(7):1627–1637
Wei L, Zhihua Q, Simaan MA (2015) Nash strategies for pursuit-evasion differential games involving limited observations. IEEE Trans Aerosp Electron Syst 51(2):1347–1356
Wang Y, Dong L, Sun C (2020) Cooperative control for multi-player pursuit-evasion games with reinforcement learning. Neurocomputing 412:101–114
Zhang B-K, Hu B, Chen L, Zhang D-X, Cheng X-M, Guan Z-H (2021) Probabilistic reward-based reinforcement learning for multi-agent pursuit and evasion. In: 2021 33rd Chinese control and decision conference (CCDC),IEEE, pp 3352–3357
Verma S, Verma R, Sujit P (2019) Mapel: multi-agent pursuer-evader learning using situation report. In: 2019 international joint conference on neural networks (IJCNN), IEEE, pp 1–8
Sun L, Chang Y-C, Lyu C, Shi Y, Shi Y, Lin C-T (2023) Toward multi-target self-organizing pursuit in a partially observable Markov game. Inf Sci 648:119475
Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems 30
Tang H, Zhang W, Sun M, Lin B, Hu Z (2021) A PE game with one superior hunter and multi-pursuer against an evader. In: 2021 40th Chinese control conference (CCC), IEEE, pp 5124–5130
Li S, Wang C, Xie G (2022) Pursuit-evasion differential games of players with different speeds in spaces of different dimensions. In: 2022 American control conference (ACC), IEEE, pp 1299–1304
Bakolas E, Tsiotras P (2012) Relay pursuit of a maneuvering target using dynamic Voronoi diagrams. Automatica 48(9):2213–2220
Lopez VG, Lewis FL, Wan Y, Sanchez EN, Fan L (2019) Solutions for multiagent pursuit-evasion games on communication graphs: finite-time capture and asymptotic behaviors. IEEE Trans Autom Control 65(5):1911–1923
Yugang L, Goldie N (2013) Robotic urban search and rescue: a survey from the control perspective. J Intell Robot Syst 72(2):147–165
Chen J, Zha W, Peng Z, Gu D (2016) Multi-player pursuit-evasion games with one superior evader. Automatica 71:24–32
Pierson A, Wang Z, Schwager M (2016) Intercepting rogue robots: an algorithm for capturing multiple evaders with multiple pursuers. IEEE Robot Autom Lett 2(2):530–537
Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al (2019) Grandmaster level in starcraft II using multi-agent reinforcement learning. Nature 575(7782):350–354
Berner C, Brockman G, Chan B, Cheung V, Dębiak P, Dennison C, Farhi D, Fischer Q, Hashme S, Hesse C, et al (2019) Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680
Xia J, Luo Y, Liu Z, Zhang Y, Shi H, Liu Z (2023) Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning. Def Technol 29:80–94
Li S, Wu Y, Cui X, Dong H, Fang F, Russell S (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. Proc AAAI Conf Artif Intell 33:4213–4220
Selvakumar J, Bakolas E (2022) Min–max q-learning for multi-player pursuit-evasion games. Neurocomputing 475:1–14
Yu C, Velu A, Vinitsky E, Gao J, Wang Y, Bayen A, Wu Y (2022) The surprising effectiveness of PPO in cooperative multi-agent games. Adv Neural Inf Process Syst 35:24611–24624
Zhou Z, Xu H (2020) Mean field game and decentralized intelligent adaptive pursuit evasion strategy for massive multi-agent system under uncertain environment. In: 2020 American control conference (ACC), IEEE, pp 5382–5387
Grupen NA, Lee DD, Selman B (2022) Multi-agent curricula and emergent implicit signaling. In: Proceedings of the 21st international conference on autonomous agents and multiagent systems. AAMAS ’22, pp 553–561. International foundation for autonomous agents and multiagent systems, Richland, SC
Kouzeghar M, Song Y, Meghjani M, Bouffanais R (2023) Multi-target pursuit by a decentralized heterogeneous uav swarm using deep multi-agent reinforcement learning. arXiv preprint arXiv:2303.01799
Fang X, Wang C, Xie L, Chen J (2020) Cooperative pursuit with multi-pursuer and one faster free-moving evader. IEEE Trans Cybernet 52(3):1405–1414
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest related to this research. There are no financial, professional, or personal conflicts of interest that could influence the impartiality or integrity of this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A Number of agents
Appendix A Number of agents
Based on the mathematical principles of the Apollonius circle and the encirclement task defined in section 3.1, when the pursuers encircle the prey, the \(360^{\circ }\) range around the prey is covered by the pursuers’ capture angles, as shown in Fig. 11. Consequently, we can establish the relationship between the number of pursuers and prey within an encirclement task.
It is evident that the correlation between the number of agents in the encirclement task is contingent upon the speed ratio between the pursuer and the prey.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, T., Shi, D., Wang, Z. et al. Learning cooperative strategies in multi-agent encirclement games with faster prey using prior knowledge. Neural Comput & Applic 36, 15829–15842 (2024). https://doi.org/10.1007/s00521-024-09727-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09727-6