Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications

Published: 01 June 2021 Publication History

Abstract

Deep reinforcement learning has proved to be a fruitful method in various tasks in the field of artificial intelligence during the last several years. Recent works have focused on deep reinforcement learning beyond single-agent scenarios, with more consideration of multi-agent settings. The main goal of this paper is to provide a detailed and systematic overview of multi-agent deep reinforcement learning methods in views of challenges and applications. Specifically, the preliminary knowledge is introduced first for a better understanding of this field. Then, a taxonomy of challenges is proposed and the corresponding structures and representative methods are introduced. Finally, some applications and interesting future opportunities for multi-agent deep reinforcement learning are given.

References

[1]
Abouheaf M, Gueaieb W (2017) Multi-agent reinforcement learning approach based on reduced value function approximations. In 2017 IEEE International Symposium on Robotics and Intelligent Sensors (IRIS) pp 111–116. IEEE
[2]
Albrecht SV and Stone P Autonomous agents modeling other agents: a comprehensive survey and open problems Artif Intell 2018 258 66-95
[3]
Bard N, Foerster JN, Chandar S, Burch N, Lanctot M, Song HF, and Dunning I The hanabi challenge: a new frontier for ai research Artif Intell 2020 280 103216
[4]
Bowling M, McCracken P (2005) Coordination and adaptation in impromptu teams. In: 1995 AAAI conference on artificial intelligence, vol 5, pp 53–58
[5]
Buşoniu L, Babuška R, and De Schutter B Srinivasan D and Jain LC Multi-agent reinforcement learning: an overview Innovations in multi-agent systems and applications-1 2010 Berlin, Heidelberg Springer 183-221
[6]
Calvo JA, Dusparic I (2018) Heterogeneous multi-agent deep reinforcement learning for traffic lights control. In AICS pp 2–13
[7]
Camerer CF, Ho TH, and Chong JK Behavioural game theory: thinking, learning and teaching. In Advances in understanding strategic behavior 2004 London Palgrave Macmillan 120-180
[8]
Carmel D, Markovitch S (1996) Incorporating opponent models into adversary search. In AAAI/IAAI, Vol. 1, pp 120–125
[9]
Chen W, Zhou K, Chen C (2016) Real-time bus holding control on a transit corridor based on multi-agent reinforcement learning. In 2016 IEEE 19th International conference on intelligent transportation systems (ITSC) pp 100–106. IEEE
[10]
Christiano PF, Leike J, Brown T, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems pp 4299–4307
[11]
Da Silva FL and Costa AHR A survey on transfer learning for multiagent reinforcement learning systems J Artif Intell Res 2019 64 645-703
[12]
Ding S, Du W, Zhao X, et al. A new asynchronous reinforcement learning algorithm based on improved parallel PSO Appl Intell 2019 49 12 4211-4222
[13]
Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning pp 1329–1338
[14]
Egorov M (2016) Multi-agent deep reinforcement learning. CS231n: convolutional neural networks for visual recognition
[15]
Finn C, Levine S (2017) Deep visual foresight for planning robot motion. In 2017 IEEE International Conference on Robotics and Automation (ICRA) pp 2786–2793. IEEE
[16]
Foerster J, Assael IA, de Freitas N, Whiteson S (2016) Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems pp 2137–2145
[17]
Foerster J, Nardelli N, Farquhar G, Afouras T, Torr PH, Kohli P, Whiteson S (2017) Stabilising experience replay for deep multi-agent reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 pp 1146–1155. JMLR. org
[18]
Foerster JN, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In Thirty-Second AAAI Conference on Artificial Intelligence
[19]
Fortunato M, Azar MG, Piot B, Menick J, Osband I, Graves A, Blundell C (2017) Noisy networks for exploration. arXiv preprint
[20]
Francois-Lavet V, Fonteneau R, Ernst D (2015) How to discount deep reinforcement learning: towards new dynamic strategies. Proceedings of the Workshops at the Advances in Neural Information Processing Systems. Montreal, Canada: pp 107–116
[21]
Fu H, Tang H, Hao J, Lei Z, Chen Y, Fan C (2019) Deep multi-agent reinforcement learning with discrete-continuous hybrid action spaces. arXiv preprint
[22]
Fujimoto S, Van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. arXiv preprint
[23]
Gao C, Kartal B, Hernandez-Leal P, Taylor ME (2019) On hard exploration for reinforcement learning: a case study in pommerman. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment Vol. 15, No. 1, pp 24–30
[24]
Gmytrasiewicz PJ and Doshi P A framework for sequential planning in multi-agent settings J Artif Intell Res 2005 24 49-79
[25]
Gmytrasiewicz PJ, Durfee EH (2000) Rational coordination in multi-agent environments, autonomous agents and multi-agent systems 3 (4)
[26]
Greenwald A, Hall K, Serrano R (2003) Correlated q-learning. In: International conference on machine learning, vol 3, pp 242–249
[27]
Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep q-learning with model-based acceleration. In International Conference on Machine Learning pp 2829–2838
[28]
Gu S, Holly E, Lillicrap T et al. (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. IEEE International Conference on Robotics and Automation. Singapore: IEEE Press: 3389–3396
[29]
Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017). Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems pp 66–83 Springer, Cham
[30]
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint
[31]
Hadfield-Menell D, Russell SJ, Abbeel P, Dragan A (2016) Cooperative inverse reinforcement learning. In Advances in neural information processing systems pp 3909–3917
[32]
Hadfield-Menell D, Milli S, Abbeel P, Russell SJ, Dragan A (2017) Inverse reward design. In Advances in neural information processing systems pp 6765–6774
[33]
Hausknecht M, Stone P (2015) Deep recurrent q-learning for partially observable mdps. In 2015 AAAI Fall Symposium Series
[34]
He H, Boyd-Graber J, Kwok K, Daumé III H (2016) Opponent modeling in deep reinforcement learning. In International Conference on Machine Learning pp 1804–1813
[35]
Heess N, Sriram S, Lemmon J, Merel J, Wayne G, Tassa Y, Silver D (2017) Emergence of locomotion behaviours in rich environments. arXiv preprint
[36]
Hernandez-Leal P, Kaisers M (2017) Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor
[37]
Hernandez-Leal P, Taylor ME, Rosman B, Sucar LE, Munoz de Cote E (2016) Identifying and tracking switching, non-stationary opponents: a bayesian approach, In: Multiagent Interaction without Prior Coordination Workshop at AAAI, Phoenix, AZ, USA, 2016
[38]
Hernandez-Leal P, Kaisers M, Baarslag T, de Cote EM (2017) A survey of learning in multiagent environments: dealing with non-stationarity. arXiv preprint
[39]
Hernandez-Leal P, Zhan Y, Taylor ME, Sucar LE, and de Cote EM Efficiently detecting switches against non-stationary opponents Auton Agent Multi-Agent Syst 2017 31 4 767-789
[40]
Hernandez-Leal P, Kartal B, Taylor ME (2018) Is multiagent deep reinforcement learning the answer or the question? A brief survey. arXiv preprint
[41]
Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G (2017) Rainbow: combining improvements in deep reinforcement learning
[42]
Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Silver D (2018) Rainbow: combining improvements in deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence
[43]
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint
[44]
Hong ZW, Su SY, Shann, TY, Chang YH, Lee CY (2018) A deep policy inference q-network for multi-agent systems. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems pp 1388–1396. International Foundation for Autonomous Agents and Multiagent Systems
[45]
Hu J and Wellman MP Nash Q-learning for general-sum stochastic games J Mach Learn Res 2003 4 1039-1069
[46]
Ivanov S, D'yakonov A (2019) Modern Deep Reinforcement Learning Algorithms. arXiv preprint
[47]
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems pp 7254–7264
[48]
Jin J, Song C, Li H, Gai K, Wang J, Zhang W (2018) Real-time bidding with multi-agent reinforcement learning in display advertising. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management pp 2193–2201. ACM
[49]
Johnson M, Hofmann K, Hutton T (2016) The Malmo platform for artificial intelligence experimentation. In: IJCAI, pp 4246–4247
[50]
Kofinas P, Dounis AI, and Vouros GA Fuzzy Q-Learning for multi-agent decentralized energy management in microgrids Appl Energy 2018 219 53-67
[51]
Kononen V Asymmetric multiagent reinforcement learning Web Intell Agent Syst: An Int J 2004 2 2 105-121
[52]
Kurek M, Jakowski W (2016) Heterogeneous team deep Q-learning in low-dimensional multi-agent environments. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on pp 1–8
[53]
Lakshminarayanan AS, Sharma S, Ravindran B (2016) Dynamic frame skip deep q network. Proceedings of the Workshops at the International Joint Conference on Artificial Intelligence
[54]
Lanctot M, Zambaldi V, Gruslys A, Lazaridou A, Tuyls K, Pérolat J, Graepel T (2017) A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systemsm pp 4190–4203
[55]
Lanctot M, Zambaldi V, Gruslys A, et al. A unified game-theoretic approach to multi-agent reinforcement learning. Advances in neural information processing systems Los Angeles: NIPS Press 2017 2017 4190-4203
[56]
Lauer M, Riedmiller M (2000) An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the Seventeenth International Conference on Machine Learning
[57]
Leibo JZ, Zambaldi V, Lanctot M, Marecki J, Graepel T (2017) Multi-agent reinforcement learning in sequential social dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems pp 464–473. International Foundation for Autonomous Agents and Multiagent Systems
[58]
Levine S, Finn C, Darrell T, and Abbeel P End-to-end training of deep visuomotor policies J Mach Learn Res 2016 17 1 1334-1373
[59]
Li S, Wu Y, Cui X, Dong H, Fang F, Russell S (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In AAAI Conference on Artificial Intelligence (AAAI)
[60]
Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning Comput Sci 2016 8 6 A187
[61]
Littman ML Markov games as a framework for multi-agent reinforcement learning New brunswick: machine learning 1994 USA Elsevier 157-163
[62]
Littman ML Value-function reinforcement learning in Markov games Cognit Syst Res 2001 2 1 55-66
[63]
Liu S, Lever G, Merel J, Tunyasuvunakool S, Heess N, Graepel T (2019) Emergent coordination through competition. arXiv preprint
[64]
Lowe R, Wu Y, Tamar A, Harb J, Abbeel OP, and Mordatch I Multi-agent actor-critic for mixed cooperative-competitive environments Adv Neural Inf Process Syst 2017 30 6379-6390
[65]
Mao H, Gong Z, Ni, Y, Xiao Z (2017) ACCNet: Actor-Coordinator-Critic Net for" Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning. arXiv preprint
[66]
Mao H, Liu W, Hao J, Luo J, Li D, Zhang Z, Xiao Z (2019) Neighborhood cognition consistent multi-agent reinforcement learning. arXiv preprint
[67]
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller, M (2013) Playing atari with deep reinforcement learning. arXiv preprint
[68]
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp 1928–1937
[69]
Nguyen ND, Nahavandi S, Nguyen T (2018) A human mixed strategy approach to deep reinforcement learning. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC) pp 4023–4028. IEEE
[70]
Nguyen TT, Nguyen ND, Nahavandi S (2018) Deep reinforcement learning for multi-agent systems: a review of challenges, solutions and applications. arXiv preprint
[71]
Nguyen T, Nguyen ND, Nahavandi S (2018) Multi-agent deep reinforcement learning with human strategies. arXiv preprint
[72]
Noureddine D, Gharbi A Ahmed S (2017) Multi-agent deep reinforcement learning for task allocation in dynamic environment. In Proceedings of the 12th International Conference on Software Technologies (ICSOFT), pp 17–26
[73]
Palmer G, Tuyls K, Bloembergen D, Savani R (2018) Lenient multi-agent deep reinforcement learning. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems pp 443–451. International Foundation for Autonomous Agents and Multiagent Systems
[74]
Palmer G, Savani R, Tuyls K (2019) Negative update intervals in deep multi-agent reinforcement learning. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems pp 43–51. International Foundation for Autonomous Agents and Multiagent Systems
[75]
Panait L and Luke S Cooperative multi-agent learning: The state of the art Auton Agent Multi-Agent Syst 2005 11 3 387-434
[76]
Parisotto E, Ba JL, Salakhutdinov R (2015) Actor-mimic: Deep multitask and transfer reinforcement learning. arXiv preprint
[77]
Peng P, Yuan Q, Wen Y, Yang Y, Tang Z, Long H, Wang J (2017) Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. arXiv preprint, 2
[78]
Perolat J, Leibo JZ, Zambaldi V, Beattie C, Tuyls K, Graepel T (2017) A multi-agent reinforcement learning model of common-pool resource appropriation. In Advances in Neural Information Processing Systems pp 3643–3652
[79]
Piot B, Geist M, and Pietquin O Bridging the gap between imitation learning and inverse reinforcement learning IEEE transactions on neural networks and learning systems 2016 28 8 1814-1826
[80]
Rabinowitz NC, Perbet F, Song HF, Zhang C, Eslami SM, Botvinick M (2018) Machine theory of mind. arXiv preprint
[81]
Raileanu R, Denton E, Szlam A, Fergus R (2018) Modeling others using oneself in multi-agent reinforcement learning. arXiv preprint
[82]
Rashid T, Samvelyan M, De Witt CS, Farquhar G, Foerster J, Whiteson S (2018). QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv preprint
[83]
Resnick C, Eldridge W, Ha D, Britz D, Foerster J, Togelius J et al (2018) Pommerman: a multi-agent playground
[84]
Rosman B, Hawasly M, and Ramamoorthy S Bayesian policy reuse Machine Learning 2016 104 1 99-127
[85]
Rusu AA, Colmenarejo SG, Gulcehre C, Desjardins G, Kirkpatrick J, Pascanu R, Hadsell R (2015) Policy distillation. arXiv preprint
[86]
Samvelyan M, Rashid T, Schroeder de Witt C, Farquhar G, Nardelli N, Rudner TG, Whiteson . (2019). The starcraft multi-agent challenge. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems pp 2186–2188. International Foundation for Autonomous Agents and Multiagent Systems
[87]
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv preprint
[88]
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint
[89]
Shalev-Shwartz S, Shammah S, Shashua A (2016) Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint
[90]
Silver D, Lever G, Heess N et al (2014) Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. Beijing, China: 387–395
[91]
Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. arXiv preprint
[92]
Song J, Ren H, Sadigh D, Ermon S (2018) Multi-agent generative adversarial imitation learning. In Advances in Neural Information Processing Systems pp 7461–7472
[93]
Song Y, Wang J, Lukasiewicz T, Xu Z, Xu M, Ding Z, Wu L (2019) Arena: a general evaluation platform and building toolkit for multi-agent intelligence. arXiv preprint
[94]
Stone P and Veloso M Multiagent systems: a survey from a machine learning perspective Auton Robots 2000 8 3 345-383
[95]
Suarez J, Du Y, Isola P, Mordatch I, MMO N (1903) A massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint
[96]
Sukhbaatar S, Fergus R (2016) Learning multiagent communication with backpropagation. In Advances in neural information processing systems pp 2244–2252
[97]
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Graepel T (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv preprint
[98]
Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J, and Vicente R Multiagent cooperation and competition with deep reinforcement learning PLoS ONE 2017 12 4 e0172395
[99]
Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning pp 330–337
[100]
Tumer K, Agogino A (2007) Distributed agent-based air traffic flow management. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems pp 1–8
[101]
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In Thirtieth AAAI conference on artificial intelligence
[102]
Vidhate DA, Kulkarni P (2017) Cooperative multi-agent reinforcement learning models (CMRLM) for intelligent traffic control. In 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) pp 325–331. IEEE
[103]
Wai HT, Yang Z, Wang PZ, Hong M (2018) Multi-agent reinforcement learning via double averaging primal-dual optimization. In Advances in Neural Information Processing Systems pp 9649–9660
[104]
Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint
[105]
Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Gao Y (2019) From Few to More: Large-scale Dynamic Multiagent Curriculum Learning. arXiv preprint
[106]
Wang W, Liu TYY, Hao J, Hao X, Hu Y, Chen Y, Gao Y (2019) Action semantics network: Considering the Effects of Actions in Multiagent Systems. arXiv preprint
[107]
Wei E, Wicke D, Freelan D, Luke S (2018) Multiagent soft q-learning. In 2018 AAAI Spring Symposium Series
[108]
Xi L, Yu T, Yang B, and Zhang X A novel multi-agent decentralized win or learn fast policy hill-climbing with eligibility trace algorithm for smart generation control of interconnected complex power grids Energy Convers Manage 2015 103 82-93
[109]
Xi L, Chen J, Huang Y, Xu Y, Liu L, Zhou Y, and Li Y Smart generation control based on multi-agent reinforcement learning with the idea of the time tunnel Energy 2018 153 977-987
[110]
Xi L, Yu L, Xu Y, Wang S, Chen X (2019) A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems. IEEE Transactions on Sustainable Energy
[111]
Xu D, Si J, and Bian W Fingerprint orientation field extraction using gradient-based weighted averaging International Journal of collaborative intelligence 2016 1 4 287-297
[112]
Yang T, Hao J, Meng Z, Zhang C, Zheng YZZ, Zheng Z (2019) Towards efficient detection and optimal response against sophisticated opponents. In Proceedings of the 28th International Joint Conference on Artificial Intelligence pp 623–629. AAAI Press
[113]
Yang Y, Hao J, Liao B, Shao K, Chen G, Liu W, Tang H (2020) Qatten: a general framework for cooperative multiagent reinforcement learning. arXiv preprint .
[114]
Yang Y, Hao J, Chen G, Tang H, Chen Y, Hu Y, Wei Z (2020) Q-value path decomposition for deep multiagent reinforcement learning. In International Joint Conference on Artificial Intelligence (IJCAI)
[115]
Yin H, Pan SJ (2017) Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence
[116]
Zhang P, Hao J, Wang W, Tang H, Ma Y, Duan Y, Zheng Y (2020) KoGuN: accelerating deep reinforcement learning via integrating human suboptimal knowledge. In Thirty-seventh International Conference on Machine Learning (ICML)s
[117]
Zhao Z, Gao Y, Luo B, et al. Reinforcement learning technology in multi-agent system Comput Sci 2004 31 3 23-27
[118]
Zhao X, Ding S, An Y, and Jia W Asynchronous reinforcement learning algorithms for solving discrete space path planning problems Appl Intell 2018 48 12 4889-4904
[119]
Zhao X, Ding S, An Y, and Jia W Applications of asynchronous deep reinforcement learning based on dynamic updating weights Appl Intell 2019 49 2 581-591
[120]
Zheng L, Yang J, Cai H, Zhang W, Wang J, Yu Y (2017)s Magent: a many-agent reinforcement learning platform for artificial collective intelligence
[121]
Zheng Y, Meng Z, Hao J, Zhang Z, Yang T, Fan C (2018) A deep bayesian policy reuse approach against non-stationary agents. In Advances in Neural Information Processing Systems pp 954–964

Cited By

View all
  • (2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
  • (2024)GOV-REK: Governed Reward Engineering Kernels for Designing Robust Multi-Agent Reinforcement Learning SystemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663183(2429-2431)Online publication date: 6-May-2024
  • (2024)Online Surveillance of IoT Agents in Smart Cities Using Deep Reinforcement LearningInternational Journal of Intelligent Information Technologies10.4018/IJIIT.34994220:1(1-15)Online publication date: 17-Sep-2024
  • Show More Cited By

Index Terms

  1. A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Artificial Intelligence Review
      Artificial Intelligence Review  Volume 54, Issue 5
      Jun 2021
      795 pages

      Publisher

      Kluwer Academic Publishers

      United States

      Publication History

      Published: 01 June 2021

      Author Tags

      1. Deep reinforcement learning
      2. Multi-agent
      3. Game theory
      4. Centralized training and decentralized execution
      5. Communication learning
      6. Agent modeling

      Qualifiers

      • Research-article

      Funding Sources

      • the National Natural Science Foundations of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
      • (2024)GOV-REK: Governed Reward Engineering Kernels for Designing Robust Multi-Agent Reinforcement Learning SystemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663183(2429-2431)Online publication date: 6-May-2024
      • (2024)Online Surveillance of IoT Agents in Smart Cities Using Deep Reinforcement LearningInternational Journal of Intelligent Information Technologies10.4018/IJIIT.34994220:1(1-15)Online publication date: 17-Sep-2024
      • (2024)Chain-of-Thought in Neural Code Generation: From and for Lightweight Language ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.344050350:9(2437-2457)Online publication date: 12-Aug-2024
      • (2024)Intersec2vec-TSC: Intersection Representation Learning for Large-Scale Traffic Signal ControlIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.334015325:7(7044-7056)Online publication date: 1-Jul-2024
      • (2024)Multi-agent deep reinforcement learning based real-time planning approach for responsive customized bus routesComputers and Industrial Engineering10.1016/j.cie.2023.109840188:COnline publication date: 17-Apr-2024
      • (2024)Common challenges of deep reinforcement learning applications development: an empirical studyEmpirical Software Engineering10.1007/s10664-024-10500-529:4Online publication date: 14-Jun-2024
      • (2024)Improving multi-UAV cooperative path-finding through multiagent experience learningApplied Intelligence10.1007/s10489-024-05771-w54:21(11103-11119)Online publication date: 1-Nov-2024
      • (2024)QvQ-IL: quantity versus quality in incremental learningNeural Computing and Applications10.1007/s00521-023-09129-036:6(2767-2796)Online publication date: 1-Feb-2024
      • (2024)Modeling and Simulation Technologies for Effective Multi-agent ResearchVirtual, Augmented and Mixed Reality10.1007/978-3-031-61044-8_7(86-104)Online publication date: 29-Jun-2024
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media