research-article

A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications

Authors:

Shifei DingAuthors Info & Claims

Artificial Intelligence Review, Volume 54, Issue 5

Pages 3215 - 3238

https://doi.org/10.1007/s10462-020-09938-y

Published: 01 June 2021 Publication History

Abstract

Deep reinforcement learning has proved to be a fruitful method in various tasks in the field of artificial intelligence during the last several years. Recent works have focused on deep reinforcement learning beyond single-agent scenarios, with more consideration of multi-agent settings. The main goal of this paper is to provide a detailed and systematic overview of multi-agent deep reinforcement learning methods in views of challenges and applications. Specifically, the preliminary knowledge is introduced first for a better understanding of this field. Then, a taxonomy of challenges is proposed and the corresponding structures and representative methods are introduced. Finally, some applications and interesting future opportunities for multi-agent deep reinforcement learning are given.

References

[1]

Abouheaf M, Gueaieb W (2017) Multi-agent reinforcement learning approach based on reduced value function approximations. In 2017 IEEE International Symposium on Robotics and Intelligent Sensors (IRIS) pp 111–116. IEEE

[2]

Albrecht SV and Stone P Autonomous agents modeling other agents: a comprehensive survey and open problems Artif Intell 2018 258 66-95

[3]

Bard N, Foerster JN, Chandar S, Burch N, Lanctot M, Song HF, and Dunning I The hanabi challenge: a new frontier for ai research Artif Intell 2020 280 103216

[4]

Bowling M, McCracken P (2005) Coordination and adaptation in impromptu teams. In: 1995 AAAI conference on artificial intelligence, vol 5, pp 53–58

[5]

Buşoniu L, Babuška R, and De Schutter B Srinivasan D and Jain LC Multi-agent reinforcement learning: an overview Innovations in multi-agent systems and applications-1 2010 Berlin, Heidelberg Springer 183-221

[6]

Calvo JA, Dusparic I (2018) Heterogeneous multi-agent deep reinforcement learning for traffic lights control. In AICS pp 2–13

[7]

Camerer CF, Ho TH, and Chong JK Behavioural game theory: thinking, learning and teaching. In Advances in understanding strategic behavior 2004 London Palgrave Macmillan 120-180

[8]

Carmel D, Markovitch S (1996) Incorporating opponent models into adversary search. In AAAI/IAAI, Vol. 1, pp 120–125

[9]

Chen W, Zhou K, Chen C (2016) Real-time bus holding control on a transit corridor based on multi-agent reinforcement learning. In 2016 IEEE 19th International conference on intelligent transportation systems (ITSC) pp 100–106. IEEE

[10]

Christiano PF, Leike J, Brown T, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems pp 4299–4307

[11]

Da Silva FL and Costa AHR A survey on transfer learning for multiagent reinforcement learning systems J Artif Intell Res 2019 64 645-703

[12]

Ding S, Du W, Zhao X, et al. A new asynchronous reinforcement learning algorithm based on improved parallel PSO Appl Intell 2019 49 12 4211-4222

[13]

Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning pp 1329–1338

[14]

Egorov M (2016) Multi-agent deep reinforcement learning. CS231n: convolutional neural networks for visual recognition

[15]

Finn C, Levine S (2017) Deep visual foresight for planning robot motion. In 2017 IEEE International Conference on Robotics and Automation (ICRA) pp 2786–2793. IEEE

[16]

Foerster J, Assael IA, de Freitas N, Whiteson S (2016) Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems pp 2137–2145

[17]

Foerster J, Nardelli N, Farquhar G, Afouras T, Torr PH, Kohli P, Whiteson S (2017) Stabilising experience replay for deep multi-agent reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 pp 1146–1155. JMLR. org

[18]

Foerster JN, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In Thirty-Second AAAI Conference on Artificial Intelligence

[19]

Fortunato M, Azar MG, Piot B, Menick J, Osband I, Graves A, Blundell C (2017) Noisy networks for exploration. arXiv preprint

[20]

Francois-Lavet V, Fonteneau R, Ernst D (2015) How to discount deep reinforcement learning: towards new dynamic strategies. Proceedings of the Workshops at the Advances in Neural Information Processing Systems. Montreal, Canada: pp 107–116

[21]

Fu H, Tang H, Hao J, Lei Z, Chen Y, Fan C (2019) Deep multi-agent reinforcement learning with discrete-continuous hybrid action spaces. arXiv preprint

[22]

Fujimoto S, Van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. arXiv preprint

[23]

Gao C, Kartal B, Hernandez-Leal P, Taylor ME (2019) On hard exploration for reinforcement learning: a case study in pommerman. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment Vol. 15, No. 1, pp 24–30

[24]

Gmytrasiewicz PJ and Doshi P A framework for sequential planning in multi-agent settings J Artif Intell Res 2005 24 49-79

[25]

Gmytrasiewicz PJ, Durfee EH (2000) Rational coordination in multi-agent environments, autonomous agents and multi-agent systems 3 (4)

[26]

Greenwald A, Hall K, Serrano R (2003) Correlated q-learning. In: International conference on machine learning, vol 3, pp 242–249

[27]

Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep q-learning with model-based acceleration. In International Conference on Machine Learning pp 2829–2838

[28]

Gu S, Holly E, Lillicrap T et al. (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. IEEE International Conference on Robotics and Automation. Singapore: IEEE Press: 3389–3396

[29]

Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017). Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems pp 66–83 Springer, Cham

[30]

Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint

[31]

Hadfield-Menell D, Russell SJ, Abbeel P, Dragan A (2016) Cooperative inverse reinforcement learning. In Advances in neural information processing systems pp 3909–3917

[32]

Hadfield-Menell D, Milli S, Abbeel P, Russell SJ, Dragan A (2017) Inverse reward design. In Advances in neural information processing systems pp 6765–6774

[33]

Hausknecht M, Stone P (2015) Deep recurrent q-learning for partially observable mdps. In 2015 AAAI Fall Symposium Series

[34]

He H, Boyd-Graber J, Kwok K, Daumé III H (2016) Opponent modeling in deep reinforcement learning. In International Conference on Machine Learning pp 1804–1813

[35]

Heess N, Sriram S, Lemmon J, Merel J, Wayne G, Tassa Y, Silver D (2017) Emergence of locomotion behaviours in rich environments. arXiv preprint

[36]

Hernandez-Leal P, Kaisers M (2017) Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor

[37]

Hernandez-Leal P, Taylor ME, Rosman B, Sucar LE, Munoz de Cote E (2016) Identifying and tracking switching, non-stationary opponents: a bayesian approach, In: Multiagent Interaction without Prior Coordination Workshop at AAAI, Phoenix, AZ, USA, 2016

[38]

Hernandez-Leal P, Kaisers M, Baarslag T, de Cote EM (2017) A survey of learning in multiagent environments: dealing with non-stationarity. arXiv preprint

[39]

Hernandez-Leal P, Zhan Y, Taylor ME, Sucar LE, and de Cote EM Efficiently detecting switches against non-stationary opponents Auton Agent Multi-Agent Syst 2017 31 4 767-789

[40]

Hernandez-Leal P, Kartal B, Taylor ME (2018) Is multiagent deep reinforcement learning the answer or the question? A brief survey. arXiv preprint

[41]

Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G (2017) Rainbow: combining improvements in deep reinforcement learning

[42]

Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Silver D (2018) Rainbow: combining improvements in deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence

[43]

Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint

[44]

Hong ZW, Su SY, Shann, TY, Chang YH, Lee CY (2018) A deep policy inference q-network for multi-agent systems. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems pp 1388–1396. International Foundation for Autonomous Agents and Multiagent Systems

[45]

Hu J and Wellman MP Nash Q-learning for general-sum stochastic games J Mach Learn Res 2003 4 1039-1069

[46]

Ivanov S, D'yakonov A (2019) Modern Deep Reinforcement Learning Algorithms. arXiv preprint

[47]

Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems pp 7254–7264

[48]

Jin J, Song C, Li H, Gai K, Wang J, Zhang W (2018) Real-time bidding with multi-agent reinforcement learning in display advertising. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management pp 2193–2201. ACM

[49]

Johnson M, Hofmann K, Hutton T (2016) The Malmo platform for artificial intelligence experimentation. In: IJCAI, pp 4246–4247

[50]

Kofinas P, Dounis AI, and Vouros GA Fuzzy Q-Learning for multi-agent decentralized energy management in microgrids Appl Energy 2018 219 53-67

[51]

Kononen V Asymmetric multiagent reinforcement learning Web Intell Agent Syst: An Int J 2004 2 2 105-121

[52]

Kurek M, Jakowski W (2016) Heterogeneous team deep Q-learning in low-dimensional multi-agent environments. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on pp 1–8

[53]

Lakshminarayanan AS, Sharma S, Ravindran B (2016) Dynamic frame skip deep q network. Proceedings of the Workshops at the International Joint Conference on Artificial Intelligence

[54]

Lanctot M, Zambaldi V, Gruslys A, Lazaridou A, Tuyls K, Pérolat J, Graepel T (2017) A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systemsm pp 4190–4203

[55]

Lanctot M, Zambaldi V, Gruslys A, et al. A unified game-theoretic approach to multi-agent reinforcement learning. Advances in neural information processing systems Los Angeles: NIPS Press 2017 2017 4190-4203

[56]

Lauer M, Riedmiller M (2000) An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the Seventeenth International Conference on Machine Learning

[57]

Leibo JZ, Zambaldi V, Lanctot M, Marecki J, Graepel T (2017) Multi-agent reinforcement learning in sequential social dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems pp 464–473. International Foundation for Autonomous Agents and Multiagent Systems

[58]

Levine S, Finn C, Darrell T, and Abbeel P End-to-end training of deep visuomotor policies J Mach Learn Res 2016 17 1 1334-1373

[59]

Li S, Wu Y, Cui X, Dong H, Fang F, Russell S (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In AAAI Conference on Artificial Intelligence (AAAI)

[60]

Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning Comput Sci 2016 8 6 A187

[61]

Littman ML Markov games as a framework for multi-agent reinforcement learning New brunswick: machine learning 1994 USA Elsevier 157-163

[62]

Littman ML Value-function reinforcement learning in Markov games Cognit Syst Res 2001 2 1 55-66

[63]

Liu S, Lever G, Merel J, Tunyasuvunakool S, Heess N, Graepel T (2019) Emergent coordination through competition. arXiv preprint

[64]

Lowe R, Wu Y, Tamar A, Harb J, Abbeel OP, and Mordatch I Multi-agent actor-critic for mixed cooperative-competitive environments Adv Neural Inf Process Syst 2017 30 6379-6390

[65]

Mao H, Gong Z, Ni, Y, Xiao Z (2017) ACCNet: Actor-Coordinator-Critic Net for" Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning. arXiv preprint

[66]

Mao H, Liu W, Hao J, Luo J, Li D, Zhang Z, Xiao Z (2019) Neighborhood cognition consistent multi-agent reinforcement learning. arXiv preprint

[67]

Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller, M (2013) Playing atari with deep reinforcement learning. arXiv preprint

[68]

Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp 1928–1937

[69]

Nguyen ND, Nahavandi S, Nguyen T (2018) A human mixed strategy approach to deep reinforcement learning. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC) pp 4023–4028. IEEE

[70]

Nguyen TT, Nguyen ND, Nahavandi S (2018) Deep reinforcement learning for multi-agent systems: a review of challenges, solutions and applications. arXiv preprint

[71]

Nguyen T, Nguyen ND, Nahavandi S (2018) Multi-agent deep reinforcement learning with human strategies. arXiv preprint

[72]

Noureddine D, Gharbi A Ahmed S (2017) Multi-agent deep reinforcement learning for task allocation in dynamic environment. In Proceedings of the 12th International Conference on Software Technologies (ICSOFT), pp 17–26

[73]

Palmer G, Tuyls K, Bloembergen D, Savani R (2018) Lenient multi-agent deep reinforcement learning. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems pp 443–451. International Foundation for Autonomous Agents and Multiagent Systems

[74]

Palmer G, Savani R, Tuyls K (2019) Negative update intervals in deep multi-agent reinforcement learning. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems pp 43–51. International Foundation for Autonomous Agents and Multiagent Systems

[75]

Panait L and Luke S Cooperative multi-agent learning: The state of the art Auton Agent Multi-Agent Syst 2005 11 3 387-434

[76]

Parisotto E, Ba JL, Salakhutdinov R (2015) Actor-mimic: Deep multitask and transfer reinforcement learning. arXiv preprint

[77]

Peng P, Yuan Q, Wen Y, Yang Y, Tang Z, Long H, Wang J (2017) Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. arXiv preprint, 2

[78]

Perolat J, Leibo JZ, Zambaldi V, Beattie C, Tuyls K, Graepel T (2017) A multi-agent reinforcement learning model of common-pool resource appropriation. In Advances in Neural Information Processing Systems pp 3643–3652

[79]

Piot B, Geist M, and Pietquin O Bridging the gap between imitation learning and inverse reinforcement learning IEEE transactions on neural networks and learning systems 2016 28 8 1814-1826

[80]

Rabinowitz NC, Perbet F, Song HF, Zhang C, Eslami SM, Botvinick M (2018) Machine theory of mind. arXiv preprint

[81]

Raileanu R, Denton E, Szlam A, Fergus R (2018) Modeling others using oneself in multi-agent reinforcement learning. arXiv preprint

[82]

Rashid T, Samvelyan M, De Witt CS, Farquhar G, Foerster J, Whiteson S (2018). QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv preprint

[83]

Resnick C, Eldridge W, Ha D, Britz D, Foerster J, Togelius J et al (2018) Pommerman: a multi-agent playground

[84]

Rosman B, Hawasly M, and Ramamoorthy S Bayesian policy reuse Machine Learning 2016 104 1 99-127

[85]

Rusu AA, Colmenarejo SG, Gulcehre C, Desjardins G, Kirkpatrick J, Pascanu R, Hadsell R (2015) Policy distillation. arXiv preprint

[86]

Samvelyan M, Rashid T, Schroeder de Witt C, Farquhar G, Nardelli N, Rudner TG, Whiteson . (2019). The starcraft multi-agent challenge. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems pp 2186–2188. International Foundation for Autonomous Agents and Multiagent Systems

[87]

Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv preprint

[88]

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint

[89]

Shalev-Shwartz S, Shammah S, Shashua A (2016) Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint

[90]

Silver D, Lever G, Heess N et al (2014) Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. Beijing, China: 387–395

[91]

Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. arXiv preprint

[92]

Song J, Ren H, Sadigh D, Ermon S (2018) Multi-agent generative adversarial imitation learning. In Advances in Neural Information Processing Systems pp 7461–7472

[93]

Song Y, Wang J, Lukasiewicz T, Xu Z, Xu M, Ding Z, Wu L (2019) Arena: a general evaluation platform and building toolkit for multi-agent intelligence. arXiv preprint

[94]

Stone P and Veloso M Multiagent systems: a survey from a machine learning perspective Auton Robots 2000 8 3 345-383

[95]

Suarez J, Du Y, Isola P, Mordatch I, MMO N (1903) A massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint

[96]

Sukhbaatar S, Fergus R (2016) Learning multiagent communication with backpropagation. In Advances in neural information processing systems pp 2244–2252

[97]

Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Graepel T (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv preprint

[98]

Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J, and Vicente R Multiagent cooperation and competition with deep reinforcement learning PLoS ONE 2017 12 4 e0172395

[99]

Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning pp 330–337

[100]

Tumer K, Agogino A (2007) Distributed agent-based air traffic flow management. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems pp 1–8

[101]

Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In Thirtieth AAAI conference on artificial intelligence

[102]

Vidhate DA, Kulkarni P (2017) Cooperative multi-agent reinforcement learning models (CMRLM) for intelligent traffic control. In 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) pp 325–331. IEEE

[103]

Wai HT, Yang Z, Wang PZ, Hong M (2018) Multi-agent reinforcement learning via double averaging primal-dual optimization. In Advances in Neural Information Processing Systems pp 9649–9660

[104]

Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint

[105]

Wang W, Yang T, Liu Y, Hao J, Hao X, Hu Y, Gao Y (2019) From Few to More: Large-scale Dynamic Multiagent Curriculum Learning. arXiv preprint

[106]

Wang W, Liu TYY, Hao J, Hao X, Hu Y, Chen Y, Gao Y (2019) Action semantics network: Considering the Effects of Actions in Multiagent Systems. arXiv preprint

[107]

Wei E, Wicke D, Freelan D, Luke S (2018) Multiagent soft q-learning. In 2018 AAAI Spring Symposium Series

[108]

Xi L, Yu T, Yang B, and Zhang X A novel multi-agent decentralized win or learn fast policy hill-climbing with eligibility trace algorithm for smart generation control of interconnected complex power grids Energy Convers Manage 2015 103 82-93

[109]

Xi L, Chen J, Huang Y, Xu Y, Liu L, Zhou Y, and Li Y Smart generation control based on multi-agent reinforcement learning with the idea of the time tunnel Energy 2018 153 977-987

[110]

Xi L, Yu L, Xu Y, Wang S, Chen X (2019) A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems. IEEE Transactions on Sustainable Energy

[111]

Xu D, Si J, and Bian W Fingerprint orientation field extraction using gradient-based weighted averaging International Journal of collaborative intelligence 2016 1 4 287-297

[112]

Yang T, Hao J, Meng Z, Zhang C, Zheng YZZ, Zheng Z (2019) Towards efficient detection and optimal response against sophisticated opponents. In Proceedings of the 28th International Joint Conference on Artificial Intelligence pp 623–629. AAAI Press

[113]

Yang Y, Hao J, Liao B, Shao K, Chen G, Liu W, Tang H (2020) Qatten: a general framework for cooperative multiagent reinforcement learning. arXiv preprint .

[114]

Yang Y, Hao J, Chen G, Tang H, Chen Y, Hu Y, Wei Z (2020) Q-value path decomposition for deep multiagent reinforcement learning. In International Joint Conference on Artificial Intelligence (IJCAI)

[115]

Yin H, Pan SJ (2017) Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

[116]

Zhang P, Hao J, Wang W, Tang H, Ma Y, Duan Y, Zheng Y (2020) KoGuN: accelerating deep reinforcement learning via integrating human suboptimal knowledge. In Thirty-seventh International Conference on Machine Learning (ICML)s

[117]

Zhao Z, Gao Y, Luo B, et al. Reinforcement learning technology in multi-agent system Comput Sci 2004 31 3 23-27

[118]

Zhao X, Ding S, An Y, and Jia W Asynchronous reinforcement learning algorithms for solving discrete space path planning problems Appl Intell 2018 48 12 4889-4904

[119]

Zhao X, Ding S, An Y, and Jia W Applications of asynchronous deep reinforcement learning based on dynamic updating weights Appl Intell 2019 49 2 581-591

[120]

Zheng L, Yang J, Cai H, Zhang W, Wang J, Yu Y (2017)s Magent: a many-agent reinforcement learning platform for artificial collective intelligence

[121]

Zheng Y, Meng Z, Hao J, Zhang Z, Yang T, Fan C (2018) A deep bayesian policy reuse approach against non-stationary agents. In Advances in Neural Information Processing Systems pp 954–964

Cited By

Wai KGeng MPateria SSubagdja BTan ADastani MSichman JAlechina NDignum V(2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663219
Rana AOesterle MBrinkmann JDastani MSichman JAlechina NDignum V(2024)GOV-REK: Governed Reward Engineering Kernels for Designing Robust Multi-Agent Reinforcement Learning SystemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663183(2429-2431)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663183
Alenezi A(2024)Online Surveillance of IoT Agents in Smart Cities Using Deep Reinforcement LearningInternational Journal of Intelligent Information Technologies10.4018/IJIIT.34994220:1(1-15)Online publication date: 17-Sep-2024
https://dl.acm.org/doi/10.4018/IJIIT.349942
Show More Cited By

Index Terms

A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-agent deep reinforcement learning: a survey
Abstract
The advances in reinforcement learning have recorded sublime success in various domains. Although the multi-agent domain has been overshadowed by its single-agent counterpart during this progress, multi-agent reinforcement learning gains rapid ...
Deep reinforcement learning for multi-agent interaction
Multi-agent systems research in the United Kingdom

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel ...
Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
Agents and Artificial Intelligence
Abstract
Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence Review

Artificial Intelligence Review Volume 54, Issue 5

Jun 2021

795 pages

ISSN:0269-2821

Issue’s Table of Contents

© Springer Nature B.V. 2020.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 2021

Author Tags

Qualifiers

Research-article

Funding Sources

the National Natural Science Foundations of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wai KGeng MPateria SSubagdja BTan ADastani MSichman JAlechina NDignum V(2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663219
Rana AOesterle MBrinkmann JDastani MSichman JAlechina NDignum V(2024)GOV-REK: Governed Reward Engineering Kernels for Designing Robust Multi-Agent Reinforcement Learning SystemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663183(2429-2431)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663183
Alenezi A(2024)Online Surveillance of IoT Agents in Smart Cities Using Deep Reinforcement LearningInternational Journal of Intelligent Information Technologies10.4018/IJIIT.34994220:1(1-15)Online publication date: 17-Sep-2024
https://dl.acm.org/doi/10.4018/IJIIT.349942
Yang GZhou YChen XZhang XZhuo TChen T(2024)Chain-of-Thought in Neural Code Generation: From and for Lightweight Language ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.344050350:9(2437-2457)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3440503
Huang HHu ZWang YLu ZWen X(2024)Intersec2vec-TSC: Intersection Representation Learning for Large-Scale Traffic Signal ControlIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.334015325:7(7044-7056)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TITS.2023.3340153
Wu BZuo XChen GAi GWan X(2024)Multi-agent deep reinforcement learning based real-time planning approach for responsive customized bus routesComputers and Industrial Engineering10.1016/j.cie.2023.109840188:COnline publication date: 17-Apr-2024
https://dl.acm.org/doi/10.1016/j.cie.2023.109840
Morovati MTambon FTaraghi MNikanjam AKhomh F(2024)Common challenges of deep reinforcement learning applications development: an empirical studyEmpirical Software Engineering10.1007/s10664-024-10500-529:4Online publication date: 14-Jun-2024
https://dl.acm.org/doi/10.1007/s10664-024-10500-5
Longting JRuixuan WDong W(2024)Improving multi-UAV cooperative path-finding through multiagent experience learningApplied Intelligence10.1007/s10489-024-05771-w54:21(11103-11119)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s10489-024-05771-w
Han JZhang TLiu ZLi Y(2024)QvQ-IL: quantity versus quality in incremental learningNeural Computing and Applications10.1007/s00521-023-09129-036:6(2767-2796)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s00521-023-09129-0
Schaefer KBrewer RWickwire JScalise RKessens C(2024)Modeling and Simulation Technologies for Effective Multi-agent ResearchVirtual, Augmented and Mixed Reality10.1007/978-3-031-61044-8_7(86-104)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1007/978-3-031-61044-8_7
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents