Abstract
In large-scale multi-agent environments with homogeneous agents, most works provided approximation methods to simplify the interaction among agents. In this work, we propose a new approximation, termed mix-attention approximation, to enhance multi-agent reinforcement learning. The approximation is made by a mix-attention module, used to form consistent consensuses for agents in partially observable environments. We leverage the hard attention to compress the perception of each agent to some more partial regions. These partial regions can engage the attention of several agents at the same time, and the correlation among these partial regions is generated by a soft-attention module. We give the training method for the mix-attention mechanism and discuss the consistency between the mix-attention module and the policy network. Then we analyze the feasibility of this mix-attention-based approximation, attempting to build integrated models of our method into other approximation methods. In large-scale multi-agent environments, the proposal can be embedded into most reinforcement learning methods, and extensive experiments on multi-agent scenarios demonstrate the effectiveness of the proposed approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Du W, Ding S (2020) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54:3215–3238
Shi H, Li J, Mao J, Hwang K-S (2021) Lateral transfer learning for multiagent reinforcement learning. IEEE Trans Cybern (Early Access)
Zhang L, Li J, Shi H, Hwang K-S et al (2022) Multi-agent reinforcement learning by the actor-critic model with an attention interface. Neurocomputing 471:275–284
Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: a survey. In: 2006 9th international conference on control, automation, robotics and vision. IEEE, pp 1–6
Wei E, Luke S (2016) Lenient learning in independent-learner stochastic cooperative games. J Mach Learn Res 17(1):2914–2955
Lowe R, Wu Y.I, Tamar A, Harb J, Abbeel O.P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp 6379–6390
Wu C, Chowdhury K, Di Felice M, Meleis W (2010) Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems: industry track, pp 1705–1712
Zhang H, Feng S, Liu C, Ding Y, Zhu Y, Zhou Z, Zhang W, Yu Y, Jin H, Li Z (2019) Cityflow: a multi-agent reinforcement learning environment for large scale city traffic scenario. In: The world wide web conference, pp 3620–3624
Kurach K, Raichuk A, Stańczyk P, Zajac M, Bachem O, Espeholt L, Riquelme C, Vincent D, Michalski M, Bousquet O et al. (2020) Google research football: a novel reinforcement learning environment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4501–4510
Rădulescu R, Legrand M, Efthymiadis K, Roijers D.M, Nowé A (2018) Deep multi-agent reinforcement learning in a homogeneous open population. In: Benelux conference on artificial intelligence. Springer, pp 90–105
Wei Q, Liu D, Lewis FL (2015) Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf Sci 317:96–113
Hu H-X, Yu W, Wen G, Xuan Q, Cao J (2016) Reverse group consensus of multi-agent systems in the cooperation-competition network. IEEE Trans Circuits Syst I Regul Pap 63(11):2036–2047
Li J, Shi H, Hwang K-S (2021) An explainable ensemble feedforward method with gaussian convolutional filter. Knowl-Based Syst 225:107103
Liu W, Gu W, Sheng W, Meng X, Wu Z, Chen W (2014) Decentralized multi-agent system-based cooperative frequency control for autonomous microgrids with communication constraints. IEEE Trans Sustain Energy 5(2):446–456
DHurić BO (2017) Organisational metamodel for large-scale multi-agent systems: first steps towards modelling organisation dynamics. ADCAIJ Adv Distrib Comput Artif Intell J 6(3):17
Abouheaf M, Gueaieb W (2017) Multi-agent reinforcement learning approach based on reduced value function approximations. In: 2017 IEEE international symposium on robotics and intelligent sensors (IRIS). IEEE, pp 111–116
Yang Y, Luo R, Li M, Zhou M, Zhang W, Wang J (2018) Mean field multi-agent reinforcement learning. In: International conference on machine learning, pp 5571–5580
Jiang J, Dun C, Lu Z (2018) Graph convolutional reinforcement learning for multi-agent cooperation 2(3). arXiv:1810.09202
Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62
Yang H, Kim J-Y, Kim H, Adhikari SP (2019) Guided soft attention network for classification of breast cancer histopathology images. IEEE Trans Med Imag 39(5):1306–1315
Elsayed G, Kornblith S, Le QV (2019) Saccader: improving accuracy of hard attention models for vision. Adv Neural Inf Process Syst 32:702–714
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning. Proceedings of machine learning research, vol 97. PMLR, Long Beach, California, USA, pp 2961–2970
Mao H, Zhang Z, Xiao Z, Gong Z (2019) Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems, pp 1108–1116
Choudhury S, Solovey K, Kochenderfer M.J, Pavone M (2020) Efficient large-scale multi-drone delivery using transit networks. In: 2020 IEEE international conference on robotics and automation (ICRA), pp 4543–4550. IEEE
Lin K, Zhao R, Xu Z, Zhou J (2018) Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1774–1783
Cetin N, Nagel K, Raney B, Voellmy A (2002) Large-scale multi-agent transportation simulations. Comput Phys Commun 147(1–2):559–564
Kytöjoki J, Nuortio T, Bräysy O, Gendreau M (2007) An efficient variable neighborhood search heuristic for very large scale vehicle routing problems. Comput Oper Res 34(9):2743–2757
Nguyen ND, Nguyen T, Nahavandi S (2019) Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing 359:58–68
Jiang H, Shi D, Xue C, Wang Y, Wang G, Zhang Y (2020) Ghgc: goal-based hierarchical group communication in multi-agent reinforcement learning. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 3507–3514
Wang S, Duan J, Shi D, Xu C, Li H, Diao R, Wang Z (2020) A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning. IEEE Trans Power Syst 35(6):4644–4654
Zhou M, Chen Y, Wen Y, Yang Y, Su Y, Zhang W, Zhang D, Wang J (2019) Factorized q-learning for large-scale multi-agent systems. In: Proceedings of the first international conference on distributed artificial intelligence, pp 1–7
Doan T, Maguluri S, Romberg J (2019) Finite-time analysis of distributed td (0) with linear function approximation on multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 1626–1635
Kortvelesy R, Prorok A (2021) Modgnn: expert policy approximation in multi-agent systems with a modular graph neural network architecture. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 9161–9167
Senadeera M, Karimpanal T.G, Gupta S, Rana S (2022) Sympathy-based reinforcement learning agents. In: Proceedings of the 21st international conference on autonomous agents and multiagent systems, pp 1164–1172
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems-volume 2, pp 2204–2212
Wu YF, Zhang W, Xu P, Gu Q (2020) A finite-time analysis of two time-scale actor-critic methods. Adv Neural Inf Process Syst 33:17617–17628
Peltonen TJ, Ojajärvi R, Heikkilä TT (2018) Mean-field theory for superconductivity in twisted bilayer graphene. Phys Rev B 98(22):220504
Li J, Shi H, Hwang K-S (2022) Using fuzzy logic to learn abstract policies in large-scale multi-agent reinforcement learning. IEEE Trans Fuzzy Syst (Early Access)
Zheng L, Yang J, Cai H, Zhou M, Zhang W, Wang J, Yu Y (2018) Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Terry J, Black B, Grammel N, Jayakumar M, Hari A, Sullivan R, Santos LS, Dieffendahl C, Horsch C, Perez-Vicente R et al (2021) Pettingzoo: gym for multi-agent reinforcement learning. Adv Neural Inf Process Syst 34:15032–15043
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grant 62076202 and 61976178, the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2018AAA0102900), Open Research Projects of Zhejiang Lab (No. 2022NB0AB07), Shaanxi Province Key Research and Development Program of China under Grant 2022GY-090, CAAI-Huawei MindSpore Open Fund (No. CAAIXSJLJJ-2021-041A), and the Doctor’s Scientific Research and Innovation Foundation of Northwestern Polytechnical University (CX2022016).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shike, Y., Jingchen, L. & Haobin, S. Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning. Neural Comput & Applic 35, 3143–3154 (2023). https://doi.org/10.1007/s00521-022-07880-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07880-4