Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning

560 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In large-scale multi-agent environments with homogeneous agents, most works provided approximation methods to simplify the interaction among agents. In this work, we propose a new approximation, termed mix-attention approximation, to enhance multi-agent reinforcement learning. The approximation is made by a mix-attention module, used to form consistent consensuses for agents in partially observable environments. We leverage the hard attention to compress the perception of each agent to some more partial regions. These partial regions can engage the attention of several agents at the same time, and the correlation among these partial regions is generated by a soft-attention module. We give the training method for the mix-attention mechanism and discuss the consistency between the mix-attention module and the policy network. Then we analyze the feasibility of this mix-attention-based approximation, attempting to build integrated models of our method into other approximation methods. In large-scale multi-agent environments, the proposal can be embedded into most reinforcement learning methods, and extensive experiments on multi-agent scenarios demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

A Scalable Multi-agent Reinforcement Learning Approach Based on Value Function Decomposition

DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Du W, Ding S (2020) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54:3215–3238
Article Google Scholar
Shi H, Li J, Mao J, Hwang K-S (2021) Lateral transfer learning for multiagent reinforcement learning. IEEE Trans Cybern (Early Access)
Zhang L, Li J, Shi H, Hwang K-S et al (2022) Multi-agent reinforcement learning by the actor-critic model with an attention interface. Neurocomputing 471:275–284
Article Google Scholar
Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: a survey. In: 2006 9th international conference on control, automation, robotics and vision. IEEE, pp 1–6
Wei E, Luke S (2016) Lenient learning in independent-learner stochastic cooperative games. J Mach Learn Res 17(1):2914–2955
MATH Google Scholar
Lowe R, Wu Y.I, Tamar A, Harb J, Abbeel O.P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp 6379–6390
Wu C, Chowdhury K, Di Felice M, Meleis W (2010) Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems: industry track, pp 1705–1712
Zhang H, Feng S, Liu C, Ding Y, Zhu Y, Zhou Z, Zhang W, Yu Y, Jin H, Li Z (2019) Cityflow: a multi-agent reinforcement learning environment for large scale city traffic scenario. In: The world wide web conference, pp 3620–3624
Kurach K, Raichuk A, Stańczyk P, Zajac M, Bachem O, Espeholt L, Riquelme C, Vincent D, Michalski M, Bousquet O et al. (2020) Google research football: a novel reinforcement learning environment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4501–4510
Rădulescu R, Legrand M, Efthymiadis K, Roijers D.M, Nowé A (2018) Deep multi-agent reinforcement learning in a homogeneous open population. In: Benelux conference on artificial intelligence. Springer, pp 90–105
Wei Q, Liu D, Lewis FL (2015) Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf Sci 317:96–113
Article MATH Google Scholar
Hu H-X, Yu W, Wen G, Xuan Q, Cao J (2016) Reverse group consensus of multi-agent systems in the cooperation-competition network. IEEE Trans Circuits Syst I Regul Pap 63(11):2036–2047
Article Google Scholar
Li J, Shi H, Hwang K-S (2021) An explainable ensemble feedforward method with gaussian convolutional filter. Knowl-Based Syst 225:107103
Article Google Scholar
Liu W, Gu W, Sheng W, Meng X, Wu Z, Chen W (2014) Decentralized multi-agent system-based cooperative frequency control for autonomous microgrids with communication constraints. IEEE Trans Sustain Energy 5(2):446–456
Article Google Scholar
DHurić BO (2017) Organisational metamodel for large-scale multi-agent systems: first steps towards modelling organisation dynamics. ADCAIJ Adv Distrib Comput Artif Intell J 6(3):17
Google Scholar
Abouheaf M, Gueaieb W (2017) Multi-agent reinforcement learning approach based on reduced value function approximations. In: 2017 IEEE international symposium on robotics and intelligent sensors (IRIS). IEEE, pp 111–116
Yang Y, Luo R, Li M, Zhou M, Zhang W, Wang J (2018) Mean field multi-agent reinforcement learning. In: International conference on machine learning, pp 5571–5580
Jiang J, Dun C, Lu Z (2018) Graph convolutional reinforcement learning for multi-agent cooperation 2(3). arXiv:1810.09202
Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62
Article Google Scholar
Yang H, Kim J-Y, Kim H, Adhikari SP (2019) Guided soft attention network for classification of breast cancer histopathology images. IEEE Trans Med Imag 39(5):1306–1315
Article Google Scholar
Elsayed G, Kornblith S, Le QV (2019) Saccader: improving accuracy of hard attention models for vision. Adv Neural Inf Process Syst 32:702–714
Google Scholar
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning. Proceedings of machine learning research, vol 97. PMLR, Long Beach, California, USA, pp 2961–2970
Mao H, Zhang Z, Xiao Z, Gong Z (2019) Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems, pp 1108–1116
Choudhury S, Solovey K, Kochenderfer M.J, Pavone M (2020) Efficient large-scale multi-drone delivery using transit networks. In: 2020 IEEE international conference on robotics and automation (ICRA), pp 4543–4550. IEEE
Lin K, Zhao R, Xu Z, Zhou J (2018) Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1774–1783
Cetin N, Nagel K, Raney B, Voellmy A (2002) Large-scale multi-agent transportation simulations. Comput Phys Commun 147(1–2):559–564
Article MATH Google Scholar
Kytöjoki J, Nuortio T, Bräysy O, Gendreau M (2007) An efficient variable neighborhood search heuristic for very large scale vehicle routing problems. Comput Oper Res 34(9):2743–2757
Article MATH Google Scholar
Nguyen ND, Nguyen T, Nahavandi S (2019) Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing 359:58–68
Article Google Scholar
Jiang H, Shi D, Xue C, Wang Y, Wang G, Zhang Y (2020) Ghgc: goal-based hierarchical group communication in multi-agent reinforcement learning. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 3507–3514
Wang S, Duan J, Shi D, Xu C, Li H, Diao R, Wang Z (2020) A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning. IEEE Trans Power Syst 35(6):4644–4654
Article Google Scholar
Zhou M, Chen Y, Wen Y, Yang Y, Su Y, Zhang W, Zhang D, Wang J (2019) Factorized q-learning for large-scale multi-agent systems. In: Proceedings of the first international conference on distributed artificial intelligence, pp 1–7
Doan T, Maguluri S, Romberg J (2019) Finite-time analysis of distributed td (0) with linear function approximation on multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 1626–1635
Kortvelesy R, Prorok A (2021) Modgnn: expert policy approximation in multi-agent systems with a modular graph neural network architecture. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 9161–9167
Senadeera M, Karimpanal T.G, Gupta S, Rana S (2022) Sympathy-based reinforcement learning agents. In: Proceedings of the 21st international conference on autonomous agents and multiagent systems, pp 1164–1172
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems-volume 2, pp 2204–2212
Wu YF, Zhang W, Xu P, Gu Q (2020) A finite-time analysis of two time-scale actor-critic methods. Adv Neural Inf Process Syst 33:17617–17628
Google Scholar
Peltonen TJ, Ojajärvi R, Heikkilä TT (2018) Mean-field theory for superconductivity in twisted bilayer graphene. Phys Rev B 98(22):220504
Article Google Scholar
Li J, Shi H, Hwang K-S (2022) Using fuzzy logic to learn abstract policies in large-scale multi-agent reinforcement learning. IEEE Trans Fuzzy Syst (Early Access)
Zheng L, Yang J, Cai H, Zhou M, Zhang W, Wang J, Yu Y (2018) Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Terry J, Black B, Grammel N, Jayakumar M, Hari A, Sullivan R, Santos LS, Dieffendahl C, Horsch C, Perez-Vicente R et al (2021) Pettingzoo: gym for multi-agent reinforcement learning. Adv Neural Inf Process Syst 34:15032–15043
Google Scholar

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant 62076202 and 61976178, the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2018AAA0102900), Open Research Projects of Zhejiang Lab (No. 2022NB0AB07), Shaanxi Province Key Research and Development Program of China under Grant 2022GY-090, CAAI-Huawei MindSpore Open Fund (No. CAAIXSJLJJ-2021-041A), and the Doctor’s Scientific Research and Innovation Foundation of Northwestern Polytechnical University (CX2022016).

Author information

Yang Shike, Li Jingchen have contributed equally to this work.

Authors and Affiliations

School of Cybersecurity, Northwestern Polytechnical University, Youyi Western Street, Xi’an, 710072, Shaanxi, China
Yang Shike
School of Computer Science, Northwestern Polytechnical University, Youyi Western Street, Xi’an, 710072, Shaanxi, China
Li Jingchen & Shi Haobin
Twentieth Research Institute, China Electronic Technology Group Corporation, Zhangjiabao Street, Xi’an, 710018, Shaanxi, China
Yang Shike

Authors

Yang Shike
View author publications
You can also search for this author in PubMed Google Scholar
Li Jingchen
View author publications
You can also search for this author in PubMed Google Scholar
Shi Haobin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shi Haobin.

Ethics declarations

Conflict of interest

We declare that we have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shike, Y., Jingchen, L. & Haobin, S. Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning. Neural Comput & Applic 35, 3143–3154 (2023). https://doi.org/10.1007/s00521-022-07880-4

Download citation

Received: 21 February 2022
Accepted: 21 September 2022
Published: 07 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00521-022-07880-4

Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

A Scalable Multi-agent Reinforcement Learning Approach Based on Value Function Decomposition

DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

A Scalable Multi-agent Reinforcement Learning Approach Based on Value Function Decomposition

DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation