Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In large-scale multi-agent environments with homogeneous agents, most works provided approximation methods to simplify the interaction among agents. In this work, we propose a new approximation, termed mix-attention approximation, to enhance multi-agent reinforcement learning. The approximation is made by a mix-attention module, used to form consistent consensuses for agents in partially observable environments. We leverage the hard attention to compress the perception of each agent to some more partial regions. These partial regions can engage the attention of several agents at the same time, and the correlation among these partial regions is generated by a soft-attention module. We give the training method for the mix-attention mechanism and discuss the consistency between the mix-attention module and the policy network. Then we analyze the feasibility of this mix-attention-based approximation, attempting to build integrated models of our method into other approximation methods. In large-scale multi-agent environments, the proposal can be embedded into most reinforcement learning methods, and extensive experiments on multi-agent scenarios demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Du W, Ding S (2020) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54:3215–3238

    Article  Google Scholar 

  2. Shi H, Li J, Mao J, Hwang K-S (2021) Lateral transfer learning for multiagent reinforcement learning. IEEE Trans Cybern (Early Access)

  3. Zhang L, Li J, Shi H, Hwang K-S et al (2022) Multi-agent reinforcement learning by the actor-critic model with an attention interface. Neurocomputing 471:275–284

    Article  Google Scholar 

  4. Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: a survey. In: 2006 9th international conference on control, automation, robotics and vision. IEEE, pp 1–6

  5. Wei E, Luke S (2016) Lenient learning in independent-learner stochastic cooperative games. J Mach Learn Res 17(1):2914–2955

    MATH  Google Scholar 

  6. Lowe R, Wu Y.I, Tamar A, Harb J, Abbeel O.P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp 6379–6390

  7. Wu C, Chowdhury K, Di Felice M, Meleis W (2010) Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems: industry track, pp 1705–1712

  8. Zhang H, Feng S, Liu C, Ding Y, Zhu Y, Zhou Z, Zhang W, Yu Y, Jin H, Li Z (2019) Cityflow: a multi-agent reinforcement learning environment for large scale city traffic scenario. In: The world wide web conference, pp 3620–3624

  9. Kurach K, Raichuk A, Stańczyk P, Zajac M, Bachem O, Espeholt L, Riquelme C, Vincent D, Michalski M, Bousquet O et al. (2020) Google research football: a novel reinforcement learning environment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4501–4510

  10. Rădulescu R, Legrand M, Efthymiadis K, Roijers D.M, Nowé A (2018) Deep multi-agent reinforcement learning in a homogeneous open population. In: Benelux conference on artificial intelligence. Springer, pp 90–105

  11. Wei Q, Liu D, Lewis FL (2015) Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf Sci 317:96–113

    Article  MATH  Google Scholar 

  12. Hu H-X, Yu W, Wen G, Xuan Q, Cao J (2016) Reverse group consensus of multi-agent systems in the cooperation-competition network. IEEE Trans Circuits Syst I Regul Pap 63(11):2036–2047

    Article  Google Scholar 

  13. Li J, Shi H, Hwang K-S (2021) An explainable ensemble feedforward method with gaussian convolutional filter. Knowl-Based Syst 225:107103

    Article  Google Scholar 

  14. Liu W, Gu W, Sheng W, Meng X, Wu Z, Chen W (2014) Decentralized multi-agent system-based cooperative frequency control for autonomous microgrids with communication constraints. IEEE Trans Sustain Energy 5(2):446–456

    Article  Google Scholar 

  15. DHurić BO (2017) Organisational metamodel for large-scale multi-agent systems: first steps towards modelling organisation dynamics. ADCAIJ Adv Distrib Comput Artif Intell J 6(3):17

    Google Scholar 

  16. Abouheaf M, Gueaieb W (2017) Multi-agent reinforcement learning approach based on reduced value function approximations. In: 2017 IEEE international symposium on robotics and intelligent sensors (IRIS). IEEE, pp 111–116

  17. Yang Y, Luo R, Li M, Zhou M, Zhang W, Wang J (2018) Mean field multi-agent reinforcement learning. In: International conference on machine learning, pp 5571–5580

  18. Jiang J, Dun C, Lu Z (2018) Graph convolutional reinforcement learning for multi-agent cooperation 2(3). arXiv:1810.09202

  19. Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62

    Article  Google Scholar 

  20. Yang H, Kim J-Y, Kim H, Adhikari SP (2019) Guided soft attention network for classification of breast cancer histopathology images. IEEE Trans Med Imag 39(5):1306–1315

    Article  Google Scholar 

  21. Elsayed G, Kornblith S, Le QV (2019) Saccader: improving accuracy of hard attention models for vision. Adv Neural Inf Process Syst 32:702–714

    Google Scholar 

  22. Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning. Proceedings of machine learning research, vol 97. PMLR, Long Beach, California, USA, pp 2961–2970

  23. Mao H, Zhang Z, Xiao Z, Gong Z (2019) Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems, pp 1108–1116

  24. Choudhury S, Solovey K, Kochenderfer M.J, Pavone M (2020) Efficient large-scale multi-drone delivery using transit networks. In: 2020 IEEE international conference on robotics and automation (ICRA), pp 4543–4550. IEEE

  25. Lin K, Zhao R, Xu Z, Zhou J (2018) Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1774–1783

  26. Cetin N, Nagel K, Raney B, Voellmy A (2002) Large-scale multi-agent transportation simulations. Comput Phys Commun 147(1–2):559–564

    Article  MATH  Google Scholar 

  27. Kytöjoki J, Nuortio T, Bräysy O, Gendreau M (2007) An efficient variable neighborhood search heuristic for very large scale vehicle routing problems. Comput Oper Res 34(9):2743–2757

    Article  MATH  Google Scholar 

  28. Nguyen ND, Nguyen T, Nahavandi S (2019) Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing 359:58–68

    Article  Google Scholar 

  29. Jiang H, Shi D, Xue C, Wang Y, Wang G, Zhang Y (2020) Ghgc: goal-based hierarchical group communication in multi-agent reinforcement learning. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 3507–3514

  30. Wang S, Duan J, Shi D, Xu C, Li H, Diao R, Wang Z (2020) A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning. IEEE Trans Power Syst 35(6):4644–4654

    Article  Google Scholar 

  31. Zhou M, Chen Y, Wen Y, Yang Y, Su Y, Zhang W, Zhang D, Wang J (2019) Factorized q-learning for large-scale multi-agent systems. In: Proceedings of the first international conference on distributed artificial intelligence, pp 1–7

  32. Doan T, Maguluri S, Romberg J (2019) Finite-time analysis of distributed td (0) with linear function approximation on multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 1626–1635

  33. Kortvelesy R, Prorok A (2021) Modgnn: expert policy approximation in multi-agent systems with a modular graph neural network architecture. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 9161–9167

  34. Senadeera M, Karimpanal T.G, Gupta S, Rana S (2022) Sympathy-based reinforcement learning agents. In: Proceedings of the 21st international conference on autonomous agents and multiagent systems, pp 1164–1172

  35. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems-volume 2, pp 2204–2212

  36. Wu YF, Zhang W, Xu P, Gu Q (2020) A finite-time analysis of two time-scale actor-critic methods. Adv Neural Inf Process Syst 33:17617–17628

    Google Scholar 

  37. Peltonen TJ, Ojajärvi R, Heikkilä TT (2018) Mean-field theory for superconductivity in twisted bilayer graphene. Phys Rev B 98(22):220504

    Article  Google Scholar 

  38. Li J, Shi H, Hwang K-S (2022) Using fuzzy logic to learn abstract policies in large-scale multi-agent reinforcement learning. IEEE Trans Fuzzy Syst (Early Access)

  39. Zheng L, Yang J, Cai H, Zhou M, Zhang W, Wang J, Yu Y (2018) Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  40. Terry J, Black B, Grammel N, Jayakumar M, Hari A, Sullivan R, Santos LS, Dieffendahl C, Horsch C, Perez-Vicente R et al (2021) Pettingzoo: gym for multi-agent reinforcement learning. Adv Neural Inf Process Syst 34:15032–15043

    Google Scholar 

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant 62076202 and 61976178, the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2018AAA0102900), Open Research Projects of Zhejiang Lab (No. 2022NB0AB07), Shaanxi Province Key Research and Development Program of China under Grant 2022GY-090, CAAI-Huawei MindSpore Open Fund (No. CAAIXSJLJJ-2021-041A), and the Doctor’s Scientific Research and Innovation Foundation of Northwestern Polytechnical University (CX2022016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi Haobin.

Ethics declarations

Conflict of interest

We declare that we have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shike, Y., Jingchen, L. & Haobin, S. Mix-attention approximation for homogeneous large-scale multi-agent reinforcement learning. Neural Comput & Applic 35, 3143–3154 (2023). https://doi.org/10.1007/s00521-022-07880-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07880-4

Keywords

Navigation