research-article

Model-based Sparse Communication in Multi-agent Reinforcement Learning

Authors:

Shihan WangAuthors Info & Claims

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

Pages 439 - 447

Published: 30 May 2023 Publication History

Abstract

Learning to communicate efficiently is central to multi-agent reinforcement learning (MARL). Existing methods often require agents to exchange messages intensively, which abuses communication channels and leads to high communication overhead. Only a few methods target on learning sparse communication, but they allow limited information to be shared, which affects the efficiency of policy learning. In this work, we propose model-based communication (MBC), a learning framework with a decentralized communication scheduling process. The MBC framework enables multiple agents to make decisions with sparse communication. In particular, the MBC framework introduces a model-based message estimator to estimate the up-to-date global messages using past local data. A decentralized message scheduling mechanism is also proposed to determine whether a message shall be sent based on the estimation. We evaluated our method in a variety of mixed cooperative-competitive environments. The experiment results show that the MBC method shows better performance and lower channel overhead than the state-of-art baselines.

References

[1]

Akshat Agarwal, Sumit Kumar, Katia P. Sycara, and Michael Lewis. 2020. Learning Transferable Cooperative Behavior in Multi-Agent Teams. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 1741--1743.

[2]

Petar Velivc ković Guillem Cucurull Arantxa Casanova, Adriana Romero Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. 6th International Conference on Learning Representations (2018).

[3]

Abhishek Das, Thé ophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Mike Rabbat, and Joelle Pineau. 2019. TarMAC: Targeted Multi-Agent Communication. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. 1538--1546.

[4]

Ziluo Ding, Tiejun Huang, and Zongqing Lu. 2020. Learning Individually Inferred Communication for Multi-Agent Cooperation. In Advances in Neural Information Processing Systems.

[5]

Yali Du, Bo Liu, Vincent Moens, Ziqi Liu, Zhicheng Ren, Jun Wang, Xu Chen, and Haifeng Zhang. 2021. Learning Correlated Communication Topology in Multi-Agent Reinforcement learning. In 20th International Conference on Autonomous Agents and Multiagent Systems. 456--464.

Digital Library

[6]

Vladimir Egorov and Alexey Shpilman. 2022. Scalable Multi-Agent Model-Based Reinforcement Learning. In 21st International Conference on Autonomous Agents and Multiagent Systems. 381--390.

[7]

Jakob N. Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, and Shimon Whiteson. 2018. Counterfactual Multi-Agent Policy Gradients. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 2974--2982.

[8]

Sven Gronauer and Klaus Diepold. 2022. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, Vol. 55, 2 (2022), 895--943.

Digital Library

[9]

Shixiang Gu, Ethan Holly, Timothy P. Lillicrap, and Sergey Levine. 2017. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In 2017 IEEE International Conference on Robotics and Automation. 3389--3396.

Digital Library

[10]

Shushi Gu, Ye Wang, Niannian Wang, and Wen Wu. 2020. Intelligent optimization of availability and communication cost in satellite-UAV mobile edge caching system with fault-tolerant codes. IEEE Transactions on Cognitive Communications and Networking, Vol. 6, 4 (2020), 1230--1241.

[11]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.

[12]

Michael Janner, Justin Fu, Marvin Zhang, and Sergey Levine. 2019. When to Trust Your Model: Model-Based Policy Optimization. In Advances in Neural Information Processing Systems 32. 12498--12509.

[13]

Natasha Jaques, Angeliki Lazaridou, Edward Hughes, cC aglar Gü lcc ehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, and Nando de Freitas. 2019. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. 3040--3049.

[14]

Jiechuan Jiang and Zongqing Lu. 2018. Learning Attentional Communication for Multi-Agent Cooperation. In Advances in Neural Information Processing Systems. 7265--7275.

[15]

Daewoo Kim, Sangwoo Moon, David Hostallero, Wan Ju Kang, Taeyoung Lee, Kyunghwan Son, and Yung Yi. 2019. Learning to Schedule Communication in Multi-agent Reinforcement Learning. In 7th International Conference on Learning Representations.

[16]

Woojun Kim, Jongeui Park, and Youngchul Sung. 2021. Communication in Multi-Agent Reinforcement Learning: Intention Sharing. In 9th International Conference on Learning Representations.

[17]

Michael L Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings. 157--163.

[18]

Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, and Yang Gao. 2020. Multi-Agent Game Abstraction via Graph Attention Neural Network. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. 7211--7218.

[19]

Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Advances in Neural Information Processing Systems. 6379--6390.

[20]

Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, and Yan Ni. 2020. Learning Agent Communication under Limited Bandwidth by Message Pruning. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. 5142--5149.

[21]

Volodymyr Mnih, Adrià Puigdomè nech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In Proceedings of the 33nd International Conference on Machine Learning, Vol. 48. 1928--1937.

[22]

Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, and Sergey Levine. 2018. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning. In 2018 IEEE International Conference on Robotics and Automation. 7559--7566.

[23]

Yaru Niu, Rohan Paleja, and Matthew Gombolay. 2021. Multi-Agent Graph-Attention Communication and Teaming. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems. 964--973.

Digital Library

[24]

Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. 2018. Activation Functions: Comparison of trends in Practice and Research for Deep Learning. CoRR, Vol. abs/1811.03378 (2018).

[25]

Georgios Papoudakis, Filippos Christianos, and Stefano V. Albrecht. 2021. Agent Modelling under Partial Observability for Deep Reinforcement Learning. In Advances in Neural Information Processing Systems 34. 19210--19222.

[26]

Bei Peng, Tabish Rashid, Christian Schrö der de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Boehmer, and Shimon Whiteson. 2021. FACMAC: Factored Multi-Agent Centralised Policy Gradients. In Advances in Neural Information Processing Systems. 12208--12221.

[27]

Tabish Rashid, Mikayel Samvelyan, Christian Schrö der de Witt, Gregory Farquhar, Jakob N. Foerster, and Shimon Whiteson. 2018. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 4292--4301.

[28]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR, Vol. abs/1707.06347 (2017).

[29]

Esmaeil Seraj, Zheyuan Wang, Rohan R. Paleja, Daniel Martin, Matthew Sklar, Anirudh Patel, and Matthew C. Gombolay. 2022. Learning Efficient Diverse Communication for Cooperative Heterogeneous Teaming. In 21st International Conference on Autonomous Agents and Multiagent Systems. 1173--1182.

[30]

Pier Giuseppe Sessa, Maryam Kamgarpour, and Andreas Krause. 2022. Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation. In International Conference on Machine Learning, Vol. 162. 19580--19597.

[31]

Piyush K. Sharma, Rolando Fernandez, Erin G. Zaroukian, Michael R. Dorothy, Anjon Basak, and Derrik E. Asher. 2021. Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training. CoRR, Vol. abs/2107.14316 (2021).

[32]

Amanpreet Singh, Tushar Jain, and Sainbayar Sukhbaatar. 2019. Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks. In 7th International Conference on Learning Representations.

[33]

Sainbayar Sukhbaatar, Rob Fergus, et al. 2016. Learning multiagent communication with backpropagation. Advances in Neural Information Processing Systems, Vol. 29, 2244--2252.

[34]

Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vin'i cius Flores Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, and Thore Graepel. 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. 2085--2087.

[35]

Hado van Hasselt, Arthur Guez, and David Silver. 2016. Deep Reinforcement Learning with Double Q-Learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. 2094--2100.

[36]

Akifumi Wachi. 2019. Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. 6006--6012.

[37]

Zixin Wang, Hanyu Zhu, Mingcheng He, Yong Zhou, Xiliang Luo, and Ning Zhang. 2022. GAN and Multi-Agent DRL Based Decentralized Traffic Light Signal Control. IEEE Transactions on Vehicular Technology, Vol. 71, 2 (2022), 1333--1348.

[38]

Danië l Willemsen, Mario Coppola, and Guido C. H. E. de Croon. 2021. MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models. In IEEE/RSJ International Conference on Intelligent Robots and Systems. 5635--5640.

[39]

Chongjie Zhang and Victor R. Lesser. 2013. Coordinating multi-agent reinforcement learning with limited communication. In International conference on Autonomous Agents and Multi-Agent Systems. 1101--1108.

[40]

Sai Qian Zhang, Qi Zhang, and Jieyu Lin. 2019a. Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control. In Advances in Neural Information Processing Systems 32. 3230--3239.

[41]

Sai Qian Zhang, Qi Zhang, and Jieyu Lin. 2019b. Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control. In Advances in Neural Information Processing Systems, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché -Buc, Emily B. Fox, and Roman Garnett (Eds.). 3230--3239.

[42]

Sai Qian Zhang, Qi Zhang, and Jieyu Lin. 2020. Succinct and Robust Multi-Agent Communication With Temporal Message Control. In Advances in Neural Information Processing Systems, Vol. 33. 17271--17282.

[43]

Weijia Zhang, Hao Liu, Jindong Han, Yong Ge, and Hui Xiong. 2022. Multi-Agent Graph Convolutional Reinforcement Learning for Dynamic Electric Vehicle Charging Pricing. In The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2471--2481.

[44]

Changxi Zhu, Mehdi Dastani, and Shihan Wang. 2022. A Survey of Multi-Agent Reinforcement Learning with Communication. CoRR, Vol. abs/2203.08975 (2022).

Cited By

Weil JBao ZAbboud OMeuser TDastani MSichman JAlechina NDignum V(2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663055
Li XZhang JDastani MSichman JAlechina NDignum V(2024)Context-aware Communication for Multi-agent Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662972(1156-1164)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662972

Index Terms

Model-based Sparse Communication in Multi-agent Reinforcement Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
        Multi-agent reinforcement learning

Recommendations

AC2C: Adaptively Controlled Two-Hop Communication for Multi-Agent Reinforcement Learning
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

Learning communication strategies in cooperative multi-agent reinforcement learning (MARL) has recently attracted intensive attention. Early studies typically assumed a fully-connected communication topology among agents, which induces high communication ...
Learning Structured Communication for Multi-Agent Reinforcement Learning
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

This paper investigates multi-agent reinforcement learning (MARL) communication mechanisms in large-scale scenarios. We propose a novel framework, Learning Structured Communication (LSC), that leverages a flexible and efficient communication topology. ...
Reinforcement Learning of Communication in a Multi-agent Context
WI-IAT '11: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02

In this paper, we present a reinforcement learning approach for multi-agent communication in order to learn what to communicate, when and to whom. This method is based on introspective agents that can reason about their own actions and data so as to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

May 2023

3131 pages

ISBN:9781450394321

General Chairs:
Noa Agmon
Bar-Ilan University, Israel
,
Bo An
Nanyang Technological University, Singapore
,
Program Chairs:
Alessandro Ricci
University of Bologna, Italy
,
William Yeoh
Washington University in St. Louis, USA

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 30 May 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

China Scholarship Council

Conference

AAMAS '23

Sponsor:

SIGAI

AAMAS '23: International Conference on Autonomous Agents and Multiagent Systems

May 29 - June 2, 2023

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
126
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Weil JBao ZAbboud OMeuser TDastani MSichman JAlechina NDignum V(2024)Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message PassingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663055(1919-1927)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663055
Li XZhang JDastani MSichman JAlechina NDignum V(2024)Context-aware Communication for Multi-agent Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662972(1156-1164)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662972

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents