Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3495724.3497173guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
research-article
Free access

Succinct and robust multi-agent communication with temporal message control

Published: 06 December 2020 Publication History

Abstract

Recent studies have shown that introducing communication between agents can significantly improve overall performance in cooperative Multi-agent reinforcement learning (MARL). However, existing communication schemes often require agents to exchange an excessive number of messages at run-time under a reliable communication channel, which hinders its practicality in many real-world situations. In this paper, we present Temporal Message Control (TMC), a simple yet effective approach for achieving succinct and robust communication in MARL. TMC applies a temporal smoothing technique to drastically reduce the amount of information exchanged between agents. Experiments show that TMC can significantly reduce inter-agent communication overhead without impacting accuracy. Furthermore, TMC demonstrates much better robustness against transmission loss than existing approaches in lossy networking environments.

References

[1]
Raspberry pi website:. https://www.raspberrypi.org.
[2]
Tmc code repository. https://github.com/saizhang0218/TMC.
[3]
Tmc video demo. https://tmcpaper.github.io/tmc/.
[4]
J. B. Andersen, T. S. Rappaport, and S. Yoshida. Propagation measurements and models for wireless communications channels. IEEE Communications Magazine, 33(1):42-49, 1995.
[5]
A. Das, T. Gervet, J. Romoff, D. Batra, D. Parikh, M. Rabbat, and J. Pineau. Tarmac: Targeted multi-agent communication. arXiv preprint arXiv:1810.11187, 2018.
[6]
D. Eckhardt and P. Steenkiste. Measurement and analysis of the error characteristics of an in-building wireless network. In Conference proceedings on Applications, technologies, architectures, and protocols for computer communications, pages 243-254, 1996.
[7]
J. Foerster, I. A. Assael, N. de Freitas, and S. Whiteson. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems, pages 2137-2145, 2016.
[8]
J. N. Foerster, C. A. S. de Witt, G. Farquhar, P. H. Torr, W. Boehmer, and S. Whiteson. Multi-agent common knowledge reinforcement learning. arXiv preprint arXiv:1810.11702, 2018.
[9]
J. N. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson. Counterfactual multi-agent policy gradients. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[10]
M. Hausknecht and P. Stone. Deep recurrent q-learning for partially observable mdps. In 2015 AAAI Fall Symposium Series, 2015.
[11]
S. Iqbal and F. Sha. Actor-attention-critic for multi-agent reinforcement learning. arXiv preprint arXiv:1810.02912, 2018.
[12]
J. Jiang and Z. Lu. Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems, pages 7254-7264, 2018.
[13]
D. Kim, S. Moon, D. Hostallero, W. J. Kang, T. Lee, K. Son, and Y. Yi. Learning to schedule communication in multi-agent reinforcement learning. arXiv preprint arXiv:1902.01554, 2019.
[14]
W. Kim, M. Cho, and Y. Sung. Message-dropout: An efficient training method for multi-agent deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6079-6086, 2019.
[15]
J. Kober, J. A. Bagnell, and J. Peters. "reinforcement learning in robotics: A survey.". The International Journal of Robotics Research, 2013.
[16]
A. Konrad, B. Y. Zhao, A. D. Joseph, and R. Ludwig. A markov-based channel model algorithm for wireless networks. Wireless Networks, 9(3):189-199, 2003.
[17]
M. Lanctot, V. Zambaldi, A. Gruslys, A. Lazaridou, K. Tuyls, J. Pérolat, D. Silver, and T. Graepel. A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systems, pages 4190-4203, 2017.
[18]
J. Lin, K. Dzeparoska, S. Q. Zhang, A. Leon-Garcia, and N. Papernot. On the robustness of cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2003.03722, 2020.
[19]
R. Lowe, Y. Wu, A. Tamar, J. Harb, O. P. Abbeel, and I. Mordatch. Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in Neural Information Processing Systems, pages 6379-6390, 2017.
[20]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. "playing atari with deep reinforcement learning.". arXiv preprint arXiv:1312.5602, 2013.
[21]
F. A. Oliehoek, M. T. Spaan, and N. Vlassis. Optimal and approximate q-value functions for decentralized pomdps. Journal of Artificial Intelligence Research, 32:289-353, 2008.
[22]
P. Peng, Y. Wen, Y. Yang, Q. Yuan, Z. Tang, H. Long, and J. Wang. Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv preprint arXiv:1703.10069, 2017.
[23]
T. Rashid, M. Samvelyan, C. S. de Witt, G. Farquhar, J. Foerster, and S. Whiteson. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv preprint arXiv:1803.11485, 2018.
[24]
S. Rayanchu, A. Mishra, D. Agrawal, S. Saha, and S. Banerjee. Diagnosing wireless packet losses in 802.11: Separating collision from weak signal. In IEEE INFOCOM 2008-The 27th Conference on Computer Communications, pages 735-743. IEEE, 2008.
[25]
M. Samvelyan, T. Rashid, C. S. de Witt, G. Farquhar, N. Nardelli, T. G. J. Rudner, C.-M. Hung, P. H. S. Torr, J. Foerster, and S. Whiteson. The StarCraft Multi-Agent Challenge. CoRR, abs/1902.04043, 2019.
[26]
P. Sarolahti, M. Kojo, and K. Raatikainen. F-rto: an enhanced recovery algorithm for tcp retransmission timeouts. ACM SIGCOMM Computer Communication Review, 33(2):51-63, 2003.
[27]
S.-S. Shai, S. Shammah, and A. Shashua. "safe, multi-agent, reinforcement learning for autonomous driving.". arXiv preprint arXiv:1610.03295, 2016.
[28]
S. Shalev-Shwartz, S. Shammah, and A. Shashua. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint arXiv:1610.03295, 2016.
[29]
A. Sheth, S. Nedevschi, R. Patra, S. Surana, E. Brewer, and L. Subramanian. Packet loss characterization in wifi-based long distance networks. In IEEE INFOCOM 2007-26th IEEE International Conference on Computer Communications, pages 312-320. IEEE, 2007.
[30]
P. Skobelev, D. Budaev, N. Gusev, and G. Voschuk. Designing multi-agent swarm of uav for precise agriculture. In International Conference on Practical Applications of Agents and Multi-Agent Systems, pages 47-59. Springer, 2018.
[31]
K. Son, D. Kim, W. J. Kang, D. E. Hostallero, and Y. Yi. Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. arXiv preprint arXiv:1905.05408, 2019.
[32]
S. Sukhbaatar, R. Fergus, et al. Learning multiagent communication with backpropagation. In Advances in Neural Information Processing Systems, pages 2244-2252, 2016.
[33]
P. Sunehag, G. Lever, A. Gruslys, W. M. Czarnecki, V. Zambaldi, M. Jaderberg, M. Lanctot, N. Sonnerat, J. Z. Leibo, K. Tuyls, et al. Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296, 2017.
[34]
D. Tse and P. Viswanath. Fundamentals of wireless communication. Cambridge university press, 2005.
[35]
H. S. Wang and N. Moayeri. Finite-state markov channel-a useful model for radio communication channels. IEEE transactions on vehicular technology, 44(1):163-171, 1995.
[36]
M. Wiering. Multi-agent reinforcement learning for traffic light control. In Machine Learning: Proceedings of the Seventeenth International Conference (ICML'2000), pages 1151-1158, 2000.
[37]
M. Wunder, M. L. Littman, and M. Babes. Classes of multiagent q-learning dynamics with epsilon-greedy exploration. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 1167-1174. Citeseer, 2010.
[38]
G. Xylomenos and G. C. Polyzos. Tcp and udp performance over a wireless lan. In IEEE INFOCOM'99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No. 99CH36320), volume 2, pages 439-446. IEEE, 1999.
[39]
S. Q. Zhang, Q. Zhang, and J. Lin. Efficient communication in multi-agent reinforcement learning via variance based control. arXiv preprint arXiv:1909.02682, 2019.
[40]
M. Zorzi and R. R. Rao. On channel modeling for delay analysis of packet communications over wireless links. In Proceedings of the annual Allerton Conference on Communication Control and Computing, volume 36, pages 526-535. Citeseer, 1998.

Index Terms

  1. Succinct and robust multi-agent communication with temporal message control
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing Systems
        December 2020
        22651 pages
        ISBN:9781713829546

        Publisher

        Curran Associates Inc.

        Red Hook, NY, United States

        Publication History

        Published: 06 December 2020

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 89
          Total Downloads
        • Downloads (Last 12 months)70
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media