research-article

Structured Diversification Emergence via Reinforced Organization Control and Hierachical Consensus Learning

Authors:

Xiangfeng Wang,

Hongyuan ZhaAuthors Info & Claims

AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

Pages 773 - 781

Published: 03 May 2021 Publication History

Abstract

When solving a complex task, humans will spontaneously form teams and to complete different parts of the whole task, respectively. Meanwhile, the cooperation between teammates will improve efficiency. However, for current cooperative MARL methods, the cooperation team is constructed through either heuristics or end-to-end blackbox optimization. In order to improve the efficiency of cooperation and exploration, we propose a structured diversification emergence MARL framework named Rochico based on reinforced organization control and hierarchical consensus learning. Rochico first learns an adaptive grouping policy through the organization control module, which is established by independent multi-agent reinforcement learning. Further, the hierarchical consensus module based on the hierarchical intentions with consensus constraint is introduced after team formation. Simultaneously, utilizing the hierarchical consensus module and a self-supervised intrinsic reward enhanced decision module, the proposed cooperative MARL algorithm Rochico can output the final diversified multi-agent cooperative policy.

References

[1]

Sherief Abdallah and Victor R. Lesser. 2007. Multiagent reinforcement learning and self-organization in a network of agents. In AAMAS.

[2]

Georgios Chalkiadakis, Edith Elkind, Evangelos Markakis, Maria Polukarov, and Nicholas R. Jennings. 2010. Cooperative Games with Overlapping Coalitions. J. Artif. Intell. Res.39 (2010), 179--216.

[3]

Mehdi Dastani, Virginia Dignum, and Frank Dignum. 2003. Role-assignment in open agent societies. In AAMAS.

[4]

Christian Schroeder de Witt, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, and Shimon Whiteson. 2019. Multi-Agent Common Knowledge Reinforcement Learning. In NeurIPS.

[5]

Giovanna Di Marzo Serugendo, Marie-Pierre Gleizes, and Anthony Karageorgos. 2005. Self-organization in multi-agent systems. Knowledge Engineering Review 20, 2 (2005), 165--189.

Digital Library

[6]

Daniela Scherer Dos Santos and Ana LC Bazzan. 2012. Distributed clustering for group formation and task allocation in multiagent systems: A swarm intelligence approach.Applied Soft Computing 12, 8 (2012), 2123--2131.

[7]

Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, and Sergey Levine. 2019. Diversity is All You Need: Learning Skills without a Reward Function. In ICLR.

[8]

Jakob Foerster, Ioannis Alexandros Assael, Nando De Freitas, and Shimon Whiteson. 2016. Learning to communicate with deep multi-agent reinforcement learning. In NeurIPS.

Digital Library

[9]

Matthew E Gaston and Marie DesJardins. 2005. Agent-organized networks for dynamic team formation. In AAMAS.

[10]

Robin Glinton, Katia P. Sycara, and Paul Scerri. 2008. Agent Organized Networks Redux. In AAAI.

[11]

Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel,and Sergey Levine. 2018. Composable deep reinforcement learning for robotic manipulation. In ICRA.

[12]

Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, and Sergey Levine. 2017. Reinforcement learning with deep energy-based policies. In ICML.

[13]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In ICML.

[14]

Eric A Hansen, Daniel S Bernstein, and Shlomo Zilberstein. 2004. Dynamic programming for partially observable stochastic games. In AAAI.

[15]

Jiechuan Jiang, Chen Dun, Tiejun Huang, and Zongqing Lu. 2019. Graph Convolutional Reinforcement Learning. In ICLR.

[16]

Jiechuan Jiang and Zongqing Lu. 2018. Learning attentional communication formulti-agent cooperation. In NeurIPS.

[17]

Jiechuan Jiang and Zongqing Lu. 2020. The Emergence of Individuality in Multi-Agent Reinforcement Learning. arXiv preprint arXiv:2006.05842(2020).

[18]

Daewoo Kim, Sangwoo Moon, David Hostallero, Wan Ju Kang, Taeyoung Lee, Kyunghwan Son, and Yung Yi. 2018. Learning to Schedule Communication in Multi-agent Reinforcement Learning. In ICLR.

[19]

Ramachandra Kota, Nicholas Gibbins, and Nicholas R. Jennings. 2012. Decentralized approaches for self-adaptation in agent organizations.ACM Trans. Auton.Adapt. Syst. 7 (2012), 1:1--1:28.

Digital Library

[20]

Youngwoon Lee, Jingyun Yang, and Joseph J Lim. 2020. Learning to Coordinate Manipulation Skills via Skill Behavior Diversification. In ICLR.

[21]

Wenhao Li, Bo Jin, Xiangfeng Wang, Junchi Yan, and Hongyuan Zha. 2020. F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning.arXiv preprint arXiv:2004.11145(2020).

[22]

Ryan Lowe, Jakob Foerster, Y-Lan Boureau, Joelle Pineau, and Yann Dauphin. 2019. On the Pitfalls of Measuring Emergent Communication. In AAMAS.

[23]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research9, Nov (2008), 2579--2605.

[24]

Kathryn Sarah Macarthur, Ruben Stranders, Sarvapali Ramchurn, and Nicholas Jennings. 2011. A distributed anytime algorithm for dynamic task allocation in multi-agent systems. In AAAI.

[25]

Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, and Zhen Xiao. 2020. Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning. In AAAI.

[26]

Milan Mares. 2000. Fuzzy coalition structures. Fuzzy Sets Syst.114 (2000), 23--33.

Digital Library

[27]

Laëtitia Matignon, Laurent Jeanpierre, and Abdel-Illah Mouaddib. 2012. Coordinated multi-robot exploration under communication constraints using dcentralized Markov decision processes. In AAAI.

[28]

Kevin R McKee, Ian Gemp, Brian McWilliams, Edgar A Duèñez-Guzmán, Edward Hughes, and Joel Z Leibo. 2020. Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning. In AAMAS.

[29]

Frans A Oliehoek, Matthijs TJ Spaan, and Nikos Vlassis. 2008. Optimal and approximate Q-value functions for decentralized POMDPs. Journal of Artificial Intelligence Research 32 (2008), 289--353.

[30]

Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, and Stefano V Albrecht. 2020. Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms. arXiv preprint arXiv:2006.07869(2020).

[31]

Peng Peng, Quan Yuan, Ying Wen, Yaodong Yang, Zhenkun Tang, Haitao Long,and Jun Wang. 2017. Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. arXiv preprint arXiv:1703.10069(2017).

[32]

Alexander Peysakhovich and Adam Lerer. 2018. Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones. In AAMAS.

[33]

Sarvapali D Ramchurn, Alessandro Farinelli, Kathryn S Macarthur, and Nicholas R Jennings. 2010. Decentralized coordination in robocup rescue. Comput. J.53, 9(2010), 1447--1461.

[34]

Tabish Rashid, Mikayel Samvelyan, Christian Schroeder, Gregory Farquhar, Jakob Foerster, and Shimon Whiteson. 2018. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. In ICML.

[35]

Tabish Rashid, Mikayel Samvelyan, C. S. Witt, Gregory Farquhar, Jakob N. Foerster, and S. Whiteson. 2018. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. In ICML.

[36]

Pedro V Sander, Denis Peleshchuk, and Barbara J Grosz. 2002. A scalable, distributed algorithm for efficient task allocation. In AAMAS.

[37]

Alberto Sanfeliu and King-Sun Fu. 1983. A distance measure between attributed relational graphs for pattern recognition.IEEE transactions on systems, man, andcybernetics3 (1983), 353--362.

[38]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In CVPR.

[39]

Junjie Sheng, Xiangfeng Wang, Bo Jin, Junchi Yan, Wenhao Li, Tsung-Hui Chang, Jun Wang, and Hongyuan Zha. 2020. Learning Structured Communication for Multi-agent Reinforcement Learning. arXiv preprint arXiv:2002.04235(2020).

[40]

Tianmin Shu and Yuandong Tian. 2019. M3RL: Mind-aware Multi-agent Management Reinforcement Learning. In ICLR.

[41]

Mark Sims, Daniel Corkill, and Victor Lesser. 2008. Automated organization design for multi-agent systems.Autonomous agents and multi-agent systems16,2 (2008), 151--185.

[42]

Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Mai Xu, ZihanDing, and Lianlong Wu. 2020. Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence. In AAAI.

[43]

Peter Sunehag, G. Lever, A. Gruslys, W. Czarnecki, V. Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, K. Tuyls, and T. Graepel. 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning. In AAMAS.

[44]

Robert Tarjan. 1972. Depth-first search and linear graph algorithms. SIAM journal on computing1, 2 (1972), 146--160.

[45]

Zheng Tian, Shihao Zou, Tim Warr, Lisheng Wu, and Jun Wang. 2018. Learning to communicate implicitly by actions. arXiv preprint arXiv:1810.04444(2018).

[46]

Tonghan Wang, Heng Dong, Victor Lesser, and Chongjie Zhang. 2020. Multi-Agent Reinforcement Learning with Emergent Roles. In ICML.

[47]

Jiachen Yang, Igor Borovikov, and Hongyuan Zha. 2020. Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery. In AAMAS.

[48]

Dayong Ye, Minjie Zhang, and Danny Sutanto. 2013. Self-Adaptation-Based Dynamic Coalition Formation in a Distributed Agent Network: A Mechanism and a Brief Survey.IEEE Transactions on Parallel and Distributed Systems24(2013), 1042--1051.

[49]

Dayong Ye, Minjie Zhang, and Athanasios V Vasilakos. 2016. A survey of self-organization mechanisms in multiagent systems.IEEE Transactions on Systems,Man, and Cybernetics: Systems 47, 3 (2016), 441--461.

[50]

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola. 2017. Deep sets. In NeurIPS.

[51]

Chongjie Zhang, Sherief Abdallah, and Victor Lesser. 2009. Integrating organizational control into multi-agent learning. In AAMAS.

[52]

Chongjie Zhang, Victor R Lesser, and Sherief Abdallah. 2010. Self-organization for coordinating decentralized reinforcement learning. In AAMAS.

[53]

L Zheng, J Yang, H Cai, W Zhang, J Wang, and Y Yu. 2018. MAgent: A many-agent reinforcement learning platform for artificial collective intelligence. In AAAI.

Index Terms

Structured Diversification Emergence via Reinforced Organization Control and Hierachical Consensus Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Cooperation and coordination
      2. Multi-agent systems

Recommendations

Combining Dynamic Reward Shaping and Action Shaping for Coordinating Multi-agent Learning
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 02

Coordinating multi-agent reinforcement learning provides a promising approach to scaling learning in large cooperative multi-agent systems. It allows agents to learn local decision policies based on their local observations and rewards, and, meanwhile, ...
Integrating organizational control into multi-agent learning
AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop an organization-based control framework to speed up the convergence of MARL algorithms in ...
Diversification and the Legal Organization of the Firm

The existing literature on the relationship between strategy and structure tends to ignore the legal dimension of the organization of diversified firms. Yet, there is considerable variation in the legal organization of diversified firms; while some of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

May 2021

1899 pages

ISBN:9781450383073

General Chairs:
Frank Dignum
Umeå University, Sweden
,
Alessio Lomuscio
Imperial College London, UK
,
Program Chairs:
Ulle Endriss
University of Amsterdam, Netherlands
,
Ann Nowé
Vrije Universiteit Brussel, Belgium

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 03 May 2021

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '21

Sponsor:

SIGAI

AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems

May 3 - 7, 2021

Virtual Event, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
43
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten