Article

Reinforcement Learning for Modeling and Capturing the Effect of Partner Selection Strategies on the Emergence of Cooperation

Authors:

Somayeh Koohborfardhaghighi,

Eric PauwelsAuthors Info & Claims

Economics of Grids, Clouds, Systems, and Services: 18th International Conference, GECON 2021, Virtual Event, September 21–23, 2021, Proceedings

Pages 52 - 65

https://doi.org/10.1007/978-3-030-92916-9_5

Published: 21 September 2021 Publication History

Abstract

In this research we study the statistical mechanics of cooperation through a simple case of aspiration-driven dynamics in structured populations with mixed strategies. Comparing to the existing literature, we define a pool of possible behaviors for the agents based on the bandits learning algorithms and we highlight settings of the Iterated Prisoner’s Dilemma Game which may have positive influence on the emergence of cooperation from the aspect of both the entire population and the individual players. We present the level of cooperation and its variation in terms of the median (M) and the interquartile range (IQ) in accordance to the observed topological characteristics of the network structures and partner selection strategies. Our experimental results show that regardless of the underlying network structures, it is difficult to maintain a fully cooperative society in the shade of Random and Epsilon Greedy partner selection strategies. The reported Median values are the lowest and the changes in the IQRs do not follow a sharp increase or decrease in both strategies. Contrary to this, it will even take a shorter time to see a fully cooperative population though UCB, Epsilon First and Epsilon Decreasing strategies. Our observation with respect to different network structures also shows that, considering a certain level of heterogeneity both in terms of distance to others as well as clustering coefficient is more conductive in the spread of cooperative behavior among a networked population.

References

[1]

Eriksson, K., Strimling, P.: The hard problem of cooperation. PLoS ONE 77(2), e40325 (2012)

[2]

Axelrod R and Hamilton WD The evolution of cooperation Science 1981 211 4489 1390-1396

[3]

Flood MM Some experimental game Manage. Sci. 1958 5 1 5-26

[4]

Du J, Wu B, Altrock PM, and Wang L Aspiration dynamics of multi-player games in finite populations J. R. Soc. Interface 2014 11 94 20140077

[5]

Gintis, H.: Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction, 2nd edn, REV - Revised, Princeton University Press, Princeton (2009)

[6]

Traulsen, A., Claussen, J.C., Hauert, C.: Coevolutionary dynamics: from finite to infinite populations. Phys. Rev. Lett. 95(23), 238701 (2005)

[7]

Wu, B., Altrock, P.M., Wang, L., Traulsen, A.: Universality of weak selection. Phys. Rev. E 82(4), 046106 (2010)

[8]

Szabó G and Fáth G Evolutionary games on graphs Phys. Rep. 2007 446 4 97-216

[9]

Fudenberg D and Imhof LA Imitation processes with small mutations J. Econ. Theory 2006 131 1 251-262

[10]

Chen, X,. Wang, L.: Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 77(1) Pt 2, 17103 (2008)

[11]

Perc, M., Wang, Z.: heterogeneous aspirations promote cooperation in the prisoner's dilemma game. PLoS ONE 5(12), e15117 (2010)

[12]

Zhang L, Huang C, Li H, and Dai Q Aspiration-dependent strategy persistence promotes cooperation in spatial prisoner's dilemma game EPL 2019 126 1 18001

[13]

Liu Y, Chen X, Wang L, Li B, Zhang W, and Wang H Aspiration-based learning promotes cooperation in spatial prisoners dilemma games EPL (Europhys. Lett.) 2011 94 6 60002

[14]

Anastassacos N, Hailes S, and Musolesi M Partner Selection for the emergence of cooperation in multi-agent systems using reinforcement learning Proc. AAAI Conf. Artif. Intell. 2020 34 05 7047-7054

[15]

Santos FC, Santos MD, and Pacheco JM Social diversity promotes the emergence of cooperation in public goods games Nature 2008 454 7201 213-216

[16]

Zheng, H., Jiang, J., Wei, P., Long, G., Zhang, C.: Competitive and cooperative heterogeneous deep reinforcement learning. In: An, B., Yorke-Smith, N., El Fallah Seghrouchni, A., Sukthankar, G. (eds.) Proceedings of the 19th International Conference on Autonomous Agents and Multi-agent Systems (AAMAS 2020), 9–13 May 2020, pp. 1656–1664. Auckland, New Zealand (2020)

[17]

Fujimoto, S., Hoof, H.V., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv:1802.09477 [cs, stat] (2018)

[18]

Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv:1801.01290 [cs, stat] (2018)

[19]

Lüders B, Schläger M, Korach A, and Risi S Squillero G and Sim K Continual and one-shot learning through neural networks with dynamic external memory Applications of Evolutionary Computation 2017 Cham Springer 886-901

[20]

Watkins, C.J.C.H.: Learning from Delayed Rewards. University of Cambridge, Cambridge (1989)

[21]

Caelen O and Bontempi G Maniezzo V, Battiti R, and Watson J-P Improving the exploration strategy in bandit algorithms Learning and Intelligent Optimization 2008 Heidelberg Springer 56-68

[22]

Auer P, Cesa-Bianchi N, Freund Y, and Schapire RE The nonstochastic multiarmed bandit problem SIAM J. Comput. 2002 32 1 48-77

[23]

Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2003)

[24]

Chu, W., Li, L., Reyzin, L., Schapire, R.: Contextual bandits with linear payoff functions. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 208–214 (2011)

[25]

Lin, B., Bouneffouf, D., Cecchi, G.: Online learning in iterated prisoner's dilemma to mimic human behavior. arXiv:2006.06580 [cs, q-bio] (2020)

[26]

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, p. 342. MIT Press, Cambridge (1998)

[27]

Ezaki, T., Horita, Y., Takezawa, M., Masuda, N.: Reinforcement learning explains conditional .cooperation and its moody cousin. PL_OS Comput. Biol. 12(7), e1005034 (2016)

[28]

Takes FW and Kosters WA Computing the eccentricity distribution of large graphs Algorithms 2013 6 1 100-118

[29]

Rand DG, Nowak MA, Fowler JH, and Christakis NA Static network structure can stabilize human cooperation Proc. Natl. Acad. Sci. 2014 111 48 17093

[30]

Maciejewski, W., Fu, F., Hauert, C.: Evolutionary game dynamics in populations with heterogenous structures. PLoS Comput. Biol. 10(4), e1003567 (2014)

[31]

Ge X, Li H, and Li L Pan J-S, Lin JC-W, Sui B, and Tseng S-P Effects of centrality and heterogeneity on evolutionary games Genetic and Evolutionary Computing 2019 Singapore Springer 51-63

[32]

Koohborfardhaghighi, S., Altmann, J.: How structural changes in complex networks impact organizational learning performance (No. 2014111). Seoul National University; Technology Management, Economics, and Policy Program (2014)

[33]

Koohborfardhaghighi, S., Romero, J.P., Maliphol, S., Liu, Y., Altmann, J.: How bounded rationality of individuals in social interactions impacts evolutionary dynamics of cooperation. In: Proceedings of the International Conference on Web Intelligence, pp. 381–388 (2017)

[34]

Koohborfardhaghighi S and Altmann J Ślȩzak D, Schaefer G, Vuong ST, and Kim Y-S How variability in individual patterns of behavior changes the structural properties of networks Active Media Technology 2014 Cham Springer 49-60

Index Terms

Reinforcement Learning for Modeling and Capturing the Effect of Partner Selection Strategies on the Emergence of Cooperation
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

The “self-bad, partner-worse” strategy inhibits cooperation in networked populations
Highlights
- AGNES clustering algorithm is used to analysis dominant strategies distribution for spatial games.
- A “self-bad, partner-worse” strategy inhibiting networked cooperation is proposed.
- Iterated Prisoner’s Dilemma contains strategies ...
Abstract
The emergence and maintenance of cooperation is a popular topic in studies of information sciences and evolutionary game theory. In two-player iterated games, memory in terms of the outcome of previous interactions and the strategy choices of co-...
Clans and cooperation in the iterated prisoner's dilemma
GECCO '12: Proceedings of the 14th annual conference companion on Genetic and evolutionary computation

In evolutionary algorithms that evolve populations of strategies for the Iterated Prisoner's Dilemma, higher levels of cooperation evolve when the strategies engage in longer contests. When IPD-playing organisms are segregated into clans, with different ...
Finding Cooperation in the N-Player Iterated Prisoner's Dilemma with Deep Reinforcement Learning Over Dynamic Complex Networks
Abstract
Biological, social and economical systems expose enormous levels of complexity, and studying situations of cooperation of conflict encompassing such systems is of particular interest. The N-Player Iterated Prisoner's Dilemma (NIPD) is a general ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Economics of Grids, Clouds, Systems, and Services: 18th International Conference, GECON 2021, Virtual Event, September 21–23, 2021, Proceedings

Sep 2021

235 pages

ISBN:978-3-030-92915-2

DOI:10.1007/978-3-030-92916-9

Editors:
Konstantinos Tserpes
Harokopio University, Athens, Greece
,
Jörn Altmann
Seoul National University, Seoul, Korea (Republic of)
,
José Ángel Bañares
University of Zaragoza, Zaragoza, Spain
,
Orna Agmon Ben-Yehuda
Caesarea Rothschild Institute, University of Haifa, Haifa, Israel
,
Karim Djemame
University of Leeds, Leeds, UK
,
Vlado Stankovski
University of Ljubljana, Ljubljana, Slovenia
,
Bruno Tuffin
Inria Rennes - Bretagne Atlantique Research Centre, Rennes, France

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 September 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten