research-article

Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination Through Curriculum-Learning and Competition

Authors:

Francisco Leiva,

Javier Ruiz-del-SolarAuthors Info & Claims

2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Pages 4745 - 4752

https://doi.org/10.1109/IROS51168.2021.9636046

Published: 27 September 2021 Publication History

Abstract

This work proposes a scheme that allows learning complex multi-agent behaviors in a sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov game, and solved using deep reinforcement learning. We propose a basic multi-agent extension of TD3 for learning the policy of each player, in a decentralized manner. To ease learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of learning in multi-agent stages (1v1 and 2v2) uses agents trained in a previous stage as fixed opponents. In addition, we propose using experience sharing, a method that shares experience from a fixed opponent, trained in a previous stage, for training the agent currently learning, and a form of frame-skipping, to raise performance significantly. Our results show that high quality soccer play can be obtained with our approach in just under 40M interactions. A summarized video of the resulting game play can be found in https://youtu.be/pScrKNqfELE.

References

[1]

S. Liu, G. Lever, N. Heess, J. Merel, S. Tunyasuvunakool, and T. Graepel, “Emergent coordination through competition,” in Int. Conf. on Learning Representations, 2019.

[2]

K. Kurach et al., “Google research football: A novel reinforcement learning environment,” in Proc. of the AAAI Conf on Artificial Intelligence, vol. 34, pp. 4501-4510, 2020.

[3]

P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu, “Keepaway soccer: From machine learning testbed to benchmark,” in Robot Soccer World Cup, pp. 93-105, Springer, 2005.

[4]

S. Kalyanakrishnan, Y. Liu, and P. Stone, “Half field offense in robocup soccer: A multiagent reinforcement learning case study,” in Robot Soccer World Cup, pp. 72-85, Springer, 2006.

[5]

M. Hausknecht and P. Stone, “Deep reinforcement learning in parameterized action space,” in Proc. of the Int. Conf. on Learning Representations (ICLR), May 2016.

[6]

M. Jaderberg et al., “Human-level performance in 3d multi-player games with population-based reinforcement learning,” Science, vol. 364, no. 6443, pp. 859-865, 2019.

[7]

S. Fujimoto, H. Van Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Int. Conf. on Machine Learning, pp. 1587-1596, PMLR, 2018.

[8]

M. Tan, “Multi-agent reinforcement learning: Independent vs. cooperative agents,” in Proc. of the 10th Int. Conf. on Machine Learning, pp. 330-337, 1993.

[9]

V. Mnih et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529-533, 2015.

[10]

L. Leottau, C. Celemin, and J. Ruiz-del Solar, “Ball dribbling for humanoid biped robots: a reinforcement learning and fuzzy control approach,” in Robot Soccer World Cup, pp. 549-561, Springer, 2014.

[11]

M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda, “Vision-based behavior acquisition for a shooting robot by using a reinforcement learning,” in Proc. of IAPR/IEEE Workshop on Visual Behaviors, pp. 112-118, 1994.

[12]

M. Asada, E. Uchibe, S. Noda, S. Tawaratsumida, and K. Hosoda, “A vision-based reinforcement learning for coordination of soccer playing behaviors,” in Proc. of AAAI-94 Workshop on AI and A-life and Entertainment, pp. 16-21, 1994.

[13]

P. Stone, R. S. Sutton, and G. Kuhlmann, “Reinforcement learning for robocup soccer keepaway,” Adaptive Behavior, vol. 13, no. 3, pp. 165-188, 2005.

[14]

M. Wiering, R. Sałustowicz, and J. Schmidhuber, “Reinforcement learning soccer teams with incomplete world models,” Autonomous Robots, vol. 7, no. 1, pp. 77-88, 1999.

Digital Library

[15]

J. M. C. Ocana, F. Riccio, R. Capobianco, and D. Nardi, “Cooperative multi-agent deep reinforcement learning in a 2 versus 2 free-kick task,” in Robot World Cup, pp. 44-57, Springer, 2019.

[16]

T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in Proc. of the 4th Int. Conf. on Learning Representations (ICLR), May 2016.

[17]

R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 6382-6393, 2017.

[18]

J. Ackermann, V. Gabler, T. Osa, and M. Sugiyama, “Reducing overestimation bias in multi-agent domains using double centralized critics,” arXiv preprint arXiv:1910.01465, 2019.

[19]

J. L. Elman, “Learning and development in neural networks: The importance of starting small,” Cognition, vol. 48, no. 1, pp. 71-99, 1993.

[20]

Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proc. of the 26th annual Int. Conf. on Machine Learning, pp. 41-48, 2009.

[21]

M. Al-Shedivat, T. Bansal, Y. Burda, I. Sutskever, I. Mordatch, and P. Abbeel, “Continuous adaptation via meta-learning in nonstationary and competitive environments,” in Proc. of the 6th Int. Conf. on Learning Representations (ICLR), 2018.

[22]

N. Heess et al., “Emergence of locomotion behaviours in rich environments,” arXiv preprint arXiv:1707.02286, 2017.

[23]

M. Hüttenrauch, S. Adrian, G. Neumann, et al., “Deep reinforcement learning for swarm systems,” Journal of Machine Learning Research, vol. 20, no. 54, pp. 1-31, 2019.

[24]

D. Balduzzi, K. Tuyls, J. Perolat, and T. Graepel, “Re-evaluating evaluation,” in Advances in Neural Information Processing Systems, vol. 31, pp. 3268-3279, Curran Associates, Inc., 2018.

[25]

A. E. Elo, The rating of chessplayers, past and present. Arco Pub., 1978.

Index Terms

Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination Through Curriculum-Learning and Competition
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning to Play, Playing to Learn: Comparing the Experiences of Adult Foreign Language Learners with Off-the-Shelf and Specialized Games for Learning German

Learning opportunities offered by digital games have become an important research topic in recent years. Language learning is one of the areas in which games could prosper but the question then is whether these should be specialized language-learning ...
TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

Multi-agent football poses an unsolved challenge in AI research. Existing work has focused on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In this paper, we develop a multi-agent system to play the full 11 vs. 11 ...
Learning about machine learning through tic-tac-toe competition scenarios

In the movie "War Games", a global thermonuclear disaster was avoided because a computer program learned that tic-tac-toe has no winning move. But what assumptions about machine learning are necessary to get that result? This paper describes an ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Sep 2021

7915 pages

Copyright © 2021.

Publisher

IEEE Press

Publication History

Published: 27 September 2021

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents