Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/IROS51168.2021.9636046guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination Through Curriculum-Learning and Competition

Published: 27 September 2021 Publication History

Abstract

This work proposes a scheme that allows learning complex multi-agent behaviors in a sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov game, and solved using deep reinforcement learning. We propose a basic multi-agent extension of TD3 for learning the policy of each player, in a decentralized manner. To ease learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of learning in multi-agent stages (1v1 and 2v2) uses agents trained in a previous stage as fixed opponents. In addition, we propose using experience sharing, a method that shares experience from a fixed opponent, trained in a previous stage, for training the agent currently learning, and a form of frame-skipping, to raise performance significantly. Our results show that high quality soccer play can be obtained with our approach in just under 40M interactions. A summarized video of the resulting game play can be found in https://youtu.be/pScrKNqfELE.

References

[1]
S. Liu, G. Lever, N. Heess, J. Merel, S. Tunyasuvunakool, and T. Graepel, “Emergent coordination through competition,” in Int. Conf. on Learning Representations, 2019.
[2]
K. Kurach et al., “Google research football: A novel reinforcement learning environment,” in Proc. of the AAAI Conf on Artificial Intelligence, vol. 34, pp. 4501-4510, 2020.
[3]
P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu, “Keepaway soccer: From machine learning testbed to benchmark,” in Robot Soccer World Cup, pp. 93-105, Springer, 2005.
[4]
S. Kalyanakrishnan, Y. Liu, and P. Stone, “Half field offense in robocup soccer: A multiagent reinforcement learning case study,” in Robot Soccer World Cup, pp. 72-85, Springer, 2006.
[5]
M. Hausknecht and P. Stone, “Deep reinforcement learning in parameterized action space,” in Proc. of the Int. Conf. on Learning Representations (ICLR), May 2016.
[6]
M. Jaderberg et al., “Human-level performance in 3d multi-player games with population-based reinforcement learning,” Science, vol. 364, no. 6443, pp. 859-865, 2019.
[7]
S. Fujimoto, H. Van Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Int. Conf. on Machine Learning, pp. 1587-1596, PMLR, 2018.
[8]
M. Tan, “Multi-agent reinforcement learning: Independent vs. cooperative agents,” in Proc. of the 10th Int. Conf. on Machine Learning, pp. 330-337, 1993.
[9]
V. Mnih et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529-533, 2015.
[10]
L. Leottau, C. Celemin, and J. Ruiz-del Solar, “Ball dribbling for humanoid biped robots: a reinforcement learning and fuzzy control approach,” in Robot Soccer World Cup, pp. 549-561, Springer, 2014.
[11]
M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda, “Vision-based behavior acquisition for a shooting robot by using a reinforcement learning,” in Proc. of IAPR/IEEE Workshop on Visual Behaviors, pp. 112-118, 1994.
[12]
M. Asada, E. Uchibe, S. Noda, S. Tawaratsumida, and K. Hosoda, “A vision-based reinforcement learning for coordination of soccer playing behaviors,” in Proc. of AAAI-94 Workshop on AI and A-life and Entertainment, pp. 16-21, 1994.
[13]
P. Stone, R. S. Sutton, and G. Kuhlmann, “Reinforcement learning for robocup soccer keepaway,” Adaptive Behavior, vol. 13, no. 3, pp. 165-188, 2005.
[14]
M. Wiering, R. Sałustowicz, and J. Schmidhuber, “Reinforcement learning soccer teams with incomplete world models,” Autonomous Robots, vol. 7, no. 1, pp. 77-88, 1999.
[15]
J. M. C. Ocana, F. Riccio, R. Capobianco, and D. Nardi, “Cooperative multi-agent deep reinforcement learning in a 2 versus 2 free-kick task,” in Robot World Cup, pp. 44-57, Springer, 2019.
[16]
T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in Proc. of the 4th Int. Conf. on Learning Representations (ICLR), May 2016.
[17]
R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 6382-6393, 2017.
[18]
J. Ackermann, V. Gabler, T. Osa, and M. Sugiyama, “Reducing overestimation bias in multi-agent domains using double centralized critics,” arXiv preprint arXiv:1910.01465, 2019.
[19]
J. L. Elman, “Learning and development in neural networks: The importance of starting small,” Cognition, vol. 48, no. 1, pp. 71-99, 1993.
[20]
Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proc. of the 26th annual Int. Conf. on Machine Learning, pp. 41-48, 2009.
[21]
M. Al-Shedivat, T. Bansal, Y. Burda, I. Sutskever, I. Mordatch, and P. Abbeel, “Continuous adaptation via meta-learning in nonstationary and competitive environments,” in Proc. of the 6th Int. Conf. on Learning Representations (ICLR), 2018.
[22]
N. Heess et al., “Emergence of locomotion behaviours in rich environments,” arXiv preprint arXiv:1707.02286, 2017.
[23]
M. Hüttenrauch, S. Adrian, G. Neumann, et al., “Deep reinforcement learning for swarm systems,” Journal of Machine Learning Research, vol. 20, no. 54, pp. 1-31, 2019.
[24]
D. Balduzzi, K. Tuyls, J. Perolat, and T. Graepel, “Re-evaluating evaluation,” in Advances in Neural Information Processing Systems, vol. 31, pp. 3268-3279, Curran Associates, Inc., 2018.
[25]
A. E. Elo, The rating of chessplayers, past and present. Arco Pub., 1978.

Index Terms

  1. Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination Through Curriculum-Learning and Competition
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
      Sep 2021
      7915 pages

      Publisher

      IEEE Press

      Publication History

      Published: 27 September 2021

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Nov 2024

      Other Metrics

      Citations

      View Options

      View options

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media