Abstract
This paper investigates the use of Deep Reinforcement Learning (DRL) applied to the humanoid robot soccer environment, where a robot must learn from basic to complex skills while it interacts with the environment through images received by its own camera. To do so, the Dueling Double DQN algorithm is used: it receives the images from the robot’s camera and decides on which discrete action should be performed, such as walk forward, turn to the left or kick the ball. The first experiments were performed in a robotic simulator in which the robot could learn, with DRL, three different tasks: to walk towards the ball, to act like a penalty taker and to act like a goalkeeper. In the second experiment, the learning obtained in the task to walk towards the ball was transferred to a real humanoid robot and a similar behavior could be observed, even though the environment was not exactly the same when the domain was changed. Results showed that it is possible to use DRL to learn tasks related to the role of a humanoid robot-soccer player, such as goalkeeper and penalty taker.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
The scripts utilized in the experiments presented in this paper are available at https://github.com/Isaac25silva/DRL_goalkeeper and https://github.com/Isaac25silva/DRLtransfer.git.
References
Kim, S., Kim, M., Lee, J., Hwang, S., Chae, J., Park, B., Cho, H., Sim, J., Jung, J., Lee, H., et al.: Approach of team Snu to the Darpa robotics challenge finals. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp 777–784. IEEE (2015)
Lim, J., Lee, I., Shim, I., Jung, H., Joe, H.M., Bae, H., Sim, O., Oh, J., Jung, T., Shin, S., et al.: Robot system of drc-hubo+ and control strategy of team kaist in darpa robotics challenge finals. Journal of Field Robotics 34(4), 802–829 (2017)
Jumel, F., Saraydaryan, J., Leber, R., Matignon, L., Lombardi, E., Wolf, C., Simonin, O.: Context aware robot architecture, application to the robocup@ home challenge. In: Robocup Symposium (2018)
Perico, D.H., Silva, I.J, Vilão, C.O. JR, Homem, T.P.D., Destro, R.C, Tonidandel, F., Bianchi, R.A.C.: Newton: a high level control humanoid robot for the robocup soccer kidsize league. In: Robotics, pp 53–73. Springer (2014)
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: a doom-based ai research platform for visual reinforcement learning. arXiv:1605.02097 (2016)
Lample, G., Chaplot, D.S.: Playing fps games with deep reinforcement learning. arXiv:1609.05521 (2016)
Jaderberg, M., Mnih, V., Czarnecki, W.M., Schaul, T., Leibo, J.Z., Silver, D., Kavukcuoglu, K.: Reinforcement learning with unsupervised auxiliary tasks. arXiv:1611.05397 (2016)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P, Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. arXiv:1602.01783 (2016)
Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments. arXiv:1611.03673(2016)
Justesen, N., Bontrager, P., Togelius, J., Risi, S.: Deep learning for video game playing. IEEE Transactions on Games (2019)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. arXiv:1504.00702 (2015)
Yahya, A., Li, A., Kalakrishnan, M., Chebotar, Y., Levine, S.: Collective robot reinforcement learning with distributed asynchronous guided policy search. arXiv:1610.00673 (2016)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. arXiv:1610.00633 (2016)
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)
Russell, S.J, Norvig, P.: Artificial Intelligence: a Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Lange, S., Riedmiller, M.A: Deep learning of visual control policies. In: ESANN. Citeseer (2010)
Riedmiller, M.: Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method. In: ECML, vol. 3720, pp 317–328. Springer (2005)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv:1511.05952 (2015)
Hasselt, H.V., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. arXiv:1509.06461 (2015)
Wang, Z., de Freitas, N., Lanctot, M.: Dueling network architectures for deep reinforcement learning. arXiv:1511.06581 (2016)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra. D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction, 2nd edn. MIT Press, Cambridge (2017). in progress - draft edition
Lin, L.-J.: Reinforcement Learning for Robots Using Neural Networks. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science (1993)
Hasselt, H.V.: Double q-learning. In: Advances in Neural Information Processing Systems, pp 2613–2621 (2010)
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv:1610.01733 (2016)
Tai, L., Li, S., Liu, M.: A deep-network solution towards model-less obstacle avoidance. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2759–2764. IEEE (2016)
Tai, L., Liu, M.: Mobile robots exploration through cnn-based reinforcement learning. Robotics and Biomimetics 3(1), 24 (2016)
Lobos-Tsunekawa, K., Leiva, F., Ruiz-del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robotics and Automation Letters 3(4), 3247–3254 (2018)
Abreu, M., Lau, N., Sousa, A., Reis, L.P.: Learning low level skills from scratch for humanoid robot soccer using deep reinforcement learning. In: 2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp 1–8. IEEE (2019)
Team description paper: Citbrains (kid size league) (2017)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (2014)
Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181 (2011)
Ha, I., Tamura, Y., Asama, H.: Development of open platform humanoid robot darwin-op. Adv. Robot. 27(3), 223–232 (2013)
Michel, O.: Webots: professional mobile robot simulation. Journal of Advanced Robotics Systems 1(1), 39–42 (2004)
Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181. IEEE (2011)
Perico, D.H., Silva, I.J., Vilão, C.O., Homem, T.P.D., Destro, R.C., Tonidandel, F., Bianchi, R.A.C.: Hardware and software aspects of the design and assembly of a new humanoid robot for robocup soccer. In: 2014 Joint Conference on Robotics: SBR-LARS Robotics Symposium and Robocontrol, pp 73–78 (2014)
Matthias Plappert. keras-rl. https://github.com/matthiasplappert/keras-rl (2016)
Acknowledgments
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Funding
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.
Author information
Authors and Affiliations
Contributions
– Conceptualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
Methodology: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
–Software: I. J. da Silva
–Investigation: I. J. da Silva
–Formal Analysis: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
–Validation: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
–Data curation: I. J. da Silva
–Writing – original draft: I. J. da Silva; D. H. Perico; T. P. D. Homem
–Writing – review & editing: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
–Visualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi
–Resources: I. J. da Silva
–Funding acquisition: I. J. da Silva
–Project administration: R. A. C. Bianchi
–Supervision: R. A. C. Bianchi
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
da Silva, I.J., Perico, D.H., Homem, T.P.D. et al. Deep Reinforcement Learning for a Humanoid Robot Soccer Player. J Intell Robot Syst 102, 69 (2021). https://doi.org/10.1007/s10846-021-01333-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01333-1