Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Deep Reinforcement Learning for a Humanoid Robot Soccer Player

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

This paper investigates the use of Deep Reinforcement Learning (DRL) applied to the humanoid robot soccer environment, where a robot must learn from basic to complex skills while it interacts with the environment through images received by its own camera. To do so, the Dueling Double DQN algorithm is used: it receives the images from the robot’s camera and decides on which discrete action should be performed, such as walk forward, turn to the left or kick the ball. The first experiments were performed in a robotic simulator in which the robot could learn, with DRL, three different tasks: to walk towards the ball, to act like a penalty taker and to act like a goalkeeper. In the second experiment, the learning obtained in the task to walk towards the ball was transferred to a real humanoid robot and a similar behavior could be observed, even though the environment was not exactly the same when the domain was changed. Results showed that it is possible to use DRL to learn tasks related to the role of a humanoid robot-soccer player, such as goalkeeper and penalty taker.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

The scripts utilized in the experiments presented in this paper are available at https://github.com/Isaac25silva/DRL_goalkeeper and https://github.com/Isaac25silva/DRLtransfer.git.

References

  1. Kim, S., Kim, M., Lee, J., Hwang, S., Chae, J., Park, B., Cho, H., Sim, J., Jung, J., Lee, H., et al.: Approach of team Snu to the Darpa robotics challenge finals. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp 777–784. IEEE (2015)

  2. Lim, J., Lee, I., Shim, I., Jung, H., Joe, H.M., Bae, H., Sim, O., Oh, J., Jung, T., Shin, S., et al.: Robot system of drc-hubo+ and control strategy of team kaist in darpa robotics challenge finals. Journal of Field Robotics 34(4), 802–829 (2017)

    Article  Google Scholar 

  3. Jumel, F., Saraydaryan, J., Leber, R., Matignon, L., Lombardi, E., Wolf, C., Simonin, O.: Context aware robot architecture, application to the robocup@ home challenge. In: Robocup Symposium (2018)

  4. Perico, D.H., Silva, I.J, Vilão, C.O. JR, Homem, T.P.D., Destro, R.C, Tonidandel, F., Bianchi, R.A.C.: Newton: a high level control humanoid robot for the robocup soccer kidsize league. In: Robotics, pp 53–73. Springer (2014)

  5. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: a doom-based ai research platform for visual reinforcement learning. arXiv:1605.02097 (2016)

  6. Lample, G., Chaplot, D.S.: Playing fps games with deep reinforcement learning. arXiv:1609.05521 (2016)

  7. Jaderberg, M., Mnih, V., Czarnecki, W.M., Schaul, T., Leibo, J.Z., Silver, D., Kavukcuoglu, K.: Reinforcement learning with unsupervised auxiliary tasks. arXiv:1611.05397 (2016)

  8. Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P, Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. arXiv:1602.01783 (2016)

  9. Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments. arXiv:1611.03673(2016)

  10. Justesen, N., Bontrager, P., Togelius, J., Risi, S.: Deep learning for video game playing. IEEE Transactions on Games (2019)

  11. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. arXiv:1504.00702 (2015)

  12. Yahya, A., Li, A., Kalakrishnan, M., Chebotar, Y., Levine, S.: Collective robot reinforcement learning with distributed asynchronous guided policy search. arXiv:1610.00673 (2016)

  13. Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. arXiv:1610.00633 (2016)

  14. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)

    MATH  Google Scholar 

  15. Russell, S.J, Norvig, P.: Artificial Intelligence: a Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010)

    MATH  Google Scholar 

  16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  17. Lange, S., Riedmiller, M.A: Deep learning of visual control policies. In: ESANN. Citeseer (2010)

  18. Riedmiller, M.: Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method. In: ECML, vol. 3720, pp 317–328. Springer (2005)

  19. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)

  20. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv:1511.05952 (2015)

  21. Hasselt, H.V., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. arXiv:1509.06461 (2015)

  22. Wang, Z., de Freitas, N., Lanctot, M.: Dueling network architectures for deep reinforcement learning. arXiv:1511.06581 (2016)

  23. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra. D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)

  24. Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction, 2nd edn. MIT Press, Cambridge (2017). in progress - draft edition

    MATH  Google Scholar 

  25. Lin, L.-J.: Reinforcement Learning for Robots Using Neural Networks. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science (1993)

  26. Hasselt, H.V.: Double q-learning. In: Advances in Neural Information Processing Systems, pp 2613–2621 (2010)

  27. Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv:1610.01733 (2016)

  28. Tai, L., Li, S., Liu, M.: A deep-network solution towards model-less obstacle avoidance. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2759–2764. IEEE (2016)

  29. Tai, L., Liu, M.: Mobile robots exploration through cnn-based reinforcement learning. Robotics and Biomimetics 3(1), 24 (2016)

    Article  Google Scholar 

  30. Lobos-Tsunekawa, K., Leiva, F., Ruiz-del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robotics and Automation Letters 3(4), 3247–3254 (2018)

    Article  Google Scholar 

  31. Abreu, M., Lau, N., Sousa, A., Reis, L.P.: Learning low level skills from scratch for humanoid robot soccer using deep reinforcement learning. In: 2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp 1–8. IEEE (2019)

  32. Team description paper: Citbrains (kid size league) (2017)

  33. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (2014)

    MATH  Google Scholar 

  34. Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181 (2011)

  35. Ha, I., Tamura, Y., Asama, H.: Development of open platform humanoid robot darwin-op. Adv. Robot. 27(3), 223–232 (2013)

    Article  Google Scholar 

  36. Michel, O.: Webots: professional mobile robot simulation. Journal of Advanced Robotics Systems 1(1), 39–42 (2004)

    Google Scholar 

  37. Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181. IEEE (2011)

  38. Perico, D.H., Silva, I.J., Vilão, C.O., Homem, T.P.D., Destro, R.C., Tonidandel, F., Bianchi, R.A.C.: Hardware and software aspects of the design and assembly of a new humanoid robot for robocup soccer. In: 2014 Joint Conference on Robotics: SBR-LARS Robotics Symposium and Robocontrol, pp 73–78 (2014)

  39. Matthias Plappert. keras-rl. https://github.com/matthiasplappert/keras-rl (2016)

Download references

Acknowledgments

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Author information

Authors and Affiliations

Authors

Contributions

– Conceptualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

Methodology: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Software: I. J. da Silva

–Investigation: I. J. da Silva

–Formal Analysis: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Validation: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Data curation: I. J. da Silva

–Writing – original draft: I. J. da Silva; D. H. Perico; T. P. D. Homem

–Writing – review & editing: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Visualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Resources: I. J. da Silva

–Funding acquisition: I. J. da Silva

–Project administration: R. A. C. Bianchi

–Supervision: R. A. C. Bianchi

Corresponding author

Correspondence to Isaac Jesus da Silva.

Ethics declarations

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

da Silva, I.J., Perico, D.H., Homem, T.P.D. et al. Deep Reinforcement Learning for a Humanoid Robot Soccer Player. J Intell Robot Syst 102, 69 (2021). https://doi.org/10.1007/s10846-021-01333-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01333-1

Keywords

Navigation