Deep Reinforcement Learning for a Humanoid Robot Soccer Player

Isaac Jesus da Silva ORCID: orcid.org/0000-0002-5662-6593¹,
Danilo Hernani Perico¹,
Thiago Pedro Donadon Homem² &
…
Reinaldo Augusto da Costa Bianchi¹

673 Accesses
11 Citations
Explore all metrics

Abstract

This paper investigates the use of Deep Reinforcement Learning (DRL) applied to the humanoid robot soccer environment, where a robot must learn from basic to complex skills while it interacts with the environment through images received by its own camera. To do so, the Dueling Double DQN algorithm is used: it receives the images from the robot’s camera and decides on which discrete action should be performed, such as walk forward, turn to the left or kick the ball. The first experiments were performed in a robotic simulator in which the robot could learn, with DRL, three different tasks: to walk towards the ball, to act like a penalty taker and to act like a goalkeeper. In the second experiment, the learning obtained in the task to walk towards the ball was transferred to a real humanoid robot and a similar behavior could be observed, even though the environment was not exactly the same when the domain was changed. Results showed that it is possible to use DRL to learn tasks related to the role of a humanoid robot-soccer player, such as goalkeeper and penalty taker.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Reinforcement Learning for Humanoid Robot Behaviors

Article 27 April 2022

Deep Q-Network for AI Soccer

6D Localization and Kicking for Humanoid Robotic Soccer

Article 12 May 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

The scripts utilized in the experiments presented in this paper are available at https://github.com/Isaac25silva/DRL_goalkeeper and https://github.com/Isaac25silva/DRLtransfer.git.

References

Kim, S., Kim, M., Lee, J., Hwang, S., Chae, J., Park, B., Cho, H., Sim, J., Jung, J., Lee, H., et al.: Approach of team Snu to the Darpa robotics challenge finals. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp 777–784. IEEE (2015)
Lim, J., Lee, I., Shim, I., Jung, H., Joe, H.M., Bae, H., Sim, O., Oh, J., Jung, T., Shin, S., et al.: Robot system of drc-hubo+ and control strategy of team kaist in darpa robotics challenge finals. Journal of Field Robotics 34(4), 802–829 (2017)
Article Google Scholar
Jumel, F., Saraydaryan, J., Leber, R., Matignon, L., Lombardi, E., Wolf, C., Simonin, O.: Context aware robot architecture, application to the robocup@ home challenge. In: Robocup Symposium (2018)
Perico, D.H., Silva, I.J, Vilão, C.O. JR, Homem, T.P.D., Destro, R.C, Tonidandel, F., Bianchi, R.A.C.: Newton: a high level control humanoid robot for the robocup soccer kidsize league. In: Robotics, pp 53–73. Springer (2014)
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: a doom-based ai research platform for visual reinforcement learning. arXiv:1605.02097 (2016)
Lample, G., Chaplot, D.S.: Playing fps games with deep reinforcement learning. arXiv:1609.05521 (2016)
Jaderberg, M., Mnih, V., Czarnecki, W.M., Schaul, T., Leibo, J.Z., Silver, D., Kavukcuoglu, K.: Reinforcement learning with unsupervised auxiliary tasks. arXiv:1611.05397 (2016)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P, Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. arXiv:1602.01783 (2016)
Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments. arXiv:1611.03673(2016)
Justesen, N., Bontrager, P., Togelius, J., Risi, S.: Deep learning for video game playing. IEEE Transactions on Games (2019)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. arXiv:1504.00702 (2015)
Yahya, A., Li, A., Kalakrishnan, M., Chebotar, Y., Levine, S.: Collective robot reinforcement learning with distributed asynchronous guided policy search. arXiv:1610.00673 (2016)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. arXiv:1610.00633 (2016)
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)
MATH Google Scholar
Russell, S.J, Norvig, P.: Artificial Intelligence: a Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010)
MATH Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Lange, S., Riedmiller, M.A: Deep learning of visual control policies. In: ESANN. Citeseer (2010)
Riedmiller, M.: Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method. In: ECML, vol. 3720, pp 317–328. Springer (2005)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv:1511.05952 (2015)
Hasselt, H.V., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. arXiv:1509.06461 (2015)
Wang, Z., de Freitas, N., Lanctot, M.: Dueling network architectures for deep reinforcement learning. arXiv:1511.06581 (2016)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra. D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction, 2nd edn. MIT Press, Cambridge (2017). in progress - draft edition
MATH Google Scholar
Lin, L.-J.: Reinforcement Learning for Robots Using Neural Networks. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science (1993)
Hasselt, H.V.: Double q-learning. In: Advances in Neural Information Processing Systems, pp 2613–2621 (2010)
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv:1610.01733 (2016)
Tai, L., Li, S., Liu, M.: A deep-network solution towards model-less obstacle avoidance. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2759–2764. IEEE (2016)
Tai, L., Liu, M.: Mobile robots exploration through cnn-based reinforcement learning. Robotics and Biomimetics 3(1), 24 (2016)
Article Google Scholar
Lobos-Tsunekawa, K., Leiva, F., Ruiz-del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robotics and Automation Letters 3(4), 3247–3254 (2018)
Article Google Scholar
Abreu, M., Lau, N., Sousa, A., Reis, L.P.: Learning low level skills from scratch for humanoid robot soccer using deep reinforcement learning. In: 2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp 1–8. IEEE (2019)
Team description paper: Citbrains (kid size league) (2017)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (2014)
MATH Google Scholar
Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181 (2011)
Ha, I., Tamura, Y., Asama, H.: Development of open platform humanoid robot darwin-op. Adv. Robot. 27(3), 223–232 (2013)
Article Google Scholar
Michel, O.: Webots: professional mobile robot simulation. Journal of Advanced Robotics Systems 1(1), 39–42 (2004)
Google Scholar
Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp 2178–2181. IEEE (2011)
Perico, D.H., Silva, I.J., Vilão, C.O., Homem, T.P.D., Destro, R.C., Tonidandel, F., Bianchi, R.A.C.: Hardware and software aspects of the design and assembly of a new humanoid robot for robocup soccer. In: 2014 Joint Conference on Robotics: SBR-LARS Robotics Symposium and Robocontrol, pp 73–78 (2014)
Matthias Plappert. keras-rl. https://github.com/matthiasplappert/keras-rl (2016)

Download references

Acknowledgments

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Author information

Authors and Affiliations

FEI University Center, 3972-B Humberto de Alencar Castelo Branco Ave., Assunção, São Bernardo do Campo, SP, 09850-901, Brazil
Isaac Jesus da Silva, Danilo Hernani Perico & Reinaldo Augusto da Costa Bianchi
Federal Institute of Education, Science and Technology of São Paulo, 951 Mutinga Ave., Jardim Santo Elias, São Paulo, SP, 05110-000, Brazil
Thiago Pedro Donadon Homem

Authors

Isaac Jesus da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Hernani Perico
View author publications
You can also search for this author in PubMed Google Scholar
Thiago Pedro Donadon Homem
View author publications
You can also search for this author in PubMed Google Scholar
Reinaldo Augusto da Costa Bianchi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

– Conceptualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

Methodology: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Software: I. J. da Silva

–Investigation: I. J. da Silva

–Formal Analysis: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Validation: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Data curation: I. J. da Silva

–Writing – original draft: I. J. da Silva; D. H. Perico; T. P. D. Homem

–Writing – review & editing: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Visualization: I. J. da Silva; D. H. Perico; T. P. D. Homem; R. A. C. Bianchi

–Resources: I. J. da Silva

–Funding acquisition: I. J. da Silva

–Project administration: R. A. C. Bianchi

–Supervision: R. A. C. Bianchi

Corresponding author

Correspondence to Isaac Jesus da Silva.

Ethics declarations

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

da Silva, I.J., Perico, D.H., Homem, T.P.D. et al. Deep Reinforcement Learning for a Humanoid Robot Soccer Player. J Intell Robot Syst 102, 69 (2021). https://doi.org/10.1007/s10846-021-01333-1

Download citation

Received: 29 September 2020
Accepted: 27 January 2021
Published: 26 June 2021
DOI: https://doi.org/10.1007/s10846-021-01333-1

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Reinforcement Learning for Humanoid Robot Behaviors

Deep Q-Network for AI Soccer

6D Localization and Kicking for Humanoid Robotic Soccer

Availability of data and materials

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Deep Reinforcement Learning for a Humanoid Robot Soccer Player

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Reinforcement Learning for Humanoid Robot Behaviors

Deep Q-Network for AI Soccer

6D Localization and Kicking for Humanoid Robotic Soccer

Explore related subjects

Availability of data and materials

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation