Nothing Special   »   [go: up one dir, main page]

Skip to main content

Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

Abstract

Deep Reinforcement Learning is known to be brittle towards selection of appropriate hyperparameters. In particular, the selection of the structure of employed Deep Neural Networks has shown to be important and overfitting has become a common problem for DRL approaches. This study, first, analyzes how severe overfitting in DRL is in standard continuous control problems. Secondly, we argue that this might be partially due to the centralized perspective in control in which a single holistic controller is used that has to learn from all possible sensory inputs. As this is usually a high dimensional space, it appears natural that a Neural Network controller of high capacity starts to pick up non-meaningful correlations when relying on limited data during training. As a consequence, large Neural Network controller start to base their decisions on unimportant inputs. As a contrast, we offer a decentralized perspective in which control is distributed to local modules that act on local sensory inputs of much lower dimensionality. Such a prior of local inputs is biological inspired and shows, on the one hand, much faster learning, and, on the other hand, it is more robust against overfitting. Last, as decentralization impacts input (and output) dimensionality, we evaluated different common Neural Network initialization schemes and found Glorot initialization providing the most robust results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Andrychowicz, M., et al.: What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. arXiv:2006.05990 [cs, stat] (2020)

  2. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey of deep reinforcement learning. IEEE Sig. Process. Mag. 34(6), 26–38 (2017)

    Article  Google Scholar 

  3. Bidaye, S.S., Bockemühl, T., Büschges, A.: Six-legged walking in insects: how CPGs, peripheral feedback, and descending signals generate coordinated and adaptive motor rhythms. J. Neurophysiol. 119(2), 459–475 (2018)

    Article  Google Scholar 

  4. Clary, K., Tosch, E., Foley, J., Jensen, D.: Let’s Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments. arXiv:1904.06312 [cs, stat] (2019)

  5. Clune, J., Mouret, J.B., Lipson, H.: The evolutionary origins of modularity. Proc. R. Soc. B Biol. Sci. 280(1755), 20122863 (2013). https://doi.org/10.1098/rspb.2012.2863

    Article  Google Scholar 

  6. Dickinson, M.H., Farley, C.T., Full, R.J., Koehl, M.R., Kram, R., Lehman, S.: How animals move: an integrative view. Science 288(5463), 100–106 (2000)

    Article  Google Scholar 

  7. Dürr, V., et al.: Integrative biomimetics of autonomous hexapedal locomotion. Front. Neurorobot. 13, 88 (2019)

    Article  Google Scholar 

  8. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, vol. 9, pp. 249–256 (2010)

    Google Scholar 

  9. Heess, N., et al.: Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017)

  10. Hsu, C.C.Y., Mendler-Dünner, C., Hardt, M.: Revisiting Design Choices in Proximal Policy Optimization. arXiv:2009.10897 [cs, stat] (2020)

  11. Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019). https://doi.org/10.1126/scirobotics.aau5872

    Article  Google Scholar 

  12. Khadka, S., et al.: Collaborative Evolutionary Reinforcement Learning. arXiv:1905.00976 [cs, stat] (2019)

  13. Liang, E., et al.: RLlib: Abstractions for Distributed Reinforcement Learning. arXiv:1712.09381 [cs] (2018)

  14. Reda, D., Tao, T., van de Panne, M.: Learning to locomote: Understanding how environment design matters for deep reinforcement learning. In: Proceedings of the ACM SIGGRAPH Conference on Motion, Interaction and Games (2020)

    Google Scholar 

  15. Schilling, M., Cruse, H.: Decentralized control of insect walking: a simple neural network explains a wide range of behavioral and neurophysiological results. PLOS Comput. Biol. 16(4), e1007804 (2020)

    Article  Google Scholar 

  16. Schilling, M., Hoinville, T., Schmitz, J., Cruse, H.: Walknet, a bio-inspired controller for hexapod walking. Biol. Cybern. 107(4), 397–419 (2013)

    Article  MathSciNet  Google Scholar 

  17. Schilling, M., Konen, K., Ohl, F.W., Korthals, T.: Decentralized deep reinforcement learning for a distributed and adaptive locomotion controller of a hexapod robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2020)

    Google Scholar 

  18. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  19. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html

  20. Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. In: 2012 IEEE/RSJ IROS. pp. 5026–5033 (2012)

    Google Scholar 

  21. Zhang, C., Vinyals, O., Munos, R., Bengio, S.: A Study on Overfitting in Deep Reinforcement Learning. arXiv:1804.06893 [cs, stat] (2018)

Download references

Acknowledgements

This research was supported by the research training group “Dataninja” (Trustworthy AI for Seamless Problem Solving: Next Generation Intelligence Joins Robust Data Analysis) funded by the German federal state of North Rhine-Westphalia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malte Schilling .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schilling, M. (2021). Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86380-7_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86379-1

  • Online ISBN: 978-3-030-86380-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics