Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control

Malte Schilling¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

International Conference on Artificial Neural Networks

2463 Accesses
2 Citations

Abstract

Deep Reinforcement Learning is known to be brittle towards selection of appropriate hyperparameters. In particular, the selection of the structure of employed Deep Neural Networks has shown to be important and overfitting has become a common problem for DRL approaches. This study, first, analyzes how severe overfitting in DRL is in standard continuous control problems. Secondly, we argue that this might be partially due to the centralized perspective in control in which a single holistic controller is used that has to learn from all possible sensory inputs. As this is usually a high dimensional space, it appears natural that a Neural Network controller of high capacity starts to pick up non-meaningful correlations when relying on limited data during training. As a consequence, large Neural Network controller start to base their decisions on unimportant inputs. As a contrast, we offer a decentralized perspective in which control is distributed to local modules that act on local sensory inputs of much lower dimensionality. Such a prior of local inputs is biological inspired and shows, on the one hand, much faster learning, and, on the other hand, it is more robust against overfitting. Last, as decentralization impacts input (and output) dimensionality, we evaluated different common Neural Network initialization schemes and found Glorot initialization providing the most robust results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Case Study IV: Tuned Reinforcement Learning (in Python)

A novel investigation on the effects of state and reward structure in designing deep reinforcement learning-based controller for nonlinear dynamical systems

Article 26 March 2024

Control of neural systems at multiple scales using model-free, deep reinforcement learning

Article Open access 16 July 2018

References

Andrychowicz, M., et al.: What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. arXiv:2006.05990 [cs, stat] (2020)
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey of deep reinforcement learning. IEEE Sig. Process. Mag. 34(6), 26–38 (2017)
Article Google Scholar
Bidaye, S.S., Bockemühl, T., Büschges, A.: Six-legged walking in insects: how CPGs, peripheral feedback, and descending signals generate coordinated and adaptive motor rhythms. J. Neurophysiol. 119(2), 459–475 (2018)
Article Google Scholar
Clary, K., Tosch, E., Foley, J., Jensen, D.: Let’s Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments. arXiv:1904.06312 [cs, stat] (2019)
Clune, J., Mouret, J.B., Lipson, H.: The evolutionary origins of modularity. Proc. R. Soc. B Biol. Sci. 280(1755), 20122863 (2013). https://doi.org/10.1098/rspb.2012.2863
Article Google Scholar
Dickinson, M.H., Farley, C.T., Full, R.J., Koehl, M.R., Kram, R., Lehman, S.: How animals move: an integrative view. Science 288(5463), 100–106 (2000)
Article Google Scholar
Dürr, V., et al.: Integrative biomimetics of autonomous hexapedal locomotion. Front. Neurorobot. 13, 88 (2019)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, vol. 9, pp. 249–256 (2010)
Google Scholar
Heess, N., et al.: Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017)
Hsu, C.C.Y., Mendler-Dünner, C., Hardt, M.: Revisiting Design Choices in Proximal Policy Optimization. arXiv:2009.10897 [cs, stat] (2020)
Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019). https://doi.org/10.1126/scirobotics.aau5872
Article Google Scholar
Khadka, S., et al.: Collaborative Evolutionary Reinforcement Learning. arXiv:1905.00976 [cs, stat] (2019)
Liang, E., et al.: RLlib: Abstractions for Distributed Reinforcement Learning. arXiv:1712.09381 [cs] (2018)
Reda, D., Tao, T., van de Panne, M.: Learning to locomote: Understanding how environment design matters for deep reinforcement learning. In: Proceedings of the ACM SIGGRAPH Conference on Motion, Interaction and Games (2020)
Google Scholar
Schilling, M., Cruse, H.: Decentralized control of insect walking: a simple neural network explains a wide range of behavioral and neurophysiological results. PLOS Comput. Biol. 16(4), e1007804 (2020)
Article Google Scholar
Schilling, M., Hoinville, T., Schmitz, J., Cruse, H.: Walknet, a bio-inspired controller for hexapod walking. Biol. Cybern. 107(4), 397–419 (2013)
Article MathSciNet Google Scholar
Schilling, M., Konen, K., Ohl, F.W., Korthals, T.: Decentralized deep reinforcement learning for a distributed and adaptive locomotion controller of a hexapod robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2020)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html
Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. In: 2012 IEEE/RSJ IROS. pp. 5026–5033 (2012)
Google Scholar
Zhang, C., Vinyals, O., Munos, R., Bengio, S.: A Study on Overfitting in Deep Reinforcement Learning. arXiv:1804.06893 [cs, stat] (2018)

Download references

Acknowledgements

This research was supported by the research training group “Dataninja” (Trustworthy AI for Seamless Problem Solving: Next Generation Intelligence Joins Robust Data Analysis) funded by the German federal state of North Rhine-Westphalia.

Author information

Authors and Affiliations

Machine Learning Group, Bielefeld University, 33501, Bielefeld, Germany
Malte Schilling

Authors

Malte Schilling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malte Schilling .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schilling, M. (2021). Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_52

Download citation

DOI: https://doi.org/10.1007/978-3-030-86380-7_52
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86379-1
Online ISBN: 978-3-030-86380-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Case Study IV: Tuned Reinforcement Learning (in Python)

A novel investigation on the effects of state and reward structure in designing deep reinforcement learning-based controller for nonlinear dynamical systems

Control of neural systems at multiple scales using model-free, deep reinforcement learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Avoid Overfitting in Deep Reinforcement Learning: Increasing Robustness Through Decentralized Control

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Case Study IV: Tuned Reinforcement Learning (in Python)

A novel investigation on the effects of state and reward structure in designing deep reinforcement learning-based controller for nonlinear dynamical systems

Control of neural systems at multiple scales using model-free, deep reinforcement learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation