Biologically Inspired Reinforcement Learning: Reward-Based Decomposition for Multi-goal Environments

Weidong Zhou¹⁸ &
Richard Coggins¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3141))

Included in the following conference series:

International Workshop on Biologically Inspired Approaches to Advanced Information Technology

873 Accesses

Abstract

We present an emotion-based hierarchical reinforcement learning (HRL) algorithm for environments with multiple sources of reward. The architecture of the system is inspired by the neurobiology of the brain and particularly those areas responsible for emotions, decision making and behaviour execution, being the amygdala, the orbito-frontal cortex and the basal ganglia respectively. The learning problem is decomposed according to sources of reward. A reward source serves as a goal for a given subtask. Each subtask is assigned an artificial emotion indication (AEI) which predicts the reward component associated with the subtask. The AEIs are learned along with the top-level policy simultaneously and used to interrupt subtask execution when the AEIs change significantly. The algorithm is tested in a simulated gridworld which has two sources of reward and is partially observable. Experiments are performed comparing the emotion based algorithm with other HRL algorithms under the same learning conditions. The use of the biologically inspired architecture significantly accelerates the learning process and achieves higher long term reward compared to a human designed policy and a restricted form of the MAXQ algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Emotion in reinforcement learning agents and robots: a survey

Article Open access 25 August 2017

A generic self-learning emotional framework for machines

Article Open access 28 October 2024

A motivational model based on artificial biological functions for the intelligent decision-making of social robots

Article Open access 13 May 2023

References

Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
MATH MathSciNet Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1), 181–211 (1999)
Article MATH MathSciNet Google Scholar
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Hoffmann, C.S., Achim (eds.) Nineteenth International Conference on Machine Learning, Sydney Australia, pp. 243–250 (2002)
Google Scholar
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 1082–1088. MIT Press, Cambridge (2001)
Google Scholar
Rolls, E.T.: The brain and emotion. Oxford University Press, Oxford (1999)
Google Scholar
Schultz, W.L., Tremblay, L., Hollerman, J.R.: Reward processing in primate orbitofrontal cortex and basla ganglia. Cerebral Cortex 10, 272–283 (2000)
Article Google Scholar
Zhou, W., Coggins, R.: A biologically inspired hierarchical reinforcement learning system. Cybernetics and Systems (2004) (to appear)
Google Scholar
LeDoux, J.E.: Brain mechanisms of emotion and emotional learning. Current Opinion in Neurobiology 2, 191–197 (1992)
Article MathSciNet Google Scholar
Barto, G.: Adaptive critics and the basal ganglia. In: Davis, J.L., Houk Beiser, J.C. (eds.) Models of information processing in the basal ganglia, pp. 215–232. MIT Press, Cambridge (1995)
Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)
Google Scholar
Bradtke, S.J., Duff, M.O.: Reinforcement learning methods for continuous-time markov decision problems. In: Advances in Neural Information Processing Systems, vol. 7, pp. 393–500. MIT Press, Cambridge (1995)
Google Scholar
Gadanho, J., Hallam, S.: Emotion-triggered learning in autonomous robot control. Cybernetics and Systems 32(5), 531–559 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Laboratory, The School of Electrical and Information Engineering, The University of Sydney, NSW 2006, Australia
Weidong Zhou & Richard Coggins

Authors

Weidong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Richard Coggins
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Biologically Inspired Robotic Group (BIRG), Ecole Polytechnique Fédérale de Lausanne (EPFL), Station 14, CH-1015, Lausanne, Switzerland
Auke Jan Ijspeert
Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, 565-0871, Suita, Osaka, Japan
Masayuki Murata & Naoki Wakamiya &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, W., Coggins, R. (2004). Biologically Inspired Reinforcement Learning: Reward-Based Decomposition for Multi-goal Environments. In: Ijspeert, A.J., Murata, M., Wakamiya, N. (eds) Biologically Inspired Approaches to Advanced Information Technology. BioADIT 2004. Lecture Notes in Computer Science, vol 3141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27835-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-27835-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23339-8
Online ISBN: 978-3-540-27835-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics