A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations

Feifei Zhao^1,3,
Yi Zeng^1,2,3,
Guixiang Wang¹,
Jun Bai¹ &
…
Bo Xu ORCID: orcid.org/0000-0002-9595-9091^1,2,3

1903 Accesses
31 Citations
1 Altmetric
Explore all metrics

Abstract

Decision making is a fundamental ability for intelligent agents (e.g., humanoid robots and unmanned aerial vehicles). During decision making process, agents can improve the strategy for interacting with the dynamic environment through reinforcement learning. Many state-of-the-art reinforcement learning models deal with relatively smaller number of state-action pairs, and the states are preferably discrete, such as Q-learning and Actor-Critic algorithms. While in practice, in many scenario, the states are continuous and hard to be properly discretized. Better autonomous decision making methods need to be proposed to handle these problems. Inspired by the mechanism of decision making in human brain, we propose a general computational model, named as prefrontal cortex-basal ganglia (PFC-BG) algorithm. The proposed model is inspired by the biological reinforcement learning pathway and mechanisms from the following perspectives: (1) Dopamine signals continuously update reward-relevant information for both basal ganglia and working memory in prefrontal cortex. (2) We maintain the contextual reward information in working memory. This has a top-down biasing effect on reinforcement learning in basal ganglia. The proposed model separates the continuous states into smaller distinguishable states, and introduces continuous reward function for each state to obtain reward information at different time. To verify the performance of our model, we apply it to many UAV decision making experiments, such as avoiding obstacles and flying through window and door, and the experiments support the effectiveness of the model. Compared with traditional Q-learning and Actor-Critic algorithms, the proposed model is more biologically inspired, and more accurate and faster to make decision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Brain-like Intelligent Decision-making Based on Basal Ganglia and Its Application in Automatic Car-following

Article 18 November 2021

An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

AGI Brain: A Learning and Decision Making Framework for Artificial General Intelligence Systems Based on Modern Control Theory

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Botvinick MM. Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol. 2012;22(6): 956–962.
Article CAS PubMed Google Scholar
Lee D, Seo H, Jung MW. Neural basis of reinforcement learning and decision making. Ann Rev Neurosci. 2012;35(1):287–308.
Article CAS PubMed PubMed Central Google Scholar
Humphrys M. Action selection methods using reinforcement learning. Proceedings of the International Conference on Simulation of Adaptive Behavior; 1996. p. 135–144.
Arel I. Theoretical foundations of artificial general intelligence, chapter deep reinforcement learning as foundation for artificial general Intelligence:89–102. 2012.
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing atari with deep reinforcement learning. 2013. arXiv:1312.5602.
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature. 2015;518:529–533.
Article CAS PubMed Google Scholar
Hearn RA, Granger RH. Learning hierarchical representations and behaviors. Association for the Advancement of Artificial Intelligence. 2008.
Schultz W, Dickinson A. Neuronal coding of prediction errors. Ann Rev Neurosci. 2000;23:473–500.
Article CAS PubMed Google Scholar
Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits Neural substrates of parallel processing. Trends Neurosci. 1990;13(7):266–271.
Article CAS PubMed Google Scholar
Gerfen CR. The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. J Neural Transm Suppl. 1992;36(4):43–59.
CAS PubMed Google Scholar
Joel D, Weiner I. The organization of the basal ganglia-thalamocortical circuits: open interconnected rather than closed segregated. Neuroscience. 1994;63(2):363–379.
Article CAS PubMed Google Scholar
Joel D, Weiner I. The connections of the primate subthalamic nucleus: indirect pathways and the open-interconnected scheme of basal ganglia-thalamocortical circuitry. Brain Res Rev. 1997;23:62–78.
Article CAS PubMed Google Scholar
Parent A. Extrinsic connections of the basal ganglia. Trends Neurosci. 1990;13(7):254–258.
Article CAS PubMed Google Scholar
Joel D, Weiner I. The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum. Neuroscience. 2000;96(3): 451–474.
Article CAS PubMed Google Scholar
Schultz W, Apicella P, Scarnati E, Ljungberg T. Neuronal activity in monkey ventral striatum related to the expectation of reward. J Neurosci. 1992;12(12):4595–4610.
Article CAS PubMed Google Scholar
O’Reilly RC, Frank MJ. Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 2006;18(2):283–328.
Article PubMed Google Scholar
Frank MJ, Claus ED. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol Rev. 2006;113(2):300–326.
Article PubMed Google Scholar
Dayan P, Daw ND. Decision theory, reinforcement learning, and the brain. Cogn Affect Behav Neurosci. 2008; 8(4):429–453.
Article PubMed Google Scholar
Shadlen MN, Newsome WT. Motion perception: seeing and deciding. Proc Natl Acad Sci. 1996;93(2):628–633.
Article CAS PubMed PubMed Central Google Scholar
Karni E. A theory of bayesian decision making with action-dependent subjective probabilities. Econ Theory. 2011; 48(1):125–146.
Article Google Scholar
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T, Silver D, Kavukcuoglu K. Asynchronous methods for deep reinforcement learning. Proceedings of the 33th international conference on machine learning; 2016. p. 1928–1937.
Timothy P, Lillicrap J, Hunt J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D. Continuous control with deep reinforcement learning. 2015. arXiv:1509.02971.
Hasselt HV, Guez A, Silver D. Deep reinforcement learning with double q-learning. Proceedings of the 30th AAAI conference on artificial intelligence; 2016.
Nair A, Srinivasan P, Blackwell S, Alcicek C, Fearon R, De Maria A, Panneershelvam V, Suleyman M, Beattie C, Petersen S. Massively parallel methods for deep reinforcement learning. 2015. arXiv:1507.04296.
Barto AG, Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst. 2003;13(1):41–77.
Article Google Scholar
Morimoto J, Doyayy K. Hierarchical reinforcement learning of low-dimensional subgoals and high-dimensional trajectories. Proceedings of the 5th International Conference on Neural Information Processing; 1998. p. 850–853.
Smart WD, Kaelbling LP. Practical reinforcement learning in continuous spaces. Proceedings of the 17th International Conference on Machine Learning; 2000. p. 903–910.
Lazaric A, Restelli M, Bonarini A. Reinforcement learning in continuous action spaces through sequential monte carlo methods. Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems; 2007. p. 833–840.
Joel D, Niv Y, Ruppin E. Actor-ccritic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw. 2002;15(4):535–547.
Article PubMed Google Scholar
Frémaux N, Sprekeler H, Gerstner W. Reinforcement learning using a continuous time actor-critic framework with spiking neurons. PLOS Comput Biology. 2013;9(4):1–21.
Article Google Scholar
Ellaithy K, Bogdan M. A reinforcement learning framework for spiking networks with dynamic synapses. Comput Intell Neuroscience. 2011;2011(3):713–750.
Google Scholar
Kim HF, Hikosaka O. Parallel basal ganglia circuits for voluntary and automatic behaviour to reach rewards. Brain. 2015;138(7):1776–1800.
Article PubMed PubMed Central Google Scholar
Berns GS, Sejnowski TJ. A computational model of how the basal ganglia produce sequences. J Cogn Neurosci. 1998;10(1):108–121.
Article CAS PubMed Google Scholar
Kumaravelu K, Brocker DT, Grill WM. A biophysical model of the cortex-basal ganglia-thalamus network in the 6-ohda lesioned rat model of parkinson’s disease. J Comput Neurosci. 2016;40(2):207–229.
Article PubMed PubMed Central Google Scholar
Debnath S, Nassour J. Extending cortical-basal inspired reinforcement learning model with success-failure experience. Proceedings of 4th IEEE International Conference on Development and Learning and on Epigenetic Robotics; 2014. p. 293–298.
Vijay R, John N. Tsitsiklis Konda actor-critic algorithms. SLAM J Control Optim. 2003;42(4):1143–1166.
Article Google Scholar
Grondman I, Busoniu L, Lopes G, Babuska R. A survey of actor-critic reinforcement learning Standard and natural policy grdients. IEEE Trans Syst Man Cybern. 2012;42(6):1291–1307.
Article Google Scholar
Sutton RS, Barto AG. 1998. Reinforcement Learning: an introduction, chapter the reinforcement learning problem:70–71.
Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter temporal-difference learning:188–190. 1998.
Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter evaluative feedback:40–42. 1998.
Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter temporal-difference learning:185–186. 1998.

Download references

Acknowledgments

This study was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB02060007), and Beijing Municipal Commission of Science and Technology (Z161100000216124). We would like to thank all the anonymous reviewers for all the constructive comments, which enables this paper to be with much better shape.

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Feifei Zhao, Yi Zeng, Guixiang Wang, Jun Bai & Bo Xu
Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
Yi Zeng & Bo Xu
University of Chinese Academy of Sciences, Beijing, China
Feifei Zhao, Yi Zeng & Bo Xu

Authors

Feifei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Guixiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Bai
View author publications
You can also search for this author in PubMed Google Scholar
Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Zeng.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Feifei Zhao and Yi Zeng have equal contribution to this work and should be regarded as co-first authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, F., Zeng, Y., Wang, G. et al. A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations. Cogn Comput 10, 296–306 (2018). https://doi.org/10.1007/s12559-017-9511-3

Download citation

Received: 09 April 2017
Accepted: 13 September 2017
Published: 25 September 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s12559-017-9511-3

A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Brain-like Intelligent Decision-making Based on Basal Ganglia and Its Application in Automatic Car-following

An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

AGI Brain: A Learning and Decision Making Framework for Artificial General Intelligence Systems Based on Modern Control Theory

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Ethical Approval

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Brain-like Intelligent Decision-making Based on Basal Ganglia and Its Application in Automatic Car-following

An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

AGI Brain: A Learning and Decision Making Framework for Artificial General Intelligence Systems Based on Modern Control Theory

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Ethical Approval

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation