Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1102351.1102402acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

A causal approach to hierarchical decomposition of factored MDPs

Published: 07 August 2005 Publication History

Abstract

We present Variable Influence Structure Analysis, an algorithm that dynamically performs hierarchical decomposition of factored Markov decision processes. Our algorithm determines causal relationships between state variables and introduces temporally-extended actions that cause the values of state variables to change. Each temporally-extended action corresponds to a subtask that is significantly easier to solve than the overall task. Results from experiments show great promise in scaling to larger tasks.

References

[1]
Boutilier, C., Dearden. R., & Goldszmidt, M. (1995) Exploiting structure in policy construction. IJCAI, 14: 1104--1113.
[2]
Dean, T., & Kanazawa, K. (1989) A model for reasoning about persistence and causation. Computational Intelligence, 5(3): 142--150.
[3]
Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research. 13: 227--303.
[4]
Digney, B. (1996) Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. From animals to animats, 4: 363--372.
[5]
Feng, Z., Hansen, E., & Zilberstein, Z. (2003) Symbolic generalization for on-line planning. UAI, 19: 209--216.
[6]
Ghavamzadeh, M., & Mahadevan, S. (2001) Continuous-time hierarchical reinforcement learning. ICML, 18: 186--193.
[7]
Guestrin, C., Koller, D., & Parr, R. (2001) Max-norm projections for factored MDPs. IJCAI, 17: 673--680.
[8]
Helmert, M. (2004) A planning heuristic based on causal graph analysis. ICAPS, 16: 161--170.
[9]
Hengst, B. (2002) Discovering hierarchy in reinforcement learning with HEXQ. ICML, 19: 243--250.
[10]
Hoey, J., St-Aubin, R., Hu, A., & Boutilier, C. (1999) SPUDD: Stochastic Planning using Decision Diagrams. UAI, 15: 279--288.
[11]
Kearns, M., & Koller, D. (1999) Efficient reinforcement learning in factored MDPs. IJCAI, 16: 740--747.
[12]
Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004) Dynamic abstraction in reinforcement learning via clustering. ICML, 21: 560--567.
[13]
McGovern, A., & Barto, A. (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. ICML, 18: 361--368.
[14]
Menache, I., Mannor, S., & Shimkin, N. (2002) Q-Cut -- Dynamic discovery of sub-goals in reinforcement learning. ECML, 14: 295--306.
[15]
Parr, R., & Russell, S. (1998) Reinforcement learning with hierarchies of machines. NIPS, 10: 1043--1049.
[16]
Pickett, M., & Barto, A. (2002) PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. ICML, 19: 506--513.
[17]
Şimşek, Ö., & Barto, A. (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. ICML, 21: 751--758.
[18]
Sutton, R., Precup. D., & Singh, S. (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112: 181--211.
[19]
Thrun, S., & Schwartz, A. (1995) Finding structure in reinforcement learning. NIPS, 8: 385-392.

Cited By

View all
  • (2023)Towards efficient long-horizon decision-making using automated structure search method of hierarchical reinforcement learning for edge artificial intelligenceInternet of Things10.1016/j.iot.2023.10095124(100951)Online publication date: Dec-2023
  • (2021)SYSTEM ANALYSIS IN HIERARCHICAL INTELLIGENT MULTI-AGENT SYSTEMSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2021.03.pp.033-046(33-46)Online publication date: 2021
  • (2021)Multi-Agent Reinforcement Learning for Robot CollaborationRobotics, Machinery and Engineering Technology for Precision Agriculture10.1007/978-981-16-3844-2_53(607-623)Online publication date: 5-Oct-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '05: Proceedings of the 22nd international conference on Machine learning
August 2005
1113 pages
ISBN:1595931805
DOI:10.1145/1102351
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Towards efficient long-horizon decision-making using automated structure search method of hierarchical reinforcement learning for edge artificial intelligenceInternet of Things10.1016/j.iot.2023.10095124(100951)Online publication date: Dec-2023
  • (2021)SYSTEM ANALYSIS IN HIERARCHICAL INTELLIGENT MULTI-AGENT SYSTEMSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2021.03.pp.033-046(33-46)Online publication date: 2021
  • (2021)Multi-Agent Reinforcement Learning for Robot CollaborationRobotics, Machinery and Engineering Technology for Precision Agriculture10.1007/978-981-16-3844-2_53(607-623)Online publication date: 5-Oct-2021
  • (2021)Computational complexity reduction algorithms for Markov decision process based vertical handoff in mobile networksInternational Journal of Communication Systems10.1002/dac.493834:15Online publication date: 28-Jul-2021
  • (2020)ANALYSIS OF HIERARCHICAL LEARNING WITH REINFORCEMENT FOR THE IMPLEMENTATION OF BEHAVIORAL STRATEGIES OF INTELLIGENT AGENTSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2020.09.pp.035-045(35-45)Online publication date: Sep-2020
  • (2019)Automatic construction and evaluation of macro-actions in reinforcement learningApplied Soft Computing10.1016/j.asoc.2019.105574(105574)Online publication date: Jun-2019
  • (2016)Constructing abstraction hierarchies using a skill-symbol loopProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060832.3060851(1648-1654)Online publication date: 9-Jul-2016
  • (2016)Incorporating artificial intelligence in shopping assistance robot using Markov Decision Process2016 International Conference on Intelligent Systems Engineering (ICISE)10.1109/INTELSE.2016.7475168(94-99)Online publication date: Jan-2016
  • (2016)Conversion of MDP problems into heuristics based planning problems using temporal decomposition2016 13th International Bhurban Conference on Applied Sciences and Technology (IBCAST)10.1109/IBCAST.2016.7429874(179-184)Online publication date: Jan-2016
  • (2015)Monte Carlo Hierarchical Model LearningProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2773252(771-779)Online publication date: 4-May-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media