Abstract
Most of the works on planning and learning, e.g., planning by (model based) reinforcement learning, are based on two main assumptions: (i) the set of states of the planning domain is fixed; (ii) the mapping between the observations from the real word and the states is implicitly assumed, and is not part of the planning domain. Consequently, the focus is on learning the transitions between states. Current approaches address neither the problem of learning new states of the planning domain, nor the problem of representing and updating the mapping between the real world perceptions and the states. In this paper, we drop such assumptions. We provide a formal framework in which (i) the agent can learn dynamically new states of the planning domain; (ii) the mapping between abstract states and the perception from the real world, represented by continuous variables, is part of the planning domain; (iii) such mapping is learned and updated along the “life” of the agent. We define and develop an algorithm that interleaves planning, acting, and learning. We provide a first experimental evaluation that shows how this novel framework can effectively learn coherent abstract planning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The transition system can be either deterministic, nondeterministic, or stochastic.
- 2.
- 3.
We assume that the sequential plan returned by the planning algorithm can be transformed into a policy \(\pi \). Since here we plan for reachability goals, sequences of actions can be mapped into policies.
- 4.
The code is available in the additional material.
- 5.
The reviewer/reader interested to graphically see the computation of PAL on this simple example with different parameters can download the additional material and run the command .
- 6.
A picture of this world is reported in the additonal material.
References
Abbeel, P., Quigley, M., Ng, A.Y.: Using inaccurate models in reinforcement learning. In: Machine Learning, Proceedings of the Twenty-Third International Conference, ICML 2006, 25–29 June 2006, Pittsburgh, Pennsylvania, USA, pp. 1–8 (2006)
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2006)
Bogomolov, S., Magazzeni, D., Podelski, A., Wehrle, M.: Planning as model checking in hybrid domains. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 27–31 July 2014, Québec City, Québec, Canada, pp. 2228–2234 (2014)
Co-Reyes, J.D., Liu, Y., Gupta, A., Eysenbach, B., Abbeel, P., Levine, S.: Self-consistent trajectory autoencoder: hierarchical reinforcement learning with trajectory embeddings. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, pp. 1008–1017 (2018)
Cresswell, S., McCluskey, T.L., West, M.M.: Acquiring planning domain models using LOCM. Knowl. Eng. Rev. 28(2), 195–213 (2013)
Geffner, H., Bonet, B.: A Concise Introduction to Models and Methods for Automated Planning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, San Rafael (2013)
Ghallab, M., Nau, D.S., Traverso, P.: Automated Planning - Theory and Practice. Elsevier, Hoboken (2004)
Ghallab, M., Nau, D.S., Traverso, P.: Automated Planning and Acting. Cambridge University Press, Cambridge (2016)
Gregory, P., Cresswell, S.: Domain model acquisition in the presence of static relations in the LOP system. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, 9–15 July 2016, New York, NY, USA, pp. 4160–4164 (2016)
Henaff, M., Whitney, W.F., LeCun, Y.: Model-Based Planning with Discrete and Continuous Actions. ArXiv e-prints (2017)
Henaff, M., Whitney, W.F., LeCun, Y.: Model-based planning in discrete action spaces. CoRR abs/1705.07177 (2017)
Ingrand, F., Ghallab, M.: Deliberation for autonomous robots: a survey. Artif. Intell. 247, 10–44 (2017)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Leonetti, M., Iocchi, L., Stone, P.: A synthesis of automated planning and reinforcement learning for efficient, robust decision-making. Artif. Intell. 241, 103–130 (2016)
McCluskey, T.L., Cresswell, S., Richardson, N.E., West, M.M.: Automated acquisition of action knowledge. In: ICAART 2009 - Proceedings of the International Conference on Agents and Artificial Intelligence, 19–21 January 2009, Porto, Portugal, pp. 93–100 (2009)
Mehta, N., Tadepalli, P., Fern, A.: Autonomous learning of action models for planning. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, 12–14 December 2011, Granada, Spain, pp. 2465–2473 (2011)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Mourão, K., Zettlemoyer, L.S., Petrick, R.P.A., Steedman, M.: Learning STRIPS operators from noisy and incomplete observations. In: Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 14–18 August 2012, Catalina Island, CA, USA, pp. 614–623 (2012)
Parr, R., Russell, S.J.: Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems 10, [NIPS Conference, Denver, Colorado, USA, 1997], pp. 1043–1049 (1997)
Ryan, M.R.K.: Using abstract models of behaviours to automatically generate reinforcement learning hierarchies. In: Machine Learning, Proceedings of the Nineteenth International Conference (ICML 2002), University of New South Wales, 8–12 July 2002, Sydney, Australia, pp. 522–529 (2002)
Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Machine Learning, Proceedings of the Seventh International Conference on Machine Learning, 21–23 June 1990, Austin, Texas, USA, pp. 216–224 (1990)
Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)
Yang, F., Lyu, D., Liu, B., Gustafson, S.: PEORL: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, 13–19 July 2018, Stockholm, Sweden, pp. 4860–4866 (2018)
Zhuo, H.H., Kambhampati, S.: Action-model acquisition from noisy plan traces. In: IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, 3–9 August 2013, Beijing, China, pp. 2444–2450 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Serafini, L., Traverso, P. (2019). Learning Abstract Planning Domains and Mappings to Real World Perceptions. In: Alviano, M., Greco, G., Scarcello, F. (eds) AI*IA 2019 – Advances in Artificial Intelligence. AI*IA 2019. Lecture Notes in Computer Science(), vol 11946. Springer, Cham. https://doi.org/10.1007/978-3-030-35166-3_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-35166-3_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35165-6
Online ISBN: 978-3-030-35166-3
eBook Packages: Computer ScienceComputer Science (R0)