Abstract
Automated planning provides a powerful general problem solving tool, however, its need for a model creates a bottleneck that can be an obstacle to using automated planning algorithms in real-world settings. In this work, we propose to use cellular simultaneous recurrent networks (CSRN), to process a planning problem and provide a heuristic value estimate that can more efficiently steer the automated planning algorithms to a solution. Using this particular architecture provides us with a scale-free solution that can be used on any problem domain represented by a planar grid. We train the CSRN architecture on two benchmark domains and provide an analysis of its generalizing and scaling abilities. We also integrate the trained network into a planner and compare its performance against commonly used heuristic functions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
All data and codes will be available at https://github.com/urbanm30/CSRN_for_heuristic_computation within the next week. We are currently working on the repository’s content.
Notes
A dead-end state in planning is a state from which no goal state is reachable. An example of a dead-end in Sokoban is a state, where a box is in a corner, which is not a goal.
Experimental codes were based on https://github.com/urbanm30/nn-planning.
References
Aeronautiques C, Howe A, Knoblock C, McDermott ID, Ram A, Veloso M, Weld D, SRI DW, Barrett A, Christianson D, et al. Pddl the planning domain definition language. Technical Report; 1998.
Pommerening F, Helmert M. Incremental lm-cut. In: Twenty-Third International Conference on Automated Planning and Scheduling; 2013.
Seipp J, Pommerening F, Helmert M. New optimization functions for potential heuristics. In: Brafman RI, Domshlak C, Haslum P, Zilberstein S. (eds) Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, ICAPS 2015, Jerusalem, 7-11 June 2015; 2015. p. 193–201. http://www.aaai.org/ocs/index.php/ICAPS/ICAPS15/paper/view/10603.
Bonet B, Geffner H. Planning as heuristic search. Artif Intell. 2001;129(1–2):5–33.
Culberson J, Yang F, Holte R. A general additive search abstraction; 2007.
Geffner H. Model-free, model-based, and general intelligence. arXiv [Preprint] http://arxiv.org/abs/1806.02308; 2018.
Kahneman D. Thinking, Fast and Slow. New York: Farrar, Straus and Giroux; 2011.
Shen W, Trevizan F, Thiébaux S. Learning domain-independent planning heuristics with hypergraph networks. Proc Int Conf Autom Plan Sched. 2020;30:574–84.
Toyer S, Thiébaux S, Trevizan FW, Xie L. Asnets: Deep learning for generalised planning. J Artif Intell Res. 2020;68:1–68. https://doi.org/10.1613/jair.1.11633.
Ståhlberg S, Bonet B, Geffner H. Learning general optimal policies with graph neural networks: Expressive power, transparency, and limits. In: Kumar A, Thiébaux S, Varakantham P, Yeoh W. (Eds) Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling, ICAPS 2022, 13-24 June 2022, AAAI Press; 2022. p. 629–637. https://ojs.aaai.org/index.php/ICAPS/article/view/19851.
Arfaee SJ, Zilles S, Holte RC. Bootstrap learning of heuristic functions. In: Felner A, Sturtevant NR. (Eds) Proceedings of the Third Annual Symposium on Combinatorial Search, SOCS 2010, Stone Mountain, Atlanta, 8-10 July 2010. AAAI Press; 2010. http://aaai.org/ocs/index.php/SOCS/SOCS10/paper/view/2071.
Groshev E, Tamar A, Goldstein M, Srivastava S, Abbeel P. Learning generalized reactive policies using deep neural networks. In: 2018 AAAI Spring Symposium Series; 2018.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Adv Neural Inf Process Syst. 2017;30:5998–6008.
Wu Y, Ma Y, Wan S. Multi-scale relation reasoning for multi-modal visual question answering. Signal Process Image Commun. 2021;96:116319. https://doi.org/10.1016/j.image.2021.116319.
Chrestien L, Pevný T, Komenda A, Edelkamp S. Heuristic search planning with deep neural networks using imitation, attention and curriculum learning. CoRR arXiv2112.01918; 2021.
Urbanovská M, Komenda A. Neural networks for model-free and scale-free automated planning. Knowl Inf Syst. 2021;63(12):3103–38. https://doi.org/10.1007/s10115-021-01619-8.
Urbanovská M, Bím J, Chrestien L, Komenda A, Pevný T. Model-free automated planning using neural networks. In: Proceedings of the 1st Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL) of the International Conference on Automated Planning and Scheduling; 2020. p. 7–15.
Asai M, Fukunaga A. Classical planning in deep latent space: From unlabeled images to PDDL (and back). In: Besold TR, d’Avila Garcez AS, Noble I (Eds) Proceedings of the Twelfth International Workshop on Neural-Symbolic Learning and Reasoning, NeSy 2017, London, 17-18 July 2017. CEUR Workshop Proceedings, vol. 2003; 2017. http://ceur-ws.org/Vol-2003/NeSy17_paper3.pdf.
Asai M, Fukunaga A. Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. In: Thirty-Second AAAI Conference on Artificial Intelligence; 2018.
Ilin R, Kozma R, Werbos PJ. Beyond feedforward models trained by backpropagation: A practical training tool for a more efficient universal approximator. IEEE Trans Neural Netw. 2008;19(6):929–37.
Ilin R, Kozma R, Werbos PJ. Cellular SRN trained by extended Kalman filter shows promise for ADP. In: 2006 IEEE Int Joint Conf Neural Netw Proc. New york: IEEE; 2006. p. 506–10.
White WE, Iftekharuddin KM, Bouzerdoum A. Improved learning in grid-to-grid neural network via clustering. In: International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, 18-23 July 2010; 2010. p. 1–7. https://doi.org/10.1109/IJCNN.2010.5596518.
Bundy A, Wallen L. Breadth-first search. In: Catalogue of Artificial Intelligence Tools. Heidelberg: Springer; 1984. p. 13.
Hart P, Nilsson N, Raphael B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern. 1968;4(2):100–7. https://doi.org/10.1109/tssc.1968.300136.
Hoffmann J. Ff: The fast-forward planning system. AI Mag. 2001;22(3):57.
Richter S, Westphal M. The lama planner: Guiding cost-based anytime planning with landmarks. J Artif Intell Res. 2010;39:127–77.
Torralba A, Alcazar V, Borrajo D, Kissmann P, Edelkamp S. Symba: A symbolic bidirectional a planner. Int Plan Compet. 2014;27:105–9.
Siegelmann HT, Sontag ED. On the computational power of neural nets. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory; 1992. p. 440–449
Culberson J. Sokoban is pspace-complete; 1997.
Werbos PJ, Pang X. Generalized maze navigation: Srn critics solve what feedforward or hebbian nets cannot. In: 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No. 96CH35929), vol. 3, IEEE; 1996. p. 1764–1769.
Kingma DP, Ba J. Adam: A method for stochastic optimization. In: Bengio Y, LeCun Y. (Eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, May 7-9, 2015, Conference Track Proceedings; 2015. http://arxiv.org/abs/1412.6980.
Guez A, Mirza M, Gregor K, Kabra R, Racaniere S, Weber T, Raposo D, Santoro A, Orseau L, Eccles T, Wayne G, Silver D, Lillicrap T, Valdes V. An investigation of Model-free planning: boxoban levels. 2018. https://github.com/deepmind/boxoban-levels/
Acknowledgements
The work of Michaela Urbanovská was supported by the EU OP RDE funded project Research Center for Informatics; reg. No.: CZ.02.1.01/0.0./0.0./16_019/0000765 and the work of Antonín Komenda was supported by the Czech Science Foundation (grant no. 22-30043 S).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances on Agents and Artificial Intelligence” guest edited by Jaap van den Herik, Ana Paula Rocha and Luc Steels.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Urbanovská, M., Komenda, A. Analysis of Learning Heuristic Estimates for Grid Planning with Cellular Simultaneous Recurrent Networks. SN COMPUT. SCI. 4, 744 (2023). https://doi.org/10.1007/s42979-023-02174-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02174-5