Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Analysis of Learning Heuristic Estimates for Grid Planning with Cellular Simultaneous Recurrent Networks

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Automated planning provides a powerful general problem solving tool, however, its need for a model creates a bottleneck that can be an obstacle to using automated planning algorithms in real-world settings. In this work, we propose to use cellular simultaneous recurrent networks (CSRN), to process a planning problem and provide a heuristic value estimate that can more efficiently steer the automated planning algorithms to a solution. Using this particular architecture provides us with a scale-free solution that can be used on any problem domain represented by a planar grid. We train the CSRN architecture on two benchmark domains and provide an analysis of its generalizing and scaling abilities. We also integrate the trained network into a planner and compare its performance against commonly used heuristic functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

All data and codes will be available at https://github.com/urbanm30/CSRN_for_heuristic_computation within the next week. We are currently working on the repository’s content.

Notes

  1. A dead-end state in planning is a state from which no goal state is reachable. An example of a dead-end in Sokoban is a state, where a box is in a corner, which is not a goal.

  2. Experimental codes were based on https://github.com/urbanm30/nn-planning.

References

  1. Aeronautiques C, Howe A, Knoblock C, McDermott ID, Ram A, Veloso M, Weld D, SRI DW, Barrett A, Christianson D, et al. Pddl the planning domain definition language. Technical Report; 1998.

  2. Pommerening F, Helmert M. Incremental lm-cut. In: Twenty-Third International Conference on Automated Planning and Scheduling; 2013.

  3. Seipp J, Pommerening F, Helmert M. New optimization functions for potential heuristics. In: Brafman RI, Domshlak C, Haslum P, Zilberstein S. (eds) Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, ICAPS 2015, Jerusalem, 7-11 June 2015; 2015. p. 193–201. http://www.aaai.org/ocs/index.php/ICAPS/ICAPS15/paper/view/10603.

  4. Bonet B, Geffner H. Planning as heuristic search. Artif Intell. 2001;129(1–2):5–33.

    Article  MathSciNet  MATH  Google Scholar 

  5. Culberson J, Yang F, Holte R. A general additive search abstraction; 2007.

  6. Geffner H. Model-free, model-based, and general intelligence. arXiv [Preprint] http://arxiv.org/abs/1806.02308; 2018.

  7. Kahneman D. Thinking, Fast and Slow. New York: Farrar, Straus and Giroux; 2011.

    Google Scholar 

  8. Shen W, Trevizan F, Thiébaux S. Learning domain-independent planning heuristics with hypergraph networks. Proc Int Conf Autom Plan Sched. 2020;30:574–84.

    Google Scholar 

  9. Toyer S, Thiébaux S, Trevizan FW, Xie L. Asnets: Deep learning for generalised planning. J Artif Intell Res. 2020;68:1–68. https://doi.org/10.1613/jair.1.11633.

    Article  MathSciNet  Google Scholar 

  10. Ståhlberg S, Bonet B, Geffner H. Learning general optimal policies with graph neural networks: Expressive power, transparency, and limits. In: Kumar A, Thiébaux S, Varakantham P, Yeoh W. (Eds) Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling, ICAPS 2022, 13-24 June 2022, AAAI Press; 2022. p. 629–637. https://ojs.aaai.org/index.php/ICAPS/article/view/19851.

  11. Arfaee SJ, Zilles S, Holte RC. Bootstrap learning of heuristic functions. In: Felner A, Sturtevant NR. (Eds) Proceedings of the Third Annual Symposium on Combinatorial Search, SOCS 2010, Stone Mountain, Atlanta, 8-10 July 2010. AAAI Press; 2010. http://aaai.org/ocs/index.php/SOCS/SOCS10/paper/view/2071.

  12. Groshev E, Tamar A, Goldstein M, Srivastava S, Abbeel P. Learning generalized reactive policies using deep neural networks. In: 2018 AAAI Spring Symposium Series; 2018.

  13. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Adv Neural Inf Process Syst. 2017;30:5998–6008.

    Google Scholar 

  14. Wu Y, Ma Y, Wan S. Multi-scale relation reasoning for multi-modal visual question answering. Signal Process Image Commun. 2021;96:116319. https://doi.org/10.1016/j.image.2021.116319.

    Article  Google Scholar 

  15. Chrestien L, Pevný T, Komenda A, Edelkamp S. Heuristic search planning with deep neural networks using imitation, attention and curriculum learning. CoRR arXiv2112.01918; 2021.

  16. Urbanovská M, Komenda A. Neural networks for model-free and scale-free automated planning. Knowl Inf Syst. 2021;63(12):3103–38. https://doi.org/10.1007/s10115-021-01619-8.

    Article  Google Scholar 

  17. Urbanovská M, Bím J, Chrestien L, Komenda A, Pevný T. Model-free automated planning using neural networks. In: Proceedings of the 1st Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL) of the International Conference on Automated Planning and Scheduling; 2020. p. 7–15.

  18. Asai M, Fukunaga A. Classical planning in deep latent space: From unlabeled images to PDDL (and back). In: Besold TR, d’Avila Garcez AS, Noble I (Eds) Proceedings of the Twelfth International Workshop on Neural-Symbolic Learning and Reasoning, NeSy 2017, London, 17-18 July 2017. CEUR Workshop Proceedings, vol. 2003; 2017. http://ceur-ws.org/Vol-2003/NeSy17_paper3.pdf.

  19. Asai M, Fukunaga A. Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. In: Thirty-Second AAAI Conference on Artificial Intelligence; 2018.

  20. Ilin R, Kozma R, Werbos PJ. Beyond feedforward models trained by backpropagation: A practical training tool for a more efficient universal approximator. IEEE Trans Neural Netw. 2008;19(6):929–37.

    Article  Google Scholar 

  21. Ilin R, Kozma R, Werbos PJ. Cellular SRN trained by extended Kalman filter shows promise for ADP. In: 2006 IEEE Int Joint Conf Neural Netw Proc. New york: IEEE; 2006. p. 506–10.

    Google Scholar 

  22. White WE, Iftekharuddin KM, Bouzerdoum A. Improved learning in grid-to-grid neural network via clustering. In: International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, 18-23 July 2010; 2010. p. 1–7. https://doi.org/10.1109/IJCNN.2010.5596518.

  23. Bundy A, Wallen L. Breadth-first search. In: Catalogue of Artificial Intelligence Tools. Heidelberg: Springer; 1984. p. 13.

    Chapter  MATH  Google Scholar 

  24. Hart P, Nilsson N, Raphael B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern. 1968;4(2):100–7. https://doi.org/10.1109/tssc.1968.300136.

    Article  Google Scholar 

  25. Hoffmann J. Ff: The fast-forward planning system. AI Mag. 2001;22(3):57.

    Google Scholar 

  26. Richter S, Westphal M. The lama planner: Guiding cost-based anytime planning with landmarks. J Artif Intell Res. 2010;39:127–77.

    Article  MATH  Google Scholar 

  27. Torralba A, Alcazar V, Borrajo D, Kissmann P, Edelkamp S. Symba: A symbolic bidirectional a planner. Int Plan Compet. 2014;27:105–9.

    Google Scholar 

  28. Siegelmann HT, Sontag ED. On the computational power of neural nets. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory; 1992. p. 440–449

  29. Culberson J. Sokoban is pspace-complete; 1997.

  30. Werbos PJ, Pang X. Generalized maze navigation: Srn critics solve what feedforward or hebbian nets cannot. In: 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No. 96CH35929), vol. 3, IEEE; 1996. p. 1764–1769.

  31. Kingma DP, Ba J. Adam: A method for stochastic optimization. In: Bengio Y, LeCun Y. (Eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, May 7-9, 2015, Conference Track Proceedings; 2015. http://arxiv.org/abs/1412.6980.

  32. Guez A, Mirza M, Gregor K, Kabra R, Racaniere S, Weber T, Raposo D, Santoro A, Orseau L, Eccles T, Wayne G, Silver D, Lillicrap T, Valdes V. An investigation of Model-free planning: boxoban levels. 2018. https://github.com/deepmind/boxoban-levels/

Download references

Acknowledgements

The work of Michaela Urbanovská was supported by the EU OP RDE funded project Research Center for Informatics; reg. No.: CZ.02.1.01/0.0./0.0./16_019/0000765 and the work of Antonín Komenda was supported by the Czech Science Foundation (grant no. 22-30043 S).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michaela Urbanovská.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances on Agents and Artificial Intelligence” guest edited by Jaap van den Herik, Ana Paula Rocha and Luc Steels.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Urbanovská, M., Komenda, A. Analysis of Learning Heuristic Estimates for Grid Planning with Cellular Simultaneous Recurrent Networks. SN COMPUT. SCI. 4, 744 (2023). https://doi.org/10.1007/s42979-023-02174-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02174-5

Keywords

Navigation