research-article

PDSketch: integrated planning domain programming and learning

AUTHORs:

Tomas Lozano-Pérez,

Joshua B. Tenenbaum,

Leslie Pack KaelblingAuthors Info & Claims

NIPS'22: Proceedings of the 36th International Conference on Neural Information Processing Systems

Article No.: 2679, Pages 36972 - 36984

Published: 28 November 2022 Publication History

Abstract

This paper studies a model learning and online planning approach towards building flexible and general robots. Specifically, we investigate how to exploit the locality and sparsity structures in the underlying environmental transition model to improve model generalization, data-efficiency, and runtime-efficiency. We present a new domain definition language, named PDSketch. It allows users to flexibly define high-level structures in the transition models, such as object and feature dependencies, in a way similar to how programmers use TensorFlow or PyTorch to specify kernel sizes and hidden dimensions of a convolutional neural network. The details of the transition model will be filled in by trainable neural networks. Based on the defined structures and learned parameters, PDSketch automatically generates domain-independent planning heuristics without additional training. The derived heuristics accelerate the performance-time planning for novel goals.

Supplementary Material

Additional material (3600270.3602949_supp.pdf)

Supplemental material.

Download
247.70 KB

References

[1]

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: A System for Large-Scale Machine Learning. In OSDI, pages 265-283, 2016. 2

Digital Library

[2]

Masataro Asai and Christian Muise. Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (To Strips). arXiv:2004.12850, 2020. 9

[3]

Michael Bain and Claude Sammut. A Framework for Behavioural Cloning. Machine Intelligence, pages 103-129, 1995. 7

[4]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational Inductive Biases, Deep Learning, and Graph Networks. arXiv:1806.01261, 2018. 7

[5]

Blai Bonet and Héctor Geffner. Planning as Heuristic Search. Artif. Intell., 129(1-2):5-33, 2001. 6

Digital Library

[6]

Blai Bonet and Hector Geffner. Learning First-Order Symbolic Representations for Planning from the Structure of the State Space. In ECAI, 2020. 9

[7]

Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch. Decision Transformer: Reinforcement Learning via Sequence Modeling. In NeurIPS, 2021. 7

[8]

Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and Yoshua Bengio. BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop. In ICLR, 2019. 7

[9]

Silvia Chiappa, Sébastien Racaniere, Daan Wierstra, and Shakir Mohamed. Recurrent Environment Simulators. In ICLR, 2017. 9

[10]

Rohan Chitnis, Tom Silver, Joshua B Tenenbaum, Tomas Lozano-Perez, and Leslie Pack Kaelbling. Learning Neuro-Symbolic Relational Transition Models for Bilevel Planning. arXiv:2105.14074, 2021. 9

[11]

Peter Dayan and Geoffrey E Hinton. Feudal Reinforcement Learning. In NeurIPS, 1992. 10

[12]

Carlos Diuk, Andre Cohen, and Michael L Littman. An Object-Oriented Representation for Efficient Reinforcement Learning. In ICML, 2008. 10

Digital Library

[13]

Richard E Fikes and Nils J Nilsson. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving. Artif. Intell., 2(3-4):189-208, 1971. 4

[14]

Maria Fox and Derek Long. PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains. JAIR, 20:61-124, 2003. 4

[15]

Caelan Reed Garrett, Rohan Chitnis, Rachel Holladay, Beomjoon Kim, Leslie Pack Kaelbling, and Tomás Lozano-Pérez. Integrated task and motion planning. Annual Review of Control, Robotics, & Autonomous Systems, 4:265-293, 2021. 2

[16]

Marco Gori, Gabriele Monfardini, and Franco Scarselli. A new model for learning in graph domains. In IJCNN, 2005. 7

[17]

Carlos Guestrin, Daphne Koller, Chris Gearhart, and Neal Kanodia. Generalizing Plans to New Environments in Relational MDPs. In IJCAI, 2003. 10

Digital Library

[18]

Malte Helmert. The Fast Downward Planning System. JAIR, 26:191-246, 2006. 6

Digital Library

[19]

Jörg Hoffmann and Bernhard Nebel. The FF Planning System: Fast Plan Generation through Heuristic Search. JAIR, 14:253-302, 2001. 6

Digital Library

[20]

Jiani Huang, Ziyang Li, Binghong Chen, Karan Samel, Mayur Naik, Le Song, and Xujie Si. Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning. In NeurIPS, 2021. 10

[21]

Nikolay Jetchev, Tobias Lang, and Marc Toussaint. Learning Grounded Relational Symbols from Continuous Data for Abstract Reasoning. In ICRA Workshop, 2013. 9

[22]

Leslie Pack Kaelbling. Hierarchical Learning in Stochastic Domains: Preliminary Results. In ICML, 1993a. 10

[23]

Leslie Pack Kaelbling. Learning to Achieve Goals. In IJCAI, 1993b. 10

[24]

Ken Kansky, Tom Silver, David A Mély, Mohamed Eldawy, Miguel Lázaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, and Dileep George. Schema Networks: Zero-Shot Transfer with a Generative Causal Model of Intuitive Physics. In ICML, 2017. 10

[25]

George Konidaris, Leslie Pack Kaelbling, and Tomas Lozano-Perez. From skills to symbols: Learning symbolic representations for abstract high-level planning. JAIR, 61:215-289, 2018. 9

Digital Library

[26]

Lihong Li, Thomas J Walsh, and Michael L Littman. Towards a Unified Theory of State Abstraction for MDPs. ISAIM, 4(5):9, 2006. 5

[27]

Guiliang Liu, Ashutosh Adhikari, Amir-massoud Farahmand, and Pascal Poupart. Learning Object-Oriented Dynamics for Planning from Text. In ICLR, 2021. 10

[28]

Robin Manhaeve, Sebastijan Dumancic, Angelika Kimmig, Thomas Demeester, and Luc De Raedt. Deep-ProbLog: Neural Probabilistic Logic Programming. In NeurIPS, 2018. 10

[29]

Soroush Nasiriany, Vitchyr Pong, Steven Lin, and Sergey Levine. Planning with Goal-Conditioned Policies. In NeurIPS, 2019. 10

[30]

Hanna M Pasula, Luke S Zettlemoyer, and Leslie Pack Kaelbling. Learning Symbolic Models of Stochastic Domains. JAIR, 29:309-352, 2007. 9

Digital Library

[31]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS, 2019. 2

[32]

Vitchyr Pong, Shixiang Gu, Murtaza Dalal, and Sergey Levine. Temporal Difference Models: Model-Free Deep RL for Model-Based Control. In ICLR, 2018. 10

[33]

Ryan Riegel, Alexander Gray, Francois Luus, Naweed Khan, Ndivhuwo Makondo, Ismail Yunus Akhalwaya, Haifeng Qian, Ronald Fagin, Francisco Barahona, Udit Sharma, et al. Logical Neural Networks. In NeurIPS, 2020. 10

[34]

Tom Schaul, Daniel Horgan, Karol Gregor, and David Silver. Universal Value Function Approximators. In ICML, 2015. 10

Digital Library

[35]

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, et al. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. Nat., 588(7839):604-609, 2020. 9

[36]

Tom Silver, Rohan Chitnis, Joshua Tenenbaum, Leslie Pack Kaelbling, and Tomas Lozano-Perez. Learning Symbolic Operators for Task and Motion Planning. In IROS, 2021. 9

Digital Library

[37]

Armando Solar-Lezama. Program Synthesis by Sketching. PhD thesis, University of California, Berkeley, 2008. 4

Digital Library

[38]

Shao-Hua Sun, Te-Lin Wu, and Joseph J Lim. Program Guided Agent. In ICLR, 2019. 10

[39]

Aaron Van Den Oord, Oriol Vinyals, et al. Neural Discrete Representation Learning. In NeurIPS, 2017. 6

[40]

Rishi Veerapaneni, John D Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua Tenenbaum, and Sergey Levine. Entity Abstraction in Visual Model-Based Reinforcement Learning. In CoRL, 2020. 10

[41]

Thomas J Walsh. Efficient Learning of Relational Models for Sequential Decision Making. PhD thesis, Rutgers The State University of New Jersey-New Brunswick, 2010. 10

[42]

Victoria Xia, Zi Wang, and Leslie Pack Kaelbling. Learning Sparse Relational Transition Models. In ICLR, 2019. 10

[43]

Danfei Xu, Roberto Martín-Martin, De-An Huang, Yuke Zhu, Silvio Savarese, and Li Fei-Fei. Regression planning networks. NeurIPS, 2019. 10

[44]

Andy Zeng, Pete Florence, Jonathan Tompson, Stefan Welker, Jonathan Chien, Maria Attarian, Travis Armstrong, Ivan Krasin, Dan Duong, Vikas Sindhwani, et al. Transporter Networks: Rearranging the Visual World for Robotic Manipulation. In CoRL, 2020. 8

[45]

Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, and Sergey Levine. Learning invariant representations for reinforcement learning without reconstruction. In ICLR, 2021. 9

[46]

Guangxiang Zhu, Zhiao Huang, and Chongjie Zhang. Object-Oriented Dynamics Predictor. In NeurIPS, 2018. 4, 7, 10

Index Terms

PDSketch: integrated planning domain programming and learning
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
2. Software and its engineering
  1. Software notations and tools

Index terms have been assigned to the content through auto-classification.

Recommendations

Continual planning and acting in dynamic multiagent environments

In order to behave intelligently, artificial agents must be able to deliberatively plan their future actions. Unfortunately, realistic agent environments are usually highly dynamic and only partially observable, which makes planning computationally ...
An incremental curative learning approach for planning with incorrect domain theories
Face recognition using discriminant locality preserving projections based on maximum margin criterion

In this paper, we propose a new discriminant locality preserving projections based on maximum margin criterion (DLPP/MMC). DLPP/MMC seeks to maximize the difference, rather than the ratio, between the locality preserving between-class scatter and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

November 2022

39114 pages

ISBN:9781713871088

Copyright © 2022 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 28 November 2022

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten