research-article

Public Access

DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning

Authors:

Catherine Wong,

Mathias Sablé-Meyer,

Armando Solar-Lezama,

Joshua B. TenenbaumAuthors Info & Claims

PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pages 835 - 850

https://doi.org/10.1145/3453483.3454080

Published: 18 June 2021 Publication History

Abstract

We present a system for inductive program synthesis called DreamCoder, which inputs a corpus of synthesis problems each specified by one or a few examples, and automatically derives a library of program components and a neural search policy that can be used to efficiently solve other similar synthesis problems. The library and search policy bootstrap each other iteratively through a variant of "wake-sleep" approximate Bayesian learning. A new refactoring algorithm based on E-graph matching identifies common sub-components across synthesized programs, building a progressively deepening library of abstractions capturing the structure of the input domain. We evaluate on eight domains including classic program synthesis areas and AI tasks such as planning, inverse graphics, and equation discovery. We show that jointly learning the library and neural search policy leads to solving more problems, and solving them more quickly.

Supplementary Material

Auxiliary Archive (pldi21main-p355-p-archive.zip)

Appendix for "DreamCoder: Bootstrapping Inductive Program Synthesis with Wake-Sleep Library Learning"

Download
5.91 MB

References

[1]

Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017. Sygus-comp 2017: Results and analysis. arXiv preprint arXiv:1711.11438, https://doi.org/10.4204/EPTCS.260.9

[2]

Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. ICLR.

[3]

Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc.

Digital Library

[4]

Xinyun Chen, Chang Liu, and Dawn Song. 2018. Execution-guided neural program synthesis. ICLR.

[5]

M.T.H. Chi, R. Glaser, and M.J. Farr. 1988. The Nature of Expertise. Taylor & Francis Group. isbn:9780898597110 lccn:lc87033071 https://doi.org/10.4324/9781315799681

[6]

Michelene TH Chi, Paul J Feltovich, and Robert Glaser. 1981. Categorization and representation of physics problems by experts and novices. Cognitive science, 5, 2 (1981), https://doi.org/10.1207/s15516709cog0502_2

[7]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, https://doi.org/10.3115/v1/D14-1179

[8]

Andrew Cropper. 2019. Playgol: Learning Programs Through Play. IJCAI, https://doi.org/10.24963/ijcai.2019/841

[9]

Luis Damas and Robin Milner. 1982. Principal type-schemes for functional programs. In Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 207–212. https://doi.org/10.1145/582153.582176

Digital Library

[10]

Eyal Dechter, Jon Malmaud, Ryan P. Adams, and Joshua B. Tenenbaum. 2013. Bootstrap Learning via Modular Concept Discovery. In IJCAI.

[11]

David Detlefs, Greg Nelson, and James B. Saxe. 2005. Simplify: a theorem prover for program checking. J. ACM, 52, 3 (2005), 365–473. https://doi.org/10.1145/1066100.1066102

Digital Library

[12]

Jacob Devlin, Rudy R Bunel, Rishabh Singh, Matthew Hausknecht, and Pushmeet Kohli. 2017. Neural Program Meta-Induction. In NIPS.

[13]

Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I/O. ICML.

[14]

Kevin Ellis, Lucas Morales, Mathias Sablé-Meyer, Armando Solar-Lezama, and Josh Tenenbaum. 2018. Library Learning for Neurally-Guided Bayesian Program Induction. In NeurIPS.

[15]

Kevin Ellis, Maxwell Nye, Yewen Pu, Felix Sosa, Josh Tenenbaum, and Armando Solar-Lezama. 2019. Write, execute, assess: Program synthesis with a repl. In Advances in Neural Information Processing Systems. 9169–9178.

[16]

Jonathan St BT Evans. 1984. Heuristic and analytic processes in reasoning. British Journal of Psychology, 75, 4 (1984), 451–468. https://doi.org/10.1111/j.2044-8295.1984.tb01915.x

[17]

John K Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing data structure transformations from input-output examples. In PLDI. https://doi.org/10.1145/2737924.2737977

Digital Library

[18]

Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, and Oriol Vinyals. 2018. Synthesizing Programs for Images using Reinforced Adversarial Learning. ICML.

[19]

Jeremy Gibbons. 2003. Origami programming. https://doi.org/10.1017/S0956796804245324

Digital Library

[20]

Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In ACM SIGPLAN Notices. 46, 317–330. https://doi.org/10.1145/1926385.1926423

Digital Library

[21]

Robert John Henderson. 2013. Cumulative learning in the lambda calculus. Ph.D. Dissertation. Imperial College London. https://doi.org/10.25560/24759

[22]

Luke Hewitt, Tuan Anh Le, and Joshua Tenenbaum. 2020. Learning to learn generative programs with Memoised Wake-Sleep. In Conference on Uncertainty in Artificial Intelligence. 1278–1287.

[23]

Geoffrey E Hinton, Peter Dayan, Brendan J Frey, and Radford M Neal. 1995. The "wake-sleep" algorithm for unsupervised neural networks. Science, 268, 5214 (1995), 1158–1161.

[24]

Irvin Hwang, Andreas Stuhlmüller, and Noah D Goodman. 2011. Inducing probabilistic programs by Bayesian program merging. arXiv preprint arXiv:1110.5667.

[25]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

[26]

Kenichi Kurihara and Taisuke Sato. 2006. Variational Bayesian grammar induction for natural language. In International Colloquium on Grammatical Inference. 84–96. https://doi.org/10.1007/11872436_8

Digital Library

[27]

Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. 2015. Human-level concept learning through probabilistic program induction. Science, 350, 6266 (2015), 1332–1338. https://doi.org/10.1126/science.aab3050

[28]

Pat Langley. 1987. Scientific discovery: Computational explorations of the creative processes. MIT Press. https://doi.org/10.1177/027046768800800417

[29]

Miguel Lázaro-Gredilla, Dianhuan Lin, J Swaroop Guntupalli, and Dileep George. 2019. Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. Science Robotics, 4, 26 (2019), eaav3150. https://doi.org/10.1126/scirobotics.aav3150

[30]

Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating search-based program synthesis using learned probabilistic models. ACM SIGPLAN Notices, 53, 4 (2018), 436–449. https://doi.org/10.1145/3296979.3192410

Digital Library

[31]

Percy Liang, Michael I. Jordan, and Dan Klein. 2010. Learning Programs: A Hierarchical Bayesian Approach. In ICML.

[32]

Dianhuan Lin, Eyal Dechter, Kevin Ellis, Joshua B. Tenenbaum, and Stephen Muggleton. 2014. Bias reformulation for one-shot function induction. In ECAI 2014. https://doi.org/10.3233/978-1-61499-419-0-525

[33]

John McCarthy. 1960. Recursive functions of symbolic expressions and their computation by machine, Part I. Commun. ACM, 3, 4 (1960), 184–195. https://doi.org/10.1145/367177.367199

Digital Library

[34]

Aditya Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Kalai. 2013. A machine learning framework for programming by example. In ICML. 187–195.

[35]

Microsoft. 2016. F# Guide: Units of Measure. https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/units-of-measure

[36]

Stephen H Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad. 2015. Meta-interpretive learning of higher-order dyadic datalog: Predicate invention revisited. Machine Learning, 100, 1 (2015), 49–73. https://doi.org/10.1007/s10994-014-5471-y

Digital Library

[37]

Stephen H Muggleton, Ute Schmid, Christina Zeller, Alireza Tamaddoni-Nezhad, and Tarek Besold. 2018. Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP. Machine Learning, 107, 7 (2018), 1119–1140. https://doi.org/10.1007/s10994-018-5707-3

Digital Library

[38]

Maxwell Nye, Luke Hewitt, Joshua Tenenbaum, and Armando Solar-Lezama. 2019. Learning to infer program sketches. ICML.

[39]

Benjamin C. Pierce. 2002. Types and programming languages. MIT Press. isbn:978-0-262-16209-8

Digital Library

[40]

Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk. (With contributions by J. R. Koza).

[41]

Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program synthesis from polymorphic refinement types. ACM SIGPLAN Notices, 51, 6 (2016), 522–538. https://doi.org/10.1145/2908080.2908093

Digital Library

[42]

Illia Polosukhin and Alexander Skidanov. 2018. Neural program search: Solving programming tasks from description and examples. arXiv preprint arXiv:1802.04335.

[43]

Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A framework for inductive program synthesis. ACM SIGPLAN Notices, 50, 10 (2015), 107–126. https://doi.org/10.1145/2858965.2814310

Digital Library

[44]

Stuart J. Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach (2 ed.). Pearson Education. isbn:0137903952

Digital Library

[45]

Michael Schmidt and Hod Lipson. 2009. Distilling free-form natural laws from experimental data. science, 324, 5923 (2009), 81–85. https://doi.org/10.1126/science.1165893

[46]

Sanjit A. Seshia. 2012. Sciduction: Combining Induction, Deduction, and Structure for Verification and Synthesis. In Proceedings of the Design Automation Conference (DAC). 356–365. https://doi.org/10.1145/2228360.2228425

Digital Library

[47]

Richard Shin, Miltiadis Allamanis, Marc Brockschmidt, and Oleksandr Polozov. 2019. Program Synthesis and Semantic Parsing with Learned Code Idioms. NeurIPS.

[48]

Vighnesh Shiv and Chris Quirk. 2019. Novel positional encodings to enable tree-based transformers. In Advances in Neural Information Processing Systems.

[49]

Herbert A Simon, Patrick W Langley, and Gary L Bradshaw. 1981. Scientific discovery as problem solving. Synthese, 47, 1 (1981), 1–27. https://doi.org/10.1080/02698599208573403

[50]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems.

[51]

Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, and Christopher Meek. 2020. Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 7652–7662. https://doi.org/10.18653/v1/2020.acl-main.684

[52]

Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112, 1-2 (1999), 181–211.

[53]

Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. In ACM SIGPLAN Notices. 44, 264–276. https://doi.org/10.1145/1480881.1480915

Digital Library

[54]

David D. Thornburg. 1983. Friends of the Turtle. Compute!, March.

[55]

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 23–30. https://doi.org/10.1109/IROS.2017.8202133

[56]

Lazar Valkov, Dipak Chaudhari, Akash Srivastava, Charles Sutton, and Swarat Chaudhuri. 2018. Houdini: Lifelong learning as program synthesis. In Advances in Neural Information Processing Systems. 8687–8698.

[57]

Philip Wadler. 1990. Comprehending monads. In Proceedings of the 1990 ACM conference on LISP and functional programming. 61–78. https://doi.org/10.1145/91556.91592

Digital Library

[58]

Patrick Winston. 1972. The MIT Robot. Machine Intelligence.

Cited By

Lubin JFerguson JYe KYim JChasins S(2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656453
Nazari ASwayamdipta SChattopadhyay SRaghothaman M(2024)Generating Function Names to Improve Comprehension of Synthesized Programs2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00035(248-259)Online publication date: 2-Sep-2024
https://doi.org/10.1109/VL/HCC60511.2024.00035
Tziafas GKasaei H(2024)Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models2024 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA57147.2024.10611448(515-522)Online publication date: 13-May-2024
https://doi.org/10.1109/ICRA57147.2024.10611448
Show More Cited By

Index Terms

DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning
1. Computing methodologies
  1. Machine learning
2. Software and its engineering
  1. Software notations and tools

Recommendations

Accelerating search-based program synthesis using learned probabilistic models
PLDI '18

A key challenge in program synthesis concerns how to efficiently search for the desired program in the space of possible programs. We propose a general approach to accelerate search-based program synthesis by biasing the search towards likely programs. ...
Accelerating search-based program synthesis using learned probabilistic models
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

A key challenge in program synthesis concerns how to efficiently search for the desired program in the space of possible programs. We propose a general approach to accelerate search-based program synthesis by biasing the search towards likely programs. ...
Template-based program verification and program synthesis

Program verification is the task of automatically generating proofs for a program's compliance with a given specification. Program synthesis is the task of automatically generating a program that meets a given specification. Both program verification ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

June 2021

1341 pages

ISBN:9781450383912

DOI:10.1145/3453483

General Chair:
Stephen N. Freund
Williams College, USA
,
Program Chair:
Eran Yahav
Technion, Israel

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

PLDI '21

Sponsor:

SIGPLAN

PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

June 20 - 25, 2021

Virtual, Canada

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
7,787
Total Downloads

Downloads (Last 12 months)3,010
Downloads (Last 6 weeks)540

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lubin JFerguson JYe KYim JChasins S(2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656453
Nazari ASwayamdipta SChattopadhyay SRaghothaman M(2024)Generating Function Names to Improve Comprehension of Synthesized Programs2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00035(248-259)Online publication date: 2-Sep-2024
https://doi.org/10.1109/VL/HCC60511.2024.00035
Tziafas GKasaei H(2024)Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models2024 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA57147.2024.10611448(515-522)Online publication date: 13-May-2024
https://doi.org/10.1109/ICRA57147.2024.10611448
Bengio YMalkin N(2024)Machine Learning and Information Theory Concepts towards an AI MathematicianBulletin of the American Mathematical Society10.1090/bull/183961:3(457-469)Online publication date: 15-May-2024
https://doi.org/10.1090/bull/1839
Collins KSucholutsky IBhatt UChandra KWong LLee MZhang CZhi-Xuan THo MMansinghka VWeller ATenenbaum JGriffiths T(2024)Building machines that learn and think with peopleNature Human Behaviour10.1038/s41562-024-01991-98:10(1851-1863)Online publication date: 22-Oct-2024
https://doi.org/10.1038/s41562-024-01991-9
Webb TFrankland SAltabaa ASegert SKrishnamurthy KCampbell DRussin JGiallanza TO’Reilly RLafferty JCohen J(2024)The relational bottleneck as an inductive bias for efficient abstractionTrends in Cognitive Sciences10.1016/j.tics.2024.04.001Online publication date: May-2024
https://doi.org/10.1016/j.tics.2024.04.001
Zhou YFeinman RLake B(2024)Compositional diversity in visual concept learningCognition10.1016/j.cognition.2023.105711244(105711)Online publication date: Mar-2024
https://doi.org/10.1016/j.cognition.2023.105711
Eberhardinger MRupp FMaucher JMaghsudi S(2024)Unveiling the Decision-Making Process in Reinforcement Learning with Genetic ProgrammingAdvances in Swarm Intelligence10.1007/978-981-97-7181-3_28(349-365)Online publication date: 22-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-7181-3_28
Bednarek JKrawiec K(2024)Learning to Solve Abstract Reasoning Problems with Neurosymbolic Program Synthesis and Task GenerationNeural-Symbolic Learning and Reasoning10.1007/978-3-031-71167-1_21(386-402)Online publication date: 9-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-71167-1_21
Sun CSheng YPadon OBarrett C(2024)Clover: Closed-Loop Verifiable Code GenerationAI Verification10.1007/978-3-031-65112-0_7(134-155)Online publication date: 22-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-65112-0_7
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents