Polynomial-time proactive synthesis of tree-to-string functions from examples

M Mayer, J Hamza, V Kuncak - arXiv preprint arXiv:1701.04288, 2017 - arxiv.org
arXiv preprint arXiv:1701.04288, 2017arxiv.org
Synthesis from examples enables non-expert users to generate programs by specifying
examples of their behavior. A domain-specific form of such synthesis has been recently
deployed in a widely used spreadsheet software product. In this paper we contribute to
foundations of such techniques and present a complete algorithm for synthesis of a class of
recursive functions defined by structural recursion over a given algebraic data type
definition. The functions we consider map an algebraic data type to a string; they are useful …
Synthesis from examples enables non-expert users to generate programs by specifying examples of their behavior. A domain-specific form of such synthesis has been recently deployed in a widely used spreadsheet software product. In this paper we contribute to foundations of such techniques and present a complete algorithm for synthesis of a class of recursive functions defined by structural recursion over a given algebraic data type definition. The functions we consider map an algebraic data type to a string; they are useful for, e.g., pretty printing and serialization of programs and data. We formalize our problem as learning deterministic sequential top-down tree-to-string transducers with a single state. The first problem we consider is learning a tree-to-string transducer from any set of input/output examples provided by the user. We show that this problem is NP-complete in general, but can be solved in polynomial time under a (practically useful) closure condition that each subtree of a tree in the input/output example set is also part of the input/output examples. Because coming up with relevant input/output examples may be difficult for the user while creating hard constraint problems for the synthesizer, we also study a more automated active learning scenario in which the algorithm chooses the inputs for which the user provides the outputs. Our algorithm asks a worst-case linear number of queries as a function of the size of the algebraic data type definition to determine a unique transducer.
arxiv.org