Finding Stable Solutions in Constraint Satisfaction Problems

2009, … on Principles and …

The 15th International Conference on Principles and Practice of Constraint Programming Doctoral Program Proceedings September 20 - 24 2009 ii Welcome to the proceedings of the 2009 Constraint Programming Doctoral Program, held in conjunction with the 15th annual Constraint Programming conference in Lisbon, Portugal. The doctoral program is open to PhD students in all areas related to constraint programming. All participants present work either within the doctoral program or in the main Constraint Programming Conference. The papers in this proceedings are those which have been submitted directly to the doctoral program. They contain a wide variety of work, either completed or in progress, being undertaken by the current generation of PhD students. We also list all students who have papers accepted into the main conference. The Doctoral Program would not be possible without the support of many people and organisations. In particular, we would like to thank the program committee members and the sponsors of the Constraint Programming conference. We hope you have an enjoyable time in Portugal and a productive conference. Karen Petrie and Olivia Smith Doctoral Program Chairs, 2009 The Program Committe Peter Nightingale Chris Jeﬀerson Neil Yorke-Smith Zeynep Kiziltan Sebastian Brand Hubie Chen Claude-Guy Quimper Standa Zivny Justin Pearson Roland Yap consists of: University of St Andrews University of St Andrews SRI International University of Bologna University of Melbourne Universitat Pompeu Fabra University of Waterloo University of Oxford Uppsala University National University of Singapore iii Students with papers accepted into the main conference: Carleton Coﬀrin Alberto Delgado Aurelie Favier Mohammad Fazel-Zarandi Andy Grayland Serdar Kadioglu Ronan Le Bras Nina Naroditskaya Aurlien Rizk Elaine Sonderegger David Stynes Kevin Tierney Mohamed Wahbi Justin Yip Alessandro Zanarini Standa Zivny Brown University IT University of Copenhagen INRA, Centre de recherches de Toulouse University of Toronto University of St Andrews, Scotland Brown University Centre de recherche sur les transports University of New South Wales and NICTA, Sydney, Australia Inira University of Connecticut Cork Constraint Computation Centre, University College Cork Brown University LIRMM/CNRS, U. of Montpellier 2, France and LIMIARF/FSR, U. of Mohammed V Agdal, Morroco Brown University Ecole Polytechnique de Montreal University of Oxford iv Table of Contents A Filtering Technique for Non-normalized CSPs Marlene Arangu, Miguel A. Salido and Federico Barber Continuous Search in Constraint Programming: An Initial Investigation Alejandro Arbelaez and Youssef Hamadi Constraint Based Languages for Biological Reactions Marco Bottalico and Stefano Bistarelli Capturing fair computations on Concurrent Constraint Languages Paola Campli and Stefano Bistarelli Finding Stable Solutions in Constraint Satisfaction Problems Laura Climent, Miguel A. Salido and Federico Barber Preliminary studies of BnB-ADOPT related with Soft Arc Consistency Patricia Gutierrez and Pedro Meseguer An automaton Constraint for Local Search Jun He, Pierre Flener and Justin Pearson Research Overview: Improved Boolean Satisﬁability Techniques for Haplotype Inference Eric Hsu and Sheila McIlraith Foundations of Symmetry Breaking Revisited Tim Januschowski, Barbara Smith and Marc van Dongen Energetic Edge-Finder For Cumulative Resource Roger Kameugne and Laure Pauline Fotso Dominion – A constraint solver generator Lars Kotthoﬀ, Ian Miguel and Ian Gent On learning CSP speciﬁcations Matthieu Lopez and Arnaud Lallouet Propagating equalities and disequalities Neil Charles Armour Moore, Ian Gent and Ian Miguel Tractable Benchmarks Justyna Petke and Peter Jeavons A simple eﬃcient exact algorithm based on independent set for Maxclique problem Zhe Quan and Chu Min Li Exploring Local Acyclicity within Russian Doll Search Margarita Razgon and Gregory M. Provan Optimal Solutions for Conversational Recommender Systems Based on Comparative Preferences and Linear Inequalities Walid Trabelsi, Nic Wilson and Derek Bridge A multithreaded solving algorithm for QCSP+ Jeremie Vautard and Arnaud Lallouet 1 7 13 19 25 31 37 43 48 54 64 70 76 82 88 94 100 106 1 A Filtering Technique for Non-normalized CSPs Marlene Arangú (Student) Miguel A. Salido and Federico Barber (Supervisors) Instituto de Automática e Informática Industrial Universidad Politécnica de Valencia. Valencia, Spain Abstract. In this work we present a new algorithm, called 2-C3OP, that achieves 2-consistency in binary and non-normalized CSPs. This algorithm is a reformulation of 2-C3 algorithm and it performs the constraint checks bidirectionally using inference. The evaluation section shows that 2-C3OP achieve 2-consistency like 2-C3 and it is 40% faster and therefore it is able to prune more search space than AC3. Furthermore, 2-C3OP performs fewer constraint checks than both AC3 and 2-C3. 1 Introduction Proposing efficient algorithms for enforcing arc-consistency (which involves two variables) has always been considered to be a central question in the constraint reasoning community. Thus, there are many arc-consistency algorithms such as AC1, AC2, and AC3 [13]; AC4 [14]; AC5 [15, 12]; AC6 [5]; AC7 [6]; AC8[8]; AC2001, AC3.1 [7]; and more. However, AC3 and AC4 are the most often used [3]. Algorithms that perform arc-consistency have focused their improvements on time-complexity and spacecomplexity. Main improvements have been achieved by: changing the way of propagation, appending new structures; performing bidirectional search, changing the support search, improving the propagation, etc. However, little works has been done for developing algorithms to achieve 2-consistency in binary CSPs (binary: all constraints involve only two variables). The concept of consistency was generalized to k-consistency by [11] and an optimal k-consistency algorithm for labeling problem was proposed by [9]. Nevertheless, it is only based in normalized CSPs. If the constraints have two variables in k-consistency (k=2) then we talk about 2-consistency. Also, much work on arc-consistency made the simplifying assumptions that CSPs are binary and normalized (two different constraints do not involve exactly the same variables), because these notations are much simpler and new concepts are easier to present. But [16] shows a strange effect of associating arc-consistency with binary normalized CSPs. It is the confusion between the notions of arc-consistency and 2-consistency (2-consistency guarantees that any instantiation of a value to a variable can be consistently extended to any second variable). On binary CSPs, 2-consistency is at least as strong as arc-consistency. When the CSP is binary and normalized, arc-consistency and 2-consistency perform same pruning. A non-normalized CSP can be transformed into normalized by using the intersection of valid tuples [2]. It can be a very hard task in problems with large domains. 2 We will focus our attention on binary and non-normalized CSPs. Figure 1 left shows a binary CSP with two variables X1 and X2 , D1 = D2 = {1, 2, 3} and two constraints ′ R12 : X1 ≤ X2 , R12 : X1 6= X2 presented in [16]. It can be observed that this CSP is arc-consistent due to the fact that every value of every variable has a support ′ . In this case, arc-consistency does not prune any value for constraints R12 and R12 of the domain of variables X1 and X2 . However, (as authors say in [16]) this CSP is not 2-consistent because the instantiation X1 = 3 can not be extended to X2 and the instantiation X2 = 1 can not be extended to X1 . Thus, Figure 1 right presents the resulting CSP filtered by arc-consistency and 2-consistency. It can be observed that 2-consistency is at least as strong as arc-consistency. Fig. 1. Example of Binary CSP. By following standard notations and definitions in the literature [4]; [3], [10]; we have summarized the basic definitions that we will use in this rest of the paper. Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ri where, X is a finite set of variables {X1 , X2 , ..., Xn }. D is a set of domains D = D1 , D2 , ..., Dn such that for each variable Xi ∈ X there is a finite set of values that variable Xi can take. R is a finite set of constraints R = {R1 , R2 , ..., Rm } which restrict the values that the variables can simultaneously take. We denote Rij ≡ (Rij , 1) ∨ (Rij , 3)1 as the direct constraint defined over the variables Xi and Xj and Rji ≡ (Rij , 2) as the same constraint in the inverse direction over the variables Xi and Xj (inverse constraint). A block of constraints Cij is a set of binary constraints that involve the same variables Xi and Xj . Instantiation is a pair (Xi = a), that represents an assignment of the value a to the variable Xi , and a is in Di . A constraint Rij is satisfied if the instantiation of Xi = a and Xj = b holds in Rij . Constraint symmetry. If the value b ∈ Dj supports a value a ∈ Di , then a supports b as well. Arc-consistency: A value a ∈ Di is arc-consistent relative to Xj , iff there exists a value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies the constraint Rij ((Xi = a, Xj = b) ∈ Rij ). A variable Xi is arc-consistent relative to Xj iff all values in Di are arcconsistent. A CSP is arc-consistent iff all the variables are arc-consistent, e.g., all the 1 More information regarding direct constraint is presented in the next section. 3 constraints Rij and Rji are arc-consistent. (Note: here we are talking about full arcconsistency). 2-consistency: A value a ∈ Di is 2-consistent relative to Xj , iff there exists a k (∀k : (Xi = value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies all the constraints Rij k a, Xj = b) ∈ Rij ). A variable Xi is 2-consistent relative to Xj iff all values in Di are 2-consistent. A CSP is 2-consistent iff all the variables are 2-consistent, e.g., any instantiation of a value to a variable can be consistently extended to any second variable. 2 Algorithm 2-C3OP We present a new coarse grained algorithm called 2-C3OP that achieves 2-consistency in binary and non-normalized CSPs (see Algorithm 1). This algorithm deals with block of constraints as 2-C3 [1] but only requires keeps half block of constraints in Q and call two procedures: Revise and AddQ. 2-C3OP performs bidirectional checks (as AC7) and performs inference to avoid unnecessary checks. However, it is done by using structures that are shared by all the constraints: – suppInv: it is a vector whose size is the maximum size of all domains (maxD) and it stores the value Xi that supports the value of Xj . – minSupp: it is a integer variable that stores the first value b ∈ Dj that supports any a ∈ Di . By using minSupport inference is carried out to avoid unnecessary checks because all b ∈ Dj < minsupport are pruned without any check once suppInv is evaluated. – t: is an integer parameter an it takes values from {1, 2, 3}. This value is used to guide to the Revise procedure (direct or inverse order, bidirectional search - see below). Initially 2-C3OP procedure stores in queue Q the constraint blocks (Cij , t) : t = 1. Then, a simple loop is performed to select and revise the block of constraints stored in Q, until no change occurs (Q is empty), or the domain of a variable remains empty. The first case ensures that every value of every domain is 2-consistent and the second case returns that the problem is not consistent. The AddQ procedure would add tuples to be evaluated again. The added tuples will depend on the tuples stored in Q and the variable change. Depending on the value of t, generated by AddQ, the Revise procedure will guide the search. If t = 1 the search is full bidirectional (directed and inverse); if t = 2 the search is inverse and the search is also directed if and only if some pruning is carried out; finally if t = 3 the search is directed and the search is also inverse if and only if some pruning is carried out. The Revise procedure (see Algorithm 2) requires two internal variables changei and changej . They are initialized to zero and are used to remember which domains were pruning. For instance, if Di was pruned then changei = 1 and if Dj was pruned then changej = 2. However, if both Di and Dj were pruned then change = 3 (because change = changei + changej ). During the loop of steps 3-9, each value in Di is checked 2 . If the value b ∈ Dj supports the value a ∈ Di then suppInv[b] = a, 2 if t=2 the inverse operator is used 4 Algorithm 1: Procedure 2-C3OP Input A CSP, P = hX, D, Ri Result: true and P ′ (which is 2-consistent) or false and P ′ (which is 2-inconsistent because some domain remains empty) begin 1 for every i, j do 2 Cij = ∅ 3 4 for every arc Rij ∈ R do Cij ← Cij ∪ Rij 5 6 for every set Cij do Q ← Q ∪ {(Cij , t) : t = 1} 7 8 for each d ∈ Dmax do suppInv[d] = 0 while Q 6= ∅ do select and delete (Cij , t) from queue Q with t = {1, 2, 3} change = Revise((Cij , t)) if change > 0 then if change ≥ 1 ∧ change ≤ 3 then Q ← Q ∪ AddQ(change, (Cij , t)) else return false /*empty domain*/ 9 10 11 12 13 14 15 16 17 return true end Algorithm 2: Procedure Revise 1 2 3 4 5 6 Input A CSP P ′ defined by two variables X = (Xi , Xj ), domains Di and Dj , and tuple (Cij , t) and vector suppInv. Result: Di , such that Xi is 2-consistent relative Xj and Dj , such that Xj is 2-consistent relative Xi and integer variable change begin changei = 0; changej = 0 minSupp =dummyvalue for each a ∈ Di do if ∄b ∈ Dj such that (Xi = a, Xj = b) ∈ (Cij , t) then remove a from Di changei = 1 7 8 9 else suppInv[b] = a if minSupp =dummyvalue then minSupp = b if ([(t = 2 ∨ t = 3) ∧ changei = 1] ∨ t = 1) then for each b ∈ Dj do if b < minSupp then remove b from Dj changej = 2 10 11 12 13 14 15 16 17 else if suppInv[b] > 0 then suppInv[b] = 0 18 19 20 21 else 22 23 change = changei + changej return change if ∄a ∈ Di such that (Xi = a, Xj = b) ∈ (Cij , t) then remove b from Dj changej = 2 end 5 due to the fact that it is based on symmetry of the constraint(the support is bidirectional). Furthermore the first value b ∈ Dj (that supports a value in Di ) is stored in minSupport. The second part of Algorithm 2 depends on the values t and changei . If t = 2 or t = 3, and changei = 0 then Cij is not needed to be checked due to the fact that the constraint, in the previous loop, has not generated any prune. However, if t = 1 then Cij requires full bidirectional evaluation. If t = 2 or t = 3, and changei = 1 then Cij also requires full bidirectional evaluation. In both cases, the processing of suppInv can be done in three different ways: 1) the values of b ∈ Dj : b < minSupport are pruned without performing any checking. Furthermore, variable changej is updated to changej = 2 to indicate that a change in the second loop occurs. 2) the values of b with suppInv[b] > 0 will not be checked due to the fact that they are supported in Di . Thus, they are initialized to 0 for further use of this vector. 3) the values of b with b > minorSupport and suppInv[b] = 0 will be checked until a support a is founded or they will be removed in other case. In the last case changej is set to 2 in order to indicate that a change in the second loop occurs. 3 Experimental Results The experiments were performed on random instances characterized by the 4-tuple < n, d, m, c >, where n was the number of variables, d the domain size, m the total number of binary constraints and c the number of constraints in each block. Constraints were implicity encoded. We evaluated 50 test cases for each type of the problem. Performance was measured in terms of number of values pruned, constraint checks and computation time. All algorithms were implemented in C on a PC with 1 GB RAM and 3.0 GHz Pentium IV processor. Table 1. Number of prunings and constraint checks by using AC3, 2-C3 and 2-C3OP in problems 100% non-normalized < n, 20, 800, 2 >. Arc-consistency 2-consistency value AC3 2-C3 2-C3OP n prunings time checks prunings time checks time checks 50 50 9 109001 140 20 66774 10 58778 70 70 9 109401 140 20 64816 13 56816 90 90 11 109801 140 20 63296 13 55296 110 110 14 110201 150 20 62801 13 54801 130 130 11 110601 170 23 63001 13 55001 150 150 12 111001 180 24 62616 16 54616 In Table 1 the results show that the number of prunings was lower (53%) in AC3 than in both 2-C3 y 2-C3OP in all cases. This is due to the fact that the problems start with the same domain size and they have constraints with operators (=, 6=, ≤, or ≥), so that AC3 does not pruned any value by analyzing the constraints individually. However, 6 2-C3 and 2-C3OP analyzed the blocks of constraints which have a mix of these operators and so is able to prune more search space. Furthermore, 2-C3OP performed fewer constraints checks than both AC3 and 2-C3 by 50 % and 13 % respectively. Finally 2-C3OP was 40% faster than 2-C3. 4 Conclusions and Further Work Filtering techniques are widely used to prune the search space of CSPs. In this paper, we have presented 2-C3OP algorithm, a reformulation of 2-C3 algorithm, which achieves 2-consistency in binary and non-normalized CSPs using bidirectionally and inference. The evaluation section shows that 2-C3OP achieves 2-consistency and thus is able to prune more search space than AC3. Furthermore 2-C3OP performs fewer constraint checks than both AC3 and 2-C3 and its computation time is near to AC3 and better than 2-C3 by 40%. In further work, we will focus our attention on improved versions of 2C3OP to avoid redundant checks by using both the Last value support (as in AC3-2001 [7]). Furthermore, our goal is to apply these filtering techniques to distributed CSPs. References 1. Marlene Arangú, Miguel Salido, and Federico Barber. 2-c3: From arc-consistency to 2consistency. In SARA 2009: The Eighth Symposium on Abstraction, Reformulation and Approximation. To appear, 2009. 2. Marlene Arangú, Miguel Salido, and Federico Barber. Extending arc-consistency algorithms for non-normalized csps. In AI2009: Twenty-ninth SGAI International Conference on Artificial Intelligence. To appear, 2009. 3. Roman Barták. Theory and practice of constraint propagation. In J. Figwer, editor, Proceedings of the 3rd Workshop on Constraint Programming in Decision and Control, 2001. 4. Christian Bessiere. Constraint propagation. Technical report, CNRS/University of Montpellier, 2006. 5. Christian Bessiere and Marie Cordier. Arc-consistency and arc-consistency again. In Proc. of the AAAI’93, pages 108–113, Washington, USA, July 1993. 6. Christian Bessiere, E.C. Freuder, and J. C. Régin. Using constraint metaknowledge to reduce arc consistency computation. Artificial Intelligence, 107:125–148, 1999. 7. Christian Bessiere, J. C. Régin, Roland Yap, and Yuanling Zhang. An optimal coarse-grained arc-consistency algorithm. Artificial Intelligence, 165:165–185, 2005. 8. Assef Chmeiss and Philippe Jegou. Efficient path-consistency propagation. International Journal on Artificial Intelligence Tools, 7:121–142, 1998. 9. Martin Cooper. An optimal k-consistency algorithm. Artificial Intelligence, 41:89–95, 1994. 10. R. Dechter. Constraint Processing. Morgan Kaufmann, 2003. 11. Eugene Freuder. Synthesizing constraint expressions. Commun ACM, 21:958–966, 1978. 12. P. Van Hentenryck, Y. Deville, and C. M. Teng. A generic arc-consistency algorithm and its specializations. Artificial Intelligence, 57:291–321, 1992. 13. A. K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8:99–118, 1977. 14. R. Mohr and T.C. Henderson. Arc and path consistency revised. Artificial Intelligence, 28:225–233, 1986. 15. M. Perlin. Arc consistency for factorable relations. Artificial Intelligence, 53:329–342, 1992. 16. F. Rossi, P. Van Beek, and T. Walsh. Handbook of constraint programming. Elsevier, 2006. 7 Continuous Search in Constraint Programming: An Initial Investigation Alejandro Arbelaez1 (student) Youssef Hamadi1,2 1 2 Microsoft-INRIA joint-lab, Orsay France, alejandro.arbelaez@inria.fr Microsoft Research, Cambridge United Kingdom, youssefh@microsoft.com Abstract. In this work, we present the concept of Continuous Search, the objective of which is to allow any user to eventually get top performance from their constraint solver. Unlike previous approaches (see [9] for a recent survey), Continuous Search does not require the disposal of a large set of representative instances to properly train and learn parameters. It only assumes that once the solver runs in a real situation (often called production mode), instances will come over time, and allow for proper offline continuous training. The objective is therefore not to instantly provide good parameters for top performance, but to take advantage of the real situation to train in the background and improve the performances of the system in an incremental manner. 1 Introduction In Constraint Programming, properly crafting a constraint model which captures all the constraints of a particular problem is often not enough to ensure acceptable runtime performance. One way to improve efficiency is to use well known tricks like redundant and channeling constraints or to be aware that the constraint solver has a particular global constraint that can do part of the job more efficiently. The problem with these improvements (or tricks) is that they are far from obvious. Indeed, they do not change the solution space of the original model, and for a normal user (with a classical mathematical background), it is difficult to understand why adding redundancy helps. Because of that, normal users are often left with the tedious task of tuning the search parameters of their constraint solver, and this again, is both time consuming and not necessarily straightforward. Indeed, even if tuning is conceptually simple (try different parameters, pick the best), it requires a set of representative instances in order to properly work. This might be obvious for a constraint programmer, but not for a normal user which could train on instances far different from the ones faced by his application. In this work, we present the concept of Continuous Search (CS), the objective of which is to allow any user to eventually get top performance from their constraint solver. Unlike previous approaches (see [9] for a recent survey), Continuous Search does not require the disposal of a large set of representative 8 instances to properly train and learn parameters. It only assumes that once the solver runs in a real situation (often called production mode), instances will come over time, and allow for proper offline continuous training. The objective is therefore not to instantly provide good parameters for top performance, but to take advantage of the ’real’ situation to train in the background and improve the performances of the system in an incremental manner. The Continuous Search paradigm, uses an online learning algorithm to update a prediction function which matches instances features to the most efficient set of parameters for a given instance (e.g. Variable/Value selection algorithms). Since CS can start without offline training, this prediction function might be initially undefined. If this is the case, an instance is solved by running the solver with its default parameters. Once the instance is solved and the solution is given back to the application1 , we start our Continuous Search training. At that point, the instance is reused and the goal is to refine the strategy used to tackle it. This is done by a specific repair-like algorithm which perturbs the strategy to find a new strategy able to solve the problem more efficiently. If such a strategy is found, its parameters are stored with the instance features, and therefore, could be reused to solve similar instances more efficiently. Technically, this means that there are two different search efforts. The one done to solve the real problem, and the one related to the long term improvement of the constraint solver. We believe that this extra use of computational resources is realistic, since nowadays systems (especially production ones) are almost always on. Moreover, this has to be balanced against the huge computational cost of offline training [10]. Last, this late adaptation is the only way to face the ’real’ instances and even, to adapt to changes on the modeling or to the arrival of a completely new class of problem. The paper is organized as follows. Background material is presented in Section 2. Section 3 introduces the continuous search paradigm. Section 4 presents experimental results. Finally, before our general conclusion, Section 5 presents related work. 2 Background In this section, we briefly introduce definitions and notations used hereafter. 2.1 Constraint Satisfaction Problems Definition 1 A Constraint Satisfaction Problem (CSP) is a triple (X, D, C) where, – X = {X1 , X2 , . . . , Xn } represents a set of n variables. – D = {D1 , D2 , . . . , Dn } represents the set of associated domains, i.e., possible values for the variables. – C = {C1 , C2 , . . . , Cm } represents a finite set of constraints. 1 Since our standpoint is real settings, we consider the full application stack where solvers are not isolated pieces of software called from a command line, but are critically embedded in large and complex applications. 9 Each constraint Ci is associated to a set of variables vars(Ci ), and is used to restrict the combinations of values between these variables. Similarly, the degree deg(Xi ) of a variable is the number of constraints associated to Xi and dom(Xi ) corresponds to the current domain of Xi . Solving a CSP involves finding a solution, i.e., an assignment of values to variables such as all constraints are satisfied. If a solution exists the problem is stated as satisfiable and unsatisfiable otherwise. In this paper, we consider four well known variable selection heuristics. mindom selects the variable with the smallest domain [4], wdeg [2] selects the variable which is involved in more failed constraints, dom/wdeg [2] which selects the dom and impacts [7] whose objective is to variable which minimizes the ratio wdeg select the variable-value pair that maximizes the reduction of the remaining search space. 2.2 Support Vector Machines Among the prominent Machine Learning (ML) algorithms are Support Vector Machines (SVM) [3]. This algorithm is highly used in binary classification due to its statistical learning properties, it determines the separating hyperplane with maximum distance or margin to the closest examples (so-called support vectors) of the training set. Learning a high quality model depends on the quality of the training set. On the one hand, the description of the examples must enable to discriminate among positive and negative examples; on the other hand, the available examples must enable to accurately localize the frontier between the two classes. 3 Continuous Search in Constraint Programming The goal of this paper is to take advantage of real world situations as shown in Figure 1, where instances are presented one at a time. Therefore, in Continuous Search settings we consider two different phases, exploitation (or solving time) and exploration (or learning time). The former tries to solve new instances using the acquired knowledge, and the latter is focused on improving the classification accuracy by means of re-using previously seen instances. Instances Solving time Learning time I0 I1 ... Ik Fig. 1. Continuous Search scenario 3.1 Online Learning To deal with the continuous search scenario, we follow the same approach as in [1], where the authors propose to use a supervised Machine Learning algorithm to select the most appropriate heuristic at different states (so-called checkpoints) of the search tree. To this end, we characterize CSP instances by means of features 10 (i.e., general information common to all CSPs). Those features are the input of an Online SVM algorithm which selects the best heuristic within the checkpoint window. The features set is divided into two main categories, static and dynamic. The former intends to distinguish instances from each other (e.g., number of variables, constraints, etc.), while the latter is used to monitor the progress of the search process (e.g., max. number of failures, variable’s weight, etc.). For more details about these features see [1]. Our choice of an Online SVM algorithm is motivated by the fact that it does not need to re-train the classifier once a new example arrives. Note that training a classical SVM (so-called batch learning) requires the solution of an optimization problem which is not an ideal situation in Continuous Search. 4 Experiments This section describes the experimental validation of the proposed approach. In these experiments we included a collection of 100 nurse-scheduling problems from the MiniZinc3 repository. 4.1 Experimental setting The learning algorithm used in the experimental validation of the proposed approach is a Support Vector Machine with Gaussian kernel; we used the libSVM implementation. All our CSP heuristics (see Section 2.1) are home-made implementations integrated in the Gecode-2.1.1 constraint solver. ID Variable sel Value Sel ID Variable sel Value sel 1 dom/wdeg min 5 dom/deg min 2 dom/wdeg max 6 dom/deg max 3 wdeg min 7 min-dom min 4 wdeg max 8 impacts — Table 1. Candidate heuristics; the default heuristic is the first one. Two CSP adaptive strategies have been experimented, respectively considering the first 4 and 8 strategies in Table 1. In all cases, the default heuristic is the first one: dom/wdeg for variable selection and min-value for value selection. The exploration examples are generated by adding minor perturbations to the default execution of heuristics (i.e., executing the default heuristic at each checkpoint). Thus, during the learning time, we ran the last seen instance replacing the default heuristic by another candidate in exactly one checkpoint, this process is repeated for each heuristic in Table 1 for a limited number of checkpoints (10 in this paper). Currently, we are working on a more informative way of selecting the exploration points considering the distance of unlabeled examples to the decision boundary. All experiments were performed on a 8-machine cluster running Linux Mandriva 2009, all machines have 64 bits and two quad-core 2.33 Ghz with 8 Gb of RAM. A time out of 10 minutes was used for each experiment. 3 Available at http://www.g12.cs.mu.oz.au/minizinc/download.html 11 It can be observed in Figure 2 that the dynamic approach is able to solve more instances that its competitors (i.e., the default strategy and a random heuristic selection). However the performance goes down as the number of candidate heuristics increases. The main explanation for this phenomenon relies on the fact that we are not using any sophisticated strategy for breaking ties (i.e., if several heuristics are predicted to outperform the default one we pick one at random). We are currently studying different approaches to breaking ties, selecting the best algorithm using the decision value, exploiting the fact that examples that are far from the classification boundaries are more likely to be correctly predicted. flatzinc nsp-14 (4 heuristics) 600 500 500 def dyn random 400 run-time (s) run-time (s) flatzinc nsp-14 (8 heuristics) 600 300 200 100 def dyn random 400 300 200 100 0 0 0 10 20 30 40 50 60 solved instances 70 80 0 10 20 30 40 50 solved instances 60 70 Fig. 2. Nurse Scheduling-14 (nsp-14), Note that this data shows the performance of the continuous search approach with a particular ordering of the problem instances 5 Related work In this section, we describe some related work that has been proposed to integrate Machine Learning Algorithms into CSP and related areas such as: SAT and Quantified Boolean Formulas (QBF). SATzilla [10] is a well known SAT portfolio solver which is built upon a set of features, in general words SATzilla includes two kinds of features: basic features such as number of variables, number of propagators, etc. and local search features which actually probe the search space in order to estimate the difficulty of each problem-instance. The goal of SATzilla is to learn a runtime predictor using a simple linear regression model. CPHydra [6], one of the best constraint solvers in the lastest CSP competition4 is a portfolio approach based on case-based reasoning. Broadly speaking CPhydra maintains a database with all solved instances (so-called cases). Later on, once a new instance arrives a set of similar cases C is computed and the heuristic that is able to solve the majority of instances in C is selected. The main drawback of this portfolio approach is that due to its high complexity to 4 http://www.cril.univ-artois.fr/CPAI08/ 12 select the best solver, it is limited to a small number of solvers (in competition settings less than 6 solvers were used). Our work is related to [8] in a way that they also apply machine learning techniques to perform on-line combination of heuristics into search tree procedures. Their paper proposes to use a multinomial logistic regression method in order to maximize the probability of predicting the right heuristic at different states of the search procedure. Unfortunately, this work requires an important number of training instances to get enough generalization of the target distribution of problems and does not fulfill all the requirements of the Continuous Search settings. 6 Conclusion This paper has presented an introduction to Continuous Search, this new concept includes an online algorithm to adaptively tune a constraint solver. At different states of the search, the instance feature is provided with dynamic information collected by the search engine to dynamically adapt the search strategy of a well known CP solver in order to more efficiently solve the current instance. The results in this paper show that the online approximation to deal with continuous search outperforms a very good default heuristic for solving CSPs. 7 Acknowledgements We would like to thank the anonymous reviews and Michele Sebag for helpful discussions of the integration of Machine Learning and Constraint Programming. References 1. A. Arbelaez, Y. Hamadi, and M. Sebag. Online heuristic selection in constraint programming. In Symposiumon Combinatorial Search (SoCS), 2009. 2. F. Boussemart, F. Hemery, C. Lecoutre, and L. Sais. Boosting systematic search by weighting constraints. In ECAI’04, 2004. 3. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000. 4. R. M. Haralick and G. L. Elliott. Increasing tree search efficientcy for constraint satisfaction problems. In Artificial Intelligence, 1980. 5. F. Hutter and Y. Hamadi. Parameter adjustment based on performance prediction: Towards an instance-aware problem solver. Number MSR-TR-2005-125, Cambridge, UK, Jan 2005. Microsoft-Research. 6. E. O’Mahony, E. Hebrard, A. Holland, C. Nugent, and B. O’Sullivan. Using casebased reasoning in an algorithm portfolio for constraint solving. In AICS’08, 2008. 7. P. Refalo. Impact-based search strategies for constraint programming. In CP’04, 2004. 8. H. Samulowitz and R. Memisevic. Learning to solve qbf. In AAAI’07, 2007. 9. K. Smith-Miles. Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv., 41(1), 2008. 10. L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. Satzilla-07: The design and analysis of an algorithm portfolio for sat. In CP’07, 2007. Constraint Based Languages for Biological Reactions Marco Bottalico1 (student) Stefano Bistarelli1,2,3 (supervisor) 13 1 3 Università G. d’Annunzio, Pescara, Italy [bottalic,bista]@sci.unich.it 2 Istituto di Informatica e Telematica (CNR), Pisa, Italy [stefano.bistarelli]@iit.cnr.it Dipartimento di Matematica Informatica, Università di Perugia, Italy bista@dipmat.unipg.it Abstract. In this paper, we study the modelization of biochemical reaction by using concurrent constraint programming idioms. In particular we will consider the stochastic concurrent constraint programming (sCCP), the Hybrid concurrent constraint programming languages (Hcc) and the Biochemical Abstract Machine (BIOCHAM). Keywords: Biochemical Reactions, (Stochastic - Hybrid) Concurrent Constraint Programming, Biocham. 1 Introduction System biology is a science integrating experimental activity and mathematical modeling. They study the dynamical behaviors of biological systems. While current genome project provide a huge amount of data on genes or proteins, lots of research is still necessary to understand how the different parts of a biological system interact in order to perform complex biological functions. Mathematical and computational techniques are central in this approach to biology, as they provide the capability of formally describing living systems and studying their proprieties. A variety of formalisms for modeling biological systems has been proposed in the literature. In [1], the author distinguishes three basic approaches: discrete, stochastic, continuous, additionally we have various combinations between them. Discrete models are based on discrete variables and discrete state changes; continuous models are based on differential equations that typically model biochemical reactions; finally in the stochastic the probabilities may appear explicitly in random variables and random numbers, or implicitly like in kinetics laws. In the latest approach we have a simplified representation of processes, an integration of stochastic noise in order to get more realistic models. The need to capture both discrete and continuous phenomena, motivates the study of dynamical systems [5]. The goal of this paper is to show different kinds of languages to model biochemical reactions in order to compare and to use their appropriate features in different ways. 2 Background on Biochemical reactions and Blood Coagulation 14 In this paper, we want to examine the Biochemical Reactions: they are chemical reactions involving mainly proteins. In a cell, there are many different proteins, hence, the number of reactions that can take place, can be very high. All the interactions that take place in a cell, can be used to create a diagram, obtaining a biochemical reactions network. We will examine in the following, one of the thirteen enzymatic reaction of the blood coagulation [12], in the generic form, through the Michaelis-Menten kinetics: k1 XI : XIIa ⇀k2 XI + XIa XI + XIIa ⇋ k −1 where XI is the enzyme (E) that binds substrate (S) = XIIa, to form an enzymesubstrate complex (ES) = XI : XIIa. After we have the formation of product (P) = XIa and the release of the unchanged enzyme (E) = XI, ready for a new reaction. We are interested of the Blood Coagulation [12]. It is part of an important host defense mechanism termed hemostasis (the cessation of blood loss from a damaged vessel). Blood clotting is a very delicately balanced system; when hemostatic functions fail, hemorrhage or thromboembolic phenomena results. The chemical reactions that constitute all the process, can be see as a decomposition of many kinds of enzymatic reactions, involving reactants, products, enzymes, substrates, stoichiometric coefficients, proteins, inhibitors and chemical accelerators. Under the Michaelis-Menten hypotheses [4], the most important equations are: 2 KM = k−1k+k Michaelis constant. It measures the affinity of the enzyme for the 1 substrate: if KM is small there is a high affinity, and viceversa. VMAX = V0 = k2 [E0 ]. This is the maximum rate, would be achieved when all of the enzyme molecules have substrate bound (Hp1). [E0 ] is the starting quantity of enzyme E. k2 is also called kcat . [S] = KVMMAX + [S] . This final equation, is usually called the “Michaelis-Menten equation”. It shows the speed of the formation of the product. When the amount of product P is small, this will be a good approximation and the equations can now be integrated: d [P] dt d [E] dt = (k2 + k−1 )[ES] − (k1 )[E][S]. d [S] S: dt = k−1 [ES] − (k1 )[E][S]. ES: d [ES] dt = k1 [E][S] − (k−1 + k2 )[ES]. d [P] P: dt = k2 [ES]. E: 3 Stochastic Concurrent Constraint Programming sCCP [2] is obtained by adding a stochastic duration to the instruction inte15 racting with the constraint store C, i.e. ask and tell. The most important feature added in the sCCP is the continuous random variable T, associated with each instruction. It represents the time needed to perform the corresponding operations in the store. T is exponentially distributed, and its probability density function is f (τ) = λe−λτ where λ is a positive real number (rate of the exponential random variable) representing the expected frequency per unit of time. The duration of an ask or a tell can depend on the state of the store at the moment of the execution. The main difference of sCCP with classical cc, is the presence of two different actions with temporal duration, ask and tell, identified by a rate function λ: tellλ (c) and askλ (c), following the probability law. It means that the reaction occurs in a stochastic time T, f (τ) = λe−λτ whose mean is 1/λ; i.e. tell∞ is an instantaneous execution while tell0 never occurs. Other functions are the same that in CCP, except for the variables, that in CCP are rigid, in the sense that, whenever they are instantiated, they keep that value forever. Time-varying variables (called stream variables) can be easily modeled in sCCP as growing lists with a unbounded tail: X = [a1 , ..., an |T]. When the quantity changes, we simply need to add the new value, say b, at the end of the list by replacing the old tail variable with a list containing b and a new tail variable: T = [b|T′ ]. When we need to know the current value of the variable X, we need to extract from the list, the value immediately preceding the unbounded tail. The stream variables are denoted with assignment $=. We model the biochemical equation [4], in sCCP, with the following recursively defined method: react(XIIa, XIa, KM , V0 ) : − askrMM (KM ,V0 ,XIIa) (XIIa > 0). (tell∞ (XIIa $= XIIa − 1)||tell∞ (XIa $= XIa + 1)). react(XIIa, XIa, KM , V0 ) Where the rate λ of the ask, is computed by the Michaelis-Menten kinetics: V0 ×XIIa rMM(KM ,V0 ,XIIa) = XIIa+K . Roughly the program inserts in the store the current M value for the variables KM , V0 , XIIa; it checks the value of the factor XIIa, then, with an immediate effect, it updates the values for the factors XIIa (reagent) and XIa (product) with the new values. Subsequently it executes a new instance of the program. 4 Hybrid cc Hybrid concurrent constraint programming languages (Hybrid cc [8]) is a powerful framework for modeling, analyzing and simulating hybrid systems, i.e., systems that exhibit both discrete and continuous change. It is an extension of Timed Default cc [11] over continuous time. One of the major difficulty in the original cc framework is that cc programs can detect only the presence of information, not the absence [1]. Default cc extends cc by a negative ask combinator 16 (i f a else A) which imposes the constraint a at the program A. e = 10, s = 5, es = 0.01, p = 0, always { k1 = 1, km1 = 0.1, k2 = 0.01, cont(e), cont(s), if (e >= 0.000000001) then { e’ = ((km1+k2) * es) - (k1 * e * s), s’ = (km1 * es) - (k1 * e * s) , es’ = (k1 * e * s) - ((km1+k2) * es), p’ = k2 * es } else { e’ = 0, s’ = 0, es’ = 0, p’ = 0 } }, We can observe that in the first row, we have the initial conditions with the quantity of Enzyme (100), Substrate (10), Enzyme-Substrate (0), Product (0), and the rate of the three reactions: k1 = 1 for the first one, k−1 = 0.1 for the second one and k2 = 0.01 for the third reaction. With the syntax cont(e, s, es, p) we assert that the rates of e, s, es, p are continuous. Subsequently we control the current values of e and s then we can start the reaction; the && and = operators, has the usual meaning of “and” and “equal”. We can easily obtain the quantity of e, s, es, p in the next time instant. If the “if condition” doesn’t hold, we obtain the previous amount of factors. 5 Biochemical Abstract Machine Biochemical Abstract Machine (BIOCHAM [6]) is a software environment for modeling complex cell processes, making simulations (i.e. in silico experiments), formalizing the biological properties of the system known from real experiments, checking them and using them as specification when refining a model. BIOCHAM is based on two aspects: the analysis and simulation of boolean, kinetic and stochastic model and the simulation of biological proprieties in temporal logic. For kinetics model, BIOCHAM can search for appropriate parameter values in order to reproduce a specific behavior observed in experiments and formalized in temporal logic. We can use the Michaelis-Menten kinetics to represent the first enzymatic reaction of blood coagulation. To explain the language used to model the reaction in the BIOCHAM [3] language, we can translate the syntax in the following way: (k1*[E]*[S],km1*[ES]) for E + S <=> ES. k2*[ES] for ES => E + P. parameter(k1,1). parameter(km1,0.1). parameter(k2,0.01). present(E,100). present(S,10). absent(ES). absent(P). 17 There are two different syntax operator, used to model the different kinds of reaction: <=> and =>. The first one model the reversible reaction, involved in the ES formation, this reaction can be reversible. The second one model the irreversible reaction which produces the P factor. The f or operator shows us for k1 which substances, the reaction is performed: the first f or is for the E+S ⇋ E:S k−1 k2 reaction, the second one is for the E : S ⇀ E + P reaction. In BIOCHAM, the result of this simulation, generates a graph and a table, in relation to time and nanomolar variation. In the graph (fig.1), we have the plot of four kinds of curves, referring to the four substances involved in the reaction: the initial decreasing of the Enzyme and of the Substrate (violet and red curves respectively), the formation to the Enzyme-Substrate complex (green curve) and finally the Product formation (sky blue line). Fig. 1. Graphic 6 Related and future works Most important features are represented in [10, 7]. In [10] the authors suggests to model biomolecular process i.e. protein networks, by using the pi-Calculus, while in [7] is shown that there are two formalisms for mathematically describing the time behavior of a spatially homogeneous chemical systems: the deterministic approach and the stochastic one. The first regards the time evolution as a continuous and predictable process which is governed by a set of ordinary differential equations (the “reaction-rate equations”), while the seconds regards the time evolution as a kind of random-walk process which is governed by a 18 single differential-difference equation (the “master equation”). From the application point of view, the examined languages allows the biologist to model biological systems in a high-level and declarative way, using different kinds of applications and languages construct that capture directly a variety of biological phenomena. We are interest in the non Deterministic process Calculus (ntcc [9]) because it is a concurrent constraint programming which include time process with a graphic formation, in order to describe our biochemical reactions. In this paper we have modeled only one to the thirteen kinetic reactions used in the blood coagulation; our future work is to extend the coagulation chain using all the reactions, the goal is to compare the formation of fibrin in the scientific literature, with previous results. The second step is the in silico study of a drug which decrease the hepatic secretion of four clotting factors, and we want to analyze the derived outcomes. References 1. Alexander Bockmayr and Arnaud Courtois. Using hybrid concurrent constraint programming to model dynamic biological systems. In ICLP, pages 85–99, 2002. 2. Luca Bortolussi and Alberto Policriti. Modeling biological systems in stochastic concurrent constraint programming. Constraints, 13(1-2):66–90, 2008. 3. Nathalie Chabrier-Rivier, François Fages, and Sylvain Soliman. The biochemical abstract machine biocham. In Proc. CMSB, pages 172–191, 2004. 4. Thomas Delvin. Texbook of biochemistry with clinical correlations. McGraw Hill Book co., 2001. 5. Schaft A.J. van der and J.M. Schumacher. Introduction to Hybrid Dynamical Systems. Springer-Verlag., 1999. 6. François Fages. Temporal logic constraints in the biochemical abstract machine biocham. In Patricia M. Hill, editor, LOPSTR, volume 3901 of Lecture Notes in Computer Science, pages 1–5. Springer, 2005. 7. Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reactions. In The Journal of Physical Chemistry, pages 2340–2352, 1977. 8. Vineet Gupta, Radha Jagadeesan, and Vijay A. Saraswat. Computing with continuous change. Sci. Comput. Program., 30(1-2):3–49, 1998. 9. Julian Gutiérrez, Jorge A. Pérez, Camilo Rueda, and Frank D. Valencia. Timed concurrent constraint programming for analysing biological systems. Electr. Notes Theor. Comput. Sci., 171(2):117–137, 2007. 10. Aviv Regev, William Silverman, and Ehud Y. Shapiro. Representation and simulation of biochemical processes using the pi-calculus process algebra. In Proc. Pacific Symposium on Biocomputing, pages 459–470, 2001. 11. Vijay A. Saraswat, Radha Jagadeesan, and Vineet Gupta. Timed default concurrent constraint programming. J. Symb. Comput., 22(5/6):475–520, 1996. 12. Williams. Williams Hematology. McGraw Hill Book co., 2006. 19 Capturing fair computations on Concurrent Constraint Language Paola Campli (campli@sci.unich.it), Supervisor: Stefano Bistarelli (bista@sci.unich.it) University G.D’Annunzio - Pescara, Italy Abstract. The study of concurrency and nondeterminism in programming languages is often related to fairness. It is a feature that should be included in contexts where there are repetitive choices among alternatives and it is desirable that no alternative will be never taken or postponed infinitely often. The aim of my research is to guarantee equitable computations in Concurrent Constraint languages. This paper presents an extension of the language related to the operator of parallelism using as criterion of selection a metric provided by a fair carpool scheduling algorithm. The new operator of Parallelism (km ) is able to deal with a finite number (m) of agents to guarantee a fair prosecution of the computation, assuring that no (enabled) agents will be never executed. Introduction This paper contains an extension of Concurrent Constraint Programming (CCP) in order to guarantee a fair criterion of selection between parallel agents and to restrict or remove the possible unwanted behavior of a program. This document is structured as follows: in the first section the common definition of fairness are briefly presented; in section two the Concurrent Constraint language is described; in section three a fair CC version through an extension of the parallelism’s operator based on the idea proposed by a fair carpooling algorithm is presented. 1 Fairness in programming languages Some of the most common notion of fairness in computer science are given by Nissim Francez in [1]: weak fairness requires that if an action (or agent) is continuously enabled (so it can almost always proceed) then it must eventually do so, while strong fairness requires that if an action (or agent) can proceed infinitely often then it must eventually proceed (be executed). In order to guarantee the above properties of fairness over the CC language we add a condition (or guard) that enables the agents with a lower cost to succeed. 20 2 Concurrent Constraint Programming The concurrent constraint (cc) programming paradigm [4] concerns the behavior of a set of concurrent agents with a shared store, which is a conjunction of constraints ( relations among a specified set of variables). Each computation step possibly adds new constraints to the store. Thus information is monotonically added until all agents have evolved. The final store is a refinement of the initial one and is the result of the computation. The concurrent agents communicate with the shared store, by either checking if it entails a given constraint (ask operation) or adding a new constraint to it (tell operation). The syntax of a cc program is show in Table 1: P is the class of programs, F is the class of sequences of procedure declarations (or clauses), A is the class of agents, c ranges over constraints, and x is a tuple of variables. The + combinator expresses nondeterminism. We also assume that, in p(x) :: A, vars(A) ⊆ x, where vars(A) is the set of all variables occurring free in agent A. In a program P = F.A, A is the initial agent, to be executed in the context of the set of declarations F . The intuitive behavior of the agents is: agent “success” succeeds in one step; Table 1. cc syntax P ::= F.A F ::= p(x) :: A | F.F A ::= success | f ail | tell(c) → A | E | AkA | ∃x A | p(x) E ::= ask(c) → A | E + E agent “f ail” fails in one step; agent “ask(c) → A” checks whether constraint c is entailed by the current store and then, if so, behaves like agent A. If c is inconsistent with the current store, it fails, and otherwise it suspends, until c is either entailed by the current store or is inconsistent with it; agent “ask(c1 ) → A1 + ask(c2 ) → A2 ” may behave either like A1 or like A2 if both c1 and c2 are entailed by the current store, it behaves like Ai if ci only is entailed, it suspends if both c1 and c2 are consistent with but not entailed by the current store, and it behaves like “ask(c1 ) → A1 ” whenever “ask(c2 ) → A2 ” fails (and vice versa); agent “tell(c) → A” adds constraint c to the current store and then, if the resulting store is consistent, behaves like A, otherwise it fails; agent A1 kA2 behaves like A1 and A2 executing in parallel; agent ∃x A behaves like agent A, except that the variables in x are local to A; p(x) is a call of procedure p. Here is a brief description of (some of) the transition rules: 21 htell(c) → A, σi → hA, σ ⊗ ci tell σ⊢c, hAsk(c)→A,σi→hA,σi ask ′ ′ hA1 ,σi→hA1 ,σ i ′ hA1 kA2 ,σi→hA1 kA2 ,σ ′ i parallelism (1) ′ hA1 ,σi→hsuccess,σ i hA1 kA2 ,σi→hA2 ,σ ′ i parallelism (2) ′ hE1 ,σi→hA1 ,σ i hE1 +E2 ,σi→hA1 ,σ ′ i nondeterminism hA[y/x],σi→hA1 ,σ ′ i h∃x A,σi→hA1 ,σ ′ i hidden variables hp(y), σi → hA[y/x], σi where p(x) :: A procedure call Notice that || and + are commutative and associative operators. 3 Fair Concurrent Constraint Programming The obtain a Fair version of the parallel execution, we modify the parallel operator k to use quantitative metrics that provides a more accurate way to establish which of the agents can succeed. The metric we use is the one proposed by a Fair carpooling scheduling algorithm [2], which is described more in detail below. 3.1 The fair carpool scheduling algorithm Carpooling consists in sharing a car among a driver and one or more passengers to divide the costs of the trip. We want a scheduling algorithm that will be perceived as fair by all the members as to encourage their continued participation. Let U be a value that represents the total cost of the trip. It is convenient (but not necessary) to take U to be the least common multiple of [1, 2, . . . , m] where m is the largest number of people who ever ride together at a time in the carpool. Since in a determined day the participans can be less than m, we define n as the number of participants in the carpooling in a given day (n ∈ [1 . . . m]). Each day we calculate the passengers and driver’s scores. In the first day each member has score zero; in the following days the driver will increase his score by αn = U (n − 1)/n , while the remaining n − 1 passengers decrease their score by βn = U/n As proved by [2] this algorithm is fair. 22 3.2 Applying the carpooling algorithm over CC A problem we encounter on the CC semantic is that only two parallel agents are represented in the Parallelism rule; therefore it is not possible to express fairness with more than 2 agents, since we need to associate a value (cost or preference) to each agent. In this section the syntax and the semantic of the language will be extended with the new operator km in order to model the parallel execution of m agents (with m finite). Moreover we include an array k (with m elements) that allows us to keep track of the actions performed by each agent during the computational steps (we use the notation hAi , k[i], σi instead of hAi , σi). In this way we can associate a value k[i] to each agent Ai . The syntax will be modified as in Table 2. Table 2. fair version of cc syntax P ::= F.A F ::= p(x) :: A | F.F A ::= success | f ail | tell(c) → A | E | E ::= ask(c) → A | E + E km (A1 . . . Am ) | ∃x A | p(x) With reference to the carpooling algorithm we consider the participants in the carpool in a given day as the n enabled agents (Al1 →, . . . , Aln → ) and we represent the driver with the agent Ali , while the passengers are the remaining Alj (j ∈ [1, . . . , n]) agents. Agent Ali increases his score by αn , while the n − 1 passengers decrease their score by βn . Initially all the n elements of the array k are equal to 0. We add in the new transition rule the set of values associated to each agent by the array k; we also insert in the precondition of the rule a guard (k[i] ≤ k[j]) to compare the agent’s values and to establish which of the agents can evolve or succeed. ′ In the initial phase, the enabled agent Ali evolves in Ali with value αn . The other agents instead assume the value −βn . In next steps we update the values of the array k to obtain k′ by increasing/decreasing the previous values of k by αn /βn . We obtain the transition rule in table 3. Since km is an associative and commutative operator, we can omit the rewriting of the second rule of Parallelism (1). We use the same criterion also for the Parallelism (2) rule (table 4). If agent Ali succeeds, it is deleted by the rule together with the corresponding element k[i]. Consequently the value m of the operator km is continuously 23 Table 3. Carpooling fairness - parallelism (1)rule k[li ] ≤ k[lj ] Al1 →, . . . , Aln → j = 1, . . . , n < Ali , σ >→< A′li , σ ′ > ∀ < ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m (A1 , . . . , A′li , . . . , Am ), k ′ , σ ′ > ′ k[x] = ( k[x] if x = 1, . . . , m k[x] + αn if x = li k[x] − βn if x = lj j 6= i Table 4. Carpooling fairness - parallelism (2) rule k[li ] ≤ k[lj ] ∀ Al1 →, . . . , Aln → j = 1, . . . , n < Ali , σ >→< success, σ ′ > < ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m−1 (A1 , .., Ali−1 , Ali+1 , .., Am ), k ′ , σ ′ > k[x]′ = ½ k[x] if x 6= li null if x = li decreased by 1 every time an agent performs a success. We obtain therefore a final computation with a single agent and a single element on the array that reinstate the original rule: hk1 (A1 )[k1 ], σi → hA1 , σi. Acknowledgement We wish to thank Paolo Baldan for his help in discussing and improving the contents of the paper. 4 Future Works The existing definitions of fairness in the literature (strong fairness, weak fairness, unconditional fairness, equifairness) yield the development of various fairnesslike constructs, that vary according to the computational model considered. The limits we can observe in such expressions are that they can be used to model only infinite computations. In fact, when dealing with finite computations, it is not possible that an agent is continuously or infinitely often enabled; for this reasons we need a quantitative and more accurate notion of fairness. As proved by [3], it is not possible to write an expression which captures all the fair behavior of a system in a programming language with a primitive to express finitary choice because to express fairness unbounded nondeterminism is needed. 24 The aim of our research is to provide new notions of fairness suitable for finite computations and to guarantee equitable computations in the Concurrent Constraint language and its extensions. Moreover we plan to provide quantitative valuations by using soft constraints; we will use a semiring structure to measure how much the service is fair. To do this we plan to use the same indexes that are used to measure the inequality in economics, such as the Gini coefficient that indicates the inequality of income or wealth. References 1. Nissim Francez, FAIRNESS (text and monographs in computer science), isbn: 0387962352, Springer Verlag (1986) 2. Ronald Fagin and John H.Williams, A Fair Carpool Scheduling Algorithm, International Business Machine Corporation, 1983, pages 133–139 3. Abha Moitra, Prakash Panangaden, Finitary choice cannot express fairness: A metric space technique, 1986 4. Vijay A. Saraswat and Martin Rinard, Concurrent constraint programming, POPL 1990 25 Finding Stable Solutions in Constraint Satisfaction Problems Laura Climent, Miguel A. Salido and Federico Barber Instituto de Automática e Informática Industrial Universidad Politécnica de Valencia. Valencia, Spain Abstract. Constraint programming is a successful technology for solving combinatorial problems modeled as constraint satisfaction problems (CSPs). An important extension of constraint technology involves problems that undergo changes that may invalidate the current solution. These problems are called Dynamic Constraint Satisfaction Problems (DynCSP). Many works on dynamic problems sought methods for finding new solutions. In this paper, we focus our attention on finding stable solutions which are able to absorb changes. To this end, each constraint maintains a label that measures the degree of variability. Thus, we restrict the original search space by using these labels and solutions inside the restricted search space remain valid under changes. 1 Introduction Nowadays many real problems can be modeled as constraint satisfaction problems and are solved using constraint programming techniques. Much effort has been spent to increase the efficiency of the constraint satisfaction algorithms: filtering, learning and distributed techniques, improved backtracking, use of efficient representations and heuristics, etc. This effort resulted in the design of constraint reasoning tools which were used to solve numerous real problems. However all these techniques assume that the set of variables and constraints which compose the CSP is completely known and fixed. This is a strong limitation when dealing with real situations where the CSP under consideration may evolve. Dynamic Constraint Satisfaction Problem [2] is an extension to a static CSP that models addition and retraction of constraints and hence it is more appropriate for handling dynamic real-world problems. It is indeed easy to see that all the possible changes to a CSP (constraint or domain modifications, variable additions or removals) can be expressed in terms of constraint additions or removals [4]. Several techniques have been proposed to solve Dynamic CSPs, including: searching for stable solutions that are valid after small problem changes [5]; searching for a new solution that minimizes the number of changes from the original solution [3]; or reusing the original solution to produce a new solution [4]. – Searching for stable solutions that are valid after small problem changes has the objective of exploring methods for finding solutions that are more likely to remain valid after changes, because this fact means that these are stable solutions [5]. In 26 order to develop this method, it is necessary to find all the solutions if we want to obtain the most stable based on an objective function. – Searching for a new solution that minimizes the number of changes from the original solution, considers the stability like close solutions in case of changes [3].Nevertheless with this method we do not obtain stable solutions. – Reusing the original solution to produce a new solution consist on producing the new solution by local changes on the previous one [4]. In order to develop this technique, we must explore the search space. In this paper we focus our attention on the first set of techniques: Searching for stable solutions that are valid after problem changes. Thus, we assume that the user knows the degree of stability of each constraints and they can be labeled to identify this stability.Thus the constrainedness of CSP can be altered by modifying/restricting the dynamic constraints. 1.1 Definitions By following standard notations and definitions in the literature [1] we have summarized the basic definitions that we will use in this rest of the paper. Definition 1. Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ci where, X is a finite set of variables {x1 , x2 , ..., xn }. D is a set of domains D = D1 xD2 x...xDn such that for each variable xi ∈ X there is a finite set of values that variable can take. C is a finite set of constraints C = {C1 , C2 , ..., Cm } which restrict the values that the variables can simultaneously take. Definition 2. Instantiation is a pair (xi = a), that represents an assignment of the value a to the variable xi , and a ∈ Di . Definition 3. To Check a Constraint: Once all variables, involved in a constraint, are instantiated, the constraint can be evaluated/check to analyze whether these assignments of values to variables satisfy the constraint or not. Definition 4. The Solution Space (S): is the portion of search space that satisfies all constraints. The instantiations of all variables that satisfy all constraints are inserted in the solution space. If all constraints are convex, the solution space is a convex hyperpolyhedron, which is included in the Cartesian Product of the variable domains. 2 Finding Stable Solutions Our aim is to find solutions to problems that remain valid under changes. Many real problems, such as planning and scheduling problems, can be modeled as a CSP. However, many of the constraints are set knowing that they are probably to be modified. For instance, in scheduling problems, if a worker starts working at time T 1begin and his first task must be finished in 50 minutes, then the constraint is (T 1end − T 1begin < 50). However, the worker could arrive late to work due to many factors. Thus, we can restrict this constraints to (T 1end − T 1begin < 40). If something is wrong the task can be delayed by more minutes. Thus, we can label this constraint as a dynamic constraint of level 2 (for instance). If the problem is feasible with the more restricted constraints, the 27 obtained solutions will be more stable than the solutions of the original problem due to the fact that if the worker arrives late (<10 minutes) the solution remains valid. If the problem is not feasible, the dynamic constraints can be relaxed until they became the original ones. It must be taken into account that the search space previously explored must not be explored again due to the fact that we can add the inverse constraints to avoid searching the previous explored region. For instance, in the previous example, if the constraint (T 1end − T 1begin < 40) remains the problem infeasible, then we add the constraint (T 1end − T 1begin ≥ 40) and relax the original one (T 1end − T 1begin < 45). This modeling technique will be appropriate mainly in under-constrained problems. Without loss of generality we consider four levels of stability for the constraints: – Level 0 means that the constraint is very stable so there is no probability that it changes (no dynamic constraint). – Level 1 means that the constraint is stable so the probability that it changes is low. – Level 2 means that the constraint is low stable so the probability that it changes is high. – Level 3 means that the constraint is very unstable so the probability that it changes is very high (very dynamic constraint). The level of probability to change of the domains in a CSP can be also represented for these levels of stability explained before. In the lowest level the domains cannot change. However in the highest level it is very probable a reduction in the domain size. 2.1 By Restricting Constraints In order to find stable solutions by restricting constraints, each constraint is labeled with one of the previous integer numbers. We assume constraints in the form Ax − k ≥ 0 → a1 x1 + a2 x2 + ... + an xn − k ≥ 0. Thus, with each constraint Ci and its label li , we generate a new constraint Ci′ : Ax − (li d + k) ≥ 0, where d is a parameter that the user gives to restrict the constraint. This number must be higher than zero. For a value of zero the CSP’ will be the same than the original CSP. The parameter d allows the user to restrict the constraints in a high or low level, it is very useful because if the CSP’ has not solution, the user can relax the constraints. Thus if li = 0 then the constraint remains the original one Ci′ : Ax − (0d + k) ≥ 0 → Ci′ : Ax − k ≥ 0. Input: A CSP(P), P = hX, D, Ci: C = {C1 , C2 , ..., Cm } and a set of labels L = {l1 , l2 , ..., lm }. Output: level of stability of solutions Sol and Sol′ . 1. for every constraint Ci and label li 2. Generate a new constraint Ci′ 3. endfor ′ 4. C ′ = {C1′ , C2′ , ..., Cm } 5. Compose a CSP(P’) P ′ = hX, D, C ′ i 6. Sol ← Solve CSP(P) 7. if (∃Sol) 8. do 9. Sol′ ← Solve CSP’(P’) 10. if ¬∃ Sol′ 11. Relax CSP’ ← P ′ 28 12. endif 13. while(¬∃Sol′ ) 14. if (Sol 6= Sol′ ) 15. Check − Stability(Sol, Sol′ ) 15. return Stability(Sol), Stability(Sol′ ) 16. else (No Solution exists) 16. end Fig. 1. Example of CSP (left). Resultant CSP once constraints are modified by their labels (right). 2.2 By Reducing Domain Size This method consists in reducing the domain size of the variables in the upper and lower bound. This technique is similar to the previous one by adding new unary constraints to the set of constraints. In this case, the user can also select the stability of each variable. These values are represented by labels that are used to insert unary constraints into the system. Figure 2 left shows the previous example with the original domains and Figure 2 right shows the same problem with the reduced domain sizes. Thus, the solution space is reduced, and solutions inside this solution space are more stable. 3 Example In the following, we present an example to clarify the stability methods presented in the previous section. For simplicity, we consider a CSP with only to variables x1 and x2 with domains D1 : {3..7} and D2 : {2..6} (see Figure 1). The constraints are: C1 : x1 + x2 ≥ 6 C2 : x1 + x2 ≤ 12 C3 : x1 − x2 ≤ 4 C4 : −x1 + x2 ≤ 2 29 Figure 1 left shows a graphic representation of the original problem. It can be observed that the Cartesian Product of the domains generates 25 different instantiations. The original constraints removes the non-valid instantiations (x1 = 3, x2 = 2), (x1 = 7, x2 = 2), (x1 = 3, x2 = 6),(x1 = 7, x2 = 6). Thus, the rest of 21 instantiations are solutions of the CSP. If a CSP solver gives us a subset of solutions, which one is more robust? To answer this question, we will make use of the constraint labels. In Figure 1 (left) we present the four constraints with its corresponding labels: C1 (0), C2 (2), C3 (1), C4 (1). If we consider d = 1, the resultant constraint are: C1 (0) : x1 + x2 ≥ 6 → C1′ : x1 + x2 ≥ 6, C2 (2) : x1 + x2 ≤ 12 → C2′ : x1 + x2 ≤ 10, C3 (1) : −x1 + x2 ≤ 2 → C3′ : x1 + x2 ≤ 1, C4 (1) : x1 − x2 ≤ 4 → C4′ : x1 − x2 ≤ 3. The resultant CSP P ′ is shown in Figure 1 (right). It can be observed that constraint C1 has a label l0 = 0 so it is a static constraint. The rest of constraints have labels 1 and 2. This means that they are dynamic constraints and a tighter CSP is possible. The new constraints remove some non-valid instantiations, for instance (x1 = 7, x2 = 5), (x1 = 7, x2 = 4), (x1 = 6, x2 = 6),(x1 = 6, x2 = 5). These instantiations that are solutions to the original CSP are candidates that remain invalid if constraint C2 changes. As can be seen in Figure 1 (right), the solution space has been reduced and the set of solutions S ′ ⊂ S can be considered more robust than the set of solutions S\S ′ . Fig. 2. Resultant CSP once domain size are reduced 4 Conclusions and Further Work Many real problems can be modeled as Dynamic Constraint Satisfaction Problem. Several techniques have been developed to manage Dynamic CSPs. Some of them are based on searching for stable solutions that are valid after small problem changes. However, it is a very hard task due to the fact that all solution must be found to obtain the most 30 stable one based on an objective function. Other techniques are based on searching for a new solution that minimizes the number of changes from the original solution. These techniques are appropriate for dynamic CSPs but the obtained solutions are not stable. Finally some techniques are based on reusing the original solution to search a new solution. In this case, a CSP solver must explore the search space for finding a solution for the new problem. In this paper, we focus our attention on finding stable solutions by restricting the original CSP with the aim of obtaining solutions able to absorb changes. To this end, we have develop a tool for simulating a CSP’ with tighter constraints and/or domains, depending on the dynamic of each constraint. Thus, solutions of this CSP’ remain stable solutions for the original CSP. There is a long line of work centered in this area. Using the fact that the reduced CSP (CSP ′ ) has tighter constraints, filtering techniques such as arc-consistency can reduce the domain size of the variables. So we can apply some strategies to find even more stable solutions: – by carrying out a value ordering. Once filtering techniques such as node-consistency, arc-consistency are carried out and the domains have been reduced by the previous techniques, a value ordering heuristic can be applied to starting the search by the center of the search space. Thus, given a domain [di , Di ] of a variable xi , the search i starts by assigning the value xi = di +D , that is, by the center of the interval. For 2 instance, in the example of Figure 2, the search will start by assigning the value (x1 = 5, x2 = 4). – by searching for several solutions and carrying out convex combinations between two solutions. This technique is only valid for CSPs with convex constraints and continuous domains. Once the constraints have reduced the solution space, the more internal solutions are candidates to be more stable. Thus a convex combination between two opposite extremal points is probably more stable than its owns extremal points. References 1. Christian Bessiere. Constraint propagation. Technical report, CNRS/University of Montpellier, 2006. 2. R. Dechter and A. Dechter. Dynamic constraint networks. In Proc. of the 7th National Conference on Artificial Intelligence (AAAI-88), pages 37–42, 1988. 3. E. Hebrard, B. Hnich, and T. Walsh. Super solutions in constraint programming. In Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. CPAIOR-04, pages 157–172, 2004. 4. G. Verfaillie and T. Schiex. Solution reuse in dynamic constraint satisfaction problems. In In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI-94), pages 307–312, 1994. 5. R. Wallace and E.C. Freuder. Stable solutions for dynamic constraint satisfaction problems. In Proc. 4th International Conference on Principles and Practice of Constraint Programming, pages 447–461. Springer, 1998. 31 Preliminary studies of BnB-ADOPT related with Soft Arc Consistency ⋆ Patricia Gutierrez (student) and Pedro Meseguer (supervisor) IIIA, Institut d’Investigació en Intel.ligència Artificial CSIC, Consejo Superior de Investigaciones Cientı́ficas Campus UAB, 08193 Bellaterra, Spain. {patricia|pedro}@iiia.csic.es Abstract. BnB-ADOPT is a distributed asynchronous search algorithm for Distributed Constraint Optimization Problem (DCOP) solving. When searching, an agent may realize that some of its values will never be in the solution. In that case, these values are unconditionally deleted and their deletions are propagated. Using soft arc consistency techniques, more values in other agents could be pruned as consequence of this propagation. This causes reductions in the search effort required to solve DCOP instances. In this paper we propose the idea of an algorithm based on BnB-ADOPT that identifies and propagates unconditional value deletions as search progresses, expecting an improvement in the overall search effort. 1 Introduction A classical problem in computer science is finding the global optimum of the aggregation of some elementary functions. When these elementary functions are distributed among several agents, and cannot be centralized into one agent (due to privacy issues, the natural distributed origin of the problem data and the high costs of its translation, among others), this problem is called Distributed Constrained Optimization (DCOP). This problem has received a lot of attention in the last years. DCOP solving algorithms, such as ADOPT [5] or DPOP [6] have been developed. Considering DCOP solving by distributed asynchronous search, BnB-ADOPT [8] offers better performance than ADOPT while keeping its good theoretical properties (correctness, completeness, termination). It requires a relatively simple implementation. During the search process, some agents executing BnB-ADOPT may discover that some values will never be in the solution. In these cases, we propose to remove those values from the DCOP instance, which when propagated, may cause further value removals. As the search space becomes smaller, we expect to get some benefits (in terms of the number of messages exchanged or the number of cycles required) from this approach. We propose to propagate these value removals using a new message type. Maintaining the simplest form of soft arc consistency will allow us to prune further values. ⋆ Supported by the Spanish project TIN2006-15387-C03-01. 32 The paper is organized as follows. In section 2, we provide the basic definitions and a description of BnB-ADOPT. In section 3, we present the idea of propagating unconditional deletions within the BnB-ADOPT algorithm by applying soft arc consistency techniques. 2 Preliminaries In a centralized setting, a Constraint Optimization Problem (COP) involves a finite set of variables, each taking a value in a finite domain [1]. Variables are related by cost functions that specify the cost of value tuples on some variable subsets. Costs are positive natural numbers (including 0 and ∞). A finite COP is (X, D, C) where, X = {x1 , . . . , xn } is a set of n variables; D = {D(x1 ), . . . , D(xn )} is a collection of finite domains; D(xi ) is the initial set of xi possible values; C is a set of cost functions; fi ∈ C on the ordered set of variables var(fi ) = (xi1 , . . . , xir(i) ) specifies the cost of Q iri every combination of values of var(fi ), that is, fi : j=i D(xj ) 7→ N + . The arity 1 of fi is |var(fi )|. The overall cost of a complete tuple (involving all variables) is the addition of all individual cost functions on that particular tuple. A solution is a complete tuple with acceptable overall cost, and it is optimal if its overall cost is minimal. Moving into a distributed context, a Distributed Constraint Optimization Problem (DCOP), is a COP where variables, domains and cost functions are distributed among automated agents [5]. A variable-based DCOP is a 5-tuple (X, D, C, A, α), where X, D, C define a COP, A is a set of p agents and α maps each variable to one agent. For simplicity, we assume that each agent holds exactly one variable (so the terms variable and agent can be used interchangeably) and cost functions are unary and binary only (in the following a cost function is denoted as C with the indexes of variables involved). DCOPs can be solved by ADOPT, as the reference algorithm for asynchronous distributed search [5]. Aiming at improving its performance, several ADOPT-based algorithms have been proposed, like BnB-ADOPT [8] and ADOPT-ng [7]. This paper deals with BnB-ADOPT, so we provide a brief summary of this algorithm. BnB-ADOPT performs a distributed depth-first branch and bound search. It appears in Figure 1. It uses message passing as the communication form between agents, using three message types: VALUE, COST and STOP. Agents are arranged in a pseudotree as defined in [5]. A generic agent self executes BnB-ADOPT with these data structures: the current context (values that self believes that are assigned to higher agents in the branch from self to the root), and for each value of its domain and each child agent: a lower bound lb, an upper bound ub and a context. With this self calculates its own lower bound LB and upper bound U B based on its local cost plus any cost reported by its childrenPas follows: P LB[d] ← (xi ,di )∈myContext Cself,i (myV alue, di )+ xk ∈myChildren lb[myV alue, xk ]; P P U B[d] ← (xi ,di )∈myContext Cself,i (myV alue, di )+ xk ∈myChildren ub[myV alue, xk ]; U B ← mind∈Dself U B[d]. LB ← mind∈Dself LB[d]; 33 procedure BnB-ADOPT() myContext ← tableContexts ← empty table; for each xi ∈ myChildren ∧ v ∈ D(self ) do lb[v, xi ] ← 0; ub[v, xi ] ← ∞; end ← false; InitSelf(); Backtrack(); while (¬end) do while message queue is not empty do msg ← getMsg(); switch(msg.type) V ALU E:ProcessValue(msg); COST :ProcessCost(msg); ST OP : end ←true; Backtrack(); procedure ProcessValue(msg) if (myContext[msg.sender] 6= msg.value) then myContext[msg.sender] ← msg.value; CheckCurrentContextWithChildren(); InitSelf(); if msg.sender = myP arent then T H ← msg.threshold; procedure InitSelf() myV alue ← argminv∈Dself LB(v); T H ← ∞; procedure ProcessCost(msg) contextChange ← false; for each xi ∈ msg.context, xi 6∈ myN eighhbors do if myContext[xi ] 6= msg.context[xi ] then myContext[xi ] ← msg.context[xi ]; contextChange ← true; if contextChange = true then CheckCurrentContextWithChildren(); InitSelf(); if isCompatible(myContext, msg.context) then lb[msg.context[self ], msg.sender] ← max{msg.lb, lb[msg.context[self ], msg.sender]}; ub[msg.context[self ], msg.sender] ← min{msg.ub, ub[msg.context[self ], msg.sender]}; tableContext[msg.context[self ], msg.sender] ← msg.context; procedure Backtrack() if T H ≤ LB() then T H ← ∞; if LB(myV alue) ≥ min(T H,UB()) then myV alue ← argminv∈Dself LB(v); SendValueMessageToLowerNeighbors(myV alue); if (self = root∧ LB() = UB()) ∨ end then end ← true; for each child ∈ myChildren do sendMsg(ST OP, self, child); else sendMsg(COST, self, myP arent, myContext,LB(), UB()); procedure CheckCurrentContextWithChildren() for each val ∈ D(self ) ∧ child ∈ myChildren do if ¬isCompatible(myContext, tableContexts[val, child]) then tableContexts[val, child] ← empty; lb[val, child] ← 0; ub[val, child] ← ∞; procedure SendValueMessageToLowerNeighbors(myV alue) P cost ← j∈myP arent∪myP seudoparents Cself,j (myV alue, myContext[j]); for each child ∈ myChildren do P th ← min{T H,UB()} − cost − j∈myChildren,j6=child lb[myV alue, j]; sendMsg(V ALU E, self, child, myV alue, th); Fig. 1. The original BnB-ADOPT algorithm. self also maintains a threshold T H used for pruning during depth-first search, initialized to ∞ (and in the root agent remains always infinite). Variable myValue stands for the value of the agent current assignation, and myChildren stands for the children in the pseudotree as defined in [5]. At the beginning, self chooses the value that minimize its LB and send a VALUE to its children and pseudochildren. In response, a child sends a COST to its parent reporting the child LB and U B under the current context. When 34 self receives a VALUE, this indicates that its parent or pseudoparent has changed its value and self context is updated. self checks its context against each child context: if they are not compatible the information provided by that child (stored in the lb and ub tables) is reinitialized and the current value may change. The threshold value sent by the parent is updated. When self receives a COST, a child is informing self of its LB and U B. As these bounds are calculated depending on values of higher variables the child must attach the context under which these costs were calculated. self checks the received context and adds any higher variable assignation which its is not constrained with to its own context. If the sent context is compatible with its own context, self updates the information received in the lb and ub tables, otherwise it ignores it. Once every message in the self input queue is processed, self performs backtrack, where it will decide if it must change its value. If the LB of the current value is greater or equal to min(T H, U B()), self changes its value to the one with smallest LB. Notice that on first iterations the U B for an agent will be infinite until every child has informed with its corresponding COST (so an agent will not backtrack until all information is received), but on later iterations it will not be necessary to wait for every child to inform its cost, since an agent can decide to discard the current value with partial information if the LB of this value is greater than the stored T H, which work as a global upper bound for that child subtree. In any case, the agent will send a VALUE to each child ch with its current value and the desired threshold, calculated as, min(T H, U B()) − P P C (myV alue, d )− self,i i xk ∈myChildren,xk 6=ch lb[myV alue, xk] (xi ,di )∈myContext This desired threshold will be used for pruning once the child has reached this value, it is chosen such that the LB(d) for the agent reaches min(T H, U B()) and therefore, the agent will take a new value when the received cost of the children reach the desired threshold. In the next step the terminate condition is checked, this condition is triggered by the root when LB = U B. This condition can only be achieved when all values in the root domain have been explored or pruned. Finally, a COST is sent to the parent. 3 Propagating Unconditional Deletions Let us consider a DCOP instance, where agents are arranged in a pseudotree and each executes BnB-ADOPT. Imagine the root agent, Droot = {a, b, . . .}. Let us assume that it takes value a. After a while, root will know cost(a) = lb(a) = ub(a), and it decides to change its assignment to b, informing to its children with the corresponding VALUE messages. Children start answering about the cost of b (this is a change of context in BnB-ADOPT terms) with COST messages. As soon as root realizes that cost(b) > cost(a), b can be removed from Droot since it will not be in the solution and it will never be considered again (a similar situation happens if cost(a) > cost(b), then a can be removed from Droot ). Just removing b from Droot will cause no effect in BnB-ADOPT, because it will not consider b again as possible value for root. However, if we inform constrained agents that b is no longer in Droot , this may cause some values 35 of other agents to become unfeasible so they can be deleted as well. A related situation may happen when the cost of the best solution found so far decreases, passing from ⊤1 to ⊤2 . It may occur that a value a of agent i which could not be pruned with ⊤1 (Ci (a) < ⊤1 ) now it can be pruned with ⊤2 (Ci (a) ≥ ⊤2 ). These deletions can be further propagated in the same way, decrementing the size of the search space. It is worth noting that these deletions (either coming from root or from any other agent) are unconditional, since (i) root values does not depend on any context (contexts at root are always empty because there is no any higher agent), (ii) unitary constraints on values of variables do not depend on any context. Any deletion caused by propagation of unconditional deletions is also unconditional. To propagate these value deletions in other agents we need to maintain soft arc consistency in the distributed instance. In the centralized case, let us consider variables i and j with unitary costs Ci and Cj , and constrained by cost function Cij . The simplest form of soft arc consistency for weighted CSP (WAC) is defined as follows: 1. value (i, v) is node consistent (NC) if Ci (v) < ⊤, variable i is NC if every value is NC, problem instance is NC if every variable is NC; 2. value (i, v) is arc consistent (AC) with respect to cost function Cij if it is NC and there exists another value w ∈ Dj such that Cij (v, w) = ⊥. Value w is called a support for v. Variable i is AC if all its values are AC, and and a problem instance is AC if every variable is AC [3]. Values ⊤ and ⊥ represent the first unacceptable cost (any cost lower than ⊤ is acceptable) and no cost (with the usual addition for cost aggregation, as search progresses ⊤ is the global cost of the best solution found so far and ⊥ is 0). This definition can be translated into the distributed context: variables are agents, each agent keeps its unitary and binary constraint in which it is involved (so every agent knows the domain of every other agent it is constrained with). Node consistency remains unchanged, enforced by the owner agent, and arc consistency also remains unchanged, enforced by any of the two agents involved in the constraint. To apply this idea to DCOPs, we assume that the distributed instance is initially WAC (otherwise, it can be made WAC by preprocessing). If value a is unconditionally deleted from Di , it might happens that value a ∈ Di were the only support of a value b ∈ Dj . For this reason, after the notification of a deletion, directional WAC has to be enforced on cost function Cij from i to j (observe that in the other direction enforcing will cause no change). This has to be done in both agents i and j, to assure that both maintain the same representation of Cij . In addition, agent j may pass binary costs to unary costs, which might result that some value b ∈ Dj becomes node inconsistent. In that case, b should be deleted from Dj and its deletion should be propagated. The idea of propagating unconditional deleted values can be included in BnBADOPT, producing a new algorithm and exploiting the idea that a constraint Cij is 36 known by both agents i and j. This approach would require some minor changes with respect to BnB-ADOPT: 1. the domain of every variable constrained with self has to be represented in self ; 2. a new message type, DEL, is required. When self deletes value a in D(self ), it sends a DEL message to every agent constrained with it. When self receives a DEL message, it registers that the message value has been deleted from the domain of sender, and it enforces soft arc consistency on the constraint between self and sender. If, as result of this enforcing, some value is deleted in D(self ) it is propagated as above. Including the propagation of unconditionally deleted values does not changes the semantic of original BnB-ADOPT messages. In addition, T OP should be included in VALUE messages: from the second value tried in root, T OP is a finite value which is propagated downwards, informing the other agents of the lowest unacceptable cost. It is worth noting that this approach keeps the good BnB-ADOPT properties, namely correctness, completeness and termination: since we are eliminating values which are either suboptimal (values at root) or not WAC, their removal will not cause to miss any solution. If the value assigned to self is found to be not WAC, this is recorded, the value is removed when possible and another value is tried for self. Any value removal is propagated to agents constrained with self. Connecting BnB-ADOPT with the simplest form of soft arc consistency (WAC) is expected to be beneficial for communication. More complex forms of soft arc consistency remain to be investigated [2], [4]. The use of soft arc consistency in a distributed context requires a careful analysis, to avoid that the extra communication effort needed to enforce them surpasses the expected benefits in search space reduction. Preliminary experimental results indicates improvement in the performance with respect to the BnBADOPT algorithm. References 1. Dechter R., Constraint Processing, Morgan Kaufmann, 2003. 2. de Givry S., Heras F., Larrosa J., Zytnicki M. Existential arc consistency: getting closer to full arc consistency in weighted CSPs Proc. of IJCAI-05, 2005. 3. Larrosa J. Node and arc consistency in weighted CSP Proc. of AAAI-02, 2002. 4. Larrosa J., Schiex T. In the quest of the best form of local consistency for Weighted CSP Proc. of IJCAI-03, 2003. 5. Modi P. J., Shen W.M., Tambe M., Yokoo M. Adopt: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence, 161, 149–180, 2005. 6. Petcu A., Faltings B. A scalable method for multiagent constraint optimization Proc. of IJCAI-05, 266–271, 2005. 7. Silaghi M. Yokoo M. Nogood-based Asynchronous Distributed Optimization (ADOPT-ng). Proc. of AAMAS-06, 2006. 8. Yeoh W., Felner A. Koenig S. BnB-ADOPT: An Asynchronous Branch-and-Bound DCOP Algorithm Proc. of AAMAS-08, 591–598, 2008. 37 An automaton Constraint for Local Search Jun He (student), Pierre Flener, and Justin Pearson Department of Information Technology Uppsala University, Box 337, SE – 751 05 Uppsala, Sweden Firstname.Lastname@it.uu.se Abstract In this paper we explore the idea of using deterministic ﬁnite automata (DFAs) to implement constraints for local search (this is already a successful technique in global search). We show how it is possible to automatically incrementally maintain violation counts from a speciﬁcation of a DFA. We show the practicality of the idea on a real-life scheduling example. 1 Introduction When a high-level constraint programming (CP) language lacks a (possibly global) constraint that would allow the formulation of a particular model of a combinatorial problem, then the modeller traditionally has the choice of (1) switching to another CP language that has all the required constraints, (2) formulating a diﬀerent model that does not require any lacking constraints, or (3) implementing the lacking constraint in the low-level implementation language of the chosen CP language. This paper addresses the core question of facilitating the third option, and as a side eﬀect makes the ﬁrst option unnecessary. The user-level extensibility of CP languages has been an important goal for over a decade. In the traditional global search approach to CP (namely heuristic-based tree search interleaved with propagation), higher-level abstractions for describing new constraints include indexicals [15], (possibly enriched) deterministic ﬁnite automata (DFAs) via the automaton [2] and regular [10] generic constraints, and multi-valued decision diagrams (MDDs) via the mdd [5] generic constraint. In the more recent local search approach to CP (called constraint-based local search, CBLS, in [12]), higher-level abstractions for describing new constraints include invariants [8], a subset of ﬁrst-order logic with arithmetic via combinators [14] and diﬀerentiable invariants [13]; and existential monadic second-order logic for constraints on set variables [1]. In the former approach, a generic but eﬃcient propagation algorithm achieves a suitable level of local consistency by processing the higher-level description of the new constraint, while in the latter approach, a generic but incremental violation maintenance algorithm is proposed. In this paper, we revisit the description of new constraints via DFAs, already successfully tried within the global search approach to CP, and show that it can also be successfully used within the local search approach. The rest of this paper is organised as follows. In Section 2, we present our algorithm for incrementally 38 1 2 d d x 5 d x x 3 d x e 6 x e e e 4 Figure 1. An automaton for a simple work scheduling constraint maintaining both the violation degree of a constraint described by a DFA, and the violation degrees of each decision variable of that constraint. In Section 3, we present experimental results establishing the practicality of our results. Finally, in Section 4, we summarise this work and discuss related as well as future work. 2 Incremental Violation Maintenance with DFAs Our method for implementing a constraint described by a DFA has two major components: a pre-processing to set up useful data structures, including the computation of the violations of the constraint and each of its decision variables for the initial assignment, and the incremental maintenance of these violations when an actual move is made. Our running example is the following, for a simple work scheduling constraint. There are values for two work shifts, day (d) and evening (e), as well as a value for enjoying a day oﬀ (x). Work shifts are subject to the following three conditions: one must take at least one day oﬀ before a change of work shift; one cannot work for more than two days in a row; and one cannot have more than two days oﬀ in a row. A DFA for this constraint is given in Figure 1. The start state 1 is marked by a transition entering from nowhere, and the success states 5, 6 are marked by double circles. Missing transitions, say from state 2 upon reading a value e, are assumed to go to an implicit failure state, with a self-looping transition for every value (so that no success state is reachable from it). The pre-processing is similar to the pre-processing in [10], namely unrolling the DFA for a given length n of the sequence V = V1 , . . . , Vn of decision variables. Deﬁnition 1 (Layered Graph). Given a DFA with m states, the layered graph over n variables is a graph with m · (n + 1) nodes. Each of the n + 1 vertical layers has a node for each of the m DFA states. The node for the start state of the DFA in the layer 1 is labelled as the start node. There is an arc labelled w from node f in layer i to node t in layer i + 1 if and only if there is a transition labelled w from f to t in the DFA. A node in layer n + 1 is labelled as a success node if it corresponds to a success state in the DFA. The layered graph is further processed by removing all nodes and paths that do not lead to a success node. Next, the resulting graph, seen as a DFA (or ordered MDD), is minimised (or reduced)[6], as the size of the graph inﬂuences the run time of algorithms. For instance, the unrolled version for n = 6 decision 2 Layer 1 Layer 2 12 2 x d 42 1 x e d d 18 e 3 x x 4 e 12 Layer 3 6 2 x d d 3 e 8 x x 4 6xe 5 d 4 e 6 6 Layer 4 3 2 d x 4 d 3 e 3x 4 e x 2 x 5 d 6 2 e Layer 5 Layer 6 1 2 1 2 d x 1 3 d 3 2 Layer 39 7 e 1 4 x 1 5 e Figure 2. The unrolled automaton of Figure 1 variables of the DFA in Figure 1 is given in Figure 2 (which is best viewed in colour). The purple number by each node is the number of paths from that node to a success node in the last layer. Next, we compute the violations of the automaton constraint and each of its decision variables for the initial assignment. Toward this, we show how an assignment is transformed into a sequence of disconnected segments in the layered graph G, where each segment is a connected sequence of arcs in G. First, if the current assignment is a solution, then there exists a path P in the layered graph from the start node in the ﬁrst layer through arcs labelled by V1 , V2 , . . . , Vn and ﬁnally to a success node in the last layer. In this case, we get only one segment, which is P . Second, if the constraint is unsatisﬁed, then there is some index i such that from the start node in the ﬁrst layer through arcs labelled with V1 , V2 , . . . , Vi−1 and ﬁnally to a node Si in layer i that has no out-going arcs labelled with value Vi . The sequence of arcs labelled with V1 , V2 , . . . , Vi−1 is the ﬁrst segment. From the node Si in layer i, we select one of Si ’s successor nodes and start the next segment. Ideally we want to select a node that corrects the path in the best possible way. This is done by selecting the next node randomly from the set of nodes in the next layer that are reachable from the current node according to a distribution where nodes are weighted according to the number of paths from the current node to a success node in the last layer. Repeating this procedure until we arrive at the last layer, we will ﬁnally get several disjoint segments. For instance, in Figure 2, with the assignment V = hx, e, d, e, x, di, the ﬁrst segment will be hx, ei. Next, V3 is a violated variable because there is no arc labelled with d that connects node 4 in layer 3 with any nodes in layer 4. Node 4 in layer 3 has two out-going arcs that are connected with nodes 3 and 5 in layer 4. In layer 4, there are 4 paths from node 3 compared to 2 paths from 4 node 5, so node 3 is chosen with probability 4+2 , and we assume it is chosen. From node 3 in layer 4, we will get the second segment he, x, di, which stops at success node 5 in the last layer. 3 40 Deﬁnition 2 (Violation of an Automaton Constraint and its Variables). Given an assignment yielding the segments σ1 , . . . , σℓ of the sequence of n decision variables Vi of an automaton constraint c: Pℓ – The violation Violation of c is n − i=1 |σi |. – The violation Violation(Vi ) of decision variable Vi of c is 0 if there exists a segment index j in 1, . . . , ℓ such that i ∈ σj , and 1 otherwise. It can easily be seen that the violation of an automaton constraint is the sum of the violations of its decision variables, and that it is never an underestimate of the minimal Hamming distance between the current assignment and any solution assignment. Local search proceeds from a giving assignment by checking a number of neighbours of that assignment and picking a neighbour that ideally reduces the violation: the exact heuristics are often problem dependent. But in order to make local search computationally eﬃcient, the violations of the constraint and its variables have to be computed in an incremental fashion whenever a variable changes value. When a variable changes value only segments after that variable need to be changed. Initialisation and each incremental update only take a linear amount (in n) of work (algorithm and proof omitted for space reasons). 3 Experiments Many industries and services need to function around the clock. Rotating schedules are a popular way of guaranteeing a maximum of equity to the involved work teams (see [7]). In our example in Figure 3(a), there are day (d), evening (e), and night (n) shifts of work, as well as days oﬀ (x). Each team works maximum one shift per day. The scheduling horizon has as many weeks as there are teams. In the ﬁrst week, team i is assigned to the schedule in row i. For any next week, each team moves down to the next row, while the team on the last row moves up to the ﬁrst row. The daily workload may be uniform: in Figure 3(a) each day has exactly one team on-duty for each work shift, and hence two teams entirely oﬀduty; assuming the work shifts average 8h, each employee will work 7·3·8 = 168h over the ﬁve-week-cycle, or 33.6h per week. Daily workload, whether uniform or not, can be enforced by global cardinality (gcc) constraints on the columns. Further, any number of consecutive workdays must be between two and seven, and any change in work shift can only occur after two to seven days oﬀ. This can be enforced by a pattern(X, {(d, x), (e, x), (n, x), (x, d), (x, e), (x, n)}) constraint [4] and a circular stretch(X, [d, e, n, x], [2, 2, 2, 2], [7, 7, 7, 7]) constraint [9] on the table ﬂattened row-wise into a sequence X. Our model posts the two required pattern and stretch constraints on the row-wise ﬂattened matrix of decision variables, by actually using the conjunction automaton of the pattern and stretch DFAs, as this gives better run times than the decomposition. The gcc constraints on the columns of the matrix are kept invariant: the ﬁrst assignment is chosen so as to satisfy them, and then only swap moves (between distinct values) inside a column are considered. As a meta-heuristic, we use tabu search 4 41 1 2 3 4 5 Mon Tue Wed Thu Fri Sat Sun x x x d d d d x x e e e x x d d d x x e e e e x x n n n n n n n x x x (a) Classical rotating schedule instance total time opt time iterations 1d, 1e, 1n, 1x 18 2 14 1d, 1e, 1n, 2x 26 2 7 2d, 1e, 1n, 2x 26 2 9 2d, 2e, 1n, 2x 33 5 17 2d, 2e, 2n, 2x 42 6 20 2d, 2e, 2n, 3x 43 7 21 3d, 2e, 2n, 3x 52 8 25 (b) Experimental results on rotating schedules Figure 3. (a) A ﬁve-week rotating schedule with uniform workload. (b) Average run times (total and optimisation, in milliseconds) and numbers of iterations to the ﬁrst solutions of rotating schedules with a restarting mechanism. The search has been implemented in Comet using the violations from the algorithm to guide the search by selecting violated variables at each iteration to be changed. In addition to the classical instance in Figure 3(a), here denoted 1d, 1e, 1n, 2x, we ran experiments over other instances with uniform daily workload, namely those over 4 to 10 weeks where the weekly workload was between 33h and 42h. Figure 3(b) gives the average run times (in milliseconds) and numbers of iterations over 50 runs to the ﬁrst solution of each of these instances. All runs were made under Comet (revision 2.0 Beta) on an Intel 2.4 GHz Linux machine with 512 MB memory. Although much more experimentation is required these initial results show that even on instances with 70 decision variables (3d, 2e, 2, 3x) it is possible to ﬁnd solutions very quickly. 4 Conclusion In summary, we have shown that the idea of describing novel constraints by automata can be successfully imported from classical (global search) constraint programming to constraint-based local search. The only related work we are (now) aware of is a Comet implementation [11] of the regular constraint [10], based on the ideas for the propagator of the soft regular constraint [16]. The diﬀerence is that they estimate the violation change compared to the nearest solution (in terms of Hamming distance from the current assignment), whereas we estimate it compared to one randomly picked solution. Experiments are needed to compare the two approaches. We do not claim that automata boost the expressive power of Comet, which already has a very powerful means of achieving extensibility of its modelling language, namely diﬀerentiable invariants [13]. Indeed, re-consider the unrolled automaton in Figure 2: its paths from source to success nodes correspond to an extensional deﬁnition of the constraint, and can be read oﬀ as the nested and/or formula (V1 = d∧((V2 = x∧. . . )∨(V2 = d∧. . . )))∨(V1 = x∧. . . )∨(V1 = e∧. . . ), which can be posted as a diﬀerentiable invariant (and is more compact than the disjunctive normal form obtained by the disjunction over all such paths). Our ﬁrst experiments (omitted here for space reasons) indicate that our incremental 5 42 algorithms for automata exploit the problem structure better than those for diﬀerentiable invariants, but a deeper comparison is needed. Future work includes a variety of extensions: it is possible to extend the implementation to handle non-deterministic ﬁnite automata, which is interesting since they are often smaller than the equivalent DFAs; further incorporating reiﬁcation should be investigated, and ﬁnally extensions such as the counters used in [2]. Acknowledgements. The authors are supported by grant 2007-6445 of the Swedish Research Council (VR), and Jun He is also supported by grant 2008611010 of China Scholarship Council and the National University of Defence Technology of China. Many thanks to Magnus Ågren (SICS) for some useful discussions on this work, and to the anonymous referees, especially for pointing out the existence of [11,16]. References 1. M. Ågren, P. Flener, and J. Pearson. Generic incremental algorithms for local search. Constraints, 12(3):293–324, September 2007. 2. N. Beldiceanu, M. Carlsson, and T. Petit. Deriving ﬁltering algorithms from constraint checkers. Proc. of CP’04, LNCS 3258:107–122. Springer, 2004. 3. N. Beldiceanu, M. Carlsson, and J.-X. Rampon. Global constraint catalogue: Past, present, and future. Constraints, 12(1):21–62, 2007. Dynamic on-line version at www.emn.fr/x-info/sdemasse/gccat. 4. S. Bourdais, P. Galinier, and G. Pesant. HIBISCUS: A CP application to staﬀ scheduling in health care. Proc. of CP’03, LNCS 2833:153–167. Springer, 2003. 5. K.C.K. Cheng and R.H.C. Yap. Maintaining generalized arc consistency on ad hoc r-ary constraints. Proc. of CP’08, LNCS 5202:509–523. Springer, 2008. 6. M.Z. Lagerkvist. Techniques for Eﬃcient Constraint Propagation. Licentiate Thesis, KTH – The Royal Institute of Technology, Stockholm, Sweden, November 2008. 7. G. Laporte. The art and science of designing rotating schedules. Journal of the Operational Research Society, 50(10):1011–1017, October 1999. 8. L. Michel and P. Van Hentenryck. Localizer: A modeling language for local search. Proc. of CP’97, LNCS 1330:237–251. Springer, 1997. 9. G. Pesant. A ﬁltering algorithm for the stretch constraint. Proc. of CP’01, LNCS 2239:183–195. Springer, 2001. 10. G. Pesant. A regular language membership constraint for ﬁnite sequences of variables. Proc. of CP’04, LNCS 3258:482–495. Springer, 2004. 11. B. Pralong. Implémentation de la contrainte Regular en Comet. Master Thesis, École Polytechnique de Montréal, Canada, 2007. 12. P. Van Hentenryck and L. Michel. Constraint-Based Local Search. MIT Press, 2005. 13. P. Van Hentenryck and L. Michel. Diﬀerentiable invariants. Proc. of CP’06, LNCS 4204:604–619. Springer, 2006. 14. P. Van Hentenryck, L. Michel, and L. Liu. Constraint-based combinators for local search. Proc. of CP’04, LNCS 3258:47–61. Springer, 2004. 15. P. Van Hentenryck, V. Saraswat, and Y. Deville. Design, implementation, and evaluation of the constraint language cc(FD). Journal of Logic Programming 37(1– 3):293–316, 1998. Unpublished manuscript Constraint Processing in cc(FD), 1991. 16. W.-J. van Hoeve, G. Pesant, and L.-M. Rousseau. On global warming: Flow-based soft global constraints. Journal of Heuristics 12(4-5):347-373, September 2006. 6 43 Research Overview: Improved Boolean Satisfiability Techniques for Haplotype Inference Student: Eric I. Hsu Supervisor: Sheila A. McIlraith Department of Computer Science University of Toronto {eihsu,sheila}@cs.toronto.edu, http://www.cs.toronto.edu Abstract. Here the authors overview an ongoing effort to extend satisfiabilitybased methods for haplotype inference by pure parsimony (HIPP). This genome analysis task, first formulated as a boolean satisfiability problem by Lynce and Marques-Silva [1], has been performed successfully by modern SAT-solvers. But, it is not as widely used as some better-publicized statistical tools, such as PHASE [2]. This paper presents the authors’ assessment of the current situation, and a preliminary statement of intention concerning their aims in this area. Namely, the situation suggests three categories of improvements for making HIPP more widely-used within the biological community: 1) the ability to handle larger problems; 2) more detailed empirical understanding of the accuracy of the “pure parsimony” criterion; and 3) additional criteria and methods for improving on this level of accuracy. 1 Background As detailed in a recent overview paper [3], the haplotype inference problem is defined over a population of individuals represented by their respective genotypes. Each genotype can be viewed as a sequence of nucleotide pairs, where the two values of each pair are split across the individual’s two chromosomes as inherited from their two parents. Much of the genetic variation between individuals consists of point mutations at known sites within this sequence, known as single nucleotide polymorphisms (SNP’s). Thus, a genotype can represented (with loss of information) as a sequence of nucleotide pairs at successive SNP sites. In particular, at each such site, an individual might have two copies of the “minor allele”– in this case, the presumably mutated or at least rarer of the two possible nucleotide values. At this site the individual is then homozygous (major). Similarly, a SNP site is realized on each chromosome by the nucleotide value that is most common for the species then the site is homozygous (minor) for this individual. The third possibility is that one of the individual’s chromosomes has the major allele at a given site, while the other chromosome has the minor allele at that site–then the genotype is heterozygous at the site in question. Notationally, the first case can be represented by the character ‘0’, the second by ‘1’, and the third, heterozygous case, by ‘2’. So, the sequence “0102” indicates an individual with two minor alleles at both the first and the 2 Improved Techniques for Haplotype Inference 44 third SNP sites measured by a particular study, and two major alleles at the second site. At the fourth site, we know that one chromosome exhibits the major allele and the other exhibits the minor. Thus, an individual’s genotype is well-defined by the sequences of its two constituent chromosomes; these two individually inherited sequences are called haplotypes. If we overload ‘0’ and ‘1’ to indicate a single minor or single major allele in a particular haplotype, then we can denote that the genotype “0102” arises from the two haplotypes “0100” and “0101”. Conversely, though, a genotype with more than one heterozygous site can be explained by multiple pairs of haplotypes, as the major and minor alleles at all heterozygous sites in the genotype can be permuted across corresponding sites in the two haplotypes. More concretely, the genotype “02122” could be realized the pairs “00100”/“01111”, “00101”/“01110”, “00110”/“01101”, or “01100”/“00111”. (Naturally, the term “pair” is used colloquially here, and does not signify any sense of ordering within the haplotype sets of size two that arise in this domain.) Accordingly, there are 2k−1 candidate haplotype pairs for explaining a genotype with k heterozygous sites. In practice, this distinction is made relevant by the lack of any practical experimental method to measure an individual’s two haplotypes instead of its genotype; in other words, the machinery can identify sites at which the two haplotypes have opposing values, but cannot tell which values are grouped together on which chromosome. The goal of haplotype inference is to guess the most likely haplotypes that generated a given set of genotypes. 2 Underlying Biological Principles How can one answer to the haplotype inference problem be preferred to any other? Because the ultimate goal is to accurately predict haplotypes appearing in a particular subject (i.e. human) population, haplotype inference frameworks must apply standards that seek to model the types of phenomena that actually drove the true state of affairs in the evolution of the subject species’ genome. In other words, human haplotypes are not drawn uniformly from the space of all possible pairs that could explain human genotypes. Rather, under the coalescent model of evolution there should be only a small number human haplotypes that were recombined and mutated to produce any of the genotypes within a given haplotype inference problem instance. Thus, early researchers used greedy methods to try to minimize the set of answers to a haplotype inference problem [4], while later systems integrated models based on “perfect phylogeny” [5] or other hierarchical organizations of answer haplotypes [6]. The most widely-used techniques at this point in time integrate empirical statistical measures of likely haplotypes [7–9, 2, 10], by analyzing the population in question, or consulting outside sources [11]. Such approaches may not require that the inferred set of answer haplotypes can be arranged into a particular evolutionary structure, but they all require the haplotype set to be maximally likely according to a particular statistical model that has been fit to the problem and/or outside frequency data. On the other hand, the “pure parsimony” methodology seeks to capture such phenomena implicitly by asking directly for the smallest possible set of haplotypes that as a whole can explain a given population of genotypes [12]. Methods that are based on Improved Techniques for Haplotype Inference 45 3 this criterion thus perform “HIPP”, indicating haplotype inference by pure parsimony. This pure parsimony principle has been achieved optimally and efficiently by employing a variety of satisfiability-based techniques on small to moderately-sized data sets of about 200 sites and 100 individuals [1, 13–15]. However, aside from a preliminary and limited evaluation of pure parsimony, which did not include such satisfiability-based techniques [16], the overall accuracy and general feasibility HIPP does not seem to be well-understood within the biological community. That is, satisfiability-based HIPP methods must demonstrate two forms of feasibility in order to achieve wider adoption within the biological community: the empirical accuracy of the pure parsimony principle itself, and the efficiency and scalability of SAT methods for achieving this parsimony criterion. While SAT-based techniques can achieve optimal parsimony, on reasonably large data sets, they still cannot be applied to the massive collections characteristic of more popular biological applications that may require on the order of half a million sites. Addressing the issue of scalability will not only enable the assessment of model accuracy and solver efficiency, but can additionally lead to more accurate results. This is because multiple minimum haplotype sets can explain the same population of genotypes [3], but some of them can be safely considered more likely a priori. For instance, solution sets whose members are more similar to each other are more realistic with respect to evolutionary theory [2, 10], and certainly one must certainly be preferred a posteriori with respect to the truth–in the real world there was a single haplotype set that produced a set of genotypes (and this set may or not be minimum.) So while the initial goal is evaluation of the HIPP principle and HIPP solvers, and the prerequisite goal is improved solver scalability, in pursuing these we would like to synergistically attain the overarching goal of making the most accurate predictions possible, as opposed to merely minimal ones. We propose to do this by making finer-grained use of biological principles in designing SAT-based methodologies for HIPP. One prominent example of such phenomena would be “linkage disequilibrium”, or correlation between sites within a sequence [17]. The sequential and mostly non-random nature of genotypes and haplotypes are at core of many of the optimizations and models that underly statistical alternatives to HIPP; finding a way to exploit them within a discrete reasoning framework would make HIPP competitive in efficiency and accuracy. Three proposals for doing so are outlined in the next section. 3 Proposals for Extending the Satisfiability-Based HIPP Framework In this section we propose three general types of improvements to the HIPP framework that can make it competitive to the current methods of choice within the biological community. – Exploiting sequential structure to improve scalability. The best-performing (and most widely-used) statistical systems [7, 8, 2] for haplotype inference utilize explicit or implicit variants of the “partition ligation” scheme of Niu et al [18]. The basic idea is to escape the combinatorial explosion of considering the entire space 4 Improved Techniques for Haplotype Inference 46 of possible haplotypes for a given sequence, and instead break the sequence into blocks. Each such block is small enough to be solved efficiently and to high accuracy; they are then recombined heuristically through a polynomial-time merging scheme. With the merging scheme comes a loss of optimality; in the case of HIPP we may not get the smallest possible explaining haplotype set by means of this scheme. But, the insight of the partition ligation scheme is that evolution does not create haplotypes uniformly at random over all possible explanations for the human genotype, and in practice the loss in optimality has proved negligible in comparison to the gains in accuracy for statistical methods [18, 7, 2]. For beyond being biologically justifiable, block partitioning can actually produce more accurate overall results because each block can be solved to higher standards of likelihood using more computationally expensive methods. For instance, a statistical approach based on MCMC sampling can choose to perform vastly more iterations on a more complex model within the confines of a small block, while realistically hoping to retain much of the benefit during the merging process [2]. Applying this framework to SAT-based approaches can provide similar gains, whether using parsimony alone or integrating statistical information in solving blocks. – Assess the accuracy of HIPP on large-scale data. When HIPP is able to handle larger data sets, it will be possible to directly compare its accuracy with other, more-popular approaches. This will allow an assessment of the model’s strengths and weaknesses and inform any attempts to improve this accuracy. At this point, SAT methods have been highly successful at solving HIPP, but it remains to be seen whether the resulting answers are themselves successful at modeling the human genome–as mentioned previously here and elsewhere [3], there can be many minimum sets that all qualify as HIPP solutions, while some of these are much more empirically likely than others. Characterizing which types of these solutions usually turn out to be more accurate will go a long way to improving the parsimony model’s fit to real populations. – Exploiting linkage disequilibrium to improve accuracy. In the same spirit as the first two points, there are important correlations between various regions of the vast majority of a species’ haplotypes, due to the recombination and mutation processes that drive evolution. To compete with statistically-oriented tools, HIPP can be extended to encompass the same sort of empirical information concerning such correlations [19, 20]. This may entail a weighted SAT or a MAXSAT formulation that favors solutions that adhere to stronger correlations that have been observed from the same sources of data as used by the competitor techniques. 4 Conclusion The authors have begun to implement block decomposition within the SAT-based HIPP framework, but this description of research is decidedly preliminary and strictly for expository purposes. Once the system is able to handle large problems, the next step will be to study its accuracy, especially in terms of adding additional solution criteria to pure parsimony. The concluding step will be to use information derive from individual problem-instance and/or reference data to actually achieve such criteria. It would be a Improved Techniques for Haplotype Inference 47 5 great opportunity to be able to discuss such plans with others who are working on this problem and related areas! References 1. Lynce, I., Marques-Silva, J.: Efficient haplotype inference with boolean satisfiability. In: Proc. of 21st National Conference on Artificial Intelligence (AAAI ’06), Boston, MA. (2006) 2. Stephens, M., Scheet, P.: Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. American Journal of Human Genetics 76 (2005) 449–462 3. Lynce, I., Graça, A., Marques-Silva, J., Oliveira, A.L.: Haplotype inference with boolean constraint solving: An overview. In: Proc. of 20th IEEE Int’l Conf. on Tools with Artificial Intelligence (ICTAI ’08), Dayton, OH. (2008) 4. Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7(2) (1990) 111–122 5. Kimmel, G., Shamir, R.: GERBIL: Genotype resolution and block identification using likelihood. PNAS 102(1) (2005) 158–162 6. Xing, E.P., Sohn, K.A., Jordan, M.I., Teh, Y.W.: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture. (2006) 7. Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin, Z.S., Munro, H.M., Abecasis, G.R., Donnelly, P.: A comparison of phasing algorithms for trios and unrelated individuals. American Journal of Human Genetics 78 (2006) 437–450 8. Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Human Genetics 124 (2008) 439–450 9. Salem, R.M., Wessel, J., Schork, N.J.: A comprehensive literature review of haplotyping software and methods for use with unrelated individuals. Human Genomics 2 10. Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics 78 (2006) 629–644 11. The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 449 (2007) 851–862 12. Gusfield, D.: Haplotype inference by pure parsimony. In: Proc. of 14th Symp. on Combinatorial Pattern Matching (CPM ’03), Morelia, Mexico. (2003) 144–155 13. Graça, A., Marques-Silva, J., Lynce, I., Oliveira, A.L.: Efficient haplotype inference with pseudo-boolean optimization. In: Proc. of 2nd Int’l Conf. on Algebraic Biology (AB ’07), Linz, Austria. (2007) 125–139 14. Erdem, E., Türe, F.: Efficient haplotype inference with answer set programming. In: Proc. of 23rd National Conference on A.I. (AAAI ’08), Chicago, IL. (2008) 436–441 15. Lynce, I., Marques-Silva, J., Prestwich, S.: Boosting haplotype inference with local search. Constraints 13(1-2) (2008) 155–179 16. Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14) (2003) 1773–1780 17. Slatkin, M.: Linkage disequilibrium–understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics 9 (2008) 477–485 18. Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked single nucleotide polymorphisms. American Journal of Human Genetics 70(1) (2002) 157–169 19. Excoffier, L., Slatkin, M.: Maximum-Likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5) (1995) 921–927 20. Kuhner, M.K.: LAMARC 2.0: Maximum likelihood and Bayesian estimation of molecular haplotype frequencies in a diploid population. Bioinformatics 22(6) (2006) 768–770 48 Foundations of Symmetry Breaking Revisited Tim Januschowski (student),1,2⋆ Barbara M. Smith,3 and M.R.C. van Dongen2 1 2 Cork Constraint Computation Centre Computer Science Department, University College Cork, Cork, Ireland 3 School of Computing, University of Leeds, UK janus@cs.ucc.ie Abstract. Puget’s article “On the Satisﬁability of Symmetrical Constraint Satisfaction Problems” [6] is widely acknowledged to be one of the founding papers in symmetry breaking (see for example [3, 7]). To date, Puget’s deﬁnition of a valid reduction seems to be the only common ground that all authors on static symmetry breaking constraints agree on. His original deﬁnition of a valid reduction is for a restricted form of symmetries. We extend Puget’s deﬁnition to the recent, more general deﬁnition of symmetries by Cohen et al. [1]. In our extension, we require a valid reduction to be a tighter constraint satisfaction problem instead of one with more constraints. We re-formulate Puget’s central theorems on valid reductions for our deﬁnition and present a stronger result on the existence of valid reductions. 1 Introduction Symmetries are a common property of Constraint Satisfaction Problems (csps). In practice, symmetries often are a key to an eﬀective solution of symmetric csps: only if we properly exploit the symmetries may we solve such csps within a reasonable time. The classic approach to dealing with symmetries is the addition of traditionally called symmetry breaking constraints before search. This is the most commonly used symmetry breaking approach in practice and it receives considerable attention. We consider the theoretical foundations of the addition of symmetry breaking constraints in the present work following Puget [6]. Puget introduces his symmetry breaking constraint [6] as a constraint whose addition to a given csp leads to a related csp that he calls a valid reduction. Valid reductions are deﬁned without dependence on any speciﬁc symmetry breaking constraint however. This is why they seem to be the only common ground that authors on symmetry breaking constraints agree on, e.g. [2, 5]. These constraints diﬀer in so many aspects, that valid reductions may oﬀer the only framework to theoretically compare the diﬀerent symmetry breaking constraints. Furthermore, valid reductions oﬀer a means to study the theoretical eﬀects of the addition of symmetry breaking constraints. This is the reason why an investigation of valid reductions is important for the current research on symmetry breaking ⋆ Tim Januschowski is supported by the EMBARK initiative by the Irish Research Council for Science, Engineering and Technology 49 constraints. We see valid reductions as a ﬁrst step towards a general theory of static symmetry breaking. We informally revisit Puget’s deﬁnition of a valid reduction for a restricted form of symmetry. However, the recent deﬁnition of symmetries by Cohen et al. [1] is much wider. We carry Puget’s deﬁnition of a valid reduction over to this deﬁnition of symmetry and require a valid reduction to be a tighter csp than the original csp. Next we provide a simple example that shows that adding Puget’s constraints may not change the csp. This underlines the importance of our change of emphasis in the deﬁnition. We present a valid reduction that has more symmetries than the original csp. We identify the cases in which we can ﬁnd a valid reduction that diﬀers from the original csp and we show stronger versions of Puget’s central theorems. Due to space restrictions, we have omitted all proofs. We show an example of a symmetric csp, whose symmetries cannot be eliminated. 2 Preliminary Deﬁnitions We deﬁne a csp P as the triple (X, D, C), where X is the set of variables of P , where D(x) is the domain of x ∈ X, and where C is the set of constraints of P . Since we are interested in theoretical results, we mainly consider constraints in extensional form. We say that the tuples in the relation of c are allowed by c. All other tuples are forbidden by c. We also call a forbidden tuple a no-good. For ease of presentation, we want to exclude csps that have variables with empty domains. We further assume that in the domain of a variable no values exist that are forbidden by a unary constraint. We call a (variable,value)-assignment a literal . A solution of P is an assignment of values to all variables that is allowed by all constraints. If a solution exists, we say that P is satisﬁable, otherwise P is unsatisﬁable. A hypergraph G is a tuple (V, E) with V or V (G) the set of nodes and E or E(G) the set of hyperedges. For V̂ ⊆ V , we denote the subgraph of G that is induced by V̂ by G[V̂ ]. The microstructure S complement ( msc) of a csp (X, D, C) is the hypergraph G, where V (G) = x∈X {x} × D(x) and {(x1 , t1 ), (x2 , t2 ), . . . , (xk , tk )} is in E(G) if and only if: – a constraint in C with scope {x1 , x2 , . . . , xk }, k > 1 exists that forbids the tuple ht1 , . . . , tk i – k = 2, x1 = x2 and t1 6= t2 . For binary csps, this graph is the complement of the microstructure [4]. We want to deﬁne the msc as a loop-free hypergraph, which explains why k > 1. More speciﬁcally, we assume that literals forbidden by unary constraints do not appear as nodes in the msc. A stable set in a hypergraph G is a set of nodes, V̂ , such that G[V̂ ] does not contain any hyperedge. A csp with n variables is satisﬁable if and only if its msc contains a stable set with n nodes. A graph automorphism of hypergraph G is a permutation of V (G) that preserves adjacency. We denote the group of 50 automorphisms of G by aut(G). The orbit of a set of nodes V̂ in V (G) is the set of all images of V̂ under elements of aut(G). We denote it by orbit(V̂ ). From now on, we shall write graph instead of hypergraph. The following deﬁnition, by Cohen et al. [1], is the current state-of-the-art for symmetry deﬁnitions. Deﬁnition 1 (Constraint Symmetry [1]). Let G be a msc of a csp P . A symmetry or constraint symmetry of P is an element of aut(G). Puget [6] considers a strict subclass of the symmetries in Def. 1. Having deﬁned what the symmetries of a csp are, we proceed by deﬁning a symmetric csp. Clearly, the identity is an automorphism of any csp. It is natural to exclude csps whose only automorphism is the identity from being called symmetric csps. For technical reasons we also exclude csps the domains of which are all singletons. Deﬁnition 2 (Symmetric CSP). A csp having at least one variable with domain size strictly larger than 1 is symmetric if it admits more than one symmetry. Note that an empty csp is not symmetric according to this deﬁnition. We now have all ingredients to extend Puget’s deﬁnition of a valid reduction. 3 Valid Reductions Symmetries partition the set of solutions into orbits. The addition of a symmetry breaking constraint always produces a csp with at least one representative of each orbit of solutions: an orbit representative. Puget deﬁnes a valid reduction as a csp with more constraints than the original csp such that for every orbit of solutions (under Puget’s restricted symmetries) at least one orbit representative remains. The following is our deﬁnition. Deﬁnition 3 (Valid Reduction). Let P = (X, D, C) be a csp with n variables and msc G. Let P̃ = (X, D̃, C̃) be a csp with the same variables as P , ∅ = 6 D̃(x) ⊆ D(x) for all x ∈ X. We call P̃ a valid reduction of P if the msc G̃ of P̃ fulﬁls the following conditions: 1. a) V (G̃) ⊆ V (G) and b) E(G[V (G̃)]) ⊆ E(G̃), and 2. for every maximum stable set S of size n in G, at least one element of orbit(S, aut(G)) exists in G̃. According to the deﬁnition, we can produce a valid reduction by adding hyperedges to the msc of the csp, which is equivalent to adding a constraint of arity 2 or higher to the csp and by removing nodes from the msc, which is equivalent to adding a unary constraint. Deﬁnition 3 is an extension of Puget’s original deﬁnition since we allow for more general symmetries to deﬁne the orbits. We stress that we may add unary constraints to a csp (Condition 1a). Puget implicitly allows unary constraints in his deﬁnition, though “at the core of [his] method” is “adding ordering constraints” [6]. We disallow unary constrains that remove all literals corresponding 51 to a variable. Though this may reduce symmetries, it may make it too diﬃcult to relate a valid reduction to the original csp and to reconstruct solutions. Further work will address this. In Def. 3, we substitute fewer stable sets (partial or full solutions) in the msc for higher number of constraints. We ensure fewer stable sets in our deﬁnition by requiring the msc of a valid reduction to either have fewer nodes or having more edges on the set of nodes in the msc that the valid reduction and the the original csp have in common (Condition 1b). We shall say that a valid reduction with fewer stable sets in its msc than the original msc is tighter. A higher number of constraints in Puget’s sense could mean a mere repetition of an already present constraint. Valid reductions are deﬁned in terms of removing symmetric solutions rather than eliminating symmetries. However, by removing symmetric solutions, we may also eliminate symmetries. The existence of orbit representatives ensures that we can reconstruct all solutions of the original csp by applying the symmetries to the orbit representatives in the original csp. Puget proved a weaker version of the following theorem. Theorem 4. Any csp P is satisﬁable if and only if any valid reduction of P is satisﬁable. Besides Theorem 4, Puget’s other central result is the existence of a valid reduction for symmetric csps. He proves that for a csp with a symmetry φ that maps a variable x to a diﬀerent variable y, by showing that adding the constraint x ≤ y always produces a valid reduction. The following example shows that adding this constraint may not produce a tighter valid reduction. Consider a csp P with 2 variables x and y, each with domain {1, 2} and one extensional constraint that only allows {(x, 2), (y, 2)} and forbids all other tuples. This csp has a symmetry (also in Puget’s restricted terminology), that swaps the variables x and y. Adding x ≤ y does not change the msc, but raises the number of constraints of P . Hence, producing a tighter valid reduction with Puget’s constraints is not always possible. We believe that inventing the symmetry breaking constraint x ≤ y is the more important achievement than the statement of the existence theorem for valid reductions. Otherwise, Puget could have proved the existence theorem also by introducing a constraint that is a copy of an already existing constraint. Let us give another example that shows that adding ordering constraints may result in a valid reductions that has more symmetries than the original csp. Example 5. Consider a csp with two variables x and y, each with domain {1, 2, 3}. The constraint on x and y allows {(x, 1), (y, 3)}, {(x, 3), (y, 1)} and {(x, 3), (y, 3)}. The microstructure of this csp with symmetry group of size 4 is depicted in Fig. 1a. Consider symmetry ψ which swaps (x, 2) with (y, 2) and is the identity on all other literals. Symmetry ψ cannot be eliminated through the addition of an ordering constraint, e.g., y ≤ x. We show the microstructure of a valid reduction in Fig. 1b. The symmetry group of the valid reduction has size 12. Example 5 shows that adding ordering constraints may not always eliminate symmetries. However, removing nodes would eliminate the symmetries of Ex. 5. 52 (x, 3) (y, 3) (x, 3) (y, 3) (x, 2) (y, 2) (x, 2) (y, 2) (x, 1) (y, 1) (x, 1) (y, 1) (a) The microstructure of Ex. 5. (b) Adding y ≤ x to the csp in Ex. 5 produces the above valid reduction with a larger symmetry group than the original csp. Fig. 1: The microstructure of the csp in Ex. 5 and one of its valid reductions. In Def. 3, we do not require a valid reduction to be strictly tighter. Any csp, symmetric or not, is a valid reduction of itself. This generalises Puget’s original existence theorem for valid reductions. When we reason about the existence of valid reductions, we must therefore be interested in showing the existence of a valid reduction that is strictly tighter than the original, symmetric csp. We call such a valid reduction proper. The following theorem answers the question which symmetric csps admit a proper valid reduction. Theorem 6. A symmetric csp P has a proper valid reduction if and only if it has a partial solution that is not part of a solution or it has a symmetry φ that maps a solution S solution φ(S) such that S 6= φ(S). If P has such a symmetry, then a proper valid reduction can be obtained by adding a unary constraint. Theorem 6 singles out the symmetries that map a solution to a diﬀerent solution as the important ones for valid reductions. However, before starting to prove the satisﬁability of a csp, we may only know that symmetries exist, e.g., through automatic detection in preprocessing. We cannot assume knowing whether the symmetries map a solution to a diﬀerent solution. In Fig. 2, we depict the microstructure of a symmetric csp that does not admit a proper valid reduction. Apart from the identity, only one automorphism exists. This automorphism swaps the grey-coloured literals (z, 3) with (y, 2), and is the identity on all other literals. Every partial solution is part of a solution whose orbit consists only of itself. Hence, we cannot remove a node or an edge. This means that no proper valid reduction of this csp exists. An example of an non-symmetric csp without a proper valid reduction is the empty csp. 4 Conclusion We have presented Puget’s fundamental work [6] on symmetry breaking in light of the recent deﬁnition of symmetry [1]. We have re-formulated his central results and strengthened them. Our results suggest that symmetry breaking based on the idea of valid reductions may, in theory, only supply additional information to a problem if symmetric solutions exist. This is somewhat disappointing, since the existence of symmetric solutions can be decided only by solving the csp. 53 (x, 1) (z, 1) (x, 2) (y, 1) (z, 2) (y, 2) (z, 3) Fig. 2: The microstructure of a symmetric csp which does not admit a proper valid reduction. The only non-trivial symmetry of this csp only swaps the grey-coloured nodes (z, 3) and (y, 2). We can neither remove a node nor remove an edge. Valid reductions are among the requirements for a constraint to be considered as a symmetry breaking constraint. Hence, they oﬀer a possibility to analyse the theoretical power of symmetry breaking constraints. While valid reductions allow for constraints that are far from our intuition about symmetry breaking constraints, valid reductions can still give us an idea about the theoretical limits of symmetry breaking constraints. In further work, we will consider valid reductions with further properties. We will analyse and compare the very diverse symmetry breaking constraints using valid reductions. References 1. D. Cohen, P. Jeavons, C. Jeﬀerson, K. Petrie, and B. Smith. Symmetry deﬁnitions for constraint satisfaction problems. Constraints 11, 2006. 2. J. Crawford, M. Ginsberg, E. Luks, and A. Roy. Symmetry-breaking predicates for search problems. In KR’96: Principles of Knowledge Representation and Reasoning, pages 148–159. Morgan Kaufmann, 1996. 3. P. Flener, J. Pearson, M. Sellmann, and P. V. Hentenryck. Static and dynamic structural symmetry breaking. In Principles and Practice of Constraint Programming – CP 2006, volume 4204/2006 of Lecture Notes in Computer Science 4204, 2006, pages 695–699, 2006. 4. E. C. Freuder. Eliminating interchangeable values in constraint satisfaction problems. In Proceedings AAAI’91, pages 227–233, 1991. 5. V. Kaibel and M. E. Pfetsch. Packing and partitioning orbitopes. Mathematical Programming, 114, Number 1 / July, 2008:1–36, 2008. 6. J.-F. Puget. On the satisﬁability of symmetrical constrained satisfaction problems. In Methodologies for Intelligent Systems, volume 689/1993 of Lecture Notes in Computer Science, pages 350–361, London, UK, 1993. Springer-Verlag. 7. F. Rossi, P. van Beek, and T. Walsh, editors. Handbook of Constraint Programming. Elsevier, 2006. 54 Energetic Edge-Finder For Cumulative Resource Roger KAMEUGNE1 (student) and Laure Pauline FOTSO2 1 2 University of Yaounde 1, Department of Mathematics, P.O Box: 812 Yaounde-Cameroon. rkameugne@yahoo.fr or rkameugne@gmail.com University of Yaounde 1, Department of Computer Science, P.O Box: 812 Yaounde-Cameroon. lpfotso@ballstate.bsu.edu Abstract. Edge-finding, not-first/not-last, sweeping and overloading checking are well known propagation rules used to prune the start and end times of tasks which have to be processed without interruption on a cumulative resource. In this paper, we present new propagation rule title energetic edge-finder who can make additional pruning. The new algorithm is organized in two phases. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time of each phase is O(n2 ). 1 Introduction As it is suggested in [MV05], for a given pair (i, Θ) where i is a task and Θ a set of tasks, if it is prove that in all feasible schedules task i ends after the completion time of all tasks of Θ, then the rest-based update of edge-finding is valid. In this paper, we present a two phases algorithm hybridization of edge-finding and energetic reasoning who can make additional pruning. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. It is based on our edgefinder present in [KF09b]. Indeed, in [KF09b], we propose an iterative O(n2 )implementation of the edge-finding who reached the same fix point as the rule after many runs. This idea was already used for other rules [TL00,Vi04,KF09a]. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time of each phase is O(n2 ). The rest of the paper is organized as follows. Section 2 presents notations used in the paper. In section 3, we present on an example our motivations. Section 4 specify the energetic edge-finder rule and a relaxation of this new rule. Section 5 presents an O(n2 )-implementation of the relax energetic edge-finder rule. Section 6 concludes the paper. 55 2 Notations An instance of a CuSP consists of a set T of tasks to be performed on a resource of capacity C. Each task i must be executed (without interruption) over pi time units between an earliest start time ri (release date) and a latest end time di (due date). Moreover, it requires a constant amount of resource ci . K = {ci , i ∈ T } denotes the set of distinct capacity requirements of tasks and k = |K|. Throughout the paper, we assume that ri + pi ≤ di and ci ≤ C, otherwise the problem has no solution. We also assume that all data are integer. A solution of a CuSP is a schedule that assigns a starting date si to each task i so that: 1. ∀i ∈ T : ri P ≤ si ≤ si + pi ≤ di 2. ∀t : ci ≤ C. i∈T si ≤ t < si + pi ei = ci pi denotes the energy of task i. We extend the notation from tasks to set of tasks by setting X rΩ = min rj , dΩ = max dj , eΩ = ej j∈Ω j∈Ω j∈Ω where Ω is a set of tasks. By convention, when Ω is the empty set, rΩ = +∞, dΩ = −∞ and eΩ = 0. CuSP is a sub-problem of the Resource Constrained Project Scheduling Problem (RCPSP), where precedence constraints are relaxed and a single resource is considered at a time. The CuSP is a NP complete problem [BL99]. Definition 1 (Task Intervals). Let i, k ∈ T (eventually the same task). The task intervals Ωi,k is the set of tasks Ωi,k = {j ∈ T | ri ≤ rj ∧ dj ≤ dk } Let i be a task and [t1 , t2 ] be a time interval with t1 < t2 . The ”leftshift/right-shift” required energy consumption of i over [t1 , t2 ] noted WSh is ci times the minimum of the three following durations. • t2 − t1 , the length of the interval; • p+ i (t1 ) = max(0, pi −max(0, t1 −ri )), the number of time units during which i executes after time t1 if i is left-shifted, i.e., scheduled as soon as possible; • p− i (t2 ) = max(0, pi −max(0, di −t2 )), the number of time units during which i executes before time t2 if i is right-shifted, i.e., scheduled as late as possible. − This leads to WSh (i, t1 , t2 ) = ci min(t2 − t1 , p+ i (t1 ), pi (t2 )). The overall required energy consumption W (t , t ) over an interval [t , t Sh 1 2 1 2 ] is define as WSh (t1 , t2 ) = P WSh (i, t1 , t2 ). This required energy consumption was first define in [ELH,Lo]. i∈T Other varieties of required energy consumption are available. We have for example the Fully Elastic required energy noted WF E and the Partial Elastic required 56 energy noted WP E [BL99,B02]. It is obvious that if there is a feasible schedule, then ∀t1 , ∀t2 ≥ t1 , SSh (t1 , t2 ) = C(t2 − t1 ) − WSh (t1 , t2 ) > 0. (1) We present only algorithm to update start times because the algorithm for update the end times is symmetrical. 3 Motivation For more visibility, let us analyze the example given in Table 1. Example 1. Consider the CuSP instance of Table 1 where five tasks share a resource of capacity C = 3. Table 1. An instance of CuSP tasks r d p c a b c d e 1 3 3 3 5 10 6 6 5 7 4 1 1 1 2 1 2 2 3 1 This instance satisfy the necessary condition of existence of feasible scheduling give by relation (1). Using the propagation rule edge-finding, extended edgefinding, not-first/not-last, sweeping on this instance, no time bound is adjusted. But the hybrid rule describe up allow the update the release date of task a to 4. Indeed, ∆+ (a, 3, 6) = 1 and rest(Θ, ca ) = 1 with Θ = {b, c, d} , where ∆+ (a, t1 , t2 ) = WSh (t1 , t2 ) − WSh (a, t1 , t2 ) + ca · p+ a (t1 ) − C · (t2 − t1 ) and rest (Θ, ca ) = eΘ − (C − ca ) (dΘ − rΘ ) . Therefore, it follows that ra = rΘ + ⌈ c1a rest (Θ, ca )⌉ = 4 since da > dΘ = 6 and ∆+ (a, 3, 6) > 0. 4 Energetic edge-finder In this section, after a specification of the energetic edge-finder rule, we focus it relaxation on which we can provide a quadratic algorithm. 57 4.1 Specification of the energetic edge-finder algorithm. As for the energetic reasoning, the energetic edge-finder require a CuSP instance satisfying the necessary condition of existence of feasible scheduling of relation (1). We have the following definition. Definition 2 (Energetic edge-finding algorithm). The energetic edge-finding algorithm receives as input a CuSP instance satisfying the necessary condition of relation (1). It produces as output a vector ® ELB ′ (1) , ..., ELB ′ (n) where ELB ′ (i) = max (ri , ELB ′ (i)) and ′ ELB (i) = max max t1 < t2 Θ⊆T ∆+ (i, t1 , t2 ) > 0 rest (Θ, ci ) > 0 di > t 2 dΘ ≤ t 2 » 1 rest (Θ, ci ) rΘ + ci ¼ with ∆+ (i, t1 , t2 ) = WSh (t1 , t2 ) − WSh (i, t1 , t2 ) + ci · p+ i (t1 ) − C · (t2 − t1 ). and rest (Θ, ci ) = ½ eΘ − (C − ci ) (dΘ − rΘ ) 0 if not if Θ 6= ∅ . We can derived for our O(n2 )-implementation of edge-finding [KF09b], a precomputation phase running in O(n2 ). In spite of our efforts, we are unable to exhibit a quadratic algorithm to compute all the adjustments on the O(n2 ) relevant time intervals. 4.2 Specification of the relax energetic edge-finder algorithm To provide a quadratic algorithm, we have considered a relaxation of the rule. For a given task i, we pair attention on time intervals [t1 , t2 ] satisfying t1 = rΘ , t2 = dΘ and rest(Θ, ci ) > 0. Therefore, we have the following specification of the relax version of the energetic edge-finder. Definition 3 (Relax Energetic edge-finding algorithm). The relax energetic edge-finding algorithm receives as input a CuSP instance satisfying the necessary condition of relation (1). It produces as output a vector D E ELB1′ (1) , ..., ELB1′ (n) 58 where ELB1′ (i) = max (ri , ELB1′ (i)) and ELB1′ (i) = » ¼ 1 max rΘ + rest (Θ, ci ) ci Θ⊆T ∆+ (i, rΘ , dΘ ) > 0 di > d Θ rest (Θ, ci ) > 0 with ∆+ (i, rΘ , dΘ ) = WSh (rΘ , dΘ ) − WSh (i, rΘ , dΘ ) + ci · p+ i (rΘ ) − C · (dΘ − rΘ ). and rest (Θ, ci ) = 5 5.1 ½ eΘ − (C − ci ) (dΘ − rΘ ) 0 if not if Θ 6= ∅ . Relax energetic edge-finder algorithm Precomputation We call P reComp the adapt version of our O(n2 ) edge-finding algorithm presented in [KF09b] who compute the innermost maximization of the edge-finder specification and identify the characteristics of the corresponding task interval use for. It return an array g of length 3n where n is the number of tasks. For a given index i with 1 < i < n, g(i) is the value use to update the release date of task X(i) where X is an array of tasks sorted in non-decreasing order of release dates required as by the algorithm. g(n + i) and g(2n + i) are respectively the release and due date of the task intervals use to compute g(i). For more detail, see the appendix and [KF09b]. 5.2 Relax energetic edge-finder algorithm Using the precomputation phase P reComp, we can derive a second phase for the relax form of the energetic edge-finder. This second phase runs in O(n2 ) and perform additional adjustment. Require: X is an array of tasks sorted by non-decreasing release dates; Require: Y is an array of tasks sorted by non-decreasing due dates; Ensure: ELB(x) are computed for all 1 ≤ x ≤ n. 1: g := P reComp(); 2: for i := 1 to n do 3: ELB(i) := rX(i) ; 59 4: end for 5: for i := 1 to n do 6: if g(i) > LB(i) then 7: t1 = g(n + i), t2 = g(2n + i); 8: if t1 < t2 then 9: W := 0; 10: for x := 1 to n do 11: if x 6= X(i) then 12: W := W + WSh (x, t1 , t2 ); 13: end if 14: end for 15: if W + cX(i) p+ X(i) (t1 ) > C(t2 − t1 ) then 16: ELB(i) = max(LB(i), g(i)); 17: end if 18: end if 19: end if 20: end for Algorithm 1: CompEEF: Energetic Edge-finder algorithm in O(n2 ) and O(n) space. It is possible to increase the power of the algorithm without change the complexity. Line 16 of Algorithm 1 can be replace by the following line. 16’: ELB(i) = max(LB(i), g(i), t2 + ⌈ c1i W + cX(i) p+ X(i) (t1 ) − C(t2 − t1 )⌉ − pi ) It is possible to compute the complete energetic edge-finder with the partial elastic energy required in O(n2 log(k)). Indeed, with the partial elastic energetic required, the time points t1 and t2 correspond to release dates and due dates of some tasks. We can therefore derived an O(n2 log(k)) algorithm for the energetic edge-finder since Baptiste’s algorithm runs in O(n2 log(k)) [BL99,B02]. 6 Conclusion Edge-finding, not-first/not-last, sweeping, overloading checking and energetic reasoning are well known propagation rules used to prune the start and end times of tasks which have to be processed without interruption on a cumulative resource. In this paper, we have presented a new propagation rule title energetic edge-finder who can make additional pruning. The new algorithm is organized in two phases. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time the of each phase is O(n2 ). 60 References [BL99] Ph. Baptiste. C. Le Pape. & W. Nuijten. Satisfiability Test and Time-Bound Adjustements for Cumulative Scheduling Problems, Annals of Operations Research 92. 305-333. (1999). [TL00] P. Torres and P. Lopez. On Not-First/Not-Last Conditions in Disjunctive Scheduling. European Journal of Operational Research, Vol 127(2):332-343. (2000). [Vi04] P. Vilim. O (n log n) Filtering Algorithms for Unary resource Constraint. In proceeding of CP-AI-OR. (2004). [KF09b] R. Kameugne, L.P. Fotso and Youcheu Ngo-Kateu. Cumulative Edge-Finding Algorithm in O(n2 ). Submitted for publication to INFORMS Journal on Computing (2009). [KF09a] R. Kameugne, L.P. Fotso and E. Kouakam. Cumulative Not-First/Not-Last Algorithms in O(n3 ). Submitted for publication to Journal of Scheduling, (2009). [MV05] L. Mercier and Pascal Van Hentenryck. Edge Finding for Cumulative Scheduling. INFORMS Journal on Computing 20(1), pp. 143-153, (2008). [B02] Ph. Baptiste. Résultats de Complexité et Progammation par Contraintes pour l’Ordonnancement. HDR thesis. Compiègne University of Technology. (2002). [ELH] Esquirol P. Lopez P. et Huguet M.-J. Ordonnancement de la production, chapitre Propagation de contraintes en ordonnancement, Hermès Science Publications, Paris, 2001. [Lo] Lopez P. Approche énergètique pour l’ordonnancement de tâches sous contraintes de temps et de ressources, Thèse de doctorat, Université Paul Sabatier, Toulouse, 1991. 61 7 Appendix Proposition 1 summarizes the main dominance properties of edge-finding rule. Proposition 1. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T . If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then there exists four tasks j, k, j ′ , k ′ such that rj ≤ rj ′ < dk′ ≤ dk < di ∧ rj ≤ ri and the edge-finding rule applied to task i with the pair (Ωj,k , Ωj ′ ,k′ ) allows the same update of the earliest start time of task i. Proof. See proof of Propositions 2, 3, 4 and 6 of [MV05]. Proposition 2. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then there exist two tasks k and jmax1 such that: 1. dk = dΩ ∧ rjmax1 ≤ ri ; 2. ∀j ∈ T if rj ≤ ri then eΩjmax1 ,k + Crjmax1 ≥ eΩj,k + Crj (2) 3. The pair (Ωjmax1 ,k , Θ) allows to update the earliest start time of i. Proof. See [KF09b]. Definition 4 (Pair of interest ). Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. A pair (Ω, Θ) is a pair of interest to task i if and only if α(Ω, i) ∧ di > dΩ ∧ rest (Θ, ci ) > 0 ∧ rΘ + 1 rest (Θ, ci ) > ri . ci Definition 5. Let i, k be two tasks of an E-feasible CuSP. jmax is a task such that ri < rjmax and for all task j with ri < rj eΩjmax ,k eΩj,k ≤ . dk − rj dk − rjmax Proposition 3. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then 1. if ri < rΘ there exists a task jmax such that ri < rjmax and the pair (Ω, Ωjmax ,k ) is a pair of interest to task i where dΘ = dk . 2. if rΘ ≤ ri the pair (Ω, Ω) or (Θ, Θ) is a pair of interest to task i. Proof. See [KF09b]. According to Proposition 3, after detecting all edge-finding conditions, we can sometime perform a poor adjustment. This approach can be used iteratively, until no adjustment is found (reached the fix point). 62 Require: X is an array of tasks sorted by non-decreasing release dates; Require: Y is an array of tasks sorted by non-decreasing due dates; Ensure: g(x) are computed for all 1 ≤ x ≤ 3n. 1: for x := 1 to 3n do 2: g(x) := −∞; 3: end for 4: for y := 1 to n do 5: W := 0, maxW := 0, maxEst := −∞; 6: for x ¡:= n down to ¢1 do 7: if dX(x) ≤ dY (y) then 8: W ³:= W + eX(x) ´ maxW then 9: if dY (y)W −rX(x) > dY (y) −maxEst 10: maxW := W, maxEst := rX(x) ; 11: end if 12: else 13: restW := maxW − (C − cX(x) )(dY (y) − maxEst); 14: a := maxEst + ⌈restW/cX(x) ⌉ else −∞ ¡ if (restW > 0) then ¢ 15: if a > max(g(x), rX(x) ) then 16: g(x) = max(g(x), a), g(n + x) = maxEst, g(2n + x) = dY (y) ; 17: end if 18: end if 19: W (x) := W ; 20: end for 21: minSL := −∞, maxEst1 := dY (y) ; 22: for x ¡:= 1 to n do ¢ 23: if W (x) − C(dY (y) − rX(x) ) > minSL then 24: maxEst1 := rX(x) , minSL := W (x) − C(dY (y) − maxEst1); 25: end if 26: if (dX(x) > dY (y) ) then 27: restW := minSL + cX(x) (dY (y) − maxEst1); 28: b := if (maxEst1 ≤ dY (y) ) ∧ (restW > 0) then maxEst1 + ⌈restW/cX(x) ⌉ else −∞ 29: if b > max(g(x), rX(x) ) then 30: g(x) = max(g(x), b), g(n + x) = maxEst1, g(2n + x) = dY (y) ; 31: end if 32: end if 33: end for 34: end for Algorithm 2: PreComp: Edge-finding precomputation algorithm in O(n2 ) and O(n) space This idea was already used for the disjunctive and cumulative Not-First/NotLast rule [TL00,Vi04,KF09a]. We have the following precomputation phase. 63 Theorem 1. Algorithm 2 computed iteratively the innermost maximization of the edge-finding specification and identify the characteristics of the corresponding task intervals use for. It runs in O(n2 ) time and uses O(n) space. Proof. Direct consequence of Proposition 3. The preprocessing time for sorting the arrays X and Y is O(nlog(n)). The first loop 1-3 runs in O(n). The second loop 4-34 contents two inner loop 6-20 and 22-33 and each inner loop runs in O(n2 ). Therefore the overall complexity of Algorithm 2 is O(n2 ). In algorithm 2, • The first main loop (line 1-3) initializes array g. • The second main loop (line 4-34) iterates over all due dates of tasks in non decreasing order. • The first inner loop (6-20) iterates over all release dates sorted in non increase order and identify the task jmax of Definition 5. Lines 15-17 identify and update the characteristics of the potential task interval that can be use to. • The inner loop (22-33) iterates over all release dates sorted in non decrease order and identify the task jmax1 of Proposition 2. Lines 29-31 identify and update the characteristics of the potential task interval that can be use to. 64 Dominion A constraint solver generator Lars Kotthoff (student) supervised by Ian Miguel and Ian Gent {larsko,ianm,ipg}@cs.st-andrews.ac.uk University of St Andrews Abstract This paper proposes a design for a system to generate constraint solvers that are specialised for speciﬁc problem models. It describes the design in detail and gives preliminary experimental results showing the feasibility and eﬀectiveness of the approach. 1 Introduction Currently, applying constraint technology to a large, complex problem requires signiﬁcant manual tuning by an expert. Such experts are rare. The central aim of this project is to improve the scalability of constraint technology, while simultaneously removing its reliance on manual tuning by an expert. We propose a novel, elegant means to achieve this – a constraint solver synthesiser, which generates a constraint solver specialised to a given problem. Constraints research has mostly focused on the incremental improvement of general-purpose solvers so far. The closest point of comparison is currently the G12 project [1], which aims to combine existing general constraint solvers and solvers from related ﬁelds into a hybrid. There are previous eﬀorts at generating specialised constraint solvers in the literature, e.g. [2]; we aim to use state-of-the-art constraint solver technology employing a broad range of diﬀerent techniques. Synthesising a constraint solver has two key beneﬁts. First, it will enable a ﬁne-grained optimisation not possible for a general solver, allowing the solving of much larger, more diﬃcult problems. Second, it will open up many new research possibilities. There are many techniques in the literature that, although eﬀective in a limited number of cases, are not suitable for general use. Hence, they are omitted from current general solvers and remain relatively undeveloped. Among these are for example conﬂict recording [3], backjumping [4], singleton arc consistency [5], and neighbourhood inverse consistency [6]. The synthesiser will select such techniques as they are appropriate for an input problem. Additionally, it can also vary basic design decisions, which can have a signiﬁcant impact on performance [7]. The system we are proposing in this paper, Dominion, implements a design that is capable of achieving said goals eﬀectively and eﬃciently. The design decisions we have made are based on our experience with Minion [9] and other constraint programming systems. The remainder of this paper is structured as follows. In the next section, we describe the design of Dominion and which challenges it addresses in particular. We then present the current partial implementation of the proposed system and give experimental results obtained with it. We conclude by proposing directions for future work. 65 Analyser Solver speciﬁcation Generator Specialised solver problem model Solution Figure 1. Components and ﬂow of information in Dominion. The part above the dashed line is the actual Dominion system. The dotted arrow from the problem model to the specialised solver designates that either the model is encoded entirely in the solver such that no further information is required to solve the problem, or the solver requires further input such as problem parameters. 2 Design of a synthesiser for specialised constraint solvers The design of Dominion distinguishes two main parts. The analyser analyses the problem model and produces a solver speciﬁcation that describes what components the specialised solver needs to have and which algorithms and data structures to use. The generator takes the solver speciﬁcation and generates a solver that conforms to it. The ﬂow of information is illustrated in Figure 1. Both the analyser and the generator optimise the solver. While the analyser performs the high-level optimisations that depend on the structure of the problem model, the generator performs low-level optimisations which depend on the implementation of the solver. Those two parts are independent and linked by the solver speciﬁcation, which is completely agnostic of the format of the problem model and the implementation of the specialised solver. There can be diﬀerent front ends for both the analyser and the generator to handle problems speciﬁed in a variety of formats and specialise solvers in a number of diﬀerent ways, e.g. based on existing building blocks or synthesised from scratch. 2.1 The analyser The analyser operates on the model of a constraint problem class or instance. It determines the constraints, variables, and associated domains required to solve the problem and reasons about the algorithms and data structures the specialised solver should use. It makes high-level design decisions, such as whether to use trailing or copying for backtracking memory. It also decides what propagation algorithms to use for speciﬁc constraints and what level of consistency to enforce. The output of the analyser is a solver speciﬁcation that describes all the design decisions made. It does not necessarily ﬁx all design decisions – it may use default values – if the analyser is unable to specialise a particular part of the solver for a particular problem model. In general terms, the requirements for the solver speciﬁcation are that it (a) describes a solver which is able to ﬁnd solutions to the analysed problem 66 model and (b) describes optimisations which will make this solver perform better than a general solver. The notion of better performance includes run time as well as other resources such as memory. It is furthermore possible to optimise with respect to a particular resource; for example a solver which uses less memory at the expense of run time for embedded systems with little memory can be speciﬁed. The solver speciﬁcation may include a representation of the original problem model such that a specialised solver which encodes the problem can be produced – the generated solver does not require any input when run or only values for the parameters of a problem class. It may furthermore modify the original model in a limited way; for example split variables which were deﬁned as one type into several new types. It does not, however, optimise it like for example Tailor [8]. The analyser may read a partial solver speciﬁcation along with the model of the problem to be analysed to still allow ﬁne-tuning by human experts while not requiring it. This also allows for running the analyser incrementally, reﬁning the solver speciﬁcation based on analysis and decisions made in earlier steps. The analyser creates a constraint optimisation model of the problem of specialising a constraint solver. The decision variables are the design decisions to be made and the values in their domains are the options which are available for their implementation. The constraints encode which parts are required to solve the problem and how they interact. For example, the constraints could require the presence of an integer variable type and an equals constraint which is able to handle integer variables. A solution to this constraint problem is a solver speciﬁcation that describes a solver which is able to solve the problem described in the original model. The weight attached to each solution describes the performance of the specialised solver and could be based on static measures of performance as well as dynamic ones; e.g. predeﬁned numbers describing the performance of a speciﬁc algorithm and experimental results from probing a speciﬁc implementation. This metamodel enables the use of constraint programming techniques for generating the specialised solver and ensures that a solver speciﬁcation can be created eﬃciently even for large metamodels. The result of running the analyser phase of the system is a solver speciﬁcation which speciﬁes a solver tailored to the analysed problem model. 2.2 The generator The generator reads the solver speciﬁcation produced by the analyser and constructs a specialised constraint solver accordingly. It may modify an existing solver, or synthesise one from scratch. The generated solver has to conform to the solver speciﬁcation, but beyond that, no restrictions are imposed. In particular, the generator does not guarantee that the generated specialised solver will have better performance than a general solver, or indeed be able to solve constraint problems at all – this is encoded in the solver speciﬁcation. In addition to the high-level design decisions ﬁxed in the solver speciﬁcation, the generator can perform low-level optimisations which are speciﬁc to the im- 67 plementation of the specialised solver. It could for example decide to represent domains with a data type of smaller range than the default one to save space. The scope of the generator is not limited to generating the source code which implements the specialised solver, but also includes the system to build it. The result of running the generator phase of the system is a specialised solver which conforms to the solver speciﬁcation. 3 Preliminary implementation and experimental results We have started implementing the design proposed above in a system which operates on top of Minion [9]. The analyser reads Minion input ﬁles and writes a solver speciﬁcation which describes the constraints and the variable types which are required to solve the problem. It does not currently create a metamodel of the problem. The generator modiﬁes Minion to support only those constraints and variable types. It furthermore does some additional low-level optimisations by removing infrastructure code which is not required for the specialised solver. The current implementation of Dominion sits between the existing Tailor and Minion projects – it takes Minion problem ﬁles, which may have been generated by Tailor, as input, and generates a specialised Minion solver. The generated solver is specialised for models of problem instances from the problem class the analysed instance belongs to. The models have to be the same with respect to the constraints and variable types used. Experimental results for models from four diﬀerent problem classes are shown in Figure 2. The graph only compares the CPU time Minion and the specialised solver took to solve the problem; it does not take into account the overhead of running Dominion – analysing the problem model, generating the solver, and compiling it, which was in the order of a few minutes for all of the benchmarks. The problem classes Balanced Incomplete Block Design, Golomb Ruler, nQueens, and Social Golfers were chosen because they use a range of diﬀerent constraints and variable types. Hence the optimisations Dominion can perform are diﬀerent for each of these problem classes. This is reﬂected in the experimental results by diﬀerent performance improvements for diﬀerent classes. Figure 2 illustrates two key points. The ﬁrst point is that even a quite basic implementation of Dominion which does only a few optimisations can yield signiﬁcant performance improvements over standard Minion. The second point is that the performance improvement does not only depend on the problem class, but also on the instance, even if no additional optimisations beyond the class level were performed. For both the Balanced Incomplete Block Design and the Social Golfers problem classes the largest instances yield signiﬁcantly higher improvements than smaller ones. At this stage of the implementation, our aim is to show that a specialised solver can perform better than a general one. We believe that Figure 2 conclusively shows that. As the problem models become larger and take longer to solve, the improvement in terms of absolute run time diﬀerence becomes larger as well. Hence the more or less constant overhead of running Dominion is amortised for larger and more diﬃcult problem models, which are our main focus. Generating 1.0 2922 7,3,60 ● 10 7,3,50 ● 7,3,40 24 28 ● 12 11 13 20 0.9 ● 26 27 7,3,30 7,3,70 ● 2,9,4 0.8 2,6,4 2,7,4 2,8,4 ● 0.7 CPU time of the specialised solver relative to Minion 68 2,10,4 1 10 100 BIBD (v,k,lambda) Golomb Ruler (ticks) n−Queens (queens) Social Golfers (weeks,groups,players) 1000 10000 CPU time [seconds] Figure 2. Preliminary experimental results for models of instances of four problem classes. The x axis shows the time standard Minion took to solve the respective instance. The labels of the data points show the parameters of the problem instance, which are given in parentheses in the legend. The times were obtained using a development version of Minion which corresponds to release 0.8.1 and Dominion-generated specialised solvers based on the same version of Minion. Symbols below the solid line designate problem instances where the Dominion-generated solver was faster than Minion. The points above the line are not statistically signiﬁcant; they are random noise. The dashed line designates the median for all problem instances. a specialised solver for problem classes and instances is always going to entail a certain overhead, making the approach infeasible for small and quick-to-solve problems. 4 Conclusion and future work We have described the design of Dominion, a solver generator, and demonstrated its feasibility by providing a preliminary implementation. We have furthermore demonstrated the feasibility and eﬀectiveness of the general approach of generating specialised constraint solvers for problem models by running experiments 69 with Minion and Dominion-generated solvers and obtaining results which show signiﬁcant performance improvements. These results do not take the overhead of running Dominion into account, but we are conﬁdent that for large problem models there will be an overall performance improvement despite the overhead. Based on our experiences with Dominion, we propose that the next step should be the generation of specialised variable types for the model of a problem instance. Dominion will extend Minion and create variable types of the sort “Integer domain ranging from 10 to 22”. This not only allows us to choose diﬀerent representations for variables based on the domain, but also to simplify and speed up services provided by the variable, such as checking the bounds of the domain or checking whether a particular value is in the domain. The implementation of specialised variable types requires generating solvers for models of problem instances because the analysed problem model is essentially rewritten. The instance the solver was specialised for will be encoded in it and no further input will be required to solve the problem. We expect this optimisation to provide an additional improvement in performance which is more consistent across diﬀerent problem classes, i.e. we expect signiﬁcant improvements for all problem models and not just some. We are also planning on continuing to specify the details of Dominion and implementing it. 5 Acknowledgements The authors thank Chris Jeﬀerson for extensive help with the internals of Minion and the anonymous reviewers for their feedback. Lars Kotthoff is supported by a SICSA studentship. References 1. Stuckey, P.J., de la Banda, M.J.G., Maher, M.J., Marriott, K., Slaney, J.K., Somogyi, Z., Wallace, M., Walsh, T.: The G12 project: Mapping solver independent models to eﬃcient solutions. In: ICLP 2005. 9–13 2. Minton, S.: Automatically conﬁguring constraint satisfaction programs: A case study. Constraints 1 (1996) 7–43 3. Katsirelos, G., Bacchus, F.: Generalized nogoods in CSPs. In: AAAI 2005. 390–396 4. Prosser, P.: Hybrid algorithms for the constraint satisfaction problem. Computational Intelligence 9(3) (1993) 268–299 5. Bessière, C., Debruyne, R.: Theoretical analysis of singleton arc consistency and its extensions. Artiﬁcial Intelligence 172(1) (2008) 29–41 6. Freuder, E.C., Elfe, C.D.: Neighborhood inverse consistency preprocessing. In: AAAI 1996. 202–208 7. Kotthoff, L.: Constraint solvers: An empirical evaluation of design decisions. CIRCA preprint (2009) http://www-circa.mcs.st-and.ac.uk/Preprints/solver-design.pdf. 8. Rendl, A., Gent, I.P., Miguel, I.: Tailoring solver-independent constraint models: A case study with Essence’ and Minion. In: SARA 2007. 184–199 9. Gent, I.P., Jeﬀerson, C., Miguel, I.: MINION: A fast scalable constraint solver. In: ECAI 2006. 98–102 70 On learning CSP speciﬁcations Matthieu Lopez⋆1 and Arnaud Lallouet⋆⋆2 1 2 1 Université d’Orléans — LIFO BP6759, F-45067 Orléans Matthieu.Lopez@univ-orleans.fr Université de Caen-Basse Normandie — GREYC BP 5186 - 14032 Caen arnaud.lallouet@info.unicaen.fr Introduction Constraint Satisfaction Problems (CSP) concept enables to model a wide range of decision problems, from arithmetic puzzles to scheduling problems. Even if the goal of constraint programming (CP) is to provide a simple way to formulate problems, experience shows that a fair expertise is required to perform the modeling task[Fre97]. For this reason, machine learning of CSP models has been seriously studied in the literature. In this paper, we consider two objects: a model which corresponds to a CSP and a speciﬁcation. Our objective is to propose a way to automatically specify a problem by machine learning with instances of diﬀerent size. Model learning, also named constraints network learning, involves obtaining a set of constraints from a set of examples of the model (solutions and no-solutions of the model). A model is a tuple (V, D, C) where V is a set of variables, D a set of domains for the variables and C a set of constraints. The couple (V, D) deﬁnes the viewpoint of the model. In particular, the viewpoint (at least variables) is similar for all examples and they are all of the same size. CONACQ[BCKO06], based on version space, is an example of algorithm aiming to learn a CSP model. From solutions and no-solutions of a problem where the variables of the CSP are given, and a set of possible constraints are given, it tests, for each constraint, if the constraint is necessary to the model. A speciﬁcation of a problem is a formalization more abstract than a model. Consider for example the n-queens problem. This problem requires putting n queens on a n×n chessboard such that no queen attacks another queen. There exists models for the 8-queens, 12-queens, etc, whereas a speciﬁcation aims to formalize directly the n-queens problem. We say that a model represents an instance of a problem. To obtain a model from a speciﬁcation, additional data are needed, like the number of variables and their domains. This paper deals with speciﬁcation learning rather than model learning. To our mind, a speciﬁcation is more natural for users who do not know CP because providing examples of instances already solved is an easier task than providing examples for the current instance to be solved. Let imagine that we want to obtain a school timetable for this year. In the past years, we have proceeded manually. Thus, we have already a set of solutions for past instances of this problem but they may be of diﬀerent size because of the number of groups, teachers or rooms available. In this paper, we propose a language, a subset of ﬁrst order logic, allowing to describe a subset of problems which could be handled in CP. Let’s take as example the graph coloring problem. Given a set S of vertices of a graph G, and a set C of colors, the problem is to ﬁnd a coloration of each vertex such that when two ⋆ ⋆⋆ the student the supervisor vertices are adjacent, they have not the same color. We can specify the 71problem in the following way, where adj describing the relation of neighborhood, and color deﬁne the color of vertices: ∀X, Y ∈ S, ∀A, B ∈ C : adj(X, Y ) ∧ color(X, A) ∧ color(Y, B) → A 6= B ^ ∀X ∈ S, ∀A, B ∈ C : color(X, A) ∧ color(X, B) → A = B We will detail this language in a ﬁrst part. Next, we present how to obtain a naive model from a speciﬁcation written in our language. After a formalization of inductive logic programming (ILP), we show how our learning task can be expressed like a standard ILP problem. Finally, we present limits of state-of-art learning algorithms. 2 Speciﬁcation language Since a few years, the number of high-level speciﬁcation languages has highly increased ([FHJ+ 08, MNR+ 08, NSB+ 07a, AA06, PF03, Hni03]). Observing these languages, we notice that their principal interest is to work with collections of variables for which the size is unknown and for which their domain is not speciﬁed. Then this abstraction is solved with a data ﬁle wherein is given the missing informations to create an instance of the problem (translation from n-queens to 8-queens). However, these languages have been written to facilitate the modeling for human users and so, their complexity is such that they can not be used for target in a learning algorithm. In our language, a speciﬁcation is a conjunction of rules. These rules describe the way a constraint must be posted in an instance of the problem. The vocabulary of our language is composed of a ﬁnite set of symbols of predicates with ﬁxed arity, a set of domain composed of constants and a set of variables (Warning: these variables and these domains are not the ones of the CSP). Predicates are split in two categories. Some are called description predicates and they are used to describe observation in our problem (e.g. in n-queens problem, to describe positions of queens). Other are named constraint predicates and represent a constraint type (e.g. : =, 6=, bef ore, . . .). An atom is an expression of the form P (t1 , . . . , tk ), where P is a k-arity predicate and t1 , . . . , tk are terms, i.e. variables or constants. We call description atom an atom whose predicate is a description predicate and in a similar way, a constraint atom an atom whose predicate is a constraint. The syntax of a rule is: rule ::= ∀ variables : body → head variables ::= vs ∈ DOMAIN | variables, variables vs ::= VARIABLE | vs, vs body ::= DESC ATOM | CONSTRAINT ATOM | body ∧ body head ::= CONSTRAINT ATOM | head 2.1 From speciﬁcation to model In order to translate a speciﬁcation to a constraint network, we need other informations not present in the speciﬁcation. In the example of graph coloring, the speciﬁcation describes a representation of any well colored graph. To color a graph, we need the set of vertices and the set of colors. We also need to know the neighborhood of the vertices. Thus, to translate a speciﬁcation, we must provide diﬀerent informations like the diﬀerent sets corresponding to domains of variables in the speciﬁcation 72 and “extensions” of certain predicates. The problem will be to ﬁnd the “extension” of other predicates. In the following, we denote by E the set of predicates whose extension is given by the user and V the set of predicates whose extension must be searched. Rather than translating the speciﬁcation in a low-level language, we prefer to translate, for sake of clarity, to a model in MiniZinc[NSB+ 07b], language which can be further translated in low-level language. Let substL=[X|v,...] (A) the operator which substitutes in an atom A all occurrences of variables X in L with the associated value v. For each predicate P : K1 × . . .×Kk in V, we must create the following CSP variables: array[K1 , . . . , Kk ] of boolean : P; Next, each rule R : A1 . . . Al → H in our speciﬁcation can be translated as: constraint forall(v1 in K1 )( ... forall(vi in Kk )( subst[X1 |v1 ,...,Xi |vi ] (A1 )/\ . . . /\subst[X1 |v1 ,...,Xi |vi ] (Al ) -> subst[X1 |v1 ,...,Xi |vi ] (H) )...) Example Let’s take the graph coloring example. Generated variables would be: array[S, C] of boolean : Color; and an example of constraints: constraint forall(v1 in S)(forall(v2 in S)( forall(v3 in C)(forall(v4 in C)( adj(v1 , v2 )/\Color(v1 , v3 )/\Color(v2 , v4 ) -> v3 6= v4 )) )); Remark If we have more informations about predicates, like color is a function, we can easily obtain a better model with another choice for the viewpoint (creating one variable by vertex with colors as domain). 3 Inductive logic programming Inductive logic programming (ILP)[Mug95] is a ﬁeld at the crossroad of logic programming and machine learning. Given sets of positive examples E + and negative examples E − of a concept C, a background knowledge B and a description language L, the goal is to ﬁnd a deﬁnition H of the concept C, described with respect to L in the form of a logic program. H must cover all examples from E + and must reject examples from E − . We will deﬁne the form of H and B, and then the coverage notion and the nature of concept C in a basic kind of ILP without negation and recursion. A clause(resp. query) is a disjunction(resp. conjunction) of literals, i.e. atoms or negation of atoms. A Horn clause is a clause containing only one positive literal. Its form is: ∃X1 , X2 , . . . : H ← B1 ∧ B2 ∧ . . . ∧ Bn where Xi is a variable, the literal H is called the head, and the conjunction of literals B1 ∧ B2 ∧ . . . ∧ Bn is called the body. The hypothesis H is a disjunction of Horn clauses. In a similar way, B is a set of Horn clauses describing observations about examples and new predicates deﬁned intentionally. We say that H covers an example e if H ∪ B |= e. The concept C corresponds to an undeﬁned predicate. Thus the problem consists in learning a deﬁnition H for C such that examples are correctly covered. 4 Speciﬁcation rule learning In this section, we will present our learning problem as an ILP problem like previously deﬁned. We recall a speciﬁcation in our language is a conjuction of rules such that: 73 (∀X11 , . . . : A11 ∧ . . . ∧ A1k → A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : An1 ∧ . . . ∧ Anl → Anl+1 ) The rules can be written as disjunctions: (∀X11 , . . . : ¬A11 ∨ . . . ∨ ¬A1k ∨ A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : ¬An1 ∨ . . . ∨ ¬Anl ∨ Anl+1 ) The ﬁrst step consists in setting what concept we aim to learn. Let spec(pb), the target concept where pb is a key to an example. Previously, we have seen hypotheses and so the deﬁnition of concept is in the form of a set of clauses. But our speciﬁcation is in the form of conjunction of queries. However if we search the inverse concept[Lae02], we can easily get back to the framework previously described. Indeed, inverting connectors ∨/∧ and quantiﬁers ∀/∃ of a deﬁnition of concept, we obtain a deﬁnition for the inverse concept. Let negSpec(pb) the negated concept. Its deﬁnition would be: ∃X11 , . . . : negSpec(pb) ← A11 ∧ . . . ∧ A1k ∧ A1k+1 ... ∃X1n , . . . : negSpec(pb) ← An1 ∧ . . . ∧ Anl ∧ Anl+1 Passing to the negated concept and thus searching hypotheses describing the inverted problem, our problem of learning speciﬁcation can be deﬁned in this way: given a set of solutions G of our problem and a set of no-solutions N G, a background knowledge B = O ∪ I, where O is a set of observations about examples (the color of vertices for example) and I a set of constraint predicates given intentionally, the problem is to ﬁnd a deﬁnition H for negSpec such that H covers all examples of N G and rejects all examples of G with respect to B. 5 Diﬃculties to learn and perspectives ILP problems can be viewed as search problems. A language L sets the possible hypotheses which can be built, for our problem a conjunction of observation and constraint atoms. Thus, states of our search space are possible hypotheses with respect to L. Deﬁned in this way, our search space is clearly inﬁnite. To limit the search space, we can set, for example, the number of atoms or variables in hypotheses. Furthermore the search space can be structured with a partial order. Because of lack of space, we do not detail this part, but the lector can read [LD93]. The idea is to structure the space in a lattice where hypotheses are ordered by a relation of generality. A majority of state-of-the-art algorithms are based on this lattice. To move in this lattice, we need two operators, one to generalize a hypothesis and another to specialize a hypothesis. These operators enable to travel in the search space, either in a complete way, enumerating all possible hypotheses, or with incomplete methods using search strategies like hill-climbing or beam search. To our problem, the complexity of coverage test make prohibitive the use of complete methods. Thus, we have been interested in incomplete methods. Two principal ways have been explored in rule learning. The ﬁrst one consists in starting with the rule always true (⊤) and to specialize it step by step, i.e. to add new atoms. It is the reason why a specialization operator is required to choose the best atom to add to our hypothesis. To deﬁne it, we need a criterion, generally based on the coverage of examples by the new hypothesis. But for our rule, we did not manage to ﬁnd such a criterion. Indeed, if we observe our rules, for example coloring graph problem, adding col(X, C) or any other atom to the rule adj(X, Y ) do not change the coverage 74 of the rule which covers all positive and negative examples. Indeed, we observe that for all studied speciﬁcations, the suppression of one literal in rules of speciﬁcation makes unusable the coverage criterion. The second method consists in starting from the other extremity of the lattice, emphi.e. the most speciﬁc hypothesis. [Mug95] proposes to set the most speciﬁc clause from an example extended with the background knowledge and where all constants are replaced by variables. This extension is conﬁgured by a depth level, limiting the creation of new variables. If this clause is correctly extended, it rejects all negative examples, and covers at least the positive example which has been used to build it. With the generalization operator, we can search a hypothesis covering more positive examples. The principal ﬂaw of this method is to use the coverage test with hypotheses of large size and thus is unusable in practice. Indeed, in certain problems, we have clauses of more than 4000 atoms. We are now building an alternative method aiming to avoid extremities of the lattice which are unusable. The idea is to ﬁnd, with low cost, an hypothesis located in an area of the lattice where we can move in a relevant way, i.e. traveling across the diﬀerent states of our search space in order to launch a local search. Bibliography 75 [AA06] H. Ahriz and I. Arana. Specifying constraint problems with z. Technical report, The Robert Gordon University, 2006. [BCKO06] Christian Bessière, Remi Coletta, Frédéric Koriche, and Barry O’Sullivan. Acquiring constraint networks using a sat-based version space algorithm. In AAAI, 2006. [FHJ+ 08] Alan Frisch, Warwick Harvey, Chris Jeﬀerson, Bernadette Martı́nezHernández, and Ian Miguel. Essence : A constraint language for specifying combinatorial problems. Constraints, 13(3):268–306, 2008. [Fre97] Eugene C. Freuder. In pursuit of the holy grail. Constraints, 2(1):57–61, 1997. [Hni03] B. Hnich. Thesis: Function variables for constraint programming. AI Commun., 16(2):131–132, 2003. [Lae02] Wim Van Laer. From Propositional to First Order Logic in Machine Learning and Data Mining. PhD thesis, Katholieke Universiteit Leuven, June 2002. [LD93] Nada Lavrac and Saso Dzeroski. Inductive Logic Programming: Techniques and Applications. Routledge, New York, NY, 10001, 1993. [MNR+ 08] Kim Marriott, Nicholas Nethercote, Reza Rafeh, Peter J. Stuckey, Maria Garcia de la Banda, and Mark Wallace. The design of the zinc modelling language. Constraints, 13(3):229–267, 2008. [Mug95] S. Muggleton. Inverse entailment and progol. New Generation Computing, Special issue on Inductive Logic Programming, 13(3-4):245–286, 1995. [NSB+ 07a] N. Nethercote, P. J. Stuckey, R. Becket, S. Brand, G. J. Duck, and G. Tack. Minizinc: Towards a standard cp modelling language. In Christian Bessière, editor, Thirteenth International Conference on Principles and Practice of Constraint Programming, Lecture Notes in Computer Science, Providence, RI, USA, sep 2007. Springer-Verlag. [NSB+ 07b] Nicholas Nethercote, Peter J. Stuckey, Ralph Becket, Sebastian Brand, Gregory J. Duck, and Guido Tack. Minizinc: Towards a standard cp modelling language. In CP, pages 529–543, 2007. [PF03] M. Å gren P. Flener, J. Pearson. The syntax, semantics, and type system of esra. Research report, ASTRA, April 2003. 76 Pr♦♣❛❣❛t✐♥❣ ❡q✉❛❧✐t✐❡s ❛♥❞ ❞✐s❡q✉❛❧✐t✐❡s ◆❡✐❧ ❈✳❆✳ ▼♦♦r❡✱ s✉♣❡r✈✐s❡❞ ❜② ■❛♥ ●❡♥t ❛♥❞ ■❛♥ ▼✐❣✉❡❧ ❙❝❤♦♦❧ ♦❢ ❈♦♠♣✉t❡r ❙❝✐❡♥❝❡✱ ❯♥✐✈❡rs✐t② ♦❢ ❙t ❆♥❞r❡✇s✱ ❙❝♦t❧❛♥❞ ✶ ■♥tr♦❞✉❝t✐♦♥ ❚❤✐s ♣❛♣❡r ❞❡s❝r✐❜❡s ❛♥ ✐❞❡❛ t♦ ✐♠♣r♦✈❡ ♣r♦♣❛❣❛t✐♦♥ ✐♥ ❛ ❝♦♥str❛✐♥t s♦❧✈❡r✳ ❚❤❡ ♠♦st ❝♦♠♠♦♥ ❢r❛♠❡✇♦r❦s ❢♦r ♣r♦♣❛❣❛t✐♦♥ ❛r❡ ❜❛s❡❞ ♦♥ ❆❈✺ ❬✺❪ ✇❤❡r❡ ♣r♦♣❛❣❛t♦rs ❝❛♥ ❛❝❝❡♣t ❜♦✉♥❞ ❝❤❛♥❣❡❞✱ ❛ss✐❣♥♠❡♥t ❛♥❞ ✈❛❧✉❡ r❡♠♦✈❛❧ ❡✈❡♥ts✳ ❚❤❡ ✐❞❡❛ ✐s t♦ ❡①t❡♥❞ t❤✐s t♦ ❛❧❧♦✇ ❝♦♥str❛✐♥ts t♦ ❣❡♥❡r❛t❡ ❛♥❞ ♣r♦❝❡ss ❡q✉❛❧✐t② ❛♥❞ ❞✐s❡q✉❛❧✐t② ❡✈❡♥ts ❛s ✇❡❧❧✱ ❡✳❣✳✱ r → x = y ❝❛♥ ❣❡♥❡r❛t❡ ❛♥ ❡q✉❛❧✐t② ❡✈❡♥t ✇❤❡♥❡✈❡r r = true✱ ♠❡❛♥✐♥❣ t❤❛t ❢r♦♠ ♥♦✇ ♦♥ ✐♥ ❛♥② s♦❧✉t✐♦♥ x = y ✳ ❊✈❡r② ❝♦♥str❛✐♥t ✐♥ t❤❡ ♠✐♥✐♦♥ s♦❧✈❡r ❬✸❪ ❝❛♥ ♣r♦❞✉❝❡ s✉❝❤ ❡✈❡♥ts ❛♥❞ ❜❡♥❡✜t ✐♥ s♦♠❡ ❝✐r❝✉♠st❛♥❝❡s ❢r♦♠ r❡❝❡✐✈✐♥❣ t❤❡♠✳ ■♥ t❤✐s ♣❛♣❡r ■ ✇✐❧❧ ❣✐✈❡ ❛♥ ❡①❛♠♣❧❡ ✇❤❡r❡ t❤❡ ✐❞❡❛ ✇♦r❦s s✉❝❝❡ss❢✉❧❧②✱ ❞❡s❝r✐❜❡ t❤❡ ❛❞❞✐t✐♦♥❛❧ ♣r♦♣❛❣❛t✐♦♥ t❤❛t ✈❛r✐♦✉s ♣r♦♣❛❣❛t♦rs ✐♥ ♠✐♥✐♦♥ ❝❛♥ ❛❝❤✐❡✈❡✱ ❛♥❞ ❞❡s❝r✐❜❡ ✐♠♣❧❡♠❡♥t❛t✐♦♥ ✐ss✉❡s✳ ✷ ❊①❛♠♣❧❡ ❈♦♥s✐❞❡r t❤❡ ❢♦❧❧♦✇✐♥❣ ❈❙P✿ ✈❛r✐❛❜❧❡s x1 , . . . , xn , y1 , . . . , yn ❡❛❝❤ ✇✐t❤ ❞♦♠❛✐♥ {1, . . . , n} ❛♥❞ ❝♦♥str❛✐♥ts x1 = y1 ✱ xn = 6 yn ❛♥❞ ∀i✱ xi = yi ⇔ xi+1 = yi+1 ✳ ❈❤r✐s ❏❡✛❡rs♦♥ ❤❛s ♣r♦✈❡❞ ❬✻❪ t❤❛t ✇✐t❤ ❛♥② st❛t✐❝ ✈❛r✐❛❜❧❡ ♦r❞❡r✐♥❣✱ ✐t t❛❦❡s t✐♠❡ ❡①♣♦♥❡♥t✐❛❧ ✐♥ n ❢♦r ❜❛❝❦tr❛❝❦✐♥❣ s❡❛r❝❤ ❛♥❞ ♣r♦♣❛❣❛t✐♦♥ t♦ ♣r♦✈❡ t❤❛t ♥♦ s♦❧✉t✐♦♥ ❡①✐sts✳ ❍♦✇❡✈❡r ❜② ♣r♦♣❛❣❛t✐♥❣ ❡q✉❛❧✐t✐❡s ❛♥❞ ❞✐s❡q✉❛❧✐t✐❡s t❤❡ ❢♦❧❧♦✇✐♥❣ ✇✐❧❧ ♦❝❝✉r✿ ✶✳ ✷✳ ✸✳ ✹✳ ✺✳ x1 = y1 ✇✐❧❧ ❣❡♥❡r❛t❡ ❡✈❡♥t x1 = y1 x1 = y1 ⇔ x2 = y2 ✇✐❧❧ r❡❝❡✐✈❡ t❤❡ ❡✈❡♥t ❛♥❞ ♣r♦❞✉❝❡ x2 = y2 ... xn−1 = yn−1 ⇔ xn = yn ✇✐❧❧ r❡❝❡✐✈❡ xn−1 = yn−1 ❛♥❞ ♣r♦❞✉❝❡ xn = yn ❝♦♥str❛✐♥t xn 6= yn ✇✐❧❧ r❡❝✐❡✈❡ t❤❡ ❡✈❡♥t xn = yn ❛♥❞ t❤✉s ❢❛✐❧ ❆s ■ s❤♦✇ ✐♥ ❙❡❝t✐♦♥ ✹ t❤✐s ❝❛♥ ❜❡ ✐♠♣❧❡♠❡♥t❡❞ ✐♥ ♣♦❧②♥♦♠✐❛❧ t✐♠❡ ❛♥❞ ❤❡♥❝❡ ❛ s✐❣♥✐✜❝❛♥t s♣❡❡❞✉♣ ✐s ❛❝❤✐❡✈❛❜❧❡ ♦♥ t❤✐s t②♣❡ ♦❢ ✐♥st❛♥❝❡✳ ■ ❤❛✈❡ ✐♠♣❧❡♠❡♥t❡❞ t❤✐s ✐❞❡❛ ❛♥❞ ❛❝❤✐❡✈❡❞ ❡①♣♦♥❡♥t✐❛❧ s♣❡❡❞✉♣s ❛s ❡①♣❡❝t❡❞✳ ■ ❤♦♣❡ t♦ ❜❡ ❛❜❧❡ t♦ ♣r❡s❡♥t ♣r❛❝t✐❝❛❧ r❡s✉❧ts ♦♥ t❤✐s ❛♥❞ ♦t❤❡r ♠♦r❡ r❡❛❧✐st✐❝ ✐♥st❛♥❝❡s ❞✉r✐♥❣ ♠② ❞♦❝t♦r❛❧ ♣r♦❣r❛♠♠❡ t❛❧❦✳ ✸ ❚❤❡ t❡❝❤♥✐q✉❡ ■ ♣r♦♣♦s❡ t♦ ❛❧❧♦✇ ♣r♦♣❛❣❛t♦rs t♦ ❣❡♥❡r❛t❡ ❛♥❞ r❡❝❡✐✈❡ ❡✈❡♥ts ♦❢ t❤❡ ❢♦r♠ x = y ❛♥❞ x 6= y ✇❤❡r❡ x ❛♥❞ y ❛r❡ ✈❛r✐❛❜❧❡s✳ Pr♦♣❛❣❛t♦rs ❛r❡ ♥♦t ❛❜❧❡ t♦ ❞❡t❡❝t s✉❝❤ 77 ❡✈❡♥ts ❥✉st ❜② ✐♥s♣❡❝t✐♥❣ t❤❡ s♦❧✈❡r st❛t❡✱ ❜❡❝❛✉s❡ t❤❡② ♠❛② ❜❡ tr✉❡ ✇✐t❤♦✉t ❜❡✐♥❣ ❡♥t❛✐❧❡❞ ❜② t❤❡ st♦r❡✳ ❋♦r ❡①❛♠♣❧❡✱ ✐♥ ❛ ❈❙P ❝♦♥t❛✐♥✐♥❣ x = y ✇✐t❤ ❞♦♠(x) = ❞♦♠(y) = {1, 2, 3} t❤❡ ♣r♦♣❛❣❛t♦r ❝❛♥ r❡♠♦✈❡ ♥♦ ✈❛❧✉❡s✳ x = y ✐s tr✉❡ ✐♥ ❛♥② s♦❧✉t✐♦♥ ❜✉t ♥♦t ❡♥t❛✐❧❡❞ ❜② t❤❡ ❝♦♥str❛✐♥t st♦r❡✳ ❈♦♥✈❡rs❡❧② t❤❡ ❆❈✺✲st②❧❡ ❡✈❡♥ts ❝❛♥ ❛❧✇❛②s ❜❡ ❞❡t❡❝t❡❞ ❜② ♣r♦♣❛❣❛t♦rs ✐♥s♣❡❝t✐♥❣ t❤❡ ❞♦♠❛✐♥ st❛t❡✳ ❆s ❛ ✜rst ❡①❛♠♣❧❡✱ ■ ❡①❤✐❜✐t ❛ r✉❧❡ ❢♦r ❝♦♥str❛✐♥t ❛❜s ✭❞❡✜♥❡❞ x = |y|✮✿ ❚❤❡♦r❡♠ ✶✳ ■♥ ❝♦♥str❛✐♥t x = |y|, ✐❢ x = y t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t y ≥ 0✳ Pr♦♦❢✳ y = x = |y| =⇒ y = |y| =⇒ y ≥ 0✳ ❚❤❡ st❛♥❞❛r❞ ♣r♦♣❛❣❛t♦r ❢♦r x = |y| ✇✐❧❧ ♣r✉♥❡ s♦ t❤❛t x ≥ 0 ❛♥②✇❛②✱ ❛♥❞ t❤✐s ❢❛❝t ❝♦♠❜✐♥❡❞ ✇✐t❤ t❤❡ ❡✈❡♥t x = y ✐ts❡❧❢ s✉❜s✉♠❡s y ≥ 0✳ ❍♦✇❡✈❡r ■ ✇✐❧❧ s❤♦rt❧② s❤♦✇ t❤❛t t❤❡ ❝♦♥tr❛♣♦s✐t✐✈❡ ♦❢ t❤❡ r✉❧❡ ✐s ✉s❡❢✉❧✳ ❆s ✐t ❤❛♣♣❡♥s✱ t❤❡ ❝♦♥✈❡rs❡ ♦❢ t❤❡ r✉❧❡ ✐s ❛❧s♦ tr✉❡✿ ❚❤❡♦r❡♠ ✷✳ ■♥ ❝♦♥str❛✐♥t x = |y|✱ ✐❢ x 6= y t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t y < 0✳ Pr♦♦❢✳ |y| = x 6= y =⇒ |y| = 6 y =⇒ y < 0✳ ❚❤❡s❡ t❤❡♦r❡♠s s❤♦✇ t❤❛t ❛❜s ❝❛♥ ❜❡♥❡✜t ❢r♦♠ r❡❝❡✐✈✐♥❣ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts✱ ❛❧❧♦✇✐♥❣ ✐t t♦ ♣r✉♥❡ ❛❧❧ ♥❡❣❛t✐✈❡ ✭♣♦s✐t✐✈❡✮ ✈❛❧✉❡s ❢r♦♠ t❤❡ ❞♦♠❛✐♥ ♦❢ ②✳ ❈❛♥ ✐t ❛❧s♦ ♣r♦❞✉❝❡ s✉❝❤ ❡✈❡♥ts❄ ■ ♥♦✇ ❡①❤✐❜✐t ❛ s✐♠♣❧❡ ♠❡t❛✲t❤❡♦r❡♠ t♦ s❤♦✇ t❤❛t ❛♥② ❝♦♥str❛✐♥t t❤❛t ❝❛♥ ✉s❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ❝❛♥ ❛❧s♦ ♣r♦❞✉❝❡ t❤❡♠ ❢♦r ♦t❤❡r ❝♦♥str❛✐♥ts t♦ ✉s❡✿ ❚❤❡♦r❡♠ ✸✳ ■♥ ❝♦♥str❛✐♥t C ✱ ✐❢ event ❛❧❧♦✇s ✉s t♦ ✐♥❢❡r condition✱ t❤❡♥ ¬condition ❛❧❧♦✇s ✉s t♦ ✐♥❢❡r ¬event✳ Pr♦♦❢✳ ❖♠✐tt❡❞✳ ❋♦r ❡①❛♠♣❧❡✱ ❛ r✉❧❡ t♦ ♣r♦❞✉❝❡ ❡✈❡♥ts ❢r♦♠ ❚❤❡♦r❡♠ ✶ ❝❛♥ ❜❡ ♦❜t❛✐♥❡❞✿ ❈♦r♦❧❧❛r② ✶✳ ■♥ ❝♦♥str❛✐♥t x = |y|✱ ✐❢ y < 0 t❤❡♥ ✇❡ ❝❛♥ ✐♥❢❡r t❤❛t x 6= y ✳ Pr♦♦❢✳ ■♠♠❡❞✐❛t❡ ❢r♦♠ ❚❤❡♦r❡♠s ✶ ❛♥❞ ✸✳ ■♥ ♦✉t❧✐♥❡✱ ✐♥ ♠② ♥❡✇ ♣r♦♣❛❣❛t✐♦♥ ❢r❛♠❡✇♦r❦ ❛ ❝♦♥str❛✐♥t ♠❛② ❝❤♦♦s❡ t♦ r❡❝❡✐✈❡ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ✐♥ ❛❞❞✐t✐♦♥ t♦ t❤❡ ♥♦r♠❛❧ s❡t ♦❢ ♣r✉♥✐♥❣✱ ❛ss✐❣♥♠❡♥t ❛♥❞ ❜♦✉♥❞ ❡✈❡♥ts✳ ■t ❝❛♥ ❛❧s♦ ♥♦t✐❢② ♦t❤❡r ❝♦♥str❛✐♥ts t❤❛t ✐t ❤❛s ✐♥❢❡rr❡❞ ❛ ✭❞✐s✲ ✮❡q✉❛❧✐t② ❜② ❣❡♥❡r❛t✐♥❣ ❛♥ ❡✈❡♥t✳ ❚❤❡ s♦❧✈❡r ✇✐❧❧ ✕ ❛❞❞ t❤❡ ❡✈❡♥t t♦ t❤❡ st♦r❡❀ ♦t❤❡r ❝♦♥str❛✐♥ts ❝❛♥ ❝❤❡❝❦ ❢♦r ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ✭❛♥❛❧♦❣♦✉s t♦ ✈❛r✐❛❜❧❡ ❝❤❡❝❦s ❧✐❦❡ ✐♥❉♦♠❛✐♥✱ ❣❡t▼✐♥✱ ❡t❝✳✮ ✕ ♣❛ss ♦♥ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s t♦ ❝♦♥str❛✐♥ts t❤❛t ❤❛✈❡ r❡❣✐st❡r❡❞ ✐♥ ✐♥t❡r❡st ✭❛♥❛❧✲ ♦❣♦✉s t♦ st❛t✐❝ ❛♥❞ ✇❛t❝❤❡❞ tr✐❣❣❡rs ♦♥ ✈❛❧✉❡ r❡♠♦✈❛❧s✮❀ ❛♥❞ ✕ ♣r♦♣❛❣❛t❡ t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② ❝♦♥str❛✐♥t ✐ts❡❧❢ ✭❛♥❛❧♦❣♦✉s❧② t♦ ❤♦✇ ♣r♦❞✉❝✐♥❣ t❤❡ ❆❈✺ ❡✈❡♥t x 8 1 r❡♠♦✈❡s ✶ ❢r♦♠ x✬s ❞♦♠❛✐♥✮✳ ■ ❤❛✈❡ ❛♥❛❧②s❡❞ t❤❡ ✇❤♦❧❡ s❡t ♦❢ ❝♦♥str❛✐♥ts ✐♥ ♠✐♥✐♦♥✱ ❛♥❞ ❢♦✉♥❞ t❤❛t ❡✈❡r② s✐♥❣❧❡ ♦♥❡ ❝❛♥ ❜❡♥❡✜t ❢r♦♠ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥✳ ❚❤❡s❡ ❤❛✈❡ ♥♦t ❜❡❡♥ ❝❤♦s❡♥ s♣❡❝✐❛❧❧②✳ ❆ s❡❧❡❝t✐♦♥ ❛r❡ t❛❜✉❧❛t❡❞ ✐♥ ❚❛❜❧❡ ✶ ✇✐t❤♦✉t ♣r♦♦❢✳ ❋♦r ❡①❛♠♣❧❡✱ ❚❤❡♦r❡♠s ✶ ❛♥❞ ✷ ❛r❡ r❡♣r♦❞✉❝❡❞ ❛s t❤❡ ✜rst ❛♥❞ s❡❝♦♥❞ ❧✐♥❡s ♦❢ t❤❡ t❛❜❧❡✳ ■ ❛♠ ✉♥❛❜❧❡ t♦ ❣✐✈❡ t❤❡ ❝♦♠♣❧❡♠❡♥t❛r② r✉❧❡s ❜② ❚❤❡♦r❡♠ ✸ ❢♦r s♣❛❝❡ r❡❛s♦♥s✳ ■ ❛tt❡♠♣t t♦ ♦♥❧② ❣✐✈❡ r✉❧❡s t❤❛t ❛r❡ ♥♦t ❝✉rr❡♥t❧② ❝❛♣t✉r❡❞ ❜② t❤❡ ♠✐♥✐♦♥ ♣r♦♣❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠s✳ 78 ❈♦♥str❛✐♥t ❉❡✜♥✐t✐♦♥ ❈♦♥❞✐t✐♦♥ ❊✈❡♥t ◆♦t❡s ❛❜s x = |y| x=y y≥0 ❙❡❡ ❚❤❡♦r❡♠ ✶✳ ❙✉❜s✉♠❡❞ ❛❜s x = |y| x 6= y y<0 vi 6= vj x 6= y x=y x 6= y x 6= y vec[idx] = e x=y ❜② ♦t❤❡r ♣r♦♣❛❣❛t✐♦♥✳ ❛❧❧❞✐✛❡r❡♥t ♥♦♥❡ ❡❧❡♠❡♥t z = |x − y| z = |x − y| x 6= y z = ⌊x/y⌋ vec[i] = e z 6= 1 i = idx ❡q x=y ♥♦♥❡ ❞✐✛❡r❡♥❝❡ ❞✐✛❡r❡♥❝❡ ❞✐s❡q ❞✐✈ ❣❝❝ ✐♥❡q ✐♥❡q z= 6 0 z=0 ♥♦♥❡ ❚♦♦ ❝♦♠♣❧❡① t♦ ❞❡s❝r✐❜❡ ❤❡r❡✱ s❡❡ ❙❡❝t✐♦♥ ✸✳✶✳ x≤y+c x≤y+c ❧❡①❧❡q x=y x=c ❙✐♠✐❧❛r t♦ ❧❡①❧❡q✳ ♠♦❞✉❧♦ x = max(y, z) x = max(y, z) x = max(y, z) d/e❂❄ r❡♠ r ♠♦❞✉❧♦ ✏ y=z x 6= y x 6= z d=e e=r ♣♦✇ xy = z xy = z x 6= z x=z xy = z xy = z x=z x 6= z ♠❛① ♠❛① c≥0 y≥0 ❚♦♦ ❝♦♠♣❧❡① t♦ ❞❡s❝r✐❜❡ ❤❡r❡✱ s❡❡ ❙❡❝t✐♦♥ ✸✳✷✳ ❧❡①❧❡ss ♠❛① ❙❡❡ ❚❤❡♦r❡♠ ✷ ●❡♥❡r❛t❡ ❛❧❧ ❞✐s❡q✉❛❧✐t✐❡s✳ x=y x=z x=y r=0 y 6= 1 y= 0∨y =1 y=1 y 6= 1 ❢❛✐❧ ❆❧s♦ ❘❡♠❛✐♥❞❡r ♠✉st ❜❡ ✐♥ t❤❡ r❛♥❣❡ ♣♦✇ ♣r♦❞✉❝t ♣r♦❞✉❝t t❛❜❧❡ ❖▼■❚❚❊❉ x=z 0...e − 1 ❆❧s♦ ❆❧s♦ y=z y 6= z ❙❡❡ ❙❡❝t✐♦♥ ✸✳✸ ❛♥❞✱ ♦r✱ r❡✐❢②✱ r❡✐❢②✐♠♣❧② ♠❡t❛❝♦♥str❛✐♥ts❀ ♠✐♥❀ ✇❛t❝❤✈❡❝♥❡q❀ ♠✐♥✉s❡q❀ ♦❝❝✉rr❡♥❝❡❀ s✉♠❧❡q❀ s✉♠❣❡q❀ ✇❡✐❣❤t❡❞s✉♠ ❚❛❜❧❡ ✶✳ Pr♦♣❛❣❛t✐♦♥ r✉❧❡s ✸✳✶ ●❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t ❚❤❡ ❣❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t ✭●❈❈✮ ❬✽❪ ❣✐✈❡♥ ✈❡❝t♦r ♦❢ ✈❛r✐❛❜❧❡s v1 , . . . , vm ✱ ❝♦✉♥t ✈❛r✐❛❜❧❡s c1 , . . . , cn ❛♥❞ ❝♦♥st❛♥ts val1 , . . . , valn ❡♥s✉r❡s t❤❛t t❤❡r❡ ❛r❡ ci ♦❝❝✉rr❡♥❝❡s ♦❢ vali ✐♥ v1 , . . . , vm ✳ ❲✐t❤♦✉t ❜❡✐♥❣ s♣❡❝✐✜❝ ❛❜♦✉t t❤❡ ♣r♦♣❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠✱ ❛❞❞✐t✐♦♥❛❧ ♣r♦♣❛✲ ❣❛t✐♦♥ ❝♦✉❧❞ ❜❡ ❛❝❤✐❡✈❡❞ ✐♥ t❤❡ ❢♦❧❧♦✇✐♥❣ ❝❛s❡✱ ❢♦r ❡①❛♠♣❧❡✿ ✶✳ ❙❛② t❤❛t dom(v1 )✱ dom(v2 ) ❛♥❞ dom(v3 ) ✇❡r❡ ❛❧❧ 1, 2, 3✱ ❛♥❞ 1 ❞♦❡s♥✬t ❛♣♣❡❛r ✐♥ ❛♥② ♦t❤❡r ❞♦♠❛✐♥s✳ ✷✳ ❋✉rt❤❡r♠♦r❡✱ 1 ❤❛s t♦ ❜❡ r❡♣❡❛t❡❞ t✇✐❝❡ ✐♥ v1 , . . . , vm ✳ ✸✳ ■t ✐s ❡❛s② t♦ t❡❧❧ t❤❛t ❡✐t❤❡r v1 ♦r v2 ❤❛s t♦ ❜❡ 1✱ ❜② t❤❡ ♣✐❣❡♦♥❤♦❧❡ ♣r✐♥❝✐♣❧❡✳ ✹✳ ■❢ ✇❡ ♥♦✇ ❦♥♦✇ t❤❛t v1 = v2 t❤❡♥ ✇❡ ❝❛♥ t❡❧❧ str❛✐❣❤t ❛✇❛② t❤❛t v1 = v2 = 1 ❛♥❞ v3 6= 1✳ ❉✐s✲❡q✉❛❧✐t✐❡s ❝❛♥ ❛❧s♦ ❜❡ ❡①♣❧♦✐t❡❞✿ 79 dom(v1 ) ❛♥❞ dom(v2 ) ❛r❡ ❜♦t❤ 1, 2✱ ❛♥❞ 1 ❞♦❡s♥✬t ❛♣♣❡❛r ❡❧s❡✇❤❡r❡✳ ❋✉rt❤❡r♠♦r❡ 1 ❝❛♥ ❜❡ r❡♣❡❛t❡❞ ❡✐t❤❡r ♦♥❝❡ ♦r t✇✐❝❡✳ ■❢ ✇❡ ♥♦✇ ❦♥♦✇ t❤❛t v1 6= v2 t❤❡♥ str❛✐❣❤t ❛✇❛② ✇❡ ❦♥♦✇ t❤❛t 1 ❝❛♥✬t ❜❡ ✶✳ ❙❛② t❤❛t ✷✳ ✸✳ r❡♣❡❛t❡❞ t✇✐❝❡✳ ■t ✐s ❝♦♠♣❛r❛t✐✈❡❧② s✐♠♣❧❡ t♦ ✐♥❝♦r♣♦r❛t❡ ❞✐s❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ ✐♥t♦ t❤❡ ❣❝❝ ❛❧❣♦r✐t❤♠✱ ❜② ✉s✐♥❣ ❛♥ ❛❧t❡r♥❛t❡ ♥❡t✇♦r❦ ✢♦✇ ❞❡s✐❣♥ ✐♥st❡❛❞ ♦❢ t❤❡ st❛♥❞❛r❞ ♦♥❡ ❞❡s❝r✐❜❡❞ ❜② ❘é❣✐♥ ❬✽❪✳ ■ ❤❛✈❡♥✬t ②❡t ❢♦✉♥❞ ❛♥ ❡✣❝✐❡♥t ❞❡s✐❣♥ t❤❛t ✐♥❝♦r♣♦r❛t❡s ❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥✳ ■ ❛❧s♦ ❤❛✈❡♥✬t tr✐❡❞ t♦ ♠❛❦❡ ❣❝❝ ♣r♦❞✉❝❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ❛❧t❤♦✉❣❤ ✐t ♠✉st ❜❡ ♣♦ss✐❜❧❡✳ ✸✳✷ ▲❡①✐❝♦❣r❛♣❤✐❝❛❧ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥t ❚❤❡ ❧❡①✐❝♦❣r❛♣❤✐❝❛❧ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥t ❡♥s✉r❡s t❤❛t 100 ≤lex 101 ❜✉t 101 lex 100✳ x1 . . . xr ≤lex y1 . . . yr ✱ ❡✳❣✳✱ ❚❤❡ st❛♥❞❛r❞ ❧✐t❡r❛t✉r❡ ❛❧❣♦r✐t❤♠ ♦❜t❛✐♥✐♥❣ ●❆❈ ✐s ❣✐✈❡♥ ✐♥ ❬✶❪✳ ■t ✇♦r❦s ❜② ♠❛✐♥t❛✐♥✐♥❣ t✇♦ ✐♥❞✐❝❡s α ❛♥❞ β✳ α ✐s t❤❡ ✐♥❞❡① s✉❝❤ t❤❛t ❛❧❧ ♠♦r❡ s✐❣♥✐✜❝❛♥t ✐♥❞✐❝❡s ❛r❡ ❛ss✐❣♥❡❞ t♦ t❤❡ s❛♠❡ ✈❛❧✉❡✳ β ✐s t❤❡ ♠♦st s✐❣♥✐✜❝❛♥t ✐♥❞❡① s✉❝❤ t❤❛t t❤❡ t❛✐❧ ♦❢ t❤❡ ✈❡❝t♦rs ❢r♦♠ t❤❛t ♣♦s✐t✐♦♥ ♠✉st ✈✐♦❧❛t❡ t❤❡ ❝♦♥str❛✐♥t✳ ❋♦r ❡①❛♠♣❧❡✱ t❤❡ ❢♦❧❧♦✇✐♥❣ t❛❜❧❡ ❣✐✈❡s t❤❡ ❝✉rr❡♥t ❞♦♠❛✐♥s ♦❢ t❤❡ ✈❛r✐❛❜❧❡s ✐♥ t❤❡ ❧❡①❧❡q ❝♦♥str❛✐♥t✳ i ✵ ✶ ✷ ✸ ✹ dom(xi ) 2 1, 3, 4 2, 3, 4 1 3, 4, 5 dom(yi ) 2 1 1, 2, 3 0 0, 1, 2 α = 1 ❜❡❝❛✉s❡ x0 ❛♥❞ y0 ❛r❡ ❛ss✐❣♥❡❞ t❤❡ s❛♠❡❀ β = 3 ❜❡❝❛✉s❡ ❛♥② x3 > y3 ❛♥❞ x4 > y4 ✳ ❚❤❡ ♣r♦♣❛❣❛t♦r ♠❛✐♥t❛✐♥s t❤❡s❡ ✈❛❧✉❡s α ❛♥❞ β ❛s ❞♦♠❛✐♥s ❛r❡ ♥❛rr♦✇❡❞✳ ■❢ α > β t❤❡ ❝♦♥str❛✐♥t ❢❛✐❧s✳ ■❢ α + 1 = β t❤❡ ♣r♦♣❛❣❛t♦r ❡♥❢♦r❝❡s xα < yα ✳ ■❢ α + 1 < β t❤❡♥ t❤❡ ♣r♦♣❛❣❛t♦r ❡♥❢♦r❝❡s xα ≤ yα ✳ ■♥ t❤✐s ❝❛s❡ ❛ss✐❣♥♠❡♥t ♥♦✇ ❤❛s ■t ✐s q✉✐t❡ ❡❛s② t♦ s❡❡ ❤♦✇ t♦ ✐♥❝♦r♣♦r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ ✐♥t♦ t❤❡ ❛❧❣♦r✐t❤♠✿ ■❢ ✇❡ ✜♥❞ ♦✉t t❤❛t ♣r♦❞✉❝❡ t❤❡ ❡✈❡♥t ✸✳✸ xα = yα t❤❡♥ α ❝❛♥ ❜❡ ✐♥❝r❡♠❡♥t❡❞✳ ❖♥❝❡ α+1 = β xα 6= yα ✳ ❚❛❜❧❡ ❝♦♥str❛✐♥t ❘♦✉❣❤❧② s♣❡❛❦✐♥❣✱ t❤❡ t❛❜❧❡ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t♦r ✐♥ ♠✐♥✐♦♥ ✇♦r❦s ❜② ✜♥❞✐♥❣ ❛ ✈❛❧✐❞ s✉♣♣♦rt✐♥❣ t✉♣❧❡ ❢♦r ❡❛❝❤ ✈❛r✐❛❜❧❡✴✈❛❧✉❡ ♣❛✐r✳ ❚♦ t❛❦❡ ❛❝❝♦✉♥t ♦❢ ✭❞✐s✲ ✮❡q✉❛❧✐t② ✐♥❢♦r♠❛t✐♦♥ t❤❡ ❝r✐t❡r✐❛ ❢♦r ✏✈❛❧✐❞✐t②✑ ✐s ❝❤❛♥❣❡❞✳ ❇❡❢♦r❡✱ ✐t ✇❛s t❤❛t ❡❛❝❤ ✈❛r✐❛❜❧❡✴✈❛❧✉❡ ♣❛✐r ✐♥ t❤❡ t✉♣❧❡ ✐s ✐♥ ✐ts r❡s♣❡❝t✐✈❡ ❞♦♠❛✐♥✳ ❆❢t❡r✱ t✉♣❧❡s ♠✉st ❛❧s♦ ❝♦♥❢♦r♠s t♦ ❛♥② ❛♥❞ ❛❧❧ ❦♥♦✇♥ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s✳ ❋♦r ❡①❛♠♣❧❡✱ t✉♣❧❡ (x = 1, y = 1, z = 2) ✐s ❞✐s❛❧❧♦✇❡❞ ✐❢ ✇❡ ❦♥♦✇ x 6= y ❛♥❞✴♦r ✐❢ ✇❡ ❦♥♦✇ y = z✳ ❈♦♥✈❡rs❡❧②✱ ✐❢ ❛❧❧ t✉♣❧❡s ✇❤♦s❡ ❝♦♠♣♦♥❡♥ts ❛r❡ ✐♥ t❤❡✐r r❡s♣❡❝t✐✈❡ ❞♦♠❛✐♥s ❛r❡ s✉❝❤ t❤❛t ❛ ♣❛rt✐❝✉❧❛r ♣❛✐r ♦❢ ❝♦♠♣♦♥❡♥ts ❛r❡ ❛❧✇❛②s ❡q✉❛❧ ♦r ✉♥❡q✉❛❧ t❤❡♥ t❤❡ ❝♦rr❡s♣♦♥❞✐♥❣ ❡✈❡♥t ❝❛♥ ❜❡ ❣❡♥❡r❛t❡❞✳ ✹ 80 ■♠♣❧❡♠❡♥t❛t✐♦♥ ✭❉✐s✲✮❡q✉❛❧✐t✐❡s ❝❛♥ ❜❡ ❤❛♥❞❧❡❞ ✐♥ ❛ s✐♠✐❧❛r ✇❛② t♦ ♦t❤❡r ♣r♦♣❛❣❛t✐♦♥ ❡✈❡♥ts ❧✐❦❡ v ← 1 ♦r v 8 3✳ ◆♦t❡ t❤❛t t❤❡r❡ ❛r❡ ❛ q✉❛❞r❛t✐❝ ♥✉♠❜❡r ♦❢ ❞✐✛❡r❡♥t ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ♣♦ss✐❜❧❡✱ t❤❡ s❛♠❡ ❛s ✭❞✐s✲✮❛ss✐❣♥♠❡♥t ❛♥❞ ❜♦✉♥❞ ❝❤❛♥❣❡❞ ❡✈❡♥ts✳ ❈♦♥str❛✐♥ts s❤♦✉❧❞ ❤❛✈❡ t❤❡ ❢♦❧❧♦✇✐♥❣ ❢❛❝✐❧✐t✐❡s ❛✈❛✐❧❛❜❧❡ t♦ t❤❡♠✿ ✕ ❙❡t ✉♣ st❛t✐❝ tr✐❣❣❡rs ♦♥ ❝❤♦s❡♥ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ✭✐❢ t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② ✐s ❛❧r❡❛❞② tr✉❡✱ ✐t s❤♦✉❧❞ tr✐❣❣❡r ✐♠♠❡❞✐❛t❡❧② ❛❢t❡r ❜❡✐♥❣ s❡t ✉♣✮✳ ✕ ●❡♥❡r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t② ❡✈❡♥ts ❢♦r ❝❤♦s❡♥ ♣❛✐rs ♦❢ ✈❛r✐❛❜❧❡s✳ ✕ ❈❤❡❝❦ ✐❢ ❛♥ ❡q✉❛❧✐t② ✐s tr✉❡✱ ❢❛❧s❡ ♦r ✉♥❦♥♦✇♥✳ ✭❉✐s✲✮❡q✉❛❧✐t✐❡s s❤♦✉❧❞ ❜❡ r❡♠♦✈❡❞ ❛❢t❡r ❜❛❝❦tr❛❝❦✳ ■❢ x = y ❛♥❞ x 6= y ❛r❡ ❜♦t❤ ❣❡♥❡r❛t❡❞ t❤❡ s♦❧✈❡r s❤♦✉❧❞ ❢❛✐❧ ❛♥❞ ❜❛❝❦tr❛❝❦✳ ■❢ x = y ❛♥❞ y = z ❛r❡ ❣❡♥❡r❛t❡❞✱ t❤❡ s♦❧✈❡r s❤♦✉❧❞ ❡♥s✉r❡ x = z ✐s ❣❡♥❡r❛t❡❞✳ ✭❉✐s✲✮❡q✉❛❧✐t✐❡s ♥❡❡❞ ♦♥❧② ❜❡ ❣❡♥❡r❛t❡❞ ❡①♣❧✐❝✐t❧② ❜② ♣r♦♣❛❣❛t♦rs ❛♥❞ ✇❤❡♥ t❤❡② ❛r❡ ♥♦t ♦❜✈✐♦✉s ❢r♦♠ t❤❡ ✈❛r✐❛❜❧❡ ❞♦♠❛✐♥s✳ ❋♦r ❡①❛♠♣❧❡✱ ❡✈❡♥ ✐❢ x = 1 ❛♥❞ y = 1 ❡✈❡♥t x = y ♠❛② ♥♦t r❡s✉❧t✳ ❚❤✐s ♣r♦♣❡rt② ✐s ♥♦t ❡ss❡♥t✐❛❧✱ ❛♥❞ ✐t ✇♦✉❧❞ ❜❡ ❛ ❣♦♦❞ ✐❞❡❛ t♦ tr② ❣❡♥❡r❛t✐♥❣ ❛❧❧ ❡✈❡♥ts✳ ❚❤❡ ❛❞✈❛♥t❛❣❡s ❛r❡ ❛s ❢♦❧❧♦✇s✿ ✕ ❊✈❡♥ts t❤❛t ❛r❡ ♦❜✈✐♦✉s ❢r♦♠ t❤❡ ✈❛r✐❛❜❧❡ ❞♦♠❛✐♥s s❤♦✉❧❞ ❜❡ ♣✐❝❦❡❞ ✉♣ ✐♥ t❤❡ ♥♦r♠❛❧ ❝♦✉rs❡ ♦❢ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t✐♦♥✳ ■t ✐s ❜❡tt❡r ❢♦r ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ t♦ ❜❡ ♦rt❤♦❣♦♥❛❧ t♦ ♥♦r♠❛❧ ♣r♦♣❛❣❛t✐♦♥ s♦ ✐t ♠❛② ❜❡ ❡❛s✐❧② s✇✐t❝❤❡❞ ♦✛✳ ✕ ❆✈♦✐❞s ✐♥tr♦❞✉❝✐♥❣ ✈❛st ♥✉♠❜❡rs ♦❢ ❡✈❡♥ts ❢♦r ♣r♦❜❧❡♠s ✇✐t❤ s♠❛❧❧ ❞♦♠❛✐♥s ✇❤❡r❡ ❡q✉❛❧✐t✐❡s ❛r❡ ❧✐❦❡❧②✱ ❢♦r ❡①❛♠♣❧❡ ✐♥ ❛ ❜♦♦❧❡❛♥ ♣r♦❜❧❡♠✳ ❆❧s♦✱ ♥♦ ♠❛tt❡r t❤❡ ❞♦♠❛✐♥s✱ ❛s ♠♦r❡ ✈❛r✐❛❜❧❡s ❛r❡ ❛ss✐❣♥❡❞ ♠♦r❡ s♣✉r✐♦✉s ❡q✉❛❧✐t✐❡s ❛♥❞ ❞✐s❡q✉❛❧✐t✐❡s ❛r❡ ♣r♦❞✉❝❡❞✳ ❚❤❡ ❞✐s❛❞✈❛♥t❛❣❡ ✐s t❤❛t t❤✐s ✇✐❧❧ ❧❡❛❞ t♦ ❝♦❞❡ ❞✉♣❧✐❝❛t✐♦♥✿ ❛ ♣r♦♣❛❣❛t♦r ♠✐❣❤t ❧✐❦❡ t♦ ✉s❡ ❞✐s❡q✉❛❧✐t② ❛s ❛ tr✐❣❣❡r✱ s♦ t❤❛t ✐t ✇✐❧❧ ♦♥❧② ♣r♦♣❛❣❛t❡ ✇❤❡♥ ✐t r❡❝❡✐✈❡s s✉❝❤ ❛♥ ❡✈❡♥t✳ ❍♦✇❡✈❡r ✉s✐♥❣ t❤✐s ❢r❛♠❡✇♦r❦ ✐t ❝❛♥♥♦t ❛ss✉♠❡ t❤❛t ✐t ✇✐❧❧ ❣❡t t❤❡ ✭❞✐s✲✮❡q✉❛❧✐t② tr✐❣❣❡r ❛♥❞ ❤❡♥❝❡ ✐t ♠✉st ♣❧❛❝❡ ♦t❤❡r tr✐❣❣❡rs✳ ■t ✇♦✉❧❞ ❜❡ s❡♥s✐❜❧❡ ❛t s♦♠❡ ♣♦✐♥t t♦ ❡✈❛❧✉❛t❡ ✇❤❡t❤❡r ❣❡♥❡r❛t✐♥❣ ❛❧❧ ♣♦ss✐❜❧❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ✐s ❛❞✈❛♥t❛❣❡♦✉s✳ ■♠♣❧✐❡❞ ❡✈❡♥ts ■❢ ❡✈❡♥ts x = y ❛♥❞ y = z ❛r❡ ❜♦t❤ ♣r♦❞✉❝❡❞✱ t❤❡♥ x = z ✐s ✐♠♣❧✐❡❞✳ ❚❤❡ s♦❧✈❡r s❤♦✉❧❞ ❡♥s✉r❡ t❤❛t t❤❡s❡ ✐♠♣❧✐❡❞ ❡✈❡♥ts ❛r❡ ❝r❡❛t❡❞✱ ❜❡❝❛✉s❡ t❤❡ ✈❛r✐❛❜❧❡ y ♠❛② ♥♦t ❜❡ ❦♥♦✇♥ t♦ ❛ ♣r♦♣❛❣❛t♦rs ✇✐t❤ ❛ tr✐❣❣❡r ♦♥ x = z ✳ ■♠♣❧✐❡❞ ❞✐s❡q✉❛❧✐t② ❡✈❡♥ts ❛❧s♦ ❡①✐st✳ ❙t❛♥❞❛r❞ ❛❧❣♦r✐t❤♠s ✉s✐♥❣ t❤❡ ✉♥✐♦♥✲✜♥❞ ❞❛t❛ str✉❝t✉r❡ ❝❛♥ ❜❡ ❡①♣❧♦✐t❡❞ t♦ s♦❧✈❡ t❤❡s❡ ♣r♦❜❧❡♠s✱ s❡❡ ❬✼❪✳ ❲❤❡♥ t♦ ❡♥❛❜❧❡ P♦t❡♥t✐❛❧❧②✱ ❡♥❛❜❧✐♥❣ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ ❝♦✉❧❞ ❜❡ ❞❛♠✲ ❛❣✐♥❣ t♦ ♣r♦♣❛❣❛t✐♦♥ s♣❡❡❞✱ ✐❢ t❤❡ ❞②♥❛♠✐❝ ❝❤❛r❛❝t❡r✐st✐❝s ♦❢ t❤❡ ♣r♦❜❧❡♠ ❛r❡ ♥♦t s✉✐t❛❜❧❡✳ ❇② ✉s✐♥❣ ❛ s✐♠✐❧❛r t❡❝❤♥✐q✉❡ t♦ ❬✾❪ t❤❡ s♦❧✈❡r ❝❛♥ ❞❡t❡❝t ❝❛s❡s ✇❤❡♥ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ ✇✐❧❧ ❜❡ ✉♥s✉❝❝❡ss❢✉❧✳ ❉❡t❛✐❧s ❛r❡ ♦♠✐tt❡❞ ❞✉❡ t♦ s♣❛❝❡ ❝♦♥s✐❞❡r❛t✐♦♥s✳ ✺ Pr❡✈✐♦✉s ✇♦r❦ 81 ❚❤✐s ✐❞❡❛ ✐s s♦ s✐♠♣❧❡ t❤❛t ✐t ✐s ✐♥❡✈✐t❛❜❧❡ t❤❛t s✐♠✐❧❛r ✇♦r❦ ✇✐❧❧ ❡①✐st✳ ❚❤❡ ❛✐♠ ♦❢ t❤❡ ✇♦r❦ ✐s t♦ ❜❡ s✐♠♣❧❡ ❛♥❞ ❡✛❡❝t✐✈❡❀ ❤❡♥❝❡ ■ ❞♦ ♥♦t s❡❡❦ t♦ ❝♦♠♣❡t❡ ✇✐t❤ ♠♦r❡ ❣❡♥❡r❛❧ ✐❞❡❛s✳ ❋✉rt❤❡r♠♦r❡✱ ■ ❤❛✈❡ t❛✐❧♦r❡❞ t❤❡ ✐❞❡❛ t♦ ✇♦r❦ ✐♥ ❛ ♣r♦♣❛❣❛t✐♦♥ s♦❧✈❡r✱ s♦ ✐t ✐s ♥♦t ❝♦✐♥❝✐❞❡♥t ✇✐t❤ s✐♠✐❧❛r ✐❞❡❛s ✐♥ ♦t❤❡r t②♣❡s ♦❢ ❛✉t♦♠❛t❡❞ s❡❛r❝❤ ❛♥❞ r❡❛s♦♥✐♥❣✳ ■ ✇♦✉❧❞ ❜❡ ✐♥t❡r❡st❡❞ t♦ ❤❡❛r ♦❢ ❛♥② ♦t❤❡r r❡❧❛t❡❞ ✇♦r❦✳ ❍ä❣❣❧✉♥❞ ❬✹❪ ❤❛s ✇♦r❦❡❞ ♦♥ ❛❧❧♦✇✐♥❣ ❛r❜✐tr❛r② ❝♦♥str❛✐♥ts t♦ ❜❡ ♣✉t ✐♥ t❤❡ st♦r❡✳ ❚❤✐s ❝❧❡❛r❧② ❣❡♥❡r❛❧✐s❡s ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥✳ ▼② ✇♦r❦ ✐s ♦♥ ❛ s♣❡❝✐❛❧ ❝❛s❡ ♦❢ t❤✐s ✐❞❡❛ ❝❤♦s❡♥ t♦ ❜❡ ❡✣❝✐❡♥t ❛♥❞ ♠♦r❡ ❝♦♠♠♦♥ ✐♥ ♣r❛❝t✐❝❡✳ ❈♦♥str❛✐♥t ❤❛♥❞❧✐♥❣ r✉❧❡ ✭❈❍❘✮ s♦❧✈❡rs ❬✷❪ ❝❛♥ ❡❛s✐❧② ✐♥❝♦r♣♦r❛t❡ ✭❞✐s✲✮❡q✉❛❧✐t✐❡s ✐♥t♦ t❤❡✐r r✉❧❡s✳ ❍♦✇❡✈❡r ❈❍❘ r✉❧❡s ❛r❡ ♥♦t s✉✐t❛❜❧❡ ❢♦r r❡♣❧❛❝✐♥❣ ❝♦♥str❛✐♥t ♣r♦♣❛❣❛t♦rs ❢♦r r❡❛s♦♥s ♦❢ ❡✣❝✐❡♥❝②✳ ❙♦♠❡ s❛t✐s✜❛❜✐❧✐t② ♠♦❞✉❧♦ t❤❡♦r✐❡s ✭❙▼❚✮ s♦❧✈❡rs ✉s❡ t❤❡ t❤❡♦r② ♦❢ ❡q✉❛❧✐t✐❡s ✇✐t❤ ✉♥✐♥t❡r♣r❡t❡❞ ❢✉♥❝t✐♦♥s ❬✼❪✳ ❆❧t❤♦✉❣❤ ✭❞✐s✲✮❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ ❛♣♣❧✐❡s t♦ t❤✐s t❤❡♦r② ❛♥❞ ✐s ❡①♣❧♦✐t❡❞✱ t❤❡ t❤❡♦r② ✐s ❧✐♠✐t❡❞ ✐♥ ✐ts ❡①♣r❡ss✐✈✐t② ❝♦♠♣❛r❡❞ t♦ ✇❤❛t✬s ❛✈❛✐❧❛❜❧❡ ✐♥ ❛ ❈❙P s♦❧✈❡r✳ ❙♦♠❡ ❈❙P s♦❧✈❡rs ❛r❡ ❛❜❧❡ t♦ ✉♥✐❢② ✈❛r✐❛❜❧❡s✱ ♠❡❛♥✐♥❣ t❤❛t ♦♥❝❡ t❤❡② ❛r❡ ❞❡t❡❝t❡❞ t♦ ❜❡ ❡q✉❛❧ t❤❡② ❜❡❝♦♠❡ t❤❡ s❛♠❡ ✈❛r✐❛❜❧❡❀ ❊❝❧✐♣s❡ ✐s ❛♥ ❡①❛♠♣❧❡ ♦❢ s✉❝❤ ❛ s♦❧✈❡r✳ ❚❤✐s ❛❧✲ ❧♦✇s ❡q✉❛❧✐t② ♣r♦♣❛❣❛t✐♦♥ t♦ ❜❡ ✐♠♣❧❡♠❡♥t❡❞✱ ❜✉t ❞♦❡s ♥♦t ✐♥❝❧✉❞❡ ❞✐s❡q✉❛❧✐t✐❡s✳ ❚❤❡♦r❡♠ ✸ s❤♦✇s t❤❛t ❜♦t❤ ❛r❡ ♥❡❝❡ss❛r② t♦ ❛❝❤✐❡✈❡ ❢✉❧❧ ♣r♦♣❛❣❛t✐♦♥✳ ❘❡❢❡r❡♥❝❡s ✶✳ ❆❧❛♥ ▼✳ ❋r✐s❝❤✱ ❇r❛❤✐♠ ❍♥✐❝❤✱ ❩❡②♥❡♣ ❑✐③✐❧t❛♥✱ ■❛♥ ▼✐❣✉❡❧✱ ❛♥❞ ❚♦❜② ❲❛❧s❤✳ Pr♦♣✲ ❛❣❛t✐♦♥ ❛❧❣♦r✐t❤♠s ❢♦r ❧❡①✐❝♦❣r❛♣❤✐❝ ♦r❞❡r✐♥❣ ❝♦♥str❛✐♥ts✳ ❆rt✐❢✳ ■♥t❡❧❧✳✱ ✶✼✵✭✶✵✮✿✽✵✸✕ ✽✸✹✱ ✷✵✵✻✳ ✷✳ ❚❤♦♠ ❋rü❤✇✐rt❤✳ ❚❤❡♦r② ❛♥❞ ♣r❛❝t✐❝❡ ♦❢ ❈♦♥str❛✐♥t ❍❛♥❞❧✐♥❣ ❘✉❧❡s✳ ❏✳ ▲♦❣✐❝ Pr♦✲ ❣r❛♠♠✐♥❣✱ ❙♣❡❝✐❛❧ ■ss✉❡ ♦♥ ❈♦♥str❛✐♥t ▲♦❣✐❝ Pr♦❣r❛♠♠✐♥❣✱ ✸✼✭✶✕✸✮✿✾✺✕✶✸✽✱ ✶✾✾✽✳ ✸✳ ■❛♥ P✳ ●❡♥t✱ ❈❤r✐st♦♣❤❡r ❏❡✛❡rs♦♥✱ ❛♥❞ ■❛♥ ▼✐❣✉❡❧✳ ▼✐♥✐♦♥✿ ❆ ❢❛st s❝❛❧❛❜❧❡ ❝♦♥✲ str❛✐♥t s♦❧✈❡r✳ ■♥ ❊❈❆■✱ ♣❛❣❡s ✾✽✕✶✵✷✱ ✷✵✵✻✳ ✹✳ ❇❥♦r♥ ❍❛❣❣❧✉♥❞✳ ❆ ❢r❛♠❡✇♦r❦ ❢♦r ❞❡s✐❣♥✐♥❣ ❝♦♥str❛✐♥t st♦r❡s✳ ▼❛st❡r✬s t❤❡s✐s✱ ▲✐♥❦♦♣✐♥❣ ❯♥✐✈❡rs✐t②✱ ✷✵✵✼✳ ✺✳ P❛s❝❛❧ ❱❛♥ ❍❡♥t❡♥r②❝❦✱ ❨✈❡s ❉❡✈✐❧❧❡✱ ❛♥❞ ❈❤♦❤✲▼❛♥ ❚❡♥❣✳ ❆ ❣❡♥❡r✐❝ ❛r❝✲ ❝♦♥s✐st❡♥❝② ❛❧❣♦r✐t❤♠ ❛♥❞ ✐ts s♣❡❝✐❛❧✐③❛t✐♦♥s✳ ❆rt✐✜❝✐❛❧ ■♥t❡❧❧✐❣❡♥❝❡✱ ✺✼✭✷✕✸✮✿✷✾✶✕ ✸✷✶✱ ✶✾✾✷✳ ✻✳ ❈❤r✐s ❏❡✛❡rs♦♥✱ ❏✉♥❡ ✷✵✵✾✳ P❡rs♦♥❛❧ ❝♦rr❡s♣♦♥❞❡♥❝❡✳ ✼✳ ❘✳ ◆✐❡✉✇❡♥❤✉✐s ❛♥❞ ❆✳ ❖❧✐✈❡r❛s✳ ❉❡❝✐s✐♦♥ Pr♦❝❡❞✉r❡s ❢♦r ❙❆❚✱ ❙❆❚ ▼♦❞✉❧♦ ❚❤❡♦r✐❡s ❛♥❞ ❇❡②♦♥❞✳ ❚❤❡ ❇❛r❝❡❧♦❣✐❝❚♦♦❧s✳ ✭■♥✈✐t❡❞ P❛♣❡r✮✳ ■♥ ●✳ ❙✉t❝❧✐✛❡ ❛♥❞ ❆✳ ❱♦r♦♥❦♦✈✱ ❡❞✐t♦rs✱ ✶✷❤ ■♥t❡r♥❛t✐♦♥❛❧ ❈♦♥❢❡r❡♥❝❡ ♦♥ ▲♦❣✐❝ ❢♦r Pr♦❣r❛♠♠✐♥❣✱ ❆r✲ t✐✜❝✐❛❧ ■♥t❡❧❧✐❣❡♥❝❡ ❛♥❞ ❘❡❛s♦♥✐♥❣✱ ▲P❆❘✬✵✺✱ ✈♦❧✉♠❡ ✸✽✸✺ ♦❢ ▲❡❝t✉r❡ ◆♦t❡s ✐♥ ❈♦♠✲ ♣✉t❡r ❙❝✐❡♥❝❡✱ ♣❛❣❡s ✷✸✕✹✻✳ ❙♣r✐♥❣❡r✱ ✷✵✵✺✳ ✽✳ ❏❡❛♥✲❈❤❛r❧❡s ❘é❣✐♥✳ ●❡♥❡r❛❧✐③❡❞ ❛r❝ ❝♦♥s✐st❡♥❝② ❢♦r ❣❧♦❜❛❧ ❝❛r❞✐♥❛❧✐t② ❝♦♥str❛✐♥t✳ ■♥ ❆❆❆■✴■❆❆■✱ ❱♦❧✳ ✶✱ ♣❛❣❡s ✷✵✾✕✷✶✺✱ ✶✾✾✻✳ ✾✳ ❈❤r✐st✐❛♥ ❙❝❤✉❧t❡ ❛♥❞ P❡t❡r ❏✳ ❙t✉❝❦❡②✳ ❉②♥❛♠✐❝ ❛♥❛❧②s✐s ♦❢ ❜♦✉♥❞s ✈❡rs✉s ❞♦♠❛✐♥ ♣r♦♣❛❣❛t✐♦♥✳ ■♥ ▼❛r✐❛ ●❛r❝✐❛ ❞❡ ❧❛ ❇❛♥❞❛ ❛♥❞ ❊♥r✐❝♦ P♦♥t❡❧❧✐✱ ❡❞✐t♦rs✱ ❚✇❡♥t② ❋♦✉rt❤ ■♥t❡r♥❛t✐♦♥❛❧ ❈♦♥❢❡r❡♥❝❡ ♦♥ ▲♦❣✐❝ Pr♦❣r❛♠♠✐♥❣✱ ✈♦❧✉♠❡ ✺✸✻✻ ♦❢ ▲❡❝t✉r❡ ◆♦t❡s ✐♥ ❈♦♠♣✉t❡r ❙❝✐❡♥❝❡✱ ♣❛❣❡s ✸✸✷✕✸✹✻✱ ❯❞✐♥❡✱ ■t❛❧②✱ ❉❡❝❡♠❜❡r ✷✵✵✽✳ ❙♣r✐♥❣❡r✲ ❱❡r❧❛❣✳ 82 Tractable Benchmarks Student: Justyna Petke, Supervisor: Peter Jeavons Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford, UK, e-mail: justyna.petke@comlab.ox.ac.uk Abstract. The general constraint satisfaction problem for variables with ﬁnite domains is known to be NP-complete, but many diﬀerent conditions have been identiﬁed which are suﬃcient to ensure that classes of instances satisfying those conditions are tractable, that is, solvable in polynomial time. Results about tractability have generally been presented in theoretical terms, with little discussion of how these results impact on practical constraint-solving techniques. In this paper we investigate the performance of several standard constraint solvers on benchmark instances that are designed to satisfy various diﬀerent conditions that ensure tractability. We show that in certain cases some existing solvers are able to automatically take advantage of the problem features which ensure tractability, and hence solve the corresponding instances very eﬃciently. However, we also show that in many cases the existing pre-processing techniques and solvers are unable to solve eﬃciently the families of instances of tractable problems that we generate. We therefore suggest that such families of instances may provide useful benchmarks for improving pre-processing and solving techniques. They may also provide good candidates for global constraints that can take advantage of eﬃcient special-purpose algorithms. 1 Introduction Software tools for solving ﬁnite-domain constraint problems are now freely available from several groups around the world. Examples include the Gecode system [2], the G12 [1], and the Minion constraint solvers [10]. One way to drive performance improvements in constraint solvers is to develop challenging benchmark instances. This approach can also help to drive improvements in the robustness and ﬂexibility of constraint-solving software. One obvious source of benchmark instances is from practical applications such as scheduling and manufacturing process organisation. Another source is combinatorial problems such as puzzles and games. The G12 MiniZinc suite includes several examples. In this paper we suggest another important source of useful benchmarks which has not yet been systematically explored: the theoretical study of constraint satisfaction. From the very beginning of the study of constraint programming there has been a strand of research which has focused on identifying features of constraint problems which make them tractable to solve [6, 8] and 83 this research has gathered pace recently with the discovery of some deep connections between constraint problems and algebra [3, 4], logic [7], and graph and hypergraph theory [5, 11]. This research has focused on two main ways in which imposing restrictions on a constraint problem can ensure that it can be tractably solved. The ﬁrst of these is to restrict the forms of constraint which are allowed; these are sometimes known as constraint language restrictions. The second standard approach to identifying restrictions on constraint problems which ensure tractability has been to consider restrictions on the way in which the constraints overlap; these are sometimes referred to as structural restrictions. In this paper we begin the process of translating from theoretical results in the literature to concrete families of instances of constraint problems. We obtain several families which are known to be eﬃciently solvable by simple algorithms, but which cause great diﬃculties for some existing constraint solvers. We argue that such families of instances provide a useful addition to benchmark suites derived from other sources, and can provide a fruitful challenge to the developers of general-purpose solvers. 2 Deﬁnitions In the theoretical literature the (ﬁnite-domain) constraint satisfaction problem (CSP) is typically deﬁned as follows: Deﬁnition 2.1. A instance of the constraint satisfaction problem is speciﬁed by a triple (V, D, C), where – V is a ﬁnite set of variables – D is a ﬁnite set of values (this set is called the domain) – C is a ﬁnite set of constraints. Each constraint in C is a pair (Ri , Si ) where • Si is an ordered list of ki variables, called the constraint scope; • Ri is a relation over D of arity ki , called the constraint relation. Two proposed standard higher-level languages for specifying constraint problems in practice are Zinc [14] and Essence [9]. However, both of these languages are considered too abstract and too general to be used directly as the input language for current constraint solvers, so they both have more restricted subsets which are more suitable for solver input: MiniZinc and Essence′ . There exists a software translation tool, called Tailor [15], which converts from Essence′ speciﬁcations to the input language for the Minion solver (or Gecode). Another software translation tool distributed with the G12/MiniZinc software [1], converts from MiniZinc to a more restricted language known as FlatZinc, that serves as the input language for the G12 solver; FlatZinc input is also accepted by Gecode. The FznTini solver, developed by Huang, transforms a FlatZinc ﬁle into DIMACS CNF format and then uses a Boolean Satisﬁability problem (SAT) solver, called TiniSAT, to solve the resulting SAT problem [12]. Deﬁnition 2.2. A class of CSP instances will be called tractable if there exists an algorithm which ﬁnds a solution to all instances in that class, or reports that there are no solutions, whose time complexity is polynomial in the size of the instance speciﬁcation. 2 84 3 Max-closed constraints One of the ﬁrst non-trivial classes of tractable constraint types described in the literature is the class of max-closed constraints introduced in [13]. Deﬁnition 3.1 ([13]). A constraint (R, S) with relation R of arity r over an ordered domain D is said to be max-closed if for all tuples (d1 , ..., dr ),(d′1 , ..., d′r ) ∈ R we have (max(d1 , d′1 ), ..., max(dr , d′r )) ∈ R. In particular, one useful form of max-closed constraint is an inequality of the form a1 X1 + a2 X2 + · · · + ar−1 Xr−1 ≥ ar Xr + c, where the Xi are variables, c is a constant, and the ai s are non-negative constants [13]. To generate solvable max-closed CSP instances, we selected a random assignment to all of the variables, and then generated random inequalities of the form above, keeping only those that were satisﬁed by this ﬁxed assignment. This ensured that the system of inequalities had at least one solution. To generate unsolvable max-closed CSP instances, we generated the inequalities without imposing this condition; if the resulting set was solvable, another set was generated. As with all of the results presented in this paper, we solved the generated problem instances using Gecode (version 1.3), G12 (version 0.8.1), FznTini, and Minion (version 0.8RC1) and the times given are elapsed times on a Lenovo 3000 N200 laptop with an Intel Core 2 Duo processor running at 1.66GHz an 2GB of RAM. These timings exclude the time required to translate the input from MiniZinc to FlatZinc (for input to Gecode, G12 and FznTini) or from Essence′ to Minion input format. (In the special case of FznTini, times include the additional time required to translate from FlatZinc to DIMACS CNF format.) Average times over three runs with diﬀerent generated instances were taken, but the variability was found in all cases to be quite small. Predictably, FznTini performs very poorly on these inequalities, which it has to translate into (large) sets of clauses. Standard CSP solvers should do well on these instances, because the eﬃcient algorithm for solving max-closed instances is based on achieving arc-consistency, and all standard constraint solvers do this by default. Our results conﬁrm that the standard CSP solvers do indeed all perform well on these instances, although the G12 solver was noticeably less eﬃcient than the other two. 4 0/1/all constraints Our second example of a language-based restriction ensuring tractability involves the 0/1/all constraints introduced and shown to be tractable in [6]. Deﬁnition 4.1 ([6]). Let x1 and x2 be variables. Let A be a subset of possible values for x1 and B be a subset of possible values for x2 . – A complete binary constraint is a constraint R(x1 , x2 ) of the form A × B. – A permutation constraint is a constraint R(x1 , x2 ) which is equal to {(v, π(x))|x ∈ A} for some bijection π : A → B. – A two-fan constraint is a constraint R(x1 , x2 ) where there exists v ∈ A and w ∈ B with R(x1 , x2 ) = (v × B) ∪ (A × w). 3 85 A 0/1/all constraint is either a complete constraint, a permutation constraint, or a two-fan constraint. What is particularly interesting about this form of constraint, for our purposes, is that the eﬃcient algorithm for 0/1/all constraints is based on achieving pathconsistency [6], which is not implemented in standard constraint solvers. We wrote a generator for satisﬁable CSP instances with 0/1/all constraints of various kinds on n variables. To ensure satisﬁability we ﬁrst generate a random assignment and then add only those 0/1/all constraints that satisfy the initial assignment. We generated instances for various choices of the parameter n and domain size m, and solved these using Gecode, G12, FznTini, and Minion. All the solvers performed very well, especially Gecode. We also generated unsatisﬁable instances with 0/1/all constraints on just a small number of variables, leaving all other variables unconstrained. FznTini’s results were inconclusive as the instances were unsatisﬁable [12]. The G12 solver reported ‘no solutions’ in 0.2 seconds, but Minion and Gecode could not solve this problem within 15 min. When the domain size was decreased to two, none of the solvers could verify that this simple unsatisﬁable instance had no solutions within 15 minutes. The problem seems to be the ﬁxed default variable ordering. None of the standard solvers focus the search on the few variables that are restricted; having no constraint between two variables is treated in the same way as having a complete constraint. Once the unsatisﬁable instances were embedded in satisﬁable instances, the performance of all the solvers was as good as before. These results suggest that standard CSP solvers can handle random collections of 0/1/all constraints very eﬀectively, even without specialised algorithms. However, they appear to be poor at focusing search on more highly constrained regions, which is thought to be one of the strengths of the current generation of SAT-solvers. This suggests an obvious target for improvement in adapting the variable ordering to the speciﬁc features of the input instance. 5 Bounded-width structures For our ﬁnal example, we consider classes of CSP instances which are tractable because of the way that the constraint scopes are chosen. In other words, we consider structural restrictions. For any CSP instance, the scopes of all the constraints can be viewed as the hyperedges of an associated hypergraph whose vertices are the variables. This hypergraph is called the structure of the CSP instance. One very simple condition which is suﬃcient to ensure tractability is to require the structure to have a tree decomposition [11], with some ﬁxed bound on the maximum number of vertices in any node of the tree. Such structures are said to have bounded width. However, the eﬃcient algorithm for CSP instances with bounded width structures is based on choosing an appropriate variable ordering, and imposing a level of consistency proportional to the width [7]. None of the standard CSP solvers incorporate such algorithms, so it is not at all evident whether they can solve bounded width instances eﬃciently. 4 86 To investigate this question we wrote a generator for a family of speciﬁc CSP instances with a very simple bounded-width structure. The instances we generate are speciﬁed by two parameters, w and m. They have (mw + 1) ∗ w variables arranged in groups of size w, each with domain {0, ..., m}. We impose a constraint of arity 2w on each pair of successive groups, requiring that the sum of the values assigned to the ﬁrst of these two groups should be larger than the sum of the values assigned to the second. This ensures that a solution exists and satisﬁes the following conditions: the diﬀerence between the sum of values assigned to each successive group is 1, and the sum of the values assigned to the last group is zero. It turns out that the runtimes of Gecode and Minion grow rapidly with problem size. The runtimes for the G12 solver do not increase so fast for these speciﬁc instances, but if we reverse the inequalities, then they do increase in the same way, although in this case Gecode and Minion perform much better. Somewhat surprisingly, FznTini seems to be able to solve all of these instances fairly eﬃciently, even though they contain arithmetic inequalities which have to be translated into fairly large sets of clauses. It is concluded that an important opportunity to improve the performance of CSP solvers would be in ﬁnding an eﬃcient way of taking advantage of instance structure by adapting the variable ordering or other aspects of the search process to the particular instance. Moreover, as the ordering can be set in the input ﬁle, the question arises as to whether those adjustments could be automatically identiﬁed by the translators as part of the pre-processing. 6 Conclusions We believe that the results presented in this paper have established that the various ideas about diﬀerent forms of tractable CSP instances presented in the theoretical literature can provide a fruitful source of inspiration for the design of challenging benchmarks. The initial applications of these ideas, presented in the previous sections, have already identiﬁed signiﬁcant diﬀerences between solvers in their ability to exploit salient features of the problem instances they are given. There are a number of technical diﬃculties to overcome in developing useful benchmark instances. First of all, unlike SAT solvers, there is no standard input language for CSP solvers. The translation from MiniZinc to FlatZinc, or from Essence′ to Minion, can sometimes obscure the nature of an essentially simple problem, and hence badly aﬀect the eﬃciency of the solution. On the other hand, the cost of standardisation is a loss of expressive power. We have seen in Section 3 that translating simple forms of constraints such as linear inequalities into CNF may be very ineﬃcient, and may lose the important features of the constraints which guarantee tractability. We suggest that a better awareness of the factors of a problem speciﬁcation that can ensure tractability could lead to better translation software, which ensures that such features are preserved. Even when they have been successfully captured in an appropriate speciﬁcation language, and input to a constraint solver, it can be the case that theoretically tractable instances may still be solved very ineﬃciently in practice. We have seen in Sections 4, and especially in Section 5, that when the tractability is 5 87 due to a property that requires a higher level of consistency than arc-consistency to exploit, instances may be very challenging for standard solvers. Finding eﬀective automatic ways to improve the variable orderings and value orderings used by a solver according to speciﬁc relevant features of the input instance seems a promising ﬁrst step which has not been suﬃciently pursued. Summing up, in order to improve the performance of constraint solvers we need eﬀective benchmarks which can explore that performance over a range of diﬀerent problem types with diﬀerent characteristics. One way to systematically develop such benchmarks is to use the insights from the theoretical study of constraint satisfaction. Benchmarks derived in this way can be simple enough to analyse in detail, and yet challenging enough to reveal speciﬁc weaknesses in solver techniques. This paper has begun to explore the potential of this approach. References 1. G12/MiniZinc constraint solver. Software available at http://www.g12.cs.mu.oz.au/minizinc/download.html. 2. Gecode constraint solver. Software available at http://www.gecode.org/. 3. A. Bulatov, A. Krokhin, and P. Jeavons. Classifying the complexity of constraints using ﬁnite algebras. SIAM Journal on Computing, 34(3):720–742, 2005. 4. A. Bulatov and M. Valeriote. Recent results on the algebraic approach to the CSP. In Complexity of Constraints, volume 5250 of Lecture Notes in Computer Science, pages 68–92. Springer, 2008. 5. D. Cohen, P. Jeavons, and M. Gyssens. A uniﬁed theory of structural tractability for constraint satisfaction problems. Journal of Computer and System Sciences, 74:721–743, 2007. 6. M.C. Cooper, D.A. Cohen, and P.G. Jeavons. Characterising tractable constraints. Artiﬁcial Intelligence, 65:347–361, 1994. 7. V. Dalmau, Ph. Kolaitis, and M. Vardi. Constraint satisfaction, bounded treewidth, and ﬁnite-variable logics. In Proceedings of CP’02, volume 2470 of Lecture Notes in Computer Science, pages 310–326. Springer-Verlag, 2002. 8. E.C. Freuder. A suﬃcient condition for backtrack-bounded search. Journal of the ACM, 32:755–761, 1985. 9. A. Frisch, W. Harvey, C. Jeﬀerson, B. Martnez-Hernndez, and I. Miguel. The essence of ESSENCE: A constraint language for specifying combinatorial problems. In Proceedings of IJCAI’05, pages 73–88, 2005. 10. I. Gent, C. Jeﬀerson, and I. Miguel. Minion: A fast scalable constraint solver. In Proceeedings ECAI 2006, pages 98–102. IOS Press, 2006. Software available at http://minion.sourceforge.net/. 11. M. Grohe. The structure of tractable constraint satisfaction problems. In MFCS 2006, Lecture Notes in Computer Science, 4162, pages 58–72. Springer, 2006. 12. J. Huang. Universal Booleanization of constraint models. In Proceedings of CP’08, volume 5202 of Lecture Notes in Computer Science, pages 144–158. Springer, 2008. 13. P.G. Jeavons and M.C. Cooper. Tractable constraints on ordered domains. Artiﬁcial Intelligence, 79(2):327–339, 1995. 14. N. Nethercote, P. Stuckey, R. Becket, S. Brand, G. Duck, and G. Tack. MiniZinc: Towards a standard modelling language. In Proceedings of CP’07, volume 4741 of Lecture Notes in Computer Science, pages 529–543. Springer, 2007. 15. Andrea Rendl. TAILOR - tailoring Essence′ constraint models to constraint solvers. Software available at http://www.cs.st-andrews.ac.uk/~andrea/tailor/. 6 88 A simple efficient exact algorithm based on independent set for Maxclique problem Chu Min Li and Zhe Quan (student) MIS, EA4290, Université de Picardie Jules Verne, 33 rue St. Leu 80039 Amiens, France Abstract. In this paper we propose a simple and efficient exact algorithm for the famous Maxclique problem called iM axClique, using a new and powerful upper bound based on a partition of a graph into independent subsets. Experimental results show that iM axClique is very fast on DIMACS Maxclique benchmarks and solves six instances remaining open in [9]. Keywords: Branch-and-Bound, Independent set, Maxclique 1 Introduction Consider an undirected graph G=(V , E), where V is a set of n vertices {v1 , v2 , ..., vn } and E is a set of m edges. Edge (vi , vj ) with i 6= j is said to connect vertices vi and vj . The graph is undirected because we do not distinguish (vi , vj ) from (vj , vi ). A clique of G is a subset C of V such that every two vertices in C are connected by an edge in E. The maximum clique problem (Maxclique for short) consists in finding a clique of G of the largest cardinality. Maxclique is known to be NP-hard and is very important in many real-world applications. There are mainly two types of algorithms to solve the Maxclique problem: heuristic algorithms including greedy construction algorithms and stochastic local search algorithms (see e.g. [8,1,3,5,6,10]), and exact algorithms including branch and bound algorithms (see e.g. [4,9]). Heuristic algorithms are able to solve large and hard Maxclique problems but are unable to claim the optimality of their solutions. Exact algorithms may fail to solve large and hard Maxclique problems, but are sure of the optimality of the solutions they are able to find. An independent set of a graph is a subset I of V such that any vertex in I is not connected to any vertex in I. The maximum independent set problem consists in finding an independent set of G of the largest cardinality. Maxclique is closely related to the maximum independent set problem, because a clique of G is an independent set of its complementary graph Ḡ, and vice versa, where Ḡ is defined to be (V , Ē) with Ē={(vi , vj ) | (vi , vj ) ∈ V × V , i 6= j and (vi , vj ) 6∈ E)}. In this paper, we propose a simple and efficient branch and bound algorithm for Maxclique. We call it iM axClique because it uses a new upper bound based on independent set partition of G. In section 2, we present iM axClique and the new upper bound. In section 3, we compare iM axClique with the best state-of-the-art exact algorithms [4,9] in our knowledge on the widely used DIMACS Maxclique benchmark1. The experimental results show that iM axClique is significantly faster and is able to close 6 of the 14 problems remaining open in [4] and [9]. We conclude in section 4. 1 available at http://cs.hbg.psu.edu/txn131/clique.html 2 Chu Min Li and Zhe Quan (student) 89 Algorithm 1 iMaxClique(G, C, LB), a branch and bound algorithm for Maxclique Input: A graph G=(V , E), a clique C, and the cardinality LB of the largest clique found so far Output: A maximum clique of G begin if |V |=0 then return C; U B ← overestimation(G)+|C|; if LB ≥ U B then return ∅; select the vertex v of minimum degree from G; C1 ← iMaxClique(Gv , C∪{v}, LB); if |C1 | > LB then LB ← |C1 | ; C2 ← iMaxClique(G\v, C, LB); if |C1 | ≥ |C2 | then return C1 ; else return C2 ; end 2 A Branch and Bound Maxclique Solver A clique C of a graph G=(V , E) is maximal, if no other vertex of G can be added into C to form a larger clique. C is maximum if no clique of larger cardinality exists in G. G can have several maximum cliques. Let V ′ be a subset of V , the subgraph G′ of G induced by V ′ is defined as G′ =(V ′ , E ′ ), where E ′ ={(vi , vj ) ∈ E | vi , vj ∈ V ′ }. Given a vertex v of G, the neighbor vertices of v is denoted by N (v)={v ′ | (v, v ′ ) ∈ E}. The cardinality |N (v)| of N (v) is called the degree of v. Gv denotes the subgraph induced by N (v), and G\v denotes the subgraph induced by V \{v}. G\v is obtained by removing v and all edges connecting v from G. 2.1 A Basic branch and bound Maxclique algorithm The space of all possible cliques of a graph G can be represented by a binary search tree. A basic branch and bound algorithm for Maxclique compares all these cliques and outputs a maximum clique, by exploring the search tree in a depth-first manner. Algorithm 1 shows the pseudo-code of this algorithm, inspired from the branch and bound algorithm MaxSatz for the MaxSAT problem proposed in [7]. At every node, the algorithm works with a graph G=(V , E), a clique C of the initial input graph under construction, and the cardinality of the largest clique found so far (called lower bound LB). At the root, G is the input graph, C is the empty set ∅, and LB is 0. At the other nodes, G is a subgraph of the initial input graph, C is disjoint from G but every vertex of G is connected to every vertex of C in the initial input graph. In other words, adding any clique of G into C forms a clique of the initial input graph. The aim of the algorithm is to add a clique of the maximum cardinality of G into C. If G is empty, then a maximal clique C of the initial input graph is found and returned. Otherwise the algorithm compares LB with the sum of the cardinality of C and an overestimation of the maximum cardinality of the cliques in G. The sum is called upper bound (U B). If LB<U B, the algorithm tries to find a clique of larger cardinality than LB. For this purpose, it selects a vertex v in G and implicitly partition all cliques in G into A simple efficient exact algorithm based on independent set for Maxclique problem 90 two sets, each of which corresponds to a branch of the current tree node: all cliques containing v and all cliques not containing v. The algorithm is then recursively called for each branch to obtain the largest clique in each set. The cardinality of these two cliques is compared and the larger one is returned. If LB≥U B, a larger clique clearly cannot be found from G and C. The algorithm should then prune the subtree rooted at the current node and backtracks to a higher level in the search tree. The estimation of a tight upper bound (UB) of the maximum clique cardinality is essential for the performance of iM axClique, since a tight upper bound can allow iM axClique to prune a subtree even for a large G. Algorithm iM axClique uses a new upper bound presented in the next subsection. 2.2 A new upper bound for MaxClique The upper bound used in iM axClique is computed as overestimation(G)+|C|, where the overestimation(G) function is based on the following property: Proposition 1. Let ω(G) denote the cardinality of the maximum clique of the graph G. If G can be partitioned into k independent sets, then ω(G) ≤ k. Proof. Let C be a maximum clique of G and S1 , S2 , ..., Sk be the k independent sets partitioning G. At most one vertex in an independent set Si can belong to C (i.e. |Si ∩C|≤1 ∀i), because all vertices of C are connected but vertices in Si are not connected. So the cardinality of C is at most k. Algorithm 2 shows the pseudo-code of overestimation(G), which works with the complementary graph Ḡ of G and the iClique(Ḡ, ∅) function to find a maximal independent set of G. Note that a clique in Ḡ is an independent set in G. The function overestimation(G) repeatedly calls the iClique function to partition G into independent sets and returns the number of independent sets. Algorithm 2 overestimation(G), an overestimation of the maximum cardinality of a clique of G Input: A graph G=(V , E) Output: a partition of G into independent sets begin overestimation ← 0; while Ḡ is not empty do C ← iClique(Ḡ, ∅); remove every vertex of C and its edges from Ḡ; overestimation ← overestimation +1; return overestimation; end The iClique function is showed in Algorithm 3. It returns a maximal clique of G, since every time a vertex v is added into C, all vertices not connected to v are eliminated from G. Notice that iClique is similar to iM axClique, except that it does not backtrack, since there is only one recursive call. 3 4 Chu Min Li and Zhe Quan (student) 91 Algorithm 3 iClique(G, C), a simple algorithm to find a maximal clique Input: A graph G=(V , E), a clique C Output: a maximal clique of G begin if |V |=0 then return C; select the vertex v of minimum degree from G; return iClique(Gv , C∪{v}); end 3 Comparative Evaluation of iM axClique Regin proposed in [9] a branch-and-bound algorithm for Maxclique using an upper bound based on a matching algorithm. This upper bound roughly corresponds to the number of independent sets of cardinality 2 in G. The upper bound is fast to compute and can be used to eliminate some vertices in G after each branching as follows: select a vertex v and compute the upper bound for the subgraph induced by all neighbors of v, if the upper bound is smaller or equal to LB, then v is eliminated from G. This so-called filtering process is applied to a subset of vertices and is intensively used in Regin’s algorithm. Regin’s algorithm also uses the diving technique in conjunction with a MIP approach, and a new search strategy to select the next vertex v expanding the current clique to apply the idea of not set proposed in [2]. Fahle [4] proposed a branch-and-bound algorithm for Maxclique using a upper bound based on an estimation of the number of colors needed to color G. The drawback of this upper bound compared with Regin’s approach is the time needed to compute such a bound [9]. The filtering process is also intensively used to eliminate some vertices after branching. Note that the set of vertices having the same color is necessarily independent, but not necessarily maximal, which is to be compared with the sequence of maximal independent sets computed by overestimation(G) showed in Algorithm 2. We compare in Table 1 iM axClique with Regin’s algorithm and Fahle’s algorithm on the famous DIMACS benchmark for Maxclique, excluding the 14 instances remaining open in [9] and the very easy instances (those that are solved by all the three algorithms in less than 1 second), where the runtime for Regin’s algorithm and Fahle’s algorithm is taken from [9]. The runtime of iM axClique (denoted iM axC in the table) is obtained on a similar computer as in [9], i.e. a pentium IV of 2.0 Ghz with 512 Mo of memory under Linux, and includes the reading of the graphs from the input files which may not be negligible for large graphs. We also show the improvement of iM axClique over Regin’algorithm (gap) computed as (Regin’s runtime– iM axClique’s runtime)/iM axClique’s runtime. Except six relatively easy instances (hamming10-2, johnson16-2-4, p-hat5001, p-hat1500-1, san200_0.9_2, and sanr200_0.9, solved within 1 or 4 minutes), iM axClique is significantly (up to 17 times) faster than Regin’s and Fahle’s algorithms for all the rest of instances in Table 1. It appears that the gain of iM axClique is more important for larger and denser graphs. For example, iM axClique is 66% faster than Regin’s algorithm for the brock200 instances and is 118% faster for the brock400 instances in the average. For graphs with the same number of vertices, it ap- A simple efficient exact algorithm based on independent set for Maxclique problem 92 Table 1. Runtime in seconds on DIMACS Maxclique benchmarks Name brock200_1 brock200_3 brock200_4 brock400_1 brock400_2 brock400_3 brock400_4 hamming10-2 hamming8-4 johnson16-2-4 keller4 p-hat300-2 p-hat300-3 p-hat500-1 p-hat500-2 p-hat500-3 p-hat700-1 p-hat700-2 p-hat1000-1 p-hat1000-2 p-hat1500-1 MANN_a27 san1000 san200_0.7_1 san200_0.9_1 san200_0.9_2 san200_0.9_3 san400_0.5_1 san400_0.7_1 san400_0.7_2 san400_0.7_3 san400_0.9_1 sanr200_0.7 sanr200_0.9 sanr400_0.5 sanr400_0.7 n 200 200 200 400 400 400 400 1024 256 120 171 300 300 500 500 500 700 700 1000 1000 1500 378 1000 200 200 200 200 400 400 400 400 400 200 200 400 400 m 14834 12048 13089 59723 59786 59681 59765 518656 20864 5460 9435 21928 33390 31569 62946 93800 60999 121728 122253 244799 284923 70551 250500 13930 17910 17910 17910 39900 55860 55860 55860 71820 13868 17863 39984 55869 density 0.7454 0.6054 0.6577 0.7484 0.7492 0.7479 0.7489 0.9902 0.6392 0.7647 0.6491 0.4889 0.7445 0.2531 0.5046 0.7519 0.2493 0.4976 0.2448 0.4901 0.2534 0.9901 0.5015 0.7000 0.9000 0.9000 0.9000 0.5000 0.7000 0.7000 0.7000 0.9000 0.6969 0.8976 0.5011 0.7001 ω 21 15 17 27 29 31 33 512 16 8 11 25 36 9 36 50 11 44 10 46 12 126 15 30 70 60 44 13 40 30 22 100 18 42 13 21 Fahle 92.72 2.23 8.18 fail fail fail fail 5.16 6.11 7.91 2.53 3.01 856.67 0.6 203.93 fail 2.67 2,086.63 16.43 fail 119.77 10,348.87 3044.09 1.57 62.61 1930.90 194.96 6.74 425.99 159.72 617.07 7,219.53 24.99 fail 23.09 15,925 Regin 10.72 0.86 2.13 11,340.8 7,910.6 4,477.23 6,051.77 1.04 4.19 3.80 0.5 0.59 40.71 2.30 32.69 12,744.7 6.01 255.79 27.80 16,845.7 480.84 18.48 102.80 0.36 1.04 2.62 182.70 1.19 23.28 67.53 273.23 1,700 4.30 150.08 17.12 3,139.11 iMaxC 6.61 0.42 1.27 5,448.65 2,423.43 3,525.16 2,244.49 49.71 0.65 5.49 0.21 0.46 31.09 0.71 18.38 2,085.79 2.35 143.61 15.71 7,343.64 237.05 17.12 24.24 0.06 0.28 3.35 11.83 0.31 3.48 3.68 41.62 210.62 2.65 164.22 13.92 1,813.68 gap 0.62 1.04 0.67 1.08 2.26 0.27 1.69 * 5.44 * 1.38 0.28 0.30 2.23 0.77 5.11 1.55 0.78 0.76 1.29 1.02 0.08 3.24 5.00 2.71 * 14.44 2.83 5.68 17.35 5.56 7.07 0.62 * 0.22 0.73 pears that the gain of iM axClique grows quickly when the density of the graphs is high. For example, iM axClique is 77% faster than Regin’s algorithm for p-hat1000-1 (density=0.2448), and is 129% faster for p-hat1000-2 (density=0.4901). Furthermore, iM axClique is 77% faster than Regin’s algorithm for p-hat500-2 (density=0.5046), and is more than five times faster for p-hat500-3 (density=0.7519). When a graph is large and dense, estimating the number of disjoint maximal independent sets appears to give a much tighter upper bound of the maximum cardinality of a clique than computing a maximal matching as in Regin’s algorithm or computing an upper bound of the number of colors needed to color a graph as in Fahle’s algorithm, explaining the performance of iM axClique compared with Regin’s and Fahle’s algorithms for large and dense graphs. Thanks to the new upper bound, iM axClique solves for the first time to our knowledge 6 instances (over 14) remaining open in [9]. Table 2 summarizes the performance of iM axClique on these huge and dense instances on a Macpro machine with 2.8 Ghz 5 6 Chu Min Li and Zhe Quan (student) 93 intel Xeon processor with 4 Go of memory using MacOsX system. The 8 other open instances cannot be solved after one day (24 hours=86400 seconds) of computation. Table 2. Performance of iM axClique in seconds on hard DIMACS Maxclique benchmarks Name n m density ω runtime brock800_1 800 207505 0.6493 23 76,242 brock800_2 800 208166 0.6513 24 70,296 brock800_3 800 207333 0.6487 25 50,151 brock800_4 800 207643 0.6497 26 40,013 MANN_a45 1035 533115 0.9963 345 10,921 p-hat700-3 700 183010 0.7480 62 27,377 4 Conclusion We have presented a simple efficient branch and bound algorithm for Maxclique. The algorithm is called iM axClique because it uses a new powerful upper bound based on a partition of a graph into independent sets. The performance of iM axClique is evaluated on the DIMACS Maxclique benchmarks and compared with the best state-of-theart exact algorithms in our knowledge. The results are very encouraging: iM axClique is significantly faster and has been able to close six instances remaining open in [9]. References 1. R. Battiti and M. Protasi. Reactive local search for the maximum clique problem. Algorithmica, vol. 29, No. 4, pp. 610-637, 2001. 2. C. Bron, J.A.G.M. Kerbosch, and H. J. Schell. Finding cliques in an undirected graph. Tech. Rep. Technological U. of Eindhoven, The Netherlands. 3. S. Busygin. A New Trust Region Technique for the Maximum Clique Problem. Report submitted, vailable at http://www.busygin.dp.ua, 2002. 4. T. Fahle. Simple and fast: Improving a branch-and-bound algorithm for maximum clique. In Proceedings of ESA-2002, pp. 485-498, 2002. 5. A. Grosso, M. Locatelli, F. Della Croce. Combining swaps and node weights in an adaptive greedy approach for the maximum clique problem. Journal of Heuristics, Vol. 10, pp. 135152, 2004. 6. K. Katayama, A. Hamamoto, H. Narihisa. Solving the maximum clique problem by k-opt local search. Proc. of the 19th Annual ACM Symposium on Applied Computing (SAC-2004), Nicosia, Cyprus, Vol. 2, pp. 1021-1025, March 14-17, 2004. 7. C. M. Li, F. Manyà, and J. Planes. New inference rules for Max-SAT. Journal of Artificial Intelligence Research, 30:321-359, 2007. 8. W. Pullan, H. H. Hoos. Dynamic Local Search for the Maximum Clique Problem. Journal of Artificial Intelligence Research, Vol. 25, pp. 159-185, 2006. 9. J.-C. Regin. Solving the maximum clique problem with constraint programming. In Proceedings of CPAIOR’03, Springer, LNCS 2883, pp. 634-648, 2003. 10. J. I. Van Hemert. and C. Solnon. A Study into Ant Colony Optimization, Evolutionary Computation and Constraint Programming on Binary Constraint Satisfaction Problems. In Proceedings of Evolutionary Computation in Combinatorial Optimization(EvoCOP 2004), pp. 114-123, 2004. 94 Exploring Local Acyclicity within Russian Doll Search Margarita Razgon (PhD student) and Gregory M. Provan (supervisor) {m.razgon|g.provan}@cs.ucc.ie Department of Computer Science, University College Cork, Ireland Abstract. In this paper we present two algorithms that improve Russian Doll Search (rds) using local acyclicity without making any assumption about the treewidth of a considered wcsp instance. In each iteration of computing a lower bound, the ﬁrst algorithm uses one acyclic constraint subgraph as a special case of our method rds-mdac [12], while the second algorithm explores a number of disjoint acyclic constraint subgraphs. We empirically demonstrate that the proposed algorithms outperform rds-mdac and rds over all the problem domains we studied. Keywords: Weighted Constraint Satisfaction Problem, Russian Doll Search, Branch-and-Bound, local inconsistency, acyclicity. 1 Introduction The eﬃciency of Branch-and-Bound methods for Weighted Constraint Satisfaction Problems strongly depends on the method of computing a lower bound associated with the current node of the search tree. There are two main approaches to computing lower bounds: (1) based on counting local inconsistencies of the given constraint network (cn) [14, 5, 6, 8, 2, 7, 3]; and (2) based on Russian Doll Search (rds) [13, 9, 10]. The advantage of the ﬁrst approach is that it takes into account the current domains of variables, but the drawback is that it cannot explore deep interactions between the variables. The rds-based methods have the opposite properties. They do explore deep interactions between variables, due to computing the optimal solution weights for subproblems of the given wcsp. However, these weights are computed based on the initial domains of variables, and hence they may be much smaller than the weights based on the current domains. It is natural to consider a hybrid technique that would beneﬁt from advantages of both approaches. Such an algorithm, rds-mdac, has been introduced by us in [12]. In particular, we partition the set of unassigned variables into two subsets. Then we evaluate the lower bound of the projection to one of the subsets using the directed arc-inconsistency counts [14, 5], the projection to the other subset is evaluated by rds [13]. 95 In this paper we generalize this approach by presenting a generic framework combining the local inconsistency based lower bound evaluation and the rdsbased one. The framework allows to combine rds with the majority of known consistency propagation algorithms [8, 2, 7, 3]. Then we specify the framework so that the set of unassigned variables processed by a local inconsistency based method forms an acyclic sub-cn, so, the optimal solution weight for the wcsp induced by these variables can be computed in a polynomial time. We call the resulting algorithm rds-tree. rds-tree explores local acyclicity of the constraint graph of the given wcsp, without making any assumption about its treewidth and density [4]. We show that rds-tree outperforms rds-mdac and rds even on quite dense instances of wcsps. Inspired by the above results, we ask ourselves if it is possible to explore other types of local consistency where it is not necessary that the algorithm catches a large acyclic sub-cn of the given wcsp. We answer this question positively. In particular, we design an algorithm rds-cascade that can beneﬁt from the existence of a large number of small disjoint acyclic sub-cn (i.e. a cascade of such sub-cn). We show that rds-cascade even outperforms rds-tree. The rest of the paper is organized as follows. In Section 2 we give the necessary background. In Section 3 we describe our algorithms rds-tree and rdscascade. In Section 4 we report our experiments. 2 Background Let Z be a binary Constraint Network (cn) with the set {v1 , . . . , vn } of variables. We denote the projection of Z to {vi , . . . , vn } by Zi . Let D(vi ) be a domain of values of vi . An assignment of Z is a pair (vi , vali ) such that vali ∈ D(vi ). The constraints of Z are represented by the set of conﬂicting pairs of values (or conﬂicts) of the given cn. A set of assignments, at most one for each variable, is a partial solution of Z. A partial solution that assigns all the variables of Z is a solution of Z. In a Weighted Constraint Satisfaction Problem (wcsp) conﬂicts are assigned with importance weights, a weight of a solution is a sum of weights of violated conﬂicts. The task of a wcsp is to ﬁnd the optimal solution of Z (having the smallest weight). A wcsp can be represented by its constraint graph, which contains one vertex per variable and one edge connecting any pair of vertices whose variables have at least one conﬂict between their values. When a wcsp is being solved by Branch-and-Bound (b&b), two numbers LB and U B are maintained during solving, that are, respectively, a lower and an upper bound on the optimal solution weight. If LB ≥ U B then b&b backtracks. Russian Doll Search (rds) [13] is a procedure that performs n successive searches on nested subproblems, i.e. rds solves a wcsp ﬁrst for Zn , then for Zn−1 , Zn−2 , . . . , and ﬁnally, for Z1 , for the ﬁxed order v1 , . . . , vn of variables. Assume that rds solves Z1 , and the current partial solution P assigns variables v1 , . . . , vi−1 and has weight F . Let v be an unassigned variable and val ∈ D(v). We denote by Conf (v, val, P ) the sum of weights of conﬂicts of (v, val) with P . Let M inConf (v, P ) = minval∈D(v) Conf (v, val, P ). We denote by Connect the 96 sum of M inConf (v, P ) over all unassigned variables. Let Y be the weight of an optimal solution of Zi obtained before. Then LB(rds) = F + Connect + Y . Note that Y takes into account only the initial domains of variables. To consider the current domains as well, let Y be the optimal solution of Zk for some k > i (i.e. for a subproblem smaller than Zi ). For variables vi , . . . vk−1 we compute the sum of directed arc-inconsistency (DAC) counts, i.e. evaluate the lower bound according to mdac algorithm [14, 5]. In particular, for each (vj , val) such that i ≤ j < k − 1 and val ∈ D(vj ), we compute dacvj ,val , the number of variables vr (r > j) whose current domains conﬂict with (vj , val). Then we set DM (vj , val, P ) = Conf (vj , val, P ) + dacvj ,val , M inDM (vj , P ) = Pk−1 minval∈D(vj ) DM (vj , val, P ), and SumDM = j=i M inDM (vj , P ). Then LB(rds-mdac) = F +Connect+SumDM +Y is a valid lower bound considered by our algorithm rds-mdac [12]. 3 Exploring local acyclicity within RDS Consider again the last paragraph of Section 2, and let dacrvj ,val be deﬁned analogously to dacvj ,val with the only diﬀerence that only arc-inconsistent variables vr (r ≥ k) are counted. Let Z ′ be a wcsp induced by current domains of variables vi , . . . , vk−1 (i.e. weights of conﬂicts between them are as in an original wcsp). Let DM (v, val, P ) = Conf (v, val, P ) + dacrv,val be the initial weight of (v, val) for each assignment in Z ′ . Denote by C an arbitrary LB on the optimal solution weight of Z ′ (e.g., it can be computed by [8, 2, 7, 3]). Then it can be shown that LB = F + Connect + C + Y is a valid LB on the optimal weight of an extension of P . This method is a generic technique of computing a lower bound in the rds framework. One special case of this method is rds-mdac [12]. We now present another special case. Let k be the largest possible number such that variables vi , . . . vk−1 induce an acyclic constraint subgraph of the constraint graph of Z, i.e the constraint graph of Z ′ is acyclic. Let C be the optimal weight of a solution of Z ′ , that can be computed eﬃciently. We call the algorithm, which computes LB(rds-tree) = F + Connect + C + Y in this way, rds-tree. If Z ′ involves all the unassigned variables, we update U B ← min(U B, F + C) and backtrack immediately. LB(rds-tree) is generally larger then LB(rds-mdac), because rds-mdac provides only approximation on the weight of Z ′ , while rds-tree gives us the exact optimal weight of Z ′ . It is not hard to show that the complexity of one iteration of rds-tree is O(ed2 ) (as of rds-mdac), where e is the number of constraints between the unassigned variables and d is the maximal domain size. It is also worth mentioning that due to the ﬁxed order of variables explored by rds-tree (as well as by other rds-based algorithms), the range of variables being included to Z ′ can be identiﬁed at the preprocessing stage. The kind of local acyclicity explored by rds-tree depends on the existence of suﬃciently large induced acyclic subgraph of the constraint graph of the given wcsp. Another interesting type of local acyclicity occurs when the constraint graph has many disjoint acyclic subgraphs. This type of local acyclicity 97 is explored in our algorithm, rds-cascade, described below. At the preprocessing stage, rds-cascade partitions all the variables into subsets T1 , . . . , Tt (i.e. a cascade of sub-cns), so that each subset induces an acyclic subgraph. The variables are statically ordered is such a way that each subset includes consecutive variables according to this order. Let dacfv,val be deﬁned analogously to dacv,val , with the only diﬀerence being that for each v ∈ Tm , only variables that belong to Tm+1 , . . . Tt are considered. Let Zm be the wcsp induced by the variables of Tm , where for each assignment (v, val) of v ∈ Zm , DM (v, val, P ) = Conf (v, val, P ) + dacfv,val is considered as the initial weight of this assignment. Let W Tm be the weight of an optimal solution Pt of Zm (which can be computed eﬃciently). Then LB(rds-cascade) = F + m=1 W Tm is a valid lower bound on the optimal weight of extension of P . Note that the part of rds does not participate in computing the lower bound. However, rds-cascade applies rds as a ﬁltering procedure, i.e., when for the given value (v, val) we are going to decide whether or not to remove this value from the current domain of v, (v, val) is temporarily added to P and then the lower bound is computed using rds. The complexity of rds-cascade is O(ed2 ), where e is a number of constraints between the unassigned variables and d is the maximal domain size. Indeed, given up up the cascade T1 , . . . , Tt , let e = (e1 + e2 + · · · + et ) + (eup 1 + e2 + · · · + et−1 ), where em (m ∈ {1, . . . t}) is a number of constraints between variables of Tm only and eup m (m ∈ {1, . . . t − 1}) is a number of constraints between variables of Tm and the following Tm+1 , . . . , Tt . Computing initial weights of values in each 2 2 Tm takes O(eup i ∗ d ), computing the weight W Tm of each Tm takes O(em ∗ d ), analogously to rds-tree. Hence the resulting complexity is O((e1 + e2 + · · · + up up 2 2 et ) ∗ d2 + (eup 1 + e2 + · · · + et−1 ) ∗ d ) = O(ed ). 4 Experiments We now compare rds-tree and rds-cascade to rds [13] and rds-mdac [12]. For all the tested algorithms, we order the variables in decreasing order of their degrees in the constraint graph. We use the static value ordering heuristic, ﬁrst choosing the value that the variable had in the optimal assignment found on the previous subproblem, then choosing the ﬁrst available value [13]. All the algorithms are implemented in C++, and the experiments are performed on a 2.66GHz Intel Xeon CPU with 4.4GB RAM using the GNU g++ compiler. We measure the search eﬀorts in terms of the number of backtracks and the CPUtime. We run these algorithms on two benchmarks: Binary Random Max-CSPs and Radio Link Frequency Assignment Problems (RLFAP). For Binary Random Max-CSPs, we generate cns given four parameters [11]: (1) the number of variables n, (2) the domain size dom, (3) density p1 (the probability of a constraint between two given variables), (4) tightness p2 (the probability of a conﬂict between two values of the given pair of variables). We ﬁx three parameters (n, dom, p1 ∈ [0.9, 0.5, 0.1]), and vary p2 over the range [0.65, 1]. For every tuple of parameters, we impose a time limit of 200 seconds, 98 and report results as the average of 50 instances. Due to lack of space, we describe empirical results, omitting their graphic presentation. rds-mdac outperforms rds in the number of backtracks for all the considered values of p1 and over the whole range of p2 . The rate of improvement in backtracks by rds-mdac over rds is increased with decreasing of p1 : 2.5 times for p1 = 0.9, 3 times for p1 = 0.5, 6 times for p1 = 0.1. The improvement in the CPU-time by rds-mdac compared with rds is achieved for middle and low values of density p1 over the whole range of p2 : a reduction is up to 20% for p1 = 0.5 and up to 5 times for p1 = 0.1. The comparison of the number of backtracks between rds-mdac, rds-tree, and rds-cascade is as follows. rds-tree outperforms rds-mdac for middle and low values of density p1 over the whole range of p2 , achieving a reduction of up to 15% for p1 = 0.5 and up to 20 times for p1 = 0.1. rds-cascade outperforms both rds-mdac and rds-tree for all the considered values of p1 over the whole range of p2 . rds-cascade outperforms rds-tree by 70% on average for p1 = 0.9 and p1 = 0.5, and up to 12 times for p1 = 0.1. The comparison of the CPU-time is as follows. For high values of density p1 : both rds-tree and rds-cascade improves rds-mdac by 30% on average over the whole range of p2 , but only rds-cascade reduces the CPU-time of rds by 15% for high values of p2 starting from p2 = 0.93. For middle and low values of density p1 : rds-tree outperforms rds-mdac by up to 30% for p1 = 0.5 and 6 times on average for p1 = 0.1, and rds-cascade outperforms rds-tree by up to 35% on average for p1 = 0.5 and up to 7 times for p1 = 0.1. Thus, for Binary Random Max-CSPs, the “winner” is rds-cascade, followed by rds-tree (the second best), rds-mdac (the third) and rds (the fourth). We also compare the average LB for one iteration in algorithms rds-mdac and rds-cascade, without backtracking. We choose a quarter of the variables randomly to be a current partial solution P , assign variables in P randomly, then compute LB by corresponding formulas of rds-mdac and rdscascade. The experiments empirically show that LB(rds-cascade) is higher than LB(rds-mdac) by up to 50% for all the considered values of density (0.9, 0.5, 0.1), and over the whole range of tightness. The task of RLFAP [1] is to assign frequency (values) to each of the radio links (variables) in the given communication network in such a way that the links may operate together without noticeable interference (unary and binary constraints). For benchmarking we use ﬁve CELAR 6 sub-instances (SUB0, SUB1, SUB2, SUB3, SUB4) [1]. The empirical results are as follows. rds-mdac outperforms rds signiﬁcantly in search eﬀorts for all sub-instances. The best result by rdsmdac over rds is achieved for SUB4: the improvement of 84 times in backtracks and 28 times in the CPU-time. rds-tree outperforms rds-mdac in the search eﬀorts for some sub-instances (SUB1, SUB3, SUB4), achieving a reduction of up 8 times in backtracks and up to 6 times in the CPU-time. rds-cascade outperforms rds-mdac in the search eﬀorts for all sub-instances, reducing backtracks up to 10 times and the CPU-time up to 8 times. The best performance in the 99 search eﬀorts among all the algorithms is achieved by rds-cascade (for SUB0, SUB2, SUB4) or by rds-tree (for SUB1, SUB3). 5 Conclusion In this paper we present two new algorithms, rds-tree and rds-cascade, which use local acyclicity for improving the rds, when computing LB. rds-tree considers the “small” subproblems with unassigned variables only, as rds-mdac [12], but replaces mdac evaluation of other unassigned variables by exact computation of optimal weights of acyclic sub-cn on these variables. rds-cascade divides all unassigned variables into clusters that have acyclic constraint subgraphs, computes the optimal weight of each cluster in the polynomial time, and sums up the obtained optimal weights of clusters. The empirical evaluation shows that rds-tree and rds-cascade outperform rds-mdac in all search eﬀorts, pointing towards the conclusion that exact computation of the optimal weight of even part of variables increases signiﬁcantly the LB evaluation. References 1. B.Cabon and S. de Givry and L. Lobjois and T. Schiex and J. P. Warners. Radio Link Frequency Assignment. Constraints, 4(1):79–89, 1994. 2. M. C. Cooper and T. Schiex Arc consistency for soft constraints. Artif. Intell., 154(1-2):199–227, 2004. 3. S. de Givry, F. Heras, M. Zytnicki, and J. Larrosa. Existential Arc Consistency: Getting closer to full arc consistency in weighted CSPs. In IJCAI, pages 84–89, 2005. 4. J. Larrosa and R. Dechter. Boosting Search with Variable Elimination in Constraint Optimization and Constraint Satisfaction Problems. Constraints, 8(3):303– 326, 2003. 5. J. Larrosa and P. Meseguer. Exploiting the Use of DAC in MAX-CSP. In CP, pages 308–322, 1996. 6. J. Larrosa, P. Meseguer, and T. Schiex. Maintaining Reversible DAC for Max-CSP. Artif. Intell., 107(1):149–163, 1999. 7. J. Larrosa and T. Schiex. In the quest of the best form of local consistency for Weighted CSP. In IJCAI, pages 239–244, 2003. 8. J. Larrosa and T. Schiex. Solving weighted CSP by Maintaining Arc Consistency. Artif. Intell., 159(1-2):1–26, 2004. 9. P. Meseguer and M. Sánchez. Specializing Russian Doll Search. In CP, pages 464–478, 2001. 10. P. Meseguer, M. Sánchez, and G. Verfaillie. Opportunistic Specialization in Russian Doll Search. In CP, pages 264–279, 2002. 11. P. Prosser. An empirical study of phase transitions in Binary Constraint Satisfaction Problems. Artif. Intell., 81(1-2):81–109, 1996. 12. M. Razgon and G. M. Provan. Adding Flexibility to Russian Doll Search. In ICTAI, Vol. 1, pages 163–171, 2008. 13. G. Verfaillie, M. Lemaı̂tre, and T. Schiex. Russian Doll Search for Solving Constraint Optimization Problems. In AAAI/IAAI, Vol. 1, pages 181–187, 1996. 14. R. J. Wallace. Directed Arc Consistency Preprocessing. In Constraint Processing, Selected Papers, pages 121–137, 1995. 100 Optimal Solutions for Conversational Recommender Systems Based on Comparative Preferences and Linear Inequalities Walid Trabelsi1 (student), Nic Wilson1 and Derek Bridge2 1 1 Cork Constraint Computation Centre, 2 Department of Computer Science University College Cork, Ireland w.trabelsi@4c.ucc.ie, n.wilson@4c.ucc.ie, d.bridge@cs.ucc.ie Introduction Recommenders are automated tools to deliver selective information that matches personal preferences. These tools have become increasingly important to help users ﬁnd what they need from massive amount of media, data and services currently ﬂooding the Internet. In (Bridge and Ricci, 07) [1] a conversational product recommender system is described in which a user repeatedly reconﬁgures and revises then requests a query until she ﬁnds the product she wants (see Section 2). A key component of such a system is the pruning of queries that are judged to be less interesting to the user than other queries. The approach of [1] assumes that the user’s preferences are determined by a sum of weights, which leads to pruning based on deduction of linear inequalities (see Section 3). We have developed a new approach for modelling the user’s preferences, based on comparative preferences statements (see Section 4), which can be used as an alternative method for pruning less interesting queries. Such an approach also allows the possibility of reasoning with a broad range of other input statements, such as conditional preferences, and tradeoﬀs. 2 Information Recommendation In this section we describe the recommender system set up, based on [1]. A given use case is characterized by a set of products. Products are deﬁned through a set V of n variables representing the features eventually included in a given product. Each variable Fi is deﬁned on a discrete and ﬁnite domain denoted by D(Fi ). In the following cases we assume every variable has a domain D(Fi ) = {fi , fi } where fi represents that the feature is present and fi represents the feature is absent. Thus a product is represented by an assignment to V . For example, if V = {F1 , F2 }, then the assignment f1 f2 refers to a product containing the feature F1 but not F2 . The set of assignments to V is written as D(V ). The user and the system engage in a dialogue. At each point in the dialogue there is a current query. The user has to choose a new query, which will become 101 the next current query. The system will generate a set of queries for the user to choose between. There are three stages to producing this set of queries: 1. The set of queries which are close, in a particular sense, to the current query are generated; these are called the Candidates. 2. Queries which are unsatisﬁable are eliminated; the remaining queries are called the Satisﬁables. 3. Queries which are judged to be dominated by (i.e., worse than) another of the satisﬁable queries are eliminated, leaving the Undominated ones. We consider two approaches for this last stage; the ﬁrst one is based on deduction with linear inequalities (see Section 3); the second one uses a new approach for reasoning with comparative preferences (Section 4). The user then chooses some element of the Undominated to be the new Current Query. This choice of one query above the other possible choices is used to induce information about the user’s preferences; this information is used in Stage 3 for eliminating queries in future points in the dialogue. 2.1 Generating the Candidates The candidate queries are chosen, based on the method in [1], to be the set of queries which are obtained from the current query by one of three moves: Add concerns adding one feature to the current query, Switch consists of replacing a feature by another one and Trade considers replacing one feature by two other features. For instance, if Add(Fi ) is selected then advisor is aware of the user’s desire to add the feature Fi not yet present. Choosing Switch(Fi , Fj ) means the user is trying to replace the feature Fi by the feature Fj previously absent. By a Trade(Fi , Fk , Fl ) action the user wants to replace the feature Fi by the features Fk and Fl previously absent. 2.2 Checking Satisﬁability The satisﬁable queries are those for which there exists a product which has all the features present in the query (it may also have other features not present in the query). With the list of products being stored explicitly in a database, this can be done by checking through the products explicitly. For conﬁgurable products, where the set of products is represented as a set of solutions of a Constraint Satisfaction Problem, satisﬁability of a query can be checked by determining if the CSP has solutions containing all the features in the query (which can be checked by checking satisﬁability of an augmented CSP). There can still be a large number of queries for the user to consider. It is desirable to prune queries which we believe that the user will consider worse (less of interest) than another of the queries. 102 3 Eliminating Queries Using Deduction of Linear Inequalities As described above, we need a method to prune dominated queries, i.e., queries which we believe will be less preferred by the user to others. We consider two diﬀerent approaches for doing this. The ﬁrst one, which is a slightly amended version of the approach used in [1], is based on deduction of linear inequalities (denoted by LP). It is assumed that the user assigns, for each feature Fi , a certain amount of utility wi representing the value of that feature being included in the ﬁnal product. Therefore every user has a proﬁle represented by a weight vector denoted by w = (w1 , .., wn ). The approach assumes the system does not know the values of wi it only reasons on them as unknown variables. Each move performed by the user is considered as feedback and makes the advisor induce some user preferences between the product features. These preferences will be deﬁned as linear inequalities on the user’s weights. For instance, the system might infer that one selected feature seems to be more interesting, for the user, than the features not yet present. For example, suppose that V = {F1 , F2 , F3 }, and the current query contains no features (q = {∅}). The user chooses new query {F1 } based on the move Add(F1 ), hence preferring this move to Add(F2 ) and Add(F3 ), suggesting that feature 1 is preferred to feature 2, and to feature 3. As a consequence the advisor induces the constraints w1 ≥ w2 and w1 ≥ w3 . The structure storing the set of these inequalities is called the User Model, which in this case equals {w1 ≥ w2 , w1 ≥ w3 }. The User Model is updated and enriched by new inequalities as soon as the user performs moves. The advisor can now exploit the User Model to eliminate queries in Satisﬁables, leaving the set of optimal queries: all solutions which are not dominated by another one. For query α let W (α) be the sum of weights of features in α. With this approach, α dominates β if the set of linear inequalities contained in the User Model imply the constraint W (α) ≥ W (β). A query β in Satisﬁables is eliminated if there exists some query α in Satisﬁables which strictly dominates β, i.e., such that α dominates β but β does not dominate α. 4 Eliminating Queries Using Reasoning with Comparative Preferences We describe here a diﬀerent approach for deﬁning dominance among queries, based on an approach for reasoning with comparative preferences (denoted by CPR). 4.1 Comparative preferences Comparative preferences are concerned with the relative preference of alternatives, in particular, statements implying that one alternative is preferred to 103 another. Such statements can be relatively easy to reliably elicit: often it is easier to judge that one alternative is preferred to another than it is to allocate particular grades of preference to the alternatives. We describe a language of comparative preference statements, based on [2]. Though it is rather a simple language, it is more expressive than the perhaps best known languages for reasoning with comparative preferences: CP-nets [3], TCP-nets [4], and the formalism deﬁned in [5]. A language of comparative preference statements. Let P, Q and T be subsets of V . Let p be an assignment to P , and let q be an assignment to Q. The language includes the statement p ≥ q||T . The meaning of this statement is that complete assignment α is preferred to β if α extends p, and β extends q and α and β agree on set of variables T . Given a set of comparative preference statements Γ , and two complete assignments α and β, we need to determine if Γ implies that α is preferred to β. With the standard kind of inference, this problem of determining the preference of an outcome α over another outcome β has been shown to be PSPACE-complete [6], even for the case of CP-nets. In our application, eﬃcient deduction is very important in order for the system to have a quick response time. The standard kind of inference is also rather conservative. In [2], a less conservative kind of inference is deﬁned, which allows deduction in polynomial time; we use this deductive method in approach here. 4.2 Expressing induced preference using comparative preference statements In our setting it is always at least as desirable for a product to have a feature as not. This can be represented by comparative preference statements of the following forms: fi ≥ fi ||V − {Fi }. This is saying that a query α which contains feature fi is at least as preferred as a query β which is identical to α except for not containing fi . Consider the situation where the user has chosen to add feature i instead of feature j, which with the linear inequalities approach, is represented by the constraint wi ≥ wj . There are a number of diﬀerent options for generating preference statements for this situation. 1. Let q be the current query, let q i be the current query q with the feature fi added. A rather conservative approach is to just model the preference of feature i over feature j by the preference statement: q i ≥ q j ||∅, which just expresses a preference for q i over q j . 2. fi ≥ fi ||V − {Fi , Fj } is saying that the presence or not of the feature Fi is more important than the choice of Fj . Thus, whatever the state of the feature Fj in the query the user will prefer Fi to be present in the query so that to be included in the ﬁnal product. 104 3. fi fj ≥ fi fj ||V − {Fi , Fj }; this is saying that the simultaneous presence of the feature Fi and absence of the feature Fj is more interesting than the simultaneous absence of the feature Fi and presence of the feature Fj (all else being equal). For instance if the user has to choose between two products having the same features except Fi and Fj knowing the ﬁrst contains Fi and not Fj and the second contains Fj and not Fi , she will prefer the ﬁrst one. From this list we can easily appreciate the ﬂexibility of comparative preferences when representing the user preferences. 4.3 Preliminary Experimentation and Results The preliminary experimentation involved computing the pruning rate of the new approach (CPR) and the one based on deduction of linear inequalities (LP). Tests consist of engaging simulated users in a dialogue with the two types of advisors (CPR and LP) and verifying at each step of the dialogue the number of queries pruned. We performed several experiments: each one deals with a number (i.e., 100) of simulated users interacting with an advisor adopting either the LP or CPR approach to help select optimal queries, by pruning dominated queries. The products used during tests belong to three databases of hotels. We tested the performance of the preference forms described above. These tests reveal that the CPR approach is pruning much more than the LP approach. In the following table, we report the results (expressed as percentages) of three experiments using the preference forms 1, 2 and 3 for the CPR approach. (Obviously the LP approach performance is not inﬂuenced by the preference form adopted, the variation in the bottom being due to random variation.) 1 2 3 CPR 87.62 91.23 89.09 LP 72.91 74.05 71.66 The above results show that the advice given by the CPR approach to the user is more conﬁned than the advice given by the LP approach, e.g., When applying the ﬁrst preference form, 87.62% of the available queries are eliminated by the CPR approach instead of eliminating only 72.91% by the LP approach. This allows the user to focus only on the most interesting queries according to her preferences. 5 Discussion Both the linear inequalities and the comparative preferences approaches have been implemented, and we are currently in the process of comparing them experimentally, based on three databases of hotels as products. The advice should 105 contain conﬁrmed feasible and non dominated queries so that real users get satisﬁed as soon as possible since customers are very quickly annoyed and/or bored [7]. We have enhanced the reliability and eﬃciency of information recommendation by providing an eﬃcient comparative preference representation and inference engine so that to improve the user preferences elicitation and perform eﬃcient inference helping the user to navigate to the suitable product. A potential advantage of the comparative preferences approach is that it allows also explicit preference statements to be made by the user which are context-dependent. We are planning in the future to address the related problem of ﬁnding optimal solutions of a Constraint Satisfaction Problem with respect to a set of comparative preference statements. This will involve incorporating our comparative preferences deduction mechanism in a branch-and-bound search, and developing specialist pruning mechanisms. References 1. Bridge, D., Ricci, F.: Supporting product selection with query editing recommendations. In: Proceedings of ACM Recommender Systems 2007. (2007) 2. Wilson, N.: An eﬃcient deduction mechanism for expressive comparative preference languages. In: Proceedings of the Nineteenth International Joint Conference on Artiﬁcial Intelligence (IJCAI-09). (2009) 3. Boutilier, C., Brafman, R.I., Domshlak, C., Hoos, H., Poole, D.: CP-nets: A tool for reasoning with conditional ceteris paribus preference statements. Journal of Artiﬁcial Intelligence Research 21 (2004) 135–191 4. Brafman, R., Domshlak, C., Shimony, E.: On graphical modeling of preference and importance. Journal of Artiﬁcial Intelligence Research 25 (2006) 389–424 5. Wilson, N.: Extending CP-nets with stronger conditional preference statements. In: Proceedings of AAAI-04. (2004) 735–741 6. Goldsmith, J., Lang, J., Truszczyński, M., Wilson, N.: The computational complexity of dominance and consistency in CP-nets. In: Proceedings of IJCAI-05. (2005) 144 –149 7. Raskutti, B., Zukerman, I.: Generating queries and replies during informationseeking interactions. International Journal of Human Computer Studies 47 (1997) 689–734 106 A multithreaded solving algorithm for QCSP+ Jérémie Vautard (student)1 and Arnaud Lallouet2 1 2 Université d’Orléans — LIFO BP6759, F-45067 Orléans jeremie.vautard@univ-orleans.fr Université de Caen-Basse Normandie — GREYC BP 5186 - 14032 Caen arnaud.lallouet@info.unicaen.fr Abstract. This paper presents some ideas about multi-threading QCSP solving procedures. We introduce a ﬁrst draft of a multi-threaded algorithm for solving QCSP+ and give some work leads about parallel solving of quantiﬁed problems. 1 Introduction Quantiﬁed constraint satisfaction problems (QCSP) have been studied for several years, and many search procedures ([7] [5]), consistency deﬁnitions([6] [4]) and propagations algorithms([3] [1]) have been proposed to solve them. However, while several attempts have been done to make a parallel solver for CSPs, we have not found any parallel approach for solving quantiﬁed problem. This is what we try to do in this paper. First, we propose a sketch of a quite general framework for solving QCSP+ problems (introduced in [2]) using a multi-threaded approach. Then, we discuss about several leads that can be explored in this domain. 2 2.1 The QCSP+ framework Formalism Variables, constraints and CSPs. Let V be a set of variables. Each v ∈ V , has got a domain Dv . For a given W ⊆ V , we denote DW the set of tuples on W , i.e. the cartesian product of the domains of all the variables of W . A constraint c is a pair (W, T ), W being a set of variables ant T ∈ DW a set of tuples. The constraint is satisﬁed for the values of the variables of W which form a tuple of T . If T = ∅, the constraint is said to be empty and can never be satisﬁed. On the other hand, a constraint such that T = DW is full and will be satisﬁed whatever value its variables take. W and T are also denoted by var(c) and sol(c). S A CSP is a set C of constraints. We denote var(C) the set of its variables, i.e. c∈C var(c) and sol(C) the set of its solutions, i.e. the set of tuples on var(C) satisfying all the constraints of C. The empty CSP (denoted ⊤) is true whereas a CSP containing an empty constraint is false and denoted ⊥. 107 Quantiﬁed problems. A quantiﬁed set of variables (or qset) is a pair (q, W ) where q ∈ {∀, ∃} is a quantiﬁer and W a set of variables. We call a preﬁx a sequence of qsets [(q0 , W0 ), . . . , (qn−1 , Wn−1 )] such that (i 6= j) → (Wi ∩ Wj = ∅). We denote Sn−1 var(P ) = i=0 Wi . A QCSP is a pair (P, G) where P is a preﬁx and G, also called the goal, is a CSP such that var(C) ∈ var(P ). A restricted quantiﬁed set of variables or rqset is a triple (q, W, C) where (q, W ) is a qset and C a CSP. A QCSP+ is a pair Q = (P, G) where P is a preﬁx Si of rqsets such that ∀i, var(Ci ) ⊆ j=0 Wj . Moreover, var(G) ⊆ var(P ) still holds. Solution. A QCSP (P, G) where P = [(∃, W0 ), (∀, W1 ), . . . , (∃, Wn )] represents the following logic formula : ∃W0 ∈ DW0 ∀W1 ∈ DW1 . . . ∃Wn G Thus, a solution of a quantiﬁed problem can not be a simple assignment of all the variables anymore : in fact, the goal has to be satisﬁed for all values the universally quantiﬁed variables may take. Intuitively, such a problem can be seen as a “game” where an existential player tries to satisfy all the constraints of G while a universal player aims at violating one of them, each player assigning the variables in turn, in the order deﬁned by the preﬁx. The solution must represent the strategy that the existential player should adopt to be sure that, whatever its opponent do, the goal will always be satisﬁed. This strategy can be represented as a family of Skolem functions that give a value to an existential variable as a function of the preceding universal ones, or by the set of every possible scenario (i.e. total assignment of the variables) of the strategy. In this paper, we use this later representation and organize the set of scenarios in a tree : a root node represents the whole problem, then, inductively : – if the next qset (qi , Wi ) is universal, the current node gets as many sons as there are tuples in DWi . Each node is tagged vith one of these tuples ; – if the next qset is existential, the current node gets one unique son, tagged by an element of DWi . Thus, each complete branch of this tree coresponds to a total assignment of the variables of the problem. If every branch of such a tree corresponds to an assignment satisfying G, then it is indeed a solution of the problem. QCSP+ restricts the “moves” of each player to assignments that satisfy the CSP of the rqsets. The logic formula represented is : ∃W0 C0 ∧ (∀W1 C1 → (∃W2 . . . G)) In this case, the notion of solution is the same, except that the restrictions have to be taken into account : for universal rqsets (∀, Wi , Ci ) the current node’s sons corresponds to each solution of Ci . For existential rqsets (∃, Wj , Cj ), the son must be tagged by an element such that the partial branch corresponds to an assignment that satisﬁes Cj . 108 2.2 A basic solving procedure One simple way to solve quantiﬁed problems is to adapt the classical backtracking algorithm for CSPs : – ﬁrst, perform a propagation algorithm on the problem. If an inconsistency is detected, return f alse. – pick up the leftmost quantiﬁed set of variables and enumerate the possible values of its variables, dividing the problem in as many subproblems. • in the universal case, solve all the subproblems. If one of them is false, return f alse. Else, group all the corresponding substrategies and return them. • in the existential case, solve each problem until one of them does not return f alse. if such a subproblem exists, create a node containing the values that led to the corresponding subproblem, attach the substrategy returned by the subproblem and return the whole. In the other case, return f alse. 3 Multithreaded solving : a ﬁrst attempt The multithreaded solving method we propose is based on a central data-structure called manager managing several (single threaded) solvers called workers : a partial strategy is maintained, where some nodes do not father a substrategy. Each of these nodes corresponds to a task that a worker can solve. Formally, a task consists in a pair (Q, τ ) where Q is a QCSP+ and τ the partial assignment of the variables of Q corresponding to the branch of the partial strategy where the task is attached. Once a worker has ﬁnished solving a task, it returns its result (either a sub-strategy or f alse) that will be taken into account by the manager to update the partial strategy. Then, the worker receives another task to solve. Whenever the to-do tasks queue empties, the manager sends a signal to one or several workers to stop its current task, and send a partial result. Such a result consists in a partial sub-strategy containing “unﬁnished” nodes which are as many other pending tasks. Once the whole problem is solved (i.e. there is neither more to-do task left nor other working thread), each worker thread is killed, and the result can be returned. Here is a list of each procedures and signals used in this framework : Workers. Each worker is a thread having a very simple main loop. This loop fetches a task from the Manager, and tries to solve it by calling an internal (single threaded) Solve method. The Manager can also possibly answer by a WAIT pseudo-task, which will cause the worker to sleep until a task becomes available (by calling a special method of the Manager), or by a STOP pseudo-task, which will kill the thread. Once the solver ﬁnishes, its result is sent to the Manager. This main loop is described in ﬁgure 1. 109 A worker is able to catch two signals called Terminate and Send partial. Both indicates that the search procedure should stop, but the former means that the task has become useless while the later calls for returning a partial result, along with a list of remaining “sub-tasks”. The Solve method can inherit from any original search procedure, but must be modiﬁed in order to take the signals into account. Figure 2 shows an adaptation from a basic QCSP+ solving procedure which can be interrupted by these signals. Finally, a worker provides some methods so that other processes know which subproblem it is working on. Procedure Main loop task = Manager.fetchWork() if task == STOP then Exit else if task == WAIT then Manager.wait() else result = Solve(task) Manager.returnWork(result) end if end loop Fig. 1. The worker main loop Manager. The manager is an object that contains and builds the winning strategy of the problem. During search, this winning strategy is incomplete and some nodes are replaced by tasks remaining to solve. We call ToDo this list of remaining tasks. A Manager is also aware of the list Current Workers of the workers currently solving a task and maintains a list of sleeping workers that should be waken when tasks become available for solving. Unless sait otherwise, the Manager’s methods are called by a worker. The fetchWork method withdraws a task from the ToDo list and returns it. If ToDo is empty, then it sends the signal Send partial to one worker from Current Workers and returns WAIT. If Current Workers is also empty, the search has ended and therefore, STOP is returned. The returnWork method attaches the returned sub-strategy and cuts the branches that are no longer necessary (for example, every brothers of a complete substrategy of an existential node, or directly the father node of a universal substrategy if one of the subproblems have been found to be inconsistent). Each worker that was solving a node on a cut branch are sent the Terminate signal, and the workers contained in the sleeping list are awoken. Finally, the wait 110 Procedure Procedure Solve u ([(∀, W, C)|P ′ ], G) Solve e ([(∃, W, C)|P ′ ], G) STR := ∅ SC = Set of solutions of C SC = Set of solutions of C while SC 6= ∅ do while SC 6= ∅ do choose t ∈ SC ; SC = SC − t choose t ∈ SC ; SC = SC − t if Signal Terminate then if Signal Terminate then return STOP return STOP end if end if CURSTR := Solve( (P ′ , G)[W ← t] ) CURSTR := Solve( (P ′ , G)[W ← t] ) if CURSTR 6= F ail then if CURSTR = F ail then if Signal Send Partial then return F ail return else S PartialResult(tree(t,CURSTR),SC ) STR := STR CURSTR else end if return tree(t,CUR STR) if Signal Send Partial then end if return PartialResult(STR,SC ) end if end if end while end while return F ail return STR Fig. 2. Search procedure. method records the calling thread in the Sleeping list and puts it in sleeping mode, until it is waken by the previous method. 4 Work leads Search heuristics. The time taken by a search procedure to solve a CSP greatly depends on the heuristics used to choose which subproblem should be explored ﬁrst. Unfortunately, most parallel approaches tends to be incompatible with this heuristics, thus ruining solving performances. QCSP (and QCSP+ ), reduce the alternatives, as a solver have to follow the order of the preﬁx, and the impact of the heuristics used to perform these choices remains unclear, because they were not originally tailored for QCSPs. In 2008, Verger and Bessière presented in [8] a promising heuristics for QCSP+ that accelerate solving by several orders of magnitude on some problems. Parallelizing the search might, as for CSPs, drawn the beneﬁt of theses heuristics. Task priority. In the presented parallel approach, each task could be given a priority, in order to minimize pointless subproblems solving. The method used to calculate this priority will most likely the solving time: in fact, solving one given subproblem might be pointless or not according to the result of another given task, and the priority given to the tasks should take that into account. For example, it sounds reasonable to give top priority to leftmost universal nodes as every branch from a universal node must be veriﬁed whatsoever, a single failure cutting the whole 111 subproblem. After that, solving rightmost pending existential tasks should help ﬁnishing to build complete sub-strategies. Several kind of workers. The workers run independently from each other, and their communications with the Manager are not speciﬁc to a particular search procedure. Thus, using the algorithm of ﬁgure 2 in the workers is not mandatory. Any procedure able to return the sub-strategy of a problem is a priori appropriate. However, algorithms that can not return partial work and generate remaining tasks should not be used, as they will tend to prematurely dry the ToDo list, bringing other workers into sleep mode. 5 Conclusion Solving quantiﬁed constraint satisfaction problems in a parallel way is new, and even this contribution is far from being achieved. thus, while looking interesting, this still needs to prove its worth. We presented here a quite simple and general framework that has to be implemented and tested against traditional solvers before drawing deﬁnitive conclusions. References 1. Marco Benedetti, Arnaud Lallouet, and Jérémie Vautard. Reusing csp propagators for qcsps. In Francisco Azevedo, Pedro Barahona, François Fages, and Francesca Rossi, editors, CSCLP, volume 4651 of Lecture Notes in Computer Science, pages 63–77. Springer, 2006. 2. Marco Benedetti, Arnaud Lallouet, and Jérémie Vautard. Qcsp made practical by virtue of restricted quantiﬁcation. In Manuela M. Veloso, editor, IJCAI, pages 38–43, 2007. 3. Lucas Bordeaux, Marco Cadoli, and Toni Mancini. Csp properties for quantiﬁed constraints: Deﬁnitions and complexity. In Manuela M. Veloso and Subbarao Kambhampati, editors, AAAI, pages 360–365. AAAI Press / The MIT Press, 2005. 4. Lucas Bordeaux and Eric Monfroy. Beyond np: Arc-consistency for quantiﬁed constraints. In Pascal Van Hentenryck, editor, CP, volume 2470 of Lecture Notes in Computer Science, pages 371–386. Springer, 2002. 5. Ian P. Gent, Peter Nightingale, and Kostas Stergiou. Qcsp-solve: A solver for quantiﬁed constraint satisfaction problems. In Leslie Pack Kaelbling and Alessandro Saﬃotti, editors, IJCAI, pages 138–143. Professional Book Center, 2005. 6. Peter Nightingale. Consistency for quantiﬁed constraint satisfaction problems. In Peter van Beek, editor, CP, volume 3709 of Lecture Notes in Computer Science, pages 792–796. Springer, 2005. 7. Guillaume Verger and Christian Bessiere. Blocksolve: a bottom-up approach for solving quantiﬁed csps. In Proceedings of CP’06, pages 635–649, Nantes, France, 2006. 8. Guillaume Verger and Christian Bessiere. Guiding search in qcsp+ with backpropagation. In Proceedings of CP’08, pages 175–189, Sydney, Australia, 2008.

Log In

Finding Stable Solutions in Constraint Satisfaction Problems

Free related PDFsRelated papers

Free related PDFsRelated papers